-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw: support new zone pools #14715
rgw: support new zone pools #14715
Conversation
1854f01
to
d83f7bc
Compare
@@ -916,11 +916,14 @@ func adjustZoneDefaultPools(zone map[string]interface{}, spec cephv1.ObjectShare | |||
"user_uid_pool": ".meta.users.uid", | |||
"otp_pool": ".otp", | |||
"notif_pool": ".log.notif", | |||
"topics_pool": ".meta.topics", // introduced in Ceph v19 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we try to set these new properties in v18, does it fail? I assume that's why you're checking if the properties exist like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, if pool is not presented in source json then we just ignore it.
} | ||
for pool, nsSuffix := range zonePoolNSSuffix { | ||
// replace rgw internal index pools with namespaced metadata pool | ||
namespacedPool := defaultMetaPool + nsSuffix | ||
prev, err := setObjProperty(zone, namespacedPool, pool) | ||
prev, err := setObjPropertyIfExists(zone, namespacedPool, pool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative to the "IfExists" approach, is to check the ceph version. Something like this:
if cephVersion.IsAtLeast(cephver.CephVersion{Major: 19, Minor: 0, Extra: 0}) {
// add new properties
}
But if we can just check if the property exists like you already implemented, that may certainly keep things simpler in the long term when other properties are added.
During an upgraded cluster (e.g. from v18 --> v19), the new properties will be found to exist and then we'll add them? Just wanted to confirm if you already tested that scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that Ceph version check approach is better for code readability. But, as you said, with "IfExists" approach we can just add any new pools and don't care about ceph version.
I personally more like "IfExists" approach because it makes RGW responsible for pools structure and rook just tries to apply namespaces to all existing pools.
I can also suggest iterating through all root string fields ending with _pool
and log warning if pool is unknown. Or try to apply some namespace anyway. What do you think?
I've upgaded my local rook-ceph from v18.something
to v19.1.1-0
. All daemon pods been restared and using v19 images, but when i run radosgw-admin zone get
i don't see any new pools in json after update:
{
"id": "e522c205-dbd2-43e1-8ef8-a81a55d44ef1",
"name": "pp-store",
"domain_root": "meta-pool:pp-store.meta.root",
"control_pool": "meta-pool:pp-store.control",
"gc_pool": "meta-pool:pp-store.log.gc",
"lc_pool": "meta-pool:pp-store.log.lc",
"log_pool": "meta-pool:pp-store.log",
"intent_log_pool": "meta-pool:pp-store.log.intent",
"usage_log_pool": "meta-pool:pp-store.log.usage",
"roles_pool": "meta-pool:pp-store.meta.roles",
"reshard_pool": "meta-pool:pp-store.log.reshard",
"user_keys_pool": "meta-pool:pp-store.meta.users.keys",
"user_email_pool": "meta-pool:pp-store.meta.users.email",
"user_swift_pool": "meta-pool:pp-store.meta.users.swift",
"user_uid_pool": "meta-pool:pp-store.meta.users.uid",
"otp_pool": "meta-pool:pp-store.otp",
"system_key": {
"access_key": "",
"secret_key": ""
},
"placement_pools": [...],
"realm_id": "ceb20ae5-3e69-40c4-af9f-fc9bab7904db",
"notif_pool": "meta-pool:pp-store.log.notif"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also suggest iterating through all root string fields ending with _pool and log warning if pool is unknown. Or try to apply some namespace anyway. What do you think?
That sounds great to log a warning if we find an extra pool, until we are more confident with that approach.
I've upgaded my local rook-ceph from v18.something to v19.1.1-0. All daemon pods been restared and using v19 images, but when i run radosgw-admin zone get i don't see any new pools in json after update:
This is what I worried about, the schema may not be updated until some zone update is triggered. So that means we may need to use the approach to check the ceph version instead of the "IfExists" approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds great to log a warning if we find an extra pool, until we are more confident with that approach.
I really like this idea as well.
Conceptually, I much prefer the IfExists
approach, but it does seem like we have an edge condition on upgrade we need figure out how to deal with properly.
If we upgrade from v17/v18 -> v19, my expectation would be that RGW (upon upgrade) is going to create a new pool before we Rook has a chance to see if a new pool needs configured. I'm wondering what bad would happen if the RGW did that and started putting stuff into that pool before Rook has a chance to configure it. But it also seems like my expectation is not what happens IRL, which is also concerning.
I think it's best to ask @cbodley what his recommendation is for handling the upgrade case when new fields are introduced to the zone config. How does RGW operate on upgrade when the zone spec is missing configs for default pools? Does RGW create the new default pools and use them implicitly, or does it just not use the new default pools? Is there a good (preferably idempotent) way of checking to see if the zone config is missing fields it needs after an RGW upgrade? Are there other things around this we're missing?
I think the best-case scenario for Rook would be if RGW doesn't init new default pools on upgrade and we have a recommendation for detecting what pools RGW wants to be configured that we can use on an if-exists basis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've upgaded my local rook-ceph from
v18.something
tov19.1.1-0
. All daemon pods been restared and using v19 images, but when i runradosgw-admin zone get
i don't see any new pools in json after update:
@arttor that makes it sound like the radosgw-admin
binary was still v18
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, yea. You are correct, i was using old radosgw-admin
🤦 . After toolbox update i can see new pools.
For existing rgw with shared pools, new pools are using <instance_name>.rgw.meta
pool:
{
"notif_pool": "meta-pool:pp-store.log.notif", // existing is using shared metadata pool "meta-pool"
"topics_pool": "pp-store.rgw.meta:topics", // new pool using default "pp-store.rgw.meta"
"account_pool": "pp-store.rgw.meta:accounts",
"group_pool": "pp-store.rgw.meta:groups",
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if rook wants to prevent this, it would have to use an upgraded
radosgw-admin zone set
command to update those pool names before any other radosgws/radosgw-admin commands get upgraded
@BlaineEXE how can we achive this? It would be very easy if we could add new pools with old radosgw-admin
. Or it is ok to ask users to upgrade rook to the latest version before upgrading Ceph? If it is ok, then we can just always force set new pools to zone and it should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we know about the pools, can we attempt to always set them regardless of the Ceph version? That way, they will be ready on any upgrade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes it sound like the radosgw-admin binary was still v18
That's a good point, and this is accurate. Because Ceph v19 is still pre-release, Rook is using the latest v18 image for its base image. That means that the Ceph CLI and all tools a re v18 tools, including radosgw-admin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed, added WARN log message for unknown pools in zone json
} | ||
for pool, nsSuffix := range zonePoolNSSuffix { | ||
// replace rgw internal index pools with namespaced metadata pool | ||
namespacedPool := defaultMetaPool + nsSuffix | ||
prev, err := setObjProperty(zone, namespacedPool, pool) | ||
prev, err := setObjPropertyIfExists(zone, namespacedPool, pool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds great to log a warning if we find an extra pool, until we are more confident with that approach.
I really like this idea as well.
Conceptually, I much prefer the IfExists
approach, but it does seem like we have an edge condition on upgrade we need figure out how to deal with properly.
If we upgrade from v17/v18 -> v19, my expectation would be that RGW (upon upgrade) is going to create a new pool before we Rook has a chance to see if a new pool needs configured. I'm wondering what bad would happen if the RGW did that and started putting stuff into that pool before Rook has a chance to configure it. But it also seems like my expectation is not what happens IRL, which is also concerning.
I think it's best to ask @cbodley what his recommendation is for handling the upgrade case when new fields are introduced to the zone config. How does RGW operate on upgrade when the zone spec is missing configs for default pools? Does RGW create the new default pools and use them implicitly, or does it just not use the new default pools? Is there a good (preferably idempotent) way of checking to see if the zone config is missing fields it needs after an RGW upgrade? Are there other things around this we're missing?
I think the best-case scenario for Rook would be if RGW doesn't init new default pools on upgrade and we have a recommendation for detecting what pools RGW wants to be configured that we can use on an if-exists basis.
d83f7bc
to
c53d5ae
Compare
bfb3c9a
to
304c189
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've done some more testing locally. Tried 4 possible combinations with v18/v19 ceph-daemons and rook-ceph-dependencies. I was not able to find any errors in rgw or operator logs for a simple scenarios like put/get/head bucket and object. I've also tested bucket notifications for v19 rook and v18 rgw combination and it worked.
I propose to use setIfExistApproach
approach for zone pools. Otherwise there will be unnecesary attempt of zone update for rook v18 case because new pools won't be saved to zone json.
} | ||
for pool, nsSuffix := range zonePoolNSSuffix { | ||
// replace rgw internal index pools with namespaced metadata pool | ||
namespacedPool := defaultMetaPool + nsSuffix | ||
prev, err := setObjProperty(zone, namespacedPool, pool) | ||
prev, err := setObjPropertyIfExists(zone, namespacedPool, pool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as discussed, added WARN log message for unknown pools in zone json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really just a small nit around naming as far as the code is written.
if shouldExist && !ok { | ||
// not exists: | ||
return prev, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect a bool called shouldExist
to cause the function to return an error if the field doesn't exist. Is there a better naming for this? Maybe setIfExists
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arttor this is also bringing up some related questions for me, and I'd like to clarify my understanding.
From reading the overall code, it seems like there is no error returned if the field Rook tries to set doesn't exist. Maybe I am misreading the code; if so let me know. If I'm reading correctly, is this for a specific reason?
IMO, it's probably safest for users for Rook to return an error if it tries to set a field using setObjProperty()
if the field doesn't exist. Presumably, that case would happen if (1) RGW removes a field in the future, which we want to make sure we handle properly in code if needed, or (2) a dev makes a typo/mistake in a future Rook update.
Being able to catch (2) would be good, though not critical. Being able to catch (1) could be good to help make sure Rook's handling of RGW zone configs is always aware of RGW changes. I don't expect (1) to happen often, but catching it could be important.
Also, I don't think these questions are intrinsic to this PR. We can follow up after your PTO if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I was thinking about returning an error when field is missing to log it in caller. But in this case we should implement error handling on caller side: check error type to not fail method in case of missing field. Because for v18 it is expected that some pools will be missing. I was not able to find existing "domain" error types in Rook and thought that i should not introduce error types just for this funciton.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. now it retruns error if pool not exists and the error is logged with INFO level.
304c189
to
23ff820
Compare
return setObjPropertyCommon(false, obj, val, path...) | ||
} | ||
|
||
// setObjPropertyIfExists - helper function to manipulate JSON Objects. | ||
// sets new value to json object nested field only if it is already exists in json and returns previous value. | ||
func setObjPropertyIfExists[T string | []string | map[string]interface{} | []interface{}](obj map[string]interface{}, val T, path ...string) (T, error) { | ||
return setObjPropertyCommon(true, obj, val, path...) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these switched from what they should be?
I think shouldExist=false
when using setObjPropertyIfExists()
and shouldExist=true
when using setObjProperty()
makes sense right?
return setObjPropertyCommon(false, obj, val, path...) | |
} | |
// setObjPropertyIfExists - helper function to manipulate JSON Objects. | |
// sets new value to json object nested field only if it is already exists in json and returns previous value. | |
func setObjPropertyIfExists[T string | []string | map[string]interface{} | []interface{}](obj map[string]interface{}, val T, path ...string) (T, error) { | |
return setObjPropertyCommon(true, obj, val, path...) | |
} | |
return setObjPropertyCommon(true, obj, val, path...) | |
} | |
// setObjPropertyIfExists - helper function to manipulate JSON Objects. | |
// sets new value to json object nested field only if it is already exists in json and returns previous value. | |
func setObjPropertyIfExists[T string | []string | map[string]interface{} | []interface{}](obj map[string]interface{}, val T, path ...string) (T, error) { | |
return setObjPropertyCommon(false, obj, val, path...) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it is correct. setObjPropertyIfExists()
means that shouldExist=true
. Is it makes sense to change namig?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the confusion I have is the use of shouldExist
. I am interpreting this to be the same as mustExist
, effectively failIfNotExist
.
Perhaps the parameter shouldExist
could be renamed to allowNotExists
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You interpreting it correctly. It acts like mustExist
and will return error if property is not exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read through the unit tests, and I think I understand where my confusion is coming from.
In the unit tests, I see that setObjProperty()
will inject a property into the json if the property doesn't already exist. Based on the name, I have been assuming that the setObjProperty()
function would error out if the property it is operating on doesn't exist because I interpret "set" to be equivalent to "update".
I also see that setObjPropertyIfExists()
returns an error if the property doesn't exist. Based on the name, I have been assuming that the setObjPropertyIfExists()
doesn't need to return an error if the property is non-existent because of "if exists".
I have some follow-ups thoughts to try to get/create clarity:
Does Rook need to be able to add fields to the json struct for Ceph?
If yes, renaming setObjProperty()
to setOrAddObjProperty()
(or addOrUpdateObjProperty()
) could be a good clarifying change. Following from that, setObjPropertyIfExists()
might be better named as setObjProperty()
(or updateObjProperty()
).
If Rook doesn't need to add fields to the json struct, maybe that feature could be removed from the methods? This case probably needs more discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarification, now i also see the problem. You are right, we use this method only for zone and zonegroup JSONs. And these JSONs are returned from radosgw-admin
and should always have all properties. This means that at least for now we can use setObjPropertyIfExists()
for all cases.
I will remove setObjProperty()
method and rename setObjPropertyIfExists()
to updateObjProperty()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
prev, err := setObjProperty(zone, namespacedPool, pool) | ||
prev, err := setObjPropertyIfExists(zone, namespacedPool, pool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the pools added here, I think setObjPropertyIfExists()
makes sense, but for existing pools, should we use setObjProperty()
so that Rook can error out if the property is missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think that we already have described behaviour with setObjPropertyIfExists()
. it will return error if pool is missing.
Old pools should always be presented. Even it is missing in ceph, radosgw-admin
will set default values to json anyway. If it will be missing in json then we will see warning in rook logs.
23ff820
to
5d44698
Compare
add rgw shared pools rados ns to topics_pool, account_pool, and group_pool Signed-off-by: Artem Torubarov <artem.torubarov@clyso.com>
5d44698
to
f6a25b0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
closes #14708