14:00:46 #startmeeting cinder 14:00:47 #topic roll call 14:00:47 #link https://etherpad.openstack.org/p/cinder-ussuri-meetings 14:00:48 Meeting started Wed Apr 1 14:00:46 2020 UTC and is due to finish in 60 minutes. The chair is rosmaita. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:48 o/ 14:00:49 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:50 hi! o/ 14:00:51 hi 14:00:52 The meeting name has been set to 'cinder' 14:00:57 sorry, had some technical difficulties there 14:00:59 hi 14:01:02 Hi 14:01:21 o/ 14:01:28 hi 14:01:43 looks like a good crowd, i'll get started because there's a meeting right after ours 14:01:48 o/ 14:02:15 hello 14:02:31 #announcements - upcoming deadlines 14:02:34 this week (tomorrow) - os-brick release 14:02:34 next week: 14:02:34 * ussuri cinderclient release 14:02:34 * ussuri milestone-3 release 14:02:34 * FEATURE FREEZE 14:02:35 two weeks after that (week of 20 April): 14:02:35 * RC-1 14:02:47 so feature freeze coming up fast 14:03:08 let's do a quick check of the os-brick release status 14:03:22 #link https://etherpad.openstack.org/p/os-bric-ussuri-release 14:03:37 not much in there, just a few bugfixes 14:03:57 looks like no change in the status of the unmerged changes 14:04:07 #link https://review.opendev.org/#/q/status:open+project:openstack/os-brick+branch:master 14:04:30 so unless anyone has a last minute thing, i'll propose the release from HEAD later today 14:04:43 for victoria, we will have some big changes 14:04:43 +1 for a release 14:05:15 thanks e0ne 14:05:54 #topic announcements - relaxing the two-company rule 14:06:04 #link http://lists.openstack.org/pipermail/openstack-discuss/2020-March/013423.html 14:06:27 this has become an issue with so many of us working for red hat these days 14:06:41 nova has relaxed the rule 14:06:57 i think we should as well 14:07:15 +1 14:07:25 ++ 14:07:37 but with the proviso that it's always better to have a diversity of opinion, so we should try to get reviewers from different companies 14:07:43 We've suggested it in the past, but I don't think it's ever been an issue in cinder. 14:07:44 nothing changes for me :) 14:07:45 but we shouldn't hold things up too long 14:07:51 hi 14:07:59 smcginnis: ++ 14:08:24 #topic announcements - virtual PTG brainstorming sessions 14:08:30 in case anyone is interested 14:08:31 rosmaita: +1 14:08:39 there are some virtual sessions over the next week or so 14:08:49 #link http://lists.openstack.org/pipermail/openstack-discuss/2020-March/013699.html 14:09:05 so if you'd like to help shape what the virtual PTG would be like, please attend 14:09:23 i think that's it for announcements 14:09:41 #topic Volume migration improvements 14:09:46 zigo: that's you 14:10:00 #link https://blueprints.launchpad.net/cinder/+spec/migration-improvement 14:11:09 looks like zigo may have been called away 14:11:31 anyway, he's interested in this, but doesn't have time to implement 14:11:31 rosmaita: I'm in a meeting too ... 14:11:45 basically, missing features are: progress indication, and migration abort command. This was proposed 5 years ago, but abbandoned (see linked review on that page). It'd be nice if someone was picking-up the idea and implementing it for Victoria 14:11:46 (another meeting with my work) 14:11:55 zigo: ok 14:12:08 It'd be very nice if the progress thing could be implemented by someone. 14:12:09 it's a good feature to have... but we need a volunteer to get this done 14:12:15 my question is does anyone know if there was a technical reason why we didin't follow through? 14:12:28 or just lack of volunteers? 14:12:31 I don't think I'd have the needed skills anyway, I don't know the cinder project. 14:12:41 rosmaita: it could be an issue for some drivers, I guess 14:12:43 i'd imagine that measuring progress is a tricky problem to solve since it would vary depending on the driver and how it copies data on the backend 14:12:58 rosmaita: but speaking about generic migration it's absolutely doable 14:13:00 we'd need some system where each driver could have a hook that would report that data, if it even could 14:13:05 Having it for at least LVM would be nice already. 14:13:18 eharney: +1 14:13:53 ok 14:14:00 With the old patch, it would have work for any driver where the compute node does a dd between volumes. 14:14:17 We need to make sure whatever we introduce can be used by all drivers. 14:16:16 sorry, i am thinking too much 14:16:20 2 things 14:16:38 1) is there sufficient interest on the part of driver maintainers? 14:16:46 2) someone to drive the work 14:16:56 would be good to get a sense of 1 before the PTG 14:17:09 so we don't spend a lot of time designing an interface no one will use 14:17:24 for the progress indication 14:17:39 zigo: do you want to circulate something on the ML? 14:17:59 maybe after M-3 14:18:28 ok, that's probably enough for now 14:18:45 #topic backporting 'supported' status for drivers 14:19:19 so, several vendors are getting their CIs running to be 'supported' in ussuri (were supported == False in train) 14:19:31 the question came up, can that be backported to train? 14:19:41 initially i was thinking no 14:19:46 technically, it's not a new feature 14:19:54 but someone found a precedent: 14:19:56 since we don't have drivers running CI on stable branches a lot of the time... hard to say 14:20:07 #link https://review.opendev.org/#/c/700775/ 14:20:10 eharney: exactly 14:20:12 I don't have a stong opinion on this 14:20:35 at lease we have to force drivers maintainers to have CI working for stable branches 14:20:36 and this time, we are only testing python 3, whereas train also supports py2 14:20:52 e0ne: good idea 14:21:18 so do we want to say, to backport 'supported', you need to have the CI working for Train? 14:22:17 is there good knowledge out there for how to do that? 14:22:31 not sure if it's just a switch to flip or more involved 14:22:41 eharney: good question, i don't know 14:23:55 well, i guess one way of looking at this is 14:24:16 actually, forget that, it doesn't make sense now that i started to write it down 14:24:46 I guess at least we know the vendor is around and presumably willing to address any issues that come up. 14:24:56 that is true 14:25:12 ok, tell you what, let's table this and return to it next week 14:25:36 #topic Bug: Cinder throws error creating incremental backup from parent in another project 14:25:42 hello 14:25:50 the floor is yours! 14:25:57 I found this bug: https://bugs.launchpad.net/cinder/+bug/1869746 14:25:58 Launchpad bug 1869746 in Cinder "Cinder throws error creating incremental backup from parent in another project" [Undecided,New] 14:26:12 and it has two possible solutions, I would like to discuss which one of them makes more sense 14:26:34 oh no 14:27:11 so the bug in summary: if the user tries to create an incremental backup based on a backup created by another user, the code throws an error, but the backup is created anyway 14:27:24 this only happens like this running the ceph backup driver 14:27:35 ganso: bug description says nothing about different projects, only different users are mentione4d 14:27:48 e0ne: sorry perhaps I need to improve the description 14:27:53 good question, so it's same project, different user? 14:28:33 Seems like this would only be an issue between admin creating a backup and an owning user/project creating a backup. 14:28:33 rosmaita: that's actually a good question, I did the steps to reproduce exactly, with users admin and demo 14:28:35 actually user admin could be any project 14:28:45 They should probably be treated separately. 14:28:49 the user admin belongs to the admin project, while the demo user belongs to the demo project, so could be both 14:29:39 Or even if created by admin, backup should get the owning project of the volume owner. Any reason why a cloud admin would be creating backups they don't want accessible by a project? 14:29:41 the volume from where the backup is created belongs to demo 14:31:02 smcginnis: so in this specific case, we were able to spot this bug because there were a few backups created by admin 14:31:21 smcginnis: and they thought to be harmless, but they are actually causing issues 14:31:33 ganso: so you know if it's just 2 users in the same project, does everything work as expected? 14:31:38 ganso: is it reproducible using different users without admin role? 14:31:40 ganso: Is it their intent that those admin created backups are not accessible by the volume owner? 14:32:09 rosmaita: I didn't test with those other combinations of 2 different users in same project, or if it has to be different projects 14:32:16 e0ne: I am not sure 14:32:35 ok, because in general, i think you would not expect this to work cross-project 14:32:43 smcginnis: ^^ is that right? 14:32:45 I would guess it has to be different projects. Which would mean it has to be admin versus someone in the project. 14:32:50 it sounds to me like the real failure here is about checking if the user has access to the backup (i.e. two different projects) -- not really anything related to admin 14:32:58 smcginnis: so, it was a special case that caused this bug, the customer can just reorganize their backups to avoid the bug, but this specific scenario allowed the bug to be found 14:33:02 eharney: ++ 14:33:08 perhaps a check is not happening early enough to catch when that should be rejected 14:33:12 eharney: +1 14:33:25 eharney: ++ 14:33:26 I think 1) we should make sure the user has access to the backup before attempting to create an incremental backup from it, and 14:33:43 eharney: so, exactly, other drivers that have the chunkeddriver as base class don't have this problem because it tries to get the parent first 14:33:51 right 14:33:54 and it throws the error immediately 14:34:03 eharney +1 14:34:06 2) decide whether we need to support two different sets of backup trees by keeping these separate, or if an admin creates a backup if the backup should still be set to be owned by the same project as the one that owns the volume. 14:34:08 we can check this in the API rather than in the backup driver though 14:34:36 yes, would be good to fail as early as possible 14:34:39 so, the way we found this problem was actually through the other bug: https://bugs.launchpad.net/cinder/+bug/1869749 14:34:40 Launchpad bug 1869749 in Cinder "Cinder leaves quota usage when deleting incremental backup with parent from another project" [Undecided,New] 14:34:57 we noticed that the quotas were frequently going off sync 14:35:11 and the code in that section could be improved to prevent that from happening 14:35:20 SO for my point 2, is it a valid use case that an admin wants their own backups of a tenant's volumes, or is the use case that the admin should be able to create backups on behalf of the tenant. 14:36:39 #link https://bugs.launchpad.net/cinder/+bug/1869746 14:36:39 i'd think it would be the "on behalf of" use case 14:36:40 Launchpad bug 1869746 in Cinder "Cinder throws error creating incremental backup from parent in another project" [Undecided,New] 14:37:10 but anyway, the root cause is during creation, the backup should either not be created successfully, or not raise errors. So the question is: If we are creating an incremental backup based on another backup that is not visible to that user, should the code override that (if possible) and create it anyway (since the volume owner is the user creating the incremental backup), or should we just implement a better check and throw a more friendly 14:37:10 error (preferably at the API), and prevent the backup from being created at all 14:37:21 In that case, we don't need to put in extra checking because a volume's backup would be for a volume they have access to... actually, scratch that - volume transfer. 14:37:28 So we do need the check. 14:37:55 And regardless of whoever creates a backup, the backup should be owned by the volume's project, not the creator's project. 14:38:30 And then, should we allow volume transfer to happen if there are backups present? Should those backups follow the volume like we decided to do with snapshots? 14:38:54 smcginnis: so, even if admin creates the backup, the backup should always be in the same project as the volume, that sounds good 14:39:09 and it solves the issue 14:39:35 that only works if you reject creating backups that users attempt on volumes they don't have access to 14:39:35 the volume transfer question is trickier 14:39:36 ganso: We still would need to address your concern because of backwards compatibility. We need to add a check on owner of the last full backup before we attempt to create an incremental backup from it. 14:39:50 because you don't want one user being able to consume another user's quota with failed backups etc 14:39:52 eharney: Don't we do that now? I sure hope we do. 14:40:01 Admin being a special case. 14:40:13 well, i thought the bug report was showing that we don't do it before creating them 14:40:23 If it's admin. 14:40:39 If we allow anyone else to create a backup of someone else's volume, that would be a huge security hole. 14:40:56 eharney: the backups shouldn't be visible to that user, otherwise that would be possible 14:41:19 ok 14:42:05 also, is_public volumes... 14:42:33 eharney: good point, It will end up consuming quota against someone else's project 14:42:40 * eharney will look at some of this later 14:42:55 Ah, yep. So backup trees do need to be per-project. 14:43:22 So incremental does need a check to make sure the full backup it attempts to use is one that is actually owned by the current user. 14:43:28 And skip any that are not. 14:44:36 smcginnis: but considering the ceph driver, doesn't it cause problems for the driver? imagine that you have backup A as demo, backup B as admin, and backup C as demo, if backup C skips B according to the code and considers A as parent, doesn't it cause problems? 14:44:46 actually, considering any driver, not just ceph 14:45:13 ganso: I'm not sure, but it sounds like it would need to be able to handle that situation. 14:46:50 I think we need to start an etherpad or something to start capturing some of this. Or a spec, but it's not really a new feature. Something somewhere to make sure everyone understands what the issue is and what any proposed fixes would be. 14:47:13 ok, so have we agreed that an incremental backup must be in the same project as the full backup it depends on? 14:47:27 rosmaita: actually no, because of is_public volumes 14:47:31 I think it has to be. 14:47:38 rosmaita: That sounds right. 14:47:43 ganso: Yes, because of is_public. 14:47:43 I'm confused 14:47:57 ganso: the backup, not the volume 14:48:05 yes... is_public was more about why a backup might not have the same owner as a volume 14:48:08 rosmaita: ooooh ok 14:48:20 but except for public volumes, sounds like the backup should normall be in the same project as the volume 14:48:22 makes sense 14:48:41 but i agree with smcginnis 14:48:44 The open question for me is what happens with admin created backups. 14:48:55 ganso, please write up how this should work in an etherpad 14:49:00 But I think, based on everything else, those backups would just be owned by the admin and now the owner. 14:49:04 smcginnis: I will update the launchpad bug entry with a summary of what was discussed here, with a link to the meeting notes 14:49:07 and we can all look at it and make sure it makes sense 14:49:08 *owner of the volume 14:49:14 ganso: ++ 14:49:16 what does it mean 'is_public volume'? 14:49:41 AFAIR, volume could be owned only by one project 14:49:50 and we can't share volumes between projects 14:49:55 e0ne: good question ... i thought we only had public volume_types 14:50:03 e0ne: everyone can see the volume, which is owned by somebody. This would mean anybody can create a backup of the volume, since it is public 14:50:04 rosmaita: +1. 14:50:27 ganso: is't not public. it's available for everybody form tenant 14:50:34 it's really different cases 14:51:00 it's different from visibility == public images in glance 14:51:11 hmmmm I am not familiar with this. I know "is_public" from manila/glance/nova and thought this was the same 14:51:14 My cloud is down, so I can't check. I may have been mixing that up with volume types and images. 14:51:50 yeah, i believe e0ne is correct, it's only the volume_type that can be public 14:51:55 smcginnis: if that is the case, then the previous idea of always having the backup the same project as the volume could work 14:52:19 ok, we are running low on time, this is another thing to investigate 14:52:24 ok 14:52:31 e0ne: thank you for raising this point 14:52:33 thanks everyone for feedback on this! =) 14:52:46 http://paste.openstack.org/show/791472/ 14:53:08 ganso: Yeah, if we don't have public, then I do think backups should always be in the project that owns the volume. 14:53:18 smcginnis: +2 14:53:37 #topic tempest tests 14:53:41 LiangFang: that's you 14:53:44 that seems like an odd thing to always require to me, but i need to think on it more 14:53:51 is that for your devstack plugin? 14:54:07 #link https://review.opendev.org/#/c/713772/ 14:54:16 yes 14:54:25 I have solve the issues 14:54:31 eharney: Kind of like a snapshot. It wouldn't make sense for a volume to be in one project, and one of its snapshots to be in another. Anyway, we can think about it and talk later. 14:55:16 LiangFang: good news, i will look at the patch later 14:55:23 thanks 14:55:33 #open discussion 14:55:53 quickly, i learned about the reno.cache 14:56:18 is anyone aware about any changes in zuul or something merged in cinder causing this issue https://review.opendev.org/#/c/697636/ 14:56:23 if you get crazy releasenote builds, everything the same for all releases, delete the reno.cache in releasenotes/notes 14:56:33 there are a lot of valid failures on pylint but unrelated to the patch 14:57:34 wow, that is pretty ugly 14:58:21 yep, most of them are in test files 14:58:32 pylint is not smart enough to understand a lot of unit test code adequately 14:58:44 whoami-rajat: https://review.opendev.org/#/c/716600/ 14:59:22 smcginnis, great, so i just need to recheck 14:59:29 eharney++ 14:59:49 Not sure how quickly that makes it into the gate. Just ignore for now. 14:59:54 But it should go away soon. 15:00:08 ok, we are out of time 15:00:16 #endmeeting