14:00:46 <rosmaita> #startmeeting cinder
14:00:47 <rosmaita> #topic roll call
14:00:47 <rosmaita> #link https://etherpad.openstack.org/p/cinder-ussuri-meetings
14:00:48 <openstack> Meeting started Wed Apr  1 14:00:46 2020 UTC and is due to finish in 60 minutes.  The chair is rosmaita. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:48 <jungleboyj> o/
14:00:49 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:50 <geguileo> hi! o/
14:00:51 <e0ne> hi
14:00:52 <openstack> The meeting name has been set to 'cinder'
14:00:57 <rosmaita> sorry, had some technical difficulties there
14:00:59 <lseki> hi
14:01:02 <whoami-rajat> Hi
14:01:21 <tosky> o/
14:01:28 <eharney> hi
14:01:43 <rosmaita> looks like a good crowd, i'll get started because there's a meeting right after ours
14:01:48 <LiangFang> o/
14:02:15 <ganso> hello
14:02:31 <rosmaita> #announcements - upcoming deadlines
14:02:34 <rosmaita> this week (tomorrow) - os-brick release
14:02:34 <rosmaita> next week:
14:02:34 <rosmaita> * ussuri cinderclient release
14:02:34 <rosmaita> * ussuri milestone-3 release
14:02:34 <rosmaita> * FEATURE FREEZE
14:02:35 <rosmaita> two weeks after that (week of 20 April):
14:02:35 <rosmaita> * RC-1
14:02:47 <rosmaita> so feature freeze coming up fast
14:03:08 <rosmaita> let's do a quick check of the os-brick release status
14:03:22 <rosmaita> #link https://etherpad.openstack.org/p/os-bric-ussuri-release
14:03:37 <rosmaita> not much in there, just a few bugfixes
14:03:57 <rosmaita> looks like no change in the status of the unmerged changes
14:04:07 <rosmaita> #link https://review.opendev.org/#/q/status:open+project:openstack/os-brick+branch:master
14:04:30 <rosmaita> so unless anyone has a last minute thing, i'll propose the release from HEAD later today
14:04:43 <rosmaita> for victoria, we will have some big changes
14:04:43 <e0ne> +1 for a release
14:05:15 <rosmaita> thanks e0ne
14:05:54 <rosmaita> #topic announcements - relaxing the two-company rule
14:06:04 <rosmaita> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-March/013423.html
14:06:27 <rosmaita> this has become an issue with so many of us working for red hat these days
14:06:41 <rosmaita> nova has relaxed the rule
14:06:57 <rosmaita> i think we should as well
14:07:15 <LiangFang> +1
14:07:25 <whoami-rajat> ++
14:07:37 <rosmaita> but with the proviso that it's always better to have a diversity of opinion, so we should try to get reviewers from different companies
14:07:43 <smcginnis> We've suggested it in the past, but I don't think it's ever been an issue in cinder.
14:07:44 <e0ne> nothing changes for me :)
14:07:45 <rosmaita> but we shouldn't hold things up too long
14:07:51 <enriquetaso> hi
14:07:59 <rosmaita> smcginnis: ++
14:08:24 <rosmaita> #topic announcements - virtual PTG brainstorming sessions
14:08:30 <rosmaita> in case anyone is interested
14:08:31 <e0ne> rosmaita: +1
14:08:39 <rosmaita> there are some virtual sessions over the next week or so
14:08:49 <rosmaita> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-March/013699.html
14:09:05 <rosmaita> so if you'd like to help shape what the virtual PTG would be like, please attend
14:09:23 <rosmaita> i think that's it for announcements
14:09:41 <rosmaita> #topic Volume migration improvements
14:09:46 <rosmaita> zigo: that's you
14:10:00 <rosmaita> #link https://blueprints.launchpad.net/cinder/+spec/migration-improvement
14:11:09 <rosmaita> looks like zigo may have been called away
14:11:31 <rosmaita> anyway, he's interested in this, but doesn't have time to implement
14:11:31 <zigo> rosmaita: I'm in a meeting too ...
14:11:45 <rosmaita> basically, missing features are: progress indication, and migration abort command. This was proposed 5 years ago, but abbandoned (see linked review on that page). It'd be nice if someone was picking-up the idea and implementing it for Victoria
14:11:46 <zigo> (another meeting with my work)
14:11:55 <rosmaita> zigo: ok
14:12:08 <zigo> It'd be very nice if the progress thing could be implemented by someone.
14:12:09 <e0ne> it's a good feature to have... but we need a volunteer to get this done
14:12:15 <rosmaita> my question is does anyone know if there was a technical reason why we didin't follow through?
14:12:28 <rosmaita> or just lack of volunteers?
14:12:31 <zigo> I don't think I'd have the needed skills anyway, I don't know the cinder project.
14:12:41 <e0ne> rosmaita: it could be an issue for some drivers, I guess
14:12:43 <eharney> i'd imagine that measuring progress is a tricky problem to solve since it would vary depending on the driver and how it copies data on the backend
14:12:58 <e0ne> rosmaita: but speaking about generic migration it's absolutely doable
14:13:00 <eharney> we'd need some system where each driver could have a hook that would report that data, if it even could
14:13:05 <zigo> Having it for at least LVM would be nice already.
14:13:18 <e0ne> eharney: +1
14:13:53 <rosmaita> ok
14:14:00 <zigo> With the old patch, it would have work for any driver where the compute node does a dd between volumes.
14:14:17 <smcginnis> We need to make sure whatever we introduce can be used by all drivers.
14:16:16 <rosmaita> sorry, i am thinking too much
14:16:20 <rosmaita> 2 things
14:16:38 <rosmaita> 1) is there sufficient interest on the part of driver maintainers?
14:16:46 <rosmaita> 2) someone to drive the work
14:16:56 <rosmaita> would be good to get a sense of 1 before the PTG
14:17:09 <rosmaita> so we don't spend a lot of time designing an interface no one will use
14:17:24 <rosmaita> for the progress indication
14:17:39 <rosmaita> zigo: do you want to circulate something on the ML?
14:17:59 <rosmaita> maybe after M-3
14:18:28 <rosmaita> ok, that's probably enough for now
14:18:45 <rosmaita> #topic backporting 'supported' status for drivers
14:19:19 <rosmaita> so, several vendors are getting their CIs running to be 'supported' in ussuri (were supported == False in train)
14:19:31 <rosmaita> the question came up, can that be backported to train?
14:19:41 <rosmaita> initially i was thinking no
14:19:46 <e0ne> technically, it's not a new feature
14:19:54 <rosmaita> but someone found a precedent:
14:19:56 <eharney> since we don't have drivers running CI on stable branches a lot of the time... hard to say
14:20:07 <rosmaita> #link https://review.opendev.org/#/c/700775/
14:20:10 <rosmaita> eharney: exactly
14:20:12 <e0ne> I don't have a stong opinion on this
14:20:35 <e0ne> at lease we have to force drivers maintainers to have CI working for stable branches
14:20:36 <rosmaita> and this time, we are only testing python 3, whereas train also supports py2
14:20:52 <rosmaita> e0ne: good idea
14:21:18 <rosmaita> so do we want to say, to backport 'supported', you need to have the CI working for Train?
14:22:17 <eharney> is there good knowledge out there for how to do that?
14:22:31 <eharney> not sure if it's just a switch to flip or more involved
14:22:41 <rosmaita> eharney: good question, i don't know
14:23:55 <rosmaita> well, i guess one way of looking at this is
14:24:16 <rosmaita> actually, forget that, it doesn't make sense now that i started to write it down
14:24:46 <smcginnis> I guess at least we know the vendor is around and presumably willing to address any issues that come up.
14:24:56 <rosmaita> that is true
14:25:12 <rosmaita> ok, tell you what, let's table this and return to it next week
14:25:36 <rosmaita> #topic Bug: Cinder throws error creating incremental backup from parent in another project
14:25:42 <ganso> hello
14:25:50 <rosmaita> the floor is yours!
14:25:57 <ganso> I found this bug: https://bugs.launchpad.net/cinder/+bug/1869746
14:25:58 <openstack> Launchpad bug 1869746 in Cinder "Cinder throws error creating incremental backup from parent in another project" [Undecided,New]
14:26:12 <ganso> and it has two possible solutions, I would like to discuss which one of them makes more sense
14:26:34 <enriquetaso> oh no
14:27:11 <ganso> so the bug in summary: if the user tries to create an incremental backup based on a backup created by another user, the code throws an error, but the backup is created anyway
14:27:24 <ganso> this only happens like this running the ceph backup driver
14:27:35 <e0ne> ganso: bug description says nothing about different projects, only different users are mentione4d
14:27:48 <ganso> e0ne: sorry perhaps I need to improve the description
14:27:53 <rosmaita> good question, so it's same project, different user?
14:28:33 <smcginnis> Seems like this would only be an issue between admin creating a backup and an owning user/project creating a backup.
14:28:33 <ganso> rosmaita: that's actually a good question, I did the steps to reproduce exactly, with users admin and demo
14:28:35 <rosmaita> actually user admin could be any project
14:28:45 <smcginnis> They should probably be treated separately.
14:28:49 <ganso> the user admin belongs to the admin project, while the demo user belongs to the demo project, so could be both
14:29:39 <smcginnis> Or even if created by admin, backup should get the owning project of the volume owner. Any reason why a cloud admin would be creating backups they don't want accessible by a project?
14:29:41 <ganso> the volume from where the backup is created belongs to demo
14:31:02 <ganso> smcginnis: so in this specific case, we were able to spot this bug because there were a few backups created by admin
14:31:21 <ganso> smcginnis: and they thought to be harmless, but they are actually causing issues
14:31:33 <rosmaita> ganso: so you know if it's just 2 users in the same project, does everything work as expected?
14:31:38 <e0ne> ganso: is it reproducible using different users without admin role?
14:31:40 <smcginnis> ganso: Is it their intent that those admin created backups are not accessible by the volume owner?
14:32:09 <ganso> rosmaita: I didn't test with those other combinations of 2 different users in same project, or if it has to be different projects
14:32:16 <ganso> e0ne: I am not sure
14:32:35 <rosmaita> ok, because in general, i think you would not expect this to work cross-project
14:32:43 <rosmaita> smcginnis: ^^ is that right?
14:32:45 <smcginnis> I would guess it has to be different projects. Which would mean it has to be admin versus someone in the project.
14:32:50 <eharney> it sounds to me like the real failure here is about checking if the user has access to the backup (i.e. two different projects) -- not really anything related to admin
14:32:58 <ganso> smcginnis: so, it was a special case that caused this bug, the customer can just reorganize their backups to avoid the bug, but this specific scenario allowed the bug to be found
14:33:02 <smcginnis> eharney: ++
14:33:08 <eharney> perhaps a check is not happening early enough to catch when that should be rejected
14:33:12 <e0ne> eharney: +1
14:33:25 <rosmaita> eharney: ++
14:33:26 <smcginnis> I think 1) we should make sure the user has access to the backup before attempting to create an incremental backup from it, and
14:33:43 <ganso> eharney: so, exactly, other drivers that have the chunkeddriver as base class don't have this problem because it tries to get the parent first
14:33:51 <eharney> right
14:33:54 <ganso> and it throws the error immediately
14:34:03 <enriquetaso> eharney +1
14:34:06 <smcginnis> 2) decide whether we need to support two different sets of backup trees by keeping these separate, or if an admin creates a backup if the backup should still be set to be owned by the same project as the one that owns the volume.
14:34:08 <eharney> we can check this in the API rather than in the backup driver though
14:34:36 <rosmaita> yes, would be good to fail as early as possible
14:34:39 <ganso> so, the way we found this problem was actually through the other bug: https://bugs.launchpad.net/cinder/+bug/1869749
14:34:40 <openstack> Launchpad bug 1869749 in Cinder "Cinder leaves quota usage when deleting incremental backup with parent from another project" [Undecided,New]
14:34:57 <ganso> we noticed that the quotas were frequently going off sync
14:35:11 <ganso> and the code in that section could be improved to prevent that from happening
14:35:20 <smcginnis> SO for my point 2, is it a valid use case that an admin wants their own backups of a tenant's volumes, or is the use case that the admin should be able to create backups on behalf of the tenant.
14:36:39 <enriquetaso> #link https://bugs.launchpad.net/cinder/+bug/1869746
14:36:39 <rosmaita> i'd think it would be the "on behalf of" use case
14:36:40 <openstack> Launchpad bug 1869746 in Cinder "Cinder throws error creating incremental backup from parent in another project" [Undecided,New]
14:37:10 <ganso> but anyway, the root cause is during creation, the backup should either not be created successfully, or not raise errors. So the question is: If we are creating an incremental backup based on another backup that is not visible to that user, should the code override that (if possible) and create it anyway (since the volume owner is the user creating the incremental backup), or should we just implement a better check and throw a more friendly
14:37:10 <ganso> error (preferably at the API), and prevent the backup from being created at all
14:37:21 <smcginnis> In that case, we don't need to put in extra checking because a volume's backup would be for a volume they have access to... actually, scratch that - volume transfer.
14:37:28 <smcginnis> So we do need the check.
14:37:55 <smcginnis> And regardless of whoever creates a backup, the backup should be owned by the volume's project, not the creator's project.
14:38:30 <smcginnis> And then, should we allow volume transfer to happen if there are backups present? Should those backups follow the volume like we decided to do with snapshots?
14:38:54 <ganso> smcginnis: so, even if admin creates the backup, the backup should always be in the same project as the volume, that sounds good
14:39:09 <ganso> and it solves the issue
14:39:35 <eharney> that only works if you reject creating backups that users attempt on volumes they don't have access to
14:39:35 <rosmaita> the volume transfer question is trickier
14:39:36 <smcginnis> ganso: We still would need to address your concern because of backwards compatibility. We need to add a check on owner of the last full backup before we attempt to create an incremental backup from it.
14:39:50 <eharney> because you don't want one user being able to consume another user's quota with failed backups etc
14:39:52 <smcginnis> eharney: Don't we do that now? I sure hope we do.
14:40:01 <smcginnis> Admin being a special case.
14:40:13 <eharney> well, i thought the bug report was showing that we don't do it before creating them
14:40:23 <smcginnis> If it's admin.
14:40:39 <smcginnis> If we allow anyone else to create a backup of someone else's volume, that would be a huge security hole.
14:40:56 <ganso> eharney: the backups shouldn't be visible to that user, otherwise that would be possible
14:41:19 <eharney> ok
14:42:05 <eharney> also, is_public volumes...
14:42:33 <ganso> eharney: good point, It will end up consuming quota against someone else's project
14:42:40 * eharney will look at some of this later
14:42:55 <smcginnis> Ah, yep. So backup trees do need to be per-project.
14:43:22 <smcginnis> So incremental does need a check to make sure the full backup it attempts to use is one that is actually owned by the current user.
14:43:28 <smcginnis> And skip any that are not.
14:44:36 <ganso> smcginnis: but considering the ceph driver, doesn't it cause problems for the driver? imagine that you have backup A as demo, backup B as admin, and backup C as demo, if backup C skips B according to the code and considers A as parent, doesn't it cause problems?
14:44:46 <ganso> actually, considering any driver, not just ceph
14:45:13 <smcginnis> ganso: I'm not sure, but it sounds like it would need to be able to handle that situation.
14:46:50 <smcginnis> I think we need to start an etherpad or something to start capturing some of this. Or a spec, but it's not really a new feature. Something somewhere to make sure everyone understands what the issue is and what any proposed fixes would be.
14:47:13 <rosmaita> ok, so have we agreed that an incremental backup must be in the same project as the full backup it depends on?
14:47:27 <ganso> rosmaita: actually no, because of is_public volumes
14:47:31 <smcginnis> I think it has to be.
14:47:38 <jungleboyj> rosmaita: That sounds right.
14:47:43 <smcginnis> ganso: Yes, because of is_public.
14:47:43 <ganso> I'm confused
14:47:57 <rosmaita> ganso: the backup, not the volume
14:48:05 <eharney> yes... is_public was more about why a backup might not have the same owner as a volume
14:48:08 <ganso> rosmaita: ooooh ok
14:48:20 <rosmaita> but except for public volumes, sounds like the backup should normall be in the same project as the volume
14:48:22 <ganso> makes sense
14:48:41 <rosmaita> but i agree with smcginnis
14:48:44 <smcginnis> The open question for me is what happens with admin created backups.
14:48:55 <rosmaita> ganso, please write up how this should work in an etherpad
14:49:00 <smcginnis> But I think, based on everything else, those backups would just be owned by the admin and now the owner.
14:49:04 <ganso> smcginnis: I will update the launchpad bug entry with a summary of what was discussed here, with a link to the meeting notes
14:49:07 <rosmaita> and we can all look at it and make sure it makes sense
14:49:08 <smcginnis> *owner of the volume
14:49:14 <smcginnis> ganso: ++
14:49:16 <e0ne> what does it mean 'is_public volume'?
14:49:41 <e0ne> AFAIR, volume could be owned only by one project
14:49:50 <e0ne> and we can't share volumes between projects
14:49:55 <rosmaita> e0ne: good question ... i thought we only had public volume_types
14:50:03 <ganso> e0ne: everyone can see the volume, which is owned by somebody. This would mean anybody can create a backup of the volume, since it is public
14:50:04 <e0ne> rosmaita: +1.
14:50:27 <e0ne> ganso: is't not public. it's available for everybody form tenant
14:50:34 <e0ne> it's really different cases
14:51:00 <rosmaita> it's different from visibility == public images in glance
14:51:11 <ganso> hmmmm I am not familiar with this. I know "is_public" from manila/glance/nova and thought this was the same
14:51:14 <smcginnis> My cloud is down, so I can't check. I may have been mixing that up with volume types and images.
14:51:50 <rosmaita> yeah, i believe e0ne is correct, it's only the volume_type that can be public
14:51:55 <ganso> smcginnis: if that is the case, then the previous idea of always having the backup the same project as the volume could work
14:52:19 <rosmaita> ok, we are running low on time, this is another thing to investigate
14:52:24 <ganso> ok
14:52:31 <rosmaita> e0ne: thank you for raising this point
14:52:33 <ganso> thanks everyone for feedback on this! =)
14:52:46 <e0ne> http://paste.openstack.org/show/791472/
14:53:08 <smcginnis> ganso: Yeah, if we don't have public, then I do think backups should always be in the project that owns the volume.
14:53:18 <e0ne> smcginnis: +2
14:53:37 <rosmaita> #topic tempest tests
14:53:41 <rosmaita> LiangFang: that's you
14:53:44 <eharney> that seems like an odd thing to always require to me, but i need to think on it more
14:53:51 <rosmaita> is that for your devstack plugin?
14:54:07 <rosmaita> #link https://review.opendev.org/#/c/713772/
14:54:16 <LiangFang> yes
14:54:25 <LiangFang> I have solve the issues
14:54:31 <smcginnis> eharney: Kind of like a snapshot. It wouldn't make sense for a volume to be in one project, and one of its snapshots to be in another. Anyway, we can think about it and talk later.
14:55:16 <rosmaita> LiangFang: good news, i will look at the patch later
14:55:23 <LiangFang> thanks
14:55:33 <rosmaita> #open discussion
14:55:53 <rosmaita> quickly, i learned about the reno.cache
14:56:18 <whoami-rajat> is anyone aware about any changes in zuul or something merged in cinder causing this issue https://review.opendev.org/#/c/697636/
14:56:23 <rosmaita> if you get crazy releasenote builds, everything the same for all releases, delete the reno.cache in releasenotes/notes
14:56:33 <whoami-rajat> there are a lot of valid failures on pylint but unrelated to the patch
14:57:34 <rosmaita> wow, that is pretty ugly
14:58:21 <whoami-rajat> yep, most of them are in test files
14:58:32 <eharney> pylint is not smart enough to understand a lot of unit test code adequately
14:58:44 <smcginnis> whoami-rajat: https://review.opendev.org/#/c/716600/
14:59:22 <whoami-rajat> smcginnis, great, so i just need to recheck
14:59:29 <whoami-rajat> eharney++
14:59:49 <smcginnis> Not sure how quickly that makes it into the gate. Just ignore for now.
14:59:54 <smcginnis> But it should go away soon.
15:00:08 <rosmaita> ok, we are out of time
15:00:16 <rosmaita> #endmeeting