14:00:02 <rosmaita> #startmeeting cinder
14:00:02 <rosmaita> #link https://etherpad.openstack.org/p/cinder-ussuri-meetings
14:00:02 <rosmaita> #topic roll call
14:00:03 <openstack> Meeting started Wed Apr 15 14:00:02 2020 UTC and is due to finish in 60 minutes.  The chair is rosmaita. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:04 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:07 <openstack> The meeting name has been set to 'cinder'
14:00:16 <rajinir> hi
14:00:22 <LiangFang> hi
14:00:28 <ganso> o/
14:00:29 <m5z> hi
14:00:32 <eharney> hi
14:00:45 <vkmc> o/
14:01:00 <smcginnis> o/
14:01:01 <whoami-rajat> Hi
14:01:12 <jungleboyj> o/
14:01:16 <tosky> o/
14:01:36 <rosmaita> lots on the agenda today so i'll get started
14:01:44 <rosmaita> #topic announcements
14:01:59 <rosmaita> {mitaka,newton}-driverfixes branches have been tagged eol and deleted
14:02:09 <rosmaita> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014055.html
14:02:27 <rosmaita> we agreed to do this a few months ago, but it required some manual intervention to make it happen
14:02:51 <rosmaita> if you haven't noticed, there's been a etherpad migration and domain change
14:03:02 <rosmaita> #link http://lists.opendev.org/pipermail/service-announce/2020-April/000003.html
14:03:31 <rosmaita> hopefully, you won't see anything, though monday the meeting agenda page was causing an internal server error
14:03:34 <rosmaita> but that was fixed
14:03:55 <rosmaita> #topic announcements - Victoria Virtual PTG
14:04:03 <rosmaita> ok, so the dates have been set
14:04:13 <rosmaita> one week earlier than the physical event
14:04:19 <rosmaita> not sure why
14:04:21 <jungleboyj> Yay!?!
14:04:27 <rosmaita> new dates: June 1 through June 5
14:04:36 <rosmaita> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014126.html
14:04:48 <rosmaita> that email outlines the ground rules for the virtual PTG
14:05:02 <rosmaita> key things impacting us are:
14:05:09 <rosmaita> No team can sign up for more than 4 hours per UTC day
14:05:09 <enriquetaso> o/
14:05:18 <rosmaita> No team can sign up for more than 16 hours across all time slots
14:05:19 <jungleboyj> Which is good.
14:05:40 <rosmaita> yeah, our 2-hour blocks have worked well for the midcycle sessions
14:05:53 <rosmaita> anyway, here are the time slots:
14:06:04 <rosmaita> #link https://ethercalc.openstack.org/126u8ek25noy
14:06:20 * lseki sneaks in
14:06:31 <rosmaita> and here is a meeting time planner for the first day, covering the TZs usually represented at our cinder meeting
14:06:43 <rosmaita> #link https://www.timeanddate.com/worldclock/meetingtime.html?month=6&day=1&year=2020&p1=159&p2=881&p3=141&p4=367&p5=176&p6=237&iv=0
14:06:55 <rosmaita> i don't want to spend a lot of time on this today
14:07:19 <rosmaita> i guess the thing to do is, please put time suggetions on the etherpad
14:07:29 <rosmaita> #link https://etherpad.opendev.org/p/cinder-victoria-ptg-planning
14:07:35 <rosmaita> or something
14:07:45 <rosmaita> i really don't know a good way to organize this
14:07:53 <rosmaita> so feel free to suggest ideas
14:08:02 <jungleboyj> :-)
14:08:16 <rosmaita> but, do put time slot suggestions on the etherpad
14:08:33 <rosmaita> or else i will schedule everytihing to be convenient for Roanoke time :)
14:08:41 <xover-23> hello world friends
14:08:51 <jungleboyj> Looks like 7 am Monday is the most likely time where things start to work.
14:08:54 <rosmaita> ok, moving on
14:09:01 <e0ne> jungleboyj: :)
14:09:04 <rosmaita> #topic announcements - FFEs
14:09:25 <rosmaita> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014035.html
14:09:33 <rosmaita> that was the announcement to the ML
14:09:48 <rosmaita> no other requests came in before yesterday's deadline, so that's all
14:10:00 <rosmaita> i've been tracking progress here:
14:10:10 <rosmaita> #link https://etherpad.opendev.org/p/cinder-ussuri-FFE-progress
14:10:16 <rosmaita> looks like stuff is happening
14:10:40 <rosmaita> i am beginning to wonder whether the nfs-volume-encryption is going to have to wait until victoria
14:10:51 <rosmaita> enriquetaso: eharney: opinions?
14:11:04 <eharney> i have also wondered the same
14:11:21 <enriquetaso> yep, I think is going to victoria
14:11:28 <jungleboyj> Something like that doesn't sound like something that we want to rush in place.
14:11:32 <rosmaita> ok, i will re=target that bp
14:11:33 <eharney> which is to say i wouldn't be upset about moving it out
14:11:51 <rosmaita> and let's try to focus on it very early in victoria
14:12:10 <rosmaita> ok, the other FFE is macrosan, but they are on the agenda for later
14:12:18 <rosmaita> ok final item
14:12:30 <rosmaita> #topic announcements - end-of-cycle driver stuff
14:12:40 <rosmaita> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014124.html
14:12:50 <rosmaita> there are some outstanding items
14:13:11 <rosmaita> the link above is something i sent to the ML and also forwarded directly to the driver maintainers of record
14:13:35 <rosmaita> also, tracking that on this etherpad:
14:13:46 <rosmaita> #topic https://etherpad.opendev.org/p/cinder-ussuri-driverstuff
14:13:54 <rosmaita> oops
14:13:58 <enriquetaso> :P
14:14:05 <rosmaita> #link https://etherpad.opendev.org/p/cinder-ussuri-driverstuff
14:14:23 <rosmaita> anyway, mostly driver maintainers need to check that list
14:14:40 <rosmaita> i'll raise the priority flag on reviews as they come in
14:14:57 <rosmaita> also, I'm beginning the 3rd party compliance check
14:15:08 <jungleboyj> ++
14:15:28 <rosmaita> but, like i said last week, driver maintainers can check proactively to make sure their CIs are functioning reliably
14:15:45 <rosmaita> because RC-1 is next week, and that's the deadline for having everything working
14:16:06 <rosmaita> though, it was brought to my attention that the covid-19 situation is keeping people out of datacenters
14:16:07 <smcginnis> Also a reminder that it is good to point new driver submitters to https://docs.openstack.org/cinder/latest/contributor/new_driver_checklist.html to make sure they are covering everything that we should be checking.
14:16:12 <rosmaita> and not everyone can do stuff remotely
14:16:19 <rosmaita> smcginnis: ty, good reminder
14:16:36 <rosmaita> ok, that's all for announcements
14:16:47 <rosmaita> #topic MacronSAN driver discussion
14:16:53 <rosmaita> ruffian_sheep: that's you
14:17:27 <ruffian_sheep> Regarding tempest.api.compute.admin.test_volume_swap.TestMultiAttachVolumeSwap.test_volume_swap_with_multiattach in the tempest test case, there is a problem that cannot be passed.
14:18:26 <ruffian_sheep> Without any changes to the driver, I additionally create an instance and a volume, and perform the mount operation. This test case can pass.
14:18:57 <rosmaita> you mean you can manually do what the test is doing, and you succeed?
14:19:05 <ruffian_sheep> http://120.77.149.115/88/711388/6/check/cinder-isicsi-driver/848d283/tempest_log/tox.log
14:19:09 <ruffian_sheep> http://120.77.149.115/88/711388/6/check/cinder-isicsi-driver/4a616b6/tempest_log/tox.log
14:19:19 <whoami-rajat> ruffian_sheep, did you find any other error log except from the n-cpu one?
14:19:29 <ruffian_sheep> This is the log result of two condition changes
14:20:33 <ruffian_sheep> whoami-rajat | rosmaita: This test can be performed, but an error will occur when the resource is finally cleaned up. Moreover, in the logs of related test cases, only errors are found in the n-cpu.
14:21:40 <ruffian_sheep> In fact, the same error report exists in n-cpu, but after changing the conditions, the use case can be executed without error.
14:22:17 <rosmaita> when you say "changing the conditions", what do you mean exactly?
14:22:51 <whoami-rajat> the driver seems to work correctly if the test passes in the local run
14:22:57 <ruffian_sheep> Create a new instance and volume, and perform the attach_volume operation.
14:24:01 <ruffian_sheep> I don't know what the specific reason is, but when I do this and then execute the tempest use case, it can be executed completely correctly.
14:24:22 <smcginnis> ruffian_sheep: Unrelated, but just want to note that the tempest run output should really be in the root log file. So job-output.log would be where most would expect to go to find that. Don't want to divert this discussion, but just letting you know.
14:24:34 <eharney> there are some strange cinder api errors associated with that test failure in screen-c-api.log
14:25:31 <ruffian_sheep> smcginnis : get, i will change it
14:26:32 <ruffian_sheep> eharney: I also saw it, but from the execution results, it seems to have no effect? And it is not directly related to the use case of multiattach?
14:26:43 <eharney> c-vol shows a lock held for 51 seconds, maybe something is taking longer than tempest expects there
14:27:11 <eharney> look at req-6f364876-aafd lines in c-api and c-vol logs... probably don't have time to debug it all here
14:28:21 <rosmaita> ruffian_sheep: so, multiattach is a new feature you are adding
14:28:40 <rosmaita> i wonder whether you should hold off on that until V
14:28:58 <rosmaita> so you can get everything set to mark the driver 'supported' in U
14:29:10 <ruffian_sheep> Yes, I wanted to add this feature to the ussuri version, but from the deadline, it is a bit unrealistic.
14:29:28 <rosmaita> it looks like everything else is working though?
14:30:17 <ruffian_sheep> Yes, and I initially passed the test case on the s version.
14:31:21 <ruffian_sheep> Because it was unclear at the beginning to execute the ci environment regularly. And use the latest openstack version. I carried out the tempest test on the s version.
14:31:23 <rosmaita> i think the thing to do is revise your patch without multiattach and address the comments on the review
14:31:54 <rosmaita> anything else?
14:32:05 <whoami-rajat> rosmaita++
14:32:07 <smcginnis> rosmaita: ++
14:32:17 <ruffian_sheep> Yes, I confirmed this problem with whoami-rajat in the afternoon. For now, do not add related new features.
14:32:52 <rosmaita> ok, thanks ... we will keep an eye on your patch
14:32:59 <whoami-rajat> This way the CI could be marked supported and other features (apart from multiattach) could make it as well (probably)
14:33:14 <whoami-rajat> into ussuri
14:33:29 <rosmaita> #topic Continued discussion of: Cinder throws error creating incremental backup from parent in another project
14:33:35 <rosmaita> ganso: that's you
14:33:47 <rosmaita> we started this 2 weeks ago
14:33:55 <rosmaita> link on the etherpad
14:34:37 <rosmaita> i forget where we were on this, though
14:35:15 <enriquetaso> #link https://bugs.launchpad.net/cinder/+bug/1869746
14:35:15 <openstack> Launchpad bug 1869746 in Cinder "Cinder throws error creating incremental backup from parent in another project" [Undecided,Confirmed]
14:36:12 <rosmaita> ganso: comments?
14:36:25 <ganso> oh sorry I missed the ping
14:36:30 <rosmaita> ok
14:36:51 <ganso> so, last time we discussed several different ways to tackle the problem
14:37:03 <ganso> I summarized them in this etherpad
14:37:18 <ganso> https://etherpad.opendev.org/p/cinder-backup-bug
14:38:02 <ganso> basically I came up with 3 approaches from what we discussed, each with its pros and cons
14:38:18 <rosmaita> ganso: how big a deal is this bug?
14:38:25 <eharney> i'm getting server errors trying to load the etherpad
14:38:41 <smcginnis> ganso: Having gone through all of those, is there one approach that makes the most sense to you?
14:38:57 <smcginnis> eharney: Try a hard reload (ctrl+shift+r).
14:39:12 <smcginnis> They upgraded etherpad, so it could be a bad cached js lib.
14:39:29 <ganso> rosmaita: it is workaroundable, and IMO it is the consequence of the customer doing it not in the most appropriate way, my main concern is actually fixing it to avoid anybody else from hitting it in the future.
14:39:43 <rosmaita> ganso: excellent
14:39:58 <rosmaita> i think we should fix it, just looks like it could wait for V?
14:40:10 <ganso> smcginnis: I haven't gone through them at the implementation level, I didn't have many cycles, I was mostly gathering info to see if I could map all the concerns and spot a dealbreaker
14:40:32 <ganso> rosmaita: I was actually expecting a backportable fix
14:40:35 <smcginnis> ganso: I mean just conceptually, what would be the most expected behavior of the service?
14:41:06 <ganso> rosmaita: however, as you can see in the etherpad, the fix that looks more semantically correct is (c), but that doesn't look like it can be backported
14:41:20 <smcginnis> In other words, how _should_ cinder handle something like this. What is the most correct and expected behavior under this scenario.
14:41:34 <rosmaita> ok, let's all take an item to look over the etherpad and return to this next week
14:41:52 <rosmaita> and ganso maybe you can answer smcginnis's question on the etherpad
14:41:55 <smcginnis> Based on the discussion of whether an admin should be able to backup a tenant's volumes, (c) did sound like the more correct path to me.
14:42:16 <whoami-rajat> smcginnis, can't load the etherpad with ctrl+shift+r :/
14:42:21 <ganso> smcginnis: exactly, so the expected behavior, we tackled the situations where someone creates a backup on behalf of someone else, and accounts against that someone else's quota, this looks slightly unusual to me. What we see today makes sense semantically, but causes the bug, and it becomes a bit unusual because backup have parent relationships
14:42:37 <rosmaita> more continued discussion: tosky are you around?
14:42:38 <eharney> whoami-rajat: yeah, looks broken on the server side :/
14:42:40 <enriquetaso> In order to add more info: After debugging this a bit more and thanks to Rajat's investigation. There is a difference between the API using elevated (admin role) and manager using the user role. Looks like, there is not reason reason why we are using elevated context on the API code (checked into the DB) but removing the elevated may lead to some other broken functionality that worked previously (in general this could affect other
14:42:40 <enriquetaso> cases).
14:42:44 <tosky> rosmaita: yep
14:42:54 * tosky waits for green light
14:42:55 <rosmaita> #topic continuation of: cinder-tempest-plugin
14:43:33 <tosky> apart from the reminder ("please go over the open cinder-tempest-plugin reviews"), I have a question about https://review.opendev.org/#/c/639205/
14:44:07 <tosky> as you can see it is an interesting experiment for a more complex scenario tests, which could deserve its own job
14:44:40 <tosky> but it requires iSCSI multipath, and I'm not sure how to setup that on the gates
14:45:11 <hemna_> you would need the multipath daemon running
14:45:28 <tosky> from some past discussions with some people, as I wrote in a comment on the review, I may have (incorrectly) got that it's possible to use LVM for that?
14:45:30 <eharney> the theory (per geguileo) is that we could do this with the lvm driver by setting some additional options -- but not sure what all the steps are
14:45:59 <geguileo> yeah, it's easy to do
14:46:01 <eharney> i don't know if it requires configuring additional IPs or anything like that
14:46:10 <tosky> yes! So the questions are if a) if you would like to have this complex realistic scenario in the gates and b) if someone could please provide the instructions or guidelines for that
14:46:12 <tosky> that's it
14:46:32 <enriquetaso> I think 639205 needs a rebase in order to run the new job "cinder-tempest-plugin-cbak-ceph"
14:46:48 <rosmaita> it would definitely be cool to get this running
14:46:48 <tosky> oh, sure, and also for the ddt thing
14:47:10 <hemna_> what are we testing here with that review?  that multipath daemon can handle failover, or that cinder/os-brick can do a multipath attach?
14:47:10 <rosmaita> next up: enriquetaso
14:47:10 <geguileo> one needs to set iscsi_secondary_ip_addresses with other IP addresses
14:47:21 <rosmaita> #topic continuation of: Allow removing NFS snapshots in error status is stuck
14:47:30 <geguileo> if we are using a single node deployment, one can use 127.0.0.1 as the secondary IP
14:47:49 <tosky> geguileo: please comment on the review :)
14:48:25 <rosmaita> geguileo: doesn't have to be exact, if you can just point what to look for
14:48:25 <enriquetaso> ok.. so after discussing  with eharney, I guess the patch isn't so bad
14:48:43 <rosmaita> #link https://review.opendev.org/#/c/679138/
14:48:53 <enriquetaso> About NFS snapshot in error... the model is always -- try to delete the snapshot on the backend, if it doesn't exist, then succeed, all drivers do that, this should just follow the same model
14:49:20 <enriquetaso> so, I should update the patch with this comment and see what happens
14:49:33 <rosmaita> that sounds sensible
14:49:36 <smcginnis> ++
14:49:48 <rosmaita> that was quick, thank you enriquetaso
14:49:54 <enriquetaso> \o/
14:50:02 <rosmaita> #topic Cinder master compatibility status with Ceph Nautilus and beyond
14:50:06 <rosmaita> vkmc: that's you
14:50:13 <vkmc> o/
14:50:17 <vkmc> hey folks
14:50:36 <vkmc> I'm working on updating the devstack-ceph-plugin script
14:50:50 <vkmc> something we use on the ci for manila, cinder, glance and nova
14:51:13 <vkmc> right now we are testing the master branch for openstack with old versions of ceph
14:51:37 <vkmc> luminous is the latest version we have in there
14:51:59 <hemna_> I had looked at rewriting the plugin using ceph-ansible since it seems to handle scenarios and versioning better than the old plugin code
14:52:12 <vkmc> so... I don't want to break your ci, and therefore I wanted to ask before moving forward
14:52:17 <hemna_> ceph-ansible can also handle ceph iscsi too
14:52:21 <vkmc> have you been testing cinder with nautilus?
14:52:22 <eharney> so if we change the default release of ceph in there, it will change it for our jobs on stable branches too, right?
14:53:00 <tosky> it depends on how it is set on that job, I guess
14:53:12 <vkmc> my idea was to submit an experimental job first
14:53:17 <vkmc> keep what we have now, continue testing with luminous
14:53:28 <tosky> only the master (and maybe ussuri) variant of the job could be changed to use nautilus
14:53:39 <vkmc> and then, with time, drop the experimental job and promote it
14:53:44 <eharney> i don't recall that we specify versions in our jobs... probably need to decide what the correct ceph version is to use for older stable branches, and if it matters
14:54:02 <vkmc> btw, this is the patch for this update I'm talking about, if you want to take a look https://review.opendev.org/#/c/676722/
14:54:31 <vkmc> eharney, we don't have that option on the plugin yet, we just pull whatever version is hardcoded there
14:54:46 <vkmc> and that's what I want to implement :)
14:55:19 <vkmc> I see you have two gates in cinder (the ones I could see)... one in the check pipeline and one for third party IBM, not sure if you have another one using the plugin
14:55:29 <eharney> oh... that CEPH_RELEASE var is misleading currently
14:55:30 <tosky> vkmc: you would need to set CEPH_RELEASE in the vars section of the branch-specific variant of the job
14:55:37 <vkmc> tosky, yes
14:55:38 <tosky> uh
14:56:37 <vkmc> <eharney> so if we change the default release of ceph in there, it will change it for our jobs on stable branches too, right? <- yes
14:57:14 <vkmc> so, experimental gate for master, continue using whatever version we were using for stable branches
14:58:23 <eharney> we just need to pick what version(s) we want to run for stable
14:58:26 <vkmc> we have 2 more minutes, so we can continue the discussion on the cinder channel if that sounds good for you
14:58:49 <rosmaita> ok, experimental sounds good, won't break anything
14:58:56 <rosmaita> we can figure out the details on reviews
14:59:01 <rosmaita> ok, 1 minute
14:59:07 <rosmaita> thanks, vkmc
14:59:07 <vkmc> thanks folks
14:59:15 <rosmaita> #topic open discussion
14:59:18 <rosmaita> 30 seconds
14:59:21 <enriquetaso> thanks
14:59:27 <whoami-rajat> Thanks!
14:59:30 <rosmaita> anything?
14:59:34 <eharney> cinder is awesome
14:59:42 <rosmaita> ok, can't top that
14:59:42 <e0ne> :)
14:59:45 <smcginnis> Just wanted to point out that we will need this before victoria when py38 becomes voting: https://review.opendev.org/#/c/720008/
14:59:45 <whoami-rajat> :D
14:59:45 <rosmaita> #endmeeting