17:00:23 <ildikov> #startmeeting cinder-nova-api-changes
17:00:25 <openstack> Meeting started Thu May 12 17:00:23 2016 UTC and is due to finish in 60 minutes.  The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:00:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
17:00:29 <smcginnis> o/
17:00:30 <openstack> The meeting name has been set to 'cinder_nova_api_changes'
17:00:35 <scottda> hi
17:00:36 <mriedem> o/
17:00:41 <ildikov> scottda ildikov DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo gsilvis
17:00:52 <ildikov> hi
17:01:19 <alaski> o/
17:01:31 <thingee> mriedem: I'm not sure why my question is being avoided after I asked twice. Is this because it's a priority problem in nova, or because it never will be for multiattach.
17:01:33 <smcginnis> Do we have an agenda up somewhere?
17:01:36 <ildikov> as far as I know jgriffith_ is out today, but we still have a few items to touch on
17:01:40 <cFouts> o/
17:01:50 <mriedem> thingee: later
17:02:02 <ildikov> etherpad with info: #link https://etherpad.openstack.org/p/cinder-nova-api-changes
17:02:07 <aimeeu> lurking and learning
17:02:22 <mriedem> there were some items from ildikov's meeting minutes from last week
17:02:33 <ildikov> smcginnis: I added the list of items we are targeting to get done to the etherpad
17:02:38 <mriedem> " John Griffith will work on the above described solution, that target is to have patches up by next week."
17:02:39 <smcginnis> ildikov: Thanks!
17:02:49 <ildikov> we can go through those
17:03:16 <hemna> I have a question
17:03:46 <hemna> I'm working on a nova patch to not call check_attach at attach time
17:03:47 <ildikov> I haven't seen patch(es) up from John yet
17:04:14 <hemna> and check_attach does 2 things.  1) it checks internal state of the volume and 2) checks the availability zone
17:04:38 <hemna> does it make sense to add an optional AZ param to os-attach ?
17:04:50 <hemna> and have cinder check at os-reserve ?
17:05:00 <hemna> or just keep the check on the nova side only
17:05:01 <mriedem> https://github.com/openstack/nova/blob/026468772672215d34a593e631d1e62d6a615aa4/nova/volume/cinder.py#L279
17:05:18 <hemna> https://github.com/openstack/nova/blob/master/nova/volume/cinder.py#L289-L299
17:05:38 <mriedem> so, the az stuff is a mess kind of
17:05:38 <hemna> I was just working on moving that code into a check_availability_zone() call in there instead
17:05:53 <hemna> but before I go forward, I'd like to hear opinions on it
17:05:53 <mriedem> see https://github.com/openstack/nova/blob/026468772672215d34a593e631d1e62d6a615aa4/nova/virt/block_device.py#L60
17:06:40 <mriedem> ^ is really for boot from volume where nova creates the volume
17:06:52 <hemna> I'd prefer to change nova's attach code to simply call os-reserve
17:06:57 <mriedem> because nova will create the volume in the same AZ that the instance is in, which might not exist in cinder
17:07:00 <hemna> instead of a volume get, then check, then os-reserve
17:07:15 <mriedem> hemna: i think the az check in the api just needs to remain a separate thing
17:07:34 <mriedem> see my todo here https://github.com/openstack/nova/blob/026468772672215d34a593e631d1e62d6a615aa4/nova/virt/block_device.py#L79
17:07:57 <mriedem> i've had a long-term wish of creating the volume in nova-api for boot from volume, and then attaching it later
17:08:05 <hemna> https://github.com/openstack/nova/blob/master/nova/compute/api.py#L3095
17:08:07 <mriedem> so we do all of the az checking and stuff with cinder in the api rather than on the compute
17:08:08 <hemna> that thing
17:08:22 <scottda> But for nova to do the AZ check, it will still need the volume.get, which defeats the point of what hemna is trying to do.
17:08:23 <hemna> I was hoping could simply be a call to self.volume_api.reserve_Volume()
17:08:32 <hemna> scottda, +1
17:08:33 <hemna> yah
17:08:45 <hemna> so there is that.
17:09:13 <hemna> the get, then reserve means there is still a race
17:09:26 <mriedem> so you'd have to pass the az to os-reserve
17:09:29 <hemna> yah
17:09:42 <hemna> as an optional param
17:09:49 <hemna> if it's there, cinder tests it.
17:09:57 <mriedem> re: https://github.com/openstack/nova/blob/master/nova/compute/api.py#L3095 ndipanov had a patch for a race in there also: https://review.openstack.org/#/c/290793/
17:09:59 <hemna> if it's not, it assumes it's open, re: no AZ
17:10:45 <mriedem> yeah, and nova's logic for passing the az would be based on what we have in https://github.com/openstack/nova/blob/026468772672215d34a593e631d1e62d6a615aa4/nova/virt/block_device.py#L60
17:10:46 <mriedem> for bfv
17:11:59 <mriedem> shall we take notes in https://etherpad.openstack.org/p/cinder-nova-api-changes ?
17:11:59 <hemna> https://github.com/openstack/nova/blob/master/nova/volume/cinder.py#L289
17:12:05 <hemna> so right now, that's checked
17:12:34 <hemna> kinda the same thing
17:12:43 <mriedem> yeah, nova would just need to re-use some logic to determine if it needs to pass the az to os-reserve
17:12:52 <hemna> sounds like the _get_volume_create_az_value() needs to be public
17:12:57 <mriedem> if CONF.cinder.cross_az_attach, we'd pass None
17:12:58 <ildikov> mriedem: I will add the decision points to the etherpad after the meeting
17:14:20 <hemna> I don't see any AZ check on the cinder side
17:14:29 <hemna> so I dunno
17:14:48 <scottda> I don't think there's any AZ checks enforced in Cinder
17:14:51 <mriedem> yes there is
17:14:55 <mriedem> when creating the volume
17:15:06 <mriedem> nova can pass an az and if it doesn't exist cinder fails the volume create request
17:15:08 <mriedem> UNLESS
17:15:14 <mriedem> you set a backdoor config option to ignore htat
17:15:16 <mriedem> *that
17:15:20 <hemna> the create flow passes in an AZ
17:15:20 <smcginnis> Unless a fallback is configured.
17:15:34 <mriedem> smcginnis: right, which was a hack because we didn't have the fix in nova
17:15:39 <mriedem> which is https://github.com/openstack/nova/blob/026468772672215d34a593e631d1e62d6a615aa4/nova/virt/block_device.py#L60
17:15:42 <hemna> bleh
17:15:47 <mriedem> https://github.com/openstack/nova/commit/f9a51b970f688b90baf0ae3ef31d79b3fec02ed1
17:15:52 <hemna> ok, so I don't want to make the AZ nightmare worse
17:16:05 <scottda> hemna: You made is worse by mentioning it.
17:16:19 <mriedem> well, passing the az to os-reserve and cinder checking if it's provided, isn't really making it worse
17:16:24 <mriedem> if nova doesn't provide it, it's a noop
17:16:32 <hemna> scottda, :)
17:16:43 <mriedem> if cinder microversion isn't new enough for nova to pass it, then nova still has to check like it is today
17:16:47 <hemna> I guess the real question is, should cinder care?
17:17:00 <hemna> should cinder be doing the check and fail if the AZ doesn't match ?
17:17:10 <mriedem> so,
17:17:25 <mriedem> when i was fixing this bug in nova, i had a thread in the ML about removing the nova cross_az_attach option
17:17:25 <scottda> There are use cases where deployers had geographically distinct AZs, so this was needed.
17:17:26 <hemna> afaik AZ is a nova concept ?
17:17:30 <mriedem> and there were operators saying they relied on it
17:17:35 <mriedem> scottda: yes
17:17:36 <mriedem> that
17:17:49 <scottda> We did it in our (now defunct) public cloud...
17:18:02 <mriedem> see http://lists.openstack.org/pipermail/openstack-operators/2015-September/008252.html
17:18:06 <mriedem> for some light bedtime reading
17:18:11 <hemna> :)
17:18:17 <smcginnis> The backdoor config option work brought up the fact that AZs were never fully baked.
17:18:25 <mriedem> starts here http://lists.openstack.org/pipermail/openstack-operators/2015-September/008224.html
17:18:45 <scottda> Yeah, the terminology is vague, and that's part of the problem....but we still have to live with it.
17:18:49 <hemna> and by 'fully baked' does that mean that nova should be passing the AZ in calls to Cinder ?
17:18:56 <hemna> so that they can both be on the same page?
17:18:58 <mriedem> this is the cinder workaround https://review.openstack.org/#/c/217857/
17:19:15 <smcginnis> hemna: I think to fully support and enforce AZs, yeah. :/
17:19:47 <mriedem> there are some decent details and background in the commit message of https://review.openstack.org/#/c/227564/
17:19:53 <hemna> so if a user creates a volume, is the AZ set?  and to what?  and how is that checked against attach calls from nova ?
17:19:56 <smcginnis> But maybe we should shelve this az discussion for now and get back to multiattach. AZs are an issue for single and multi attach.
17:19:57 <hemna> bleh
17:20:01 * hemna cowers in defeat
17:20:31 <mriedem> so ftr, to fully remove nova's check_attach, cinder's os-reserve would need to take an az
17:20:33 <mriedem> to validate it
17:20:39 <hemna> mriedem, yah
17:20:40 <mriedem> at least to be consistent with how things are today
17:20:46 <hemna> that's why I brought it up
17:20:47 <mriedem> let it be written in the etherpad for all time!
17:20:47 <ildikov> I guess we can make the 'check_attach' removal a two step process
17:21:06 <hemna> so, if I still do the AZ check on the nova side
17:21:16 <hemna> the race is smaller
17:21:24 <hemna> at least nova won't be checking volume state
17:21:48 <scottda> Yeah, but that's a bit of code churn and review time for an incomplete fix...
17:21:52 <hemna> I think eventually, we do want to just pass the AZ to cinder and then nova can call reserve w/o a get.
17:22:25 <hemna> I won't change the functionality of check_attach for now.
17:22:33 <hemna> but I will refactor the AZ check out of there
17:22:34 <mriedem> scottda: it's just a bug fix really
17:22:42 <hemna> and then simply call the new AZ check after the get.
17:22:51 <mriedem> yeah i think hemna and i are on the same page
17:22:53 <scottda> fair enough
17:22:55 <hemna> then reserve_volume will catch the state checks.
17:23:07 <ildikov> is it only the BFV case?
17:23:14 <mriedem> no
17:23:16 <scottda> ildikov: no
17:23:18 <ildikov> I mean when check_AZ will need to be called
17:23:35 <mriedem> so in the remaining 7 minutes i have...
17:23:42 <hemna> ok I'll forge ahead with this and push it up today then.
17:23:59 <mriedem> hemna: you might want to look at https://review.openstack.org/#/c/290793/ too
17:24:04 <ildikov> cool, added a note to the etherpad about the AZ check
17:25:11 <hemna> mriedem, ok will do
17:25:16 <ildikov> mriedem: can you check the multiattach spec when you have some time?
17:25:39 <mriedem> ildikov: is it any different from mitaka?
17:26:20 <ildikov> mriedem: slightly updated, I added a link to the etherpad so that we would not need to add implementation details to the spec regarding how to sort out things in Cinder
17:26:22 <mriedem> because i was under the impression that the multiattach spec was going to be dependent on the POC that jgriffith_ was going to be doing
17:26:59 <ildikov> does this mean we can talk about approving it, when that is ready?
17:27:31 <mriedem> i'd prefer to not land a bunch of technical debt in nova just to get this in
17:28:13 <ildikov> the Cinder part is a dependency in the sepc, if these issues are not sorted out, than we're in trouble anyway
17:28:28 <ildikov> it does not mean to sort it out in Nova instead in my view
17:29:15 <mriedem> ok i'll have to review the spec to see the changes then
17:29:21 <ildikov> and the plan is to get them done :)
17:29:25 <mriedem> #action mriedem to review multiattach nova spec
17:29:36 <ildikov> tnx
17:29:41 <mriedem> #action hemna to poke at cleaning up nova check_attach
17:29:53 <hemna> coolio
17:29:55 <mriedem> what's the status on cinder migrate testing on the multinode job in the gate?
17:29:57 <ildikov> if there's anything Nova specific that's missing I will add it
17:30:30 <scottda> mriedem: We're starting with cinder migrate on a single node. We think we can get that working....
17:30:54 <scottda> But it looks like Devstack support for multi-backend was removed. I'm trying to figure out why, and what alternative exists.
17:31:09 <mriedem> scottda: as in resize?
17:31:10 <scottda> But eventually want multi-node as well.
17:31:30 <hemna> wait what?
17:31:38 <hemna> cinder multi-backend removed from devstack ?
17:31:41 <scottda> no, just have 2 LVM volume groups as separate backends, and migrate between them on a single node.
17:31:51 <thingee> mriedem: no like like multi drivers
17:32:04 <scottda> hemna: No, I've actually found a way to do it, the syntax has changed...
17:32:06 <scottda> and
17:32:20 <scottda> and Tempest multi-backend tests are failing for me. Not sure why.
17:32:47 <mriedem> and that will still test swap volume?
17:33:08 <scottda> yes, calling cinder migrate will call swap volume.
17:34:35 <mriedem> ok, do we want to talk about https://review.openstack.org/#/c/312773/ ?
17:35:39 <scottda> What do you think of that patch mriedem ?
17:35:49 <mriedem> honestly i haven't had the time to dig into it
17:36:19 <mriedem> would be nice to see the live migration job or multi node job passing on it
17:36:22 <mriedem> but those are super flaky
17:36:50 <mriedem> i can dig into the test failures for volume-backed live migration
17:36:54 <mriedem> and see if they are related
17:37:54 <mriedem> finally, before i go,
17:38:04 <mriedem> anyone talked to jgriffith_ on the os-initialize_connection changes?
17:38:31 <scottda> no, I haven't
17:38:37 <ildikov> mriedem: the job says for live migration that it passed, but I might missed smth in the logs...
17:38:57 <ildikov> mriedem: I talked to him briefly, he's working on it, but we couldn't go into details
17:39:03 <mriedem> ildikov: yeah http://logs.openstack.org/73/312773/1/experimental/gate-tempest-dsvm-multinode-live-migration/c57f6b9/console.html#_2016-05-08_09_14_44_572
17:39:11 <hemna> the experimental jobs seem.....borked almost every time.  :(
17:39:26 <scottda> I think John said in IRC that he had unit tests for his patch mostly passing...
17:39:54 <ildikov> hemna: it's weird a bit, it congratulates you and then marks the test failed...
17:40:17 <hemna> hehe
17:40:41 <mriedem> it is
17:40:41 <scottda> like a participation trophy.
17:40:41 <mriedem> http://logs.openstack.org/73/312773/1/experimental/gate-tempest-dsvm-multinode-live-migration/c57f6b9/console.html#_2016-05-08_09_18_59_360
17:40:44 <mriedem> setting up ceph
17:40:50 <mriedem> i've pinged tdurakov on that, he works on that job
17:41:02 <ildikov> mriedem: scottda: I will try to catch him and add notes to the etherpad about that item this week or early next
17:41:03 <mriedem> that job sets up various storage backends in a single job
17:41:10 <mriedem> and runs the same 4 tests
17:41:13 <mriedem> looks like it's not working for ceph atm
17:41:21 <mriedem> ildikov: ok
17:41:42 <mriedem> alright, over by 11 minutes
17:41:44 <mriedem> anything else?
17:41:52 <ildikov> also this time next week might be tricky for me
17:42:00 <ildikov> but will try my best
17:42:20 <mriedem> change the time as needed
17:42:25 <scottda> Let's work on a new time. It'd be nice to have JohnGarbuttt here, and JohnG as well
17:42:35 <ildikov> also I know johnthetubaguy cannot make it at this slot, so if it's problematic to either of you in general please let me know and then we can find another one
17:42:37 <hemna> ok
17:42:48 <mriedem> also, fyi, i'm out from 5/20-5/30
17:42:51 <hemna> thanks for the help guys
17:42:56 <mriedem> back on 5/31
17:43:09 <smcginnis> mriedem: Nice
17:43:11 <ildikov> mriedem: ok, thanks for the info
17:43:20 <scottda> ok, bye all.
17:43:33 <ildikov> I will reach out to you regarding time slots
17:43:41 <ildikov> thanks all!
17:44:15 <ildikov> #endmeeting