16:01:48 <ildikov> #startmeeting cinder-nova-api-changes
16:01:49 <openstack> Meeting started Thu Jan  4 16:01:48 2018 UTC and is due to finish in 60 minutes.  The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:01:50 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:01:52 <openstack> The meeting name has been set to 'cinder_nova_api_changes'
16:01:56 <mriedem> o/
16:02:02 <ildikov> johnthetubaguy jaypipes e0ne jgriffith hemna mriedem patrickeast smcginnis diablo_rojo xyang xyang1 raj_singh lyarwood jungleboyj stvnoyes
16:02:21 <ildikov> mriedem: morning :)
16:02:32 <jungleboyj> @!
16:02:32 <_pewp_> jungleboyj (;-_-)ノ
16:03:03 <jungleboyj> In a workshop so I am kind of distracted.
16:03:07 <johnthetubaguy> o/
16:03:10 <ildikov> ok, let's get started
16:03:16 <ildikov> johnthetubaguy: welcome back :)
16:03:36 <jungleboyj> johnthetubaguy: Hey new Daddy!
16:03:54 <johnthetubaguy> hey good to be back :)
16:04:03 <ildikov> ah, right, Happy New Year and Parenthood Everyone! :)
16:04:14 <jungleboyj> ildikov: +2
16:04:15 <johnthetubaguy> heh, +1
16:04:25 <ildikov> and then back to the topic
16:04:53 <ildikov> the quicker part is the Cinder side changes as jgriffith promised on the Cinder channel to get something up by the end of this week
16:05:34 * jungleboyj will be watching for those.  :-)
16:05:44 <ildikov> I started to look into it, but it would be certainly faster if it's taken care of by someone, who actually has some clue about the Cinder code... :)
16:05:50 <ildikov> jungleboyj: great, thanks!
16:06:16 <ildikov> the Nova and Tempest changes currently are: https://review.openstack.org/#/q/topic:bp/multi-attach-volume+(status:open+OR+status:merged)
16:07:17 <ildikov> the very latest headache is a recent-ish Qemu change that prevents us from creating the second attachment
16:07:36 <johnthetubaguy> ... oh
16:07:36 <mriedem> i've started documenting the laundry list of issues in https://etherpad.openstack.org/p/multi-attach-volume-queens
16:07:37 <ildikov> mriedem: what's the current stage with that?
16:07:50 <mriedem> because i'm having a hard time keeping it all in my head
16:07:53 <ildikov> I got lost between backporting and version checks and flags... :/
16:08:04 <johnthetubaguy> mriedem: awesome, that is exactly the kind of thing I was about to ask for
16:08:17 <mriedem> summary is qemu 2.10 breaks the shareable flag in the disk config,
16:08:26 <mriedem> libvirt 3.10 has a fix, but we test with pike UCA which is libvirt 3.6
16:08:39 <mriedem> ubuntu might eventually backport the patches to libvirt, but we can't really wait for that
16:08:42 <ildikov> mriedem: good idea, I hoped we don't have that much, but libvirt blowing up certainly proved me wrong
16:09:00 <mriedem> alternative for testing today is to not use the pike UCA so we get qemu 2.5
16:09:06 <mriedem> i am going to test that in a bit,
16:09:14 <mriedem> i'm also told that this isn't a problem if you're using raw images,
16:09:18 <mriedem> something i'm investigating now too
16:09:32 <ildikov> so we basically slightly ignore the fact that we cannot test it properly upstream recently?
16:09:51 <mriedem> if it works with raw images, we have a way to test that by just configuring devstack properly
16:09:55 <mriedem> but i need to test that out
16:10:00 <mriedem> it would probably mean a separate CI job
16:10:03 <mriedem> long-term
16:10:34 <mriedem> the devstack patch as-is is super specific to the compute and volume types anyway, so probably not great,
16:10:44 <mriedem> when other volume backends want to claim support for multiattach and enable it in their CI
16:10:50 <ildikov> I guess we could try to group the laundry list by what needs to get landed in the upcoming 2,5 weeks and what can wait a tiny bit
16:11:14 <mriedem> well, priority 1 for me is getting a CI run where we can actually do the 2 attachments
16:11:20 <mriedem> everything falls after that
16:11:28 <jgriffith> mriedem: +1
16:11:54 <ildikov> I think we favored turning this on in Cinder one-by-one so being a bit mroe specific for now is ok
16:11:59 <ildikov> mriedem: +1
16:12:17 <mriedem> an override flag in devstack is easy to add,
16:12:20 <mriedem> i just haven't done it yet
16:13:09 <ildikov> mriedem: it doesn't sound like a famous last sentence if you say it :)
16:13:13 <mriedem> the other major things are the cinder bp for the policy rules and such, which we already mentioned,
16:13:30 <mriedem> and figuring out what to do in the compute api about the new microversion
16:13:42 <mriedem> comments in https://review.openstack.org/#/c/271047/37
16:14:22 <mriedem> as this code is written, i can attempt to multiattach regardless of knowing if the compute supports it at all
16:14:30 <ildikov> I think we said a new microversion in Cinder too, I'm still not checking for 3.48 in the API yet in Nova
16:14:34 <mriedem> and i think that needs to be tightened up
16:14:44 <mriedem> what is 3.48?
16:14:48 <mriedem> shared_targets?
16:14:52 <ildikov> yes
16:15:17 <ildikov> I think we wanted that for multi-attach and happy to have it otherwise
16:15:39 <ildikov> but well, I can bump the compute version referring to the libvirt and other changes
16:16:00 <ildikov> and do the same thing as with adding that additional reserve_volume call a while ago
16:16:21 <mriedem> i don't know that we need a minimum compute version bump though, except maybe in the boot from volume case
16:16:47 <ildikov> I guess now that we're relying on qemu and libvirt versions checking the driver capabilities is even more important early on
16:16:50 <mriedem> normal attach does an rpc call to the compute using the reserve_block_device_name method, and we can pass the multiattach flag from the api to the compute and check the capability there, and fail fast if the compute doesn't support multiattach volumes
16:17:09 <mriedem> but for boot from volume, we don't know the host yet, so in that case the min compute version matters...
16:17:41 <ildikov> and we're talking about the second and further attachments, right?
16:18:11 <mriedem> all attachments to a multiattach=True volume
16:18:17 <mriedem> because that's what the libvirt patch is checking for
16:18:32 <mriedem> if the volume says multiattach and the driver doesn't support it, the attach fails
16:18:36 <mriedem> regardless of it being the first or the 10th
16:18:43 <ildikov> you said something in the review that we shouldn't break the current behavior
16:19:06 <ildikov> and currently we don't care whether it's mutli-attach or not
16:19:30 <ildikov> I prefer failing though
16:19:34 <johnthetubaguy> who is we, cinder?
16:19:47 <ildikov> johnthetubaguy: Nova
16:19:48 <johnthetubaguy> ah, its nova
16:19:57 <johnthetubaguy> so today we allow multi-attach?
16:20:05 * johnthetubaguy is confused again
16:20:09 <mriedem> well, since pike you can't do multiattach period
16:20:13 <mriedem> no in-tree cinder drivers support it
16:20:17 <ildikov> no, I mean that today we allow attaching a multi-attach volume regardless of any support
16:20:26 <ildikov> obviously we allow it to be attached only once
16:20:29 <mriedem> https://review.openstack.org/#/c/428365/
16:20:39 <mriedem> since pike, you can't even create a multiattach=True volume
16:20:42 <mriedem> scheduling fails
16:20:43 <ildikov> yeah, that helps
16:20:56 <johnthetubaguy> right, I see
16:21:01 <ildikov> we said we turn it off to be sure
16:21:17 <ildikov> we can always argue about volumes from the ice age though...
16:21:39 <johnthetubaguy> seems like cinder needs to track new vs old attachments then
16:21:56 <ildikov> boot from volume will fail on the compute anyway
16:21:59 <johnthetubaguy> although frankly a breaking API change is probably best, given its breaking zero users
16:22:16 <ildikov> I mean if the env is new enough, but the hypervisor doesn't support it then it's the same as landing on an old compute
16:22:34 <mriedem> so i think i want to say that in nova, you can only attempt to attach a multiattach volume using the new microversion, period.
16:22:42 <mriedem> won't work with 2.1 or 2.20 or whatever
16:22:47 <johnthetubaguy> mriedem: +1 that
16:22:57 <ildikov> johnthetubaguy: except those who patched both Cinder and Nova, but I guess they will figure it out by themselves how to upgrade :)
16:22:58 <johnthetubaguy> way simpler test matrix
16:23:03 <mriedem> so if new microversion and volume.multiattach=True, then we have to figure out what checks we make about the backend support
16:23:09 <johnthetubaguy> ildikov: yup, there are in crazy land
16:23:13 <johnthetubaguy> they
16:23:35 <mriedem> 1. boot from volume could be a min compute service version check,
16:23:48 <mriedem> 2. normal attach could be checked via the reserve_block_device_name call to the compute
16:23:57 <mriedem> 3. attaching to a shelved offloaded server....i'm not sure
16:24:10 <johnthetubaguy> was reading your note about shelved offloaded, oh my
16:24:10 <mriedem> 3 is essentially the same as 1
16:24:20 <ildikov> mriedem: considering it doesn't guarantee much, I'm not sure we actually want the compute version check
16:24:21 <mriedem> you don't know the host until you unshelve
16:24:26 <johnthetubaguy> yeah, I think that is the best approach
16:24:39 <johnthetubaguy> fail as fast as we can, but meh, you have a strange setup
16:25:02 <mriedem> ildikov: without a scheduler filter, it's better than nothing
16:25:17 <johnthetubaguy> it stops a world of crazy during upgrade, that seems like a win
16:25:29 <mriedem> otherwise we just say operators have to disable multiattach on the cinder side until all of their computes are upgraded
16:26:03 <jgriffith> mriedem: that's likely the best option IMO, the other I thought we discussed was prohibit BFV for now but I don't want to go down a tangent
16:26:07 <ildikov> mriedem: if it's a mixed hypervisor environment then we're screwed anyway...
16:26:18 <mriedem> ildikov: yes in that case you are
16:26:30 <mriedem> but i don't know how many deployments actually used mixed hypervisors
16:26:38 <mriedem> i would assume that's rare
16:26:43 <mriedem> within the same region anyway
16:26:59 <stvnoyes> sorry, joining late...
16:27:05 <ildikov> jgriffith: I removed that 5 lines of code that disables booting from a multi-attach volume, you could still attach it at boot time if I put it back
16:27:12 <johnthetubaguy> well, and require multi-attach, that seems like vanishing small
16:27:16 <mriedem> jgriffith: we said we'd allow bfv with multiattach and if the operator didn't like that idea they'd disable it via policy in cinder
16:27:21 <ildikov> stvnoyes: no worries, we're still pretty much in the middle of everything :)
16:27:33 <johnthetubaguy> mriedem: +1
16:27:52 <ildikov> mriedem: ok, bumping a compute version is not a big deal, so I can do that
16:28:02 <jgriffith> mriedem: ildikov right, I wasn't proposing anything just reviewing past discussions
16:28:23 <mriedem> ildikov: yeah i think we want that for the libvirt patch that adds the capabilty flag
16:28:24 <ildikov> jgriffith: between the two of us, I liked it better disabled... :)
16:28:35 <ildikov> mriedem: ok, I will add it there
16:28:56 <ildikov> mriedem: so we should differentiate and only check it for BFV?
16:29:33 <mriedem> bfv and attaching to a shelved offloaded instance
16:29:41 <mriedem> in both cases we don't know what the host is going to be
16:29:47 <ildikov> ok
16:30:18 <mriedem> i don't think we need to check for cinder 3.48 because the code in https://review.openstack.org/#/c/529695/ is backward compatible and will just detach as before if it's not available
16:30:26 <mriedem> it just means it's not a safe detach
16:30:43 <ildikov> ok, fair enough
16:31:34 <ildikov> I have the workaround code to check the Cinder microversion now, so we can add it any time if we would change our mind later
16:33:30 <mriedem> ildikov: just a thought, but if we add a multiattach arg to the reserve_block_device_name rpc call method,
16:33:36 <mriedem> that gives us the service version bump automatically,
16:33:52 <ildikov> and for other attach cases just check whether the hypervisor supports it and fail if not
16:33:55 <mriedem> and that's the thing the normal attach flow from the api could fail fast on if the compute is too old or doesn't support multiattach
16:34:11 <mriedem> ildikov: so i think we could do that in between the libvirt patch and the api change
16:34:20 <mriedem> i could write that up today
16:34:51 <ildikov> well, I think we can make it a rule that if you write it up I won't argue :)
16:35:05 <mriedem> ok so notes are in https://etherpad.openstack.org/p/multi-attach-volume-queens
16:35:12 <ildikov> back to serious, sounds like a good enough way forward
16:35:13 <mriedem> i'll start by trying to get the test stuff sorted out
16:35:34 <ildikov> mriedem: is there anything stvnoyes and/or me can help out with?
16:35:36 <johnthetubaguy> quick question on some comments in the patchset 37?
16:35:50 <ildikov> mriedem: did you agree with stvnoyes what he could take on?
16:35:59 <johnthetubaguy> you said nova checks how many attachments the multi-attach volume already had?
16:36:06 <stvnoyes> i can take a look at the new version of libvirt and test that
16:36:08 <ildikov> so we take care of things we can independently and/or help each other where it makes sense
16:36:28 <mriedem> johnthetubaguy: where?
16:36:48 <johnthetubaguy> https://review.openstack.org/#/c/271047/37/nova/compute/api.py@3659
16:37:05 <ildikov> johnthetubaguy: I think we fall back to just don't attach a multi-attach volume if we cannot regardless of it's attached already or not
16:37:16 <johnthetubaguy> ildikov: +1 that
16:37:47 <mriedem> i'm fine with that
16:37:58 <mriedem> this was before i knew that cinder blocked multiattach altogether in pike
16:38:04 <mriedem> and was thinking about backward compat
16:38:31 <ildikov> I think we discussed that on one of the meetings back then, but it was a while ago
16:38:57 <ildikov> and I'm not really the queen of documentation, neither the Cinder guys so I will just blame them anyway :)
16:39:29 <johnthetubaguy> heh, either way, sounds like we are good there
16:39:52 <ildikov> cool, at least one thing :)
16:40:14 <ildikov> stvnoyes: I think mriedem plans a DNM devstack patch to get an earlier qemu on the gate
16:40:35 <mriedem> that or seeing if raw images work
16:40:41 <mriedem> i'm going to try the raw images thing first
16:40:56 <stvnoyes> ok, is there still a desire to test with 3.10 libvirt?
16:41:14 <ildikov> mriedem: what level of test coverage we would like to have before code freeze?
16:41:19 <mriedem> it would be nice to know that libvirt 3.10 actually works with qcow2 images for multiattach volume
16:41:35 <stvnoyes> ok, I'll take a run at that
16:42:07 <mriedem> ildikov: probably the test in https://review.openstack.org/#/c/266605/17/tempest/api/compute/volumes/test_attach_volume.py and the TODOs in there, plus attaching a multiattach volume to a shelved offloaded instance
16:42:55 <mriedem> so like,
16:43:02 <mriedem> attach a volume to an instance A,
16:43:11 <mriedem> create instance B, shelve offload it, attach volume to B and unshelve B
16:43:35 <mriedem> anyway, need to get happy path working first
16:44:10 <ildikov> sure, sounds good
16:44:26 <stvnoyes> a minor point - it would also be good to have the detaches be explicit in the test. otherwise detach failures in teardown are not as obvious to debug.,
16:44:46 <mriedem> stvnoyes: yeah andreaf pointed that out too, i plan on adding that in
16:44:58 <ildikov> stvnoyes: when the libvirt stuff seems to get together if you could take a look at the shelved offloaded case that would be pretty great
16:45:31 <stvnoyes> ok will do
16:46:06 <ildikov> ok, so summary
16:46:43 <ildikov> jgriffith works on the Cinder side policy and any other related changes to enable multi-attach, target to have some code up is end of this week
16:47:56 <ildikov> mriedem looks into the libvirt-qemu changes to get either a lower version qemu or test with raw image and he will add a patch on top of the libvirt one on adding a multiattach arg to the reserve_block_device_name rpc call method
16:48:32 <ildikov> stvnoyes looks into libvirt 3.10 and the shelved offloaded case for Tempest after that
16:49:37 <ildikov> ildikov continues to work on the libvirt patch and the dependencies below to get the shared_target changes in if anything pops up
16:50:28 <ildikov> what did I miss?
16:50:47 <mriedem> that's good for now
16:50:55 <mriedem> we'll have some api patch changes on top later
16:50:58 <mriedem> for the version checks and such
16:51:41 <jungleboyj> Sounds good.
16:52:20 <ildikov> mriedem: you mean the BFV and other checks?
16:52:31 <ildikov> I can do that once you have that additional patch
16:53:06 <mriedem> yeah, and we'll need something from the actual REST API handler code to check the microversion and pass a variable down to say if multiattach is OK
16:53:33 <mriedem> like, "allow_multiattach=False"
16:53:35 <mriedem> something like that
16:53:45 <mriedem> if microversion >= 2.59: allow_multiattach=True
16:55:42 <mriedem> shall we break?
16:56:06 <ildikov> yeah, I'm sure I will have questions to this last one, but can break for now
16:56:31 <ildikov> will monitor your etherpad
16:57:11 <ildikov> BTW, if anything new comes up (fingers crossed it doesn't happen) please add it to here: https://etherpad.openstack.org/p/multi-attach-volume-queens
16:57:24 <ildikov> thanks everyone!
16:57:50 <jungleboyj> Thanks ildikov  !
16:57:54 <jungleboyj> and everyone else.
16:57:56 <ildikov> keep on chatting on the channels
16:58:13 <ildikov> #endmeeting