21:00:08 #startmeeting cinder-nova-api-changes 21:00:11 Meeting started Wed Apr 20 21:00:08 2016 UTC and is due to finish in 60 minutes. The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:12 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:15 The meeting name has been set to 'cinder_nova_api_changes' 21:00:26 o/ 21:00:48 scottda ildikov DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo 21:00:59 o/ 21:01:14 * alaski has to run another meeting, sorry 21:01:20 hi 21:01:49 alaski: no worries, it's still just the warming up for next week 21:02:02 Hi 21:03:11 hi All 21:03:20 hi 21:04:01 basically the plan for today is to prep for next week regarding scenarios we need to discuss mostly in connection with multiattach 21:04:22 i reviewed the update attachment api spec from scottda last night 21:04:26 haven't checked back on it today, 21:04:40 but i'd like the nova portion of that fleshed out a bit before a new api is added to cinder 21:05:29 mriedem: Yes, I'm not ignoring your review... 21:05:35 mriedem, +1 21:05:48 I agree, there's also the patch that jgriffith is working on to have a PoC for alternative #2, so a few bits to put together 21:05:48 I'm just trying to gather answers. 21:07:13 mriedem: a quick question as you mentioned Nova 21:07:32 mriedem: is there a chance to have multiattach on the list of high priority items? 21:07:47 oops, accidentally closed the tab 21:08:05 mriedem: good timing, so question again 21:08:07 mriedem: is there a chance to have multiattach on the list of high priority items? 21:08:31 i won't commit to that right now, we discuss nova priorities in the last session on thursday of the design summit 21:09:01 mriedem: At least that will be after the Nova-Cinder session. 21:09:05 right 21:09:07 I know, but I guess I need to start the campaign way before that session 21:09:30 it's going to depend on how clear the plan is after the nova/cinder session i think 21:09:40 if it's all up in the air still, it's going to be hard to say it's a priority 21:09:53 and things like swap volume with multiattach volumes scare me 21:10:02 all the weird edge cases with multiattach 21:10:13 plus we don't test any of that today 21:10:28 honestly, 21:10:30 you mean swap volume, etc in general? 21:10:45 before multiattach is a priority, i think volume migration needs to be tested in the multinode job in the gate 21:10:53 b/c that would also test swap_volume since it's part of the flow 21:11:10 and then we could see if multiattach blows any of that up 21:11:21 mriedem: +1 21:11:25 ok that sounds reasonable also it would be good to test it anyway 21:11:38 i think i asked this earlier in the week, but does cinder allow migration of multiattach volumes? 21:11:52 mriedem: I don't believe we block it. 21:12:02 mriedem: But I also don't believe we test that at the moment. 21:12:10 volume migration? probably not 21:12:22 i think the only thing that uses both nodes in the multinode job is nova's live migration 21:12:23 hemna_, scottda: Are you aware of any testing added for that? 21:12:31 wait 21:12:34 No, I'm not aware of any testing. 21:12:37 do you mean nova live-migration ? 21:12:40 or cinder migrate ? 21:12:43 cinder migrate 21:12:51 which triggers swap volume in nova 21:13:01 neither are tested in the gate 21:13:07 :( 21:13:08 and those + multiattach scare me 21:13:17 well that seems to be a problem in general then 21:13:18 especially with some of the issues i found with swap volume this week 21:13:20 yeah 21:13:31 hemna_: +1 21:13:33 so, there are some testing gaps to get us to multiattach 21:13:46 We're kind of putting on the roof before the foundation is done, in some respects. 21:13:48 i need to add that to the etherpad 21:13:58 I'm not sure why that would prevent nova from setting multiattach to high priority 21:14:01 Not to be negative. 21:14:07 let's try to figure out dependencies as well 21:14:55 we might not want to put on the roof first 21:15:13 It keeps us dry while we're digging a deeper hole. 21:15:18 :) 21:15:29 We certainly know how to dig holes. 21:15:42 hemna_: multiattach is a whole new set of problems if it goes wrong, and i'd like to be tight on testing of existing functionality before introducing new complexity 21:15:58 smcginnis: lol :) 21:16:20 especially given how hard it's been to get a clear picture on the design 21:16:33 I'm not saying it should be forced to land in nova 21:16:45 just saying it should be a candidate for high priority 21:16:47 i'm not knocking the team, or not trying to, it just worries me 21:16:51 it's a candidate yes 21:16:55 ok cool 21:17:05 we've had a lot of people working on it for literally years now 21:17:07 :( 21:17:25 yes, but as far as i recall, the major nova effort wasn't until mitaka 21:17:32 *nova effort from the nova team 21:17:49 anyway, it is what it is 21:17:59 we would need high prio to get review attention, I don't think we will flood Gerrit with too much new code by N-3 21:18:02 i'd like to see it happen, just thinking through gaps we have today before we add this layer in 21:18:08 well I wouldn't agree with that, but whatevs. 21:18:14 One thing in the critical path is jgriffith code... 21:18:17 it's always the same thing in Nova. 21:18:20 lots of patches get in 21:18:24 nobody reviews or tests 21:18:28 deadlines come up and go 21:18:32 and oops...you are too late. 21:18:37 thank you next release. 21:18:46 2+ years later. here we are. 21:18:57 but at this point, it's probably not a bad thing :) 21:19:14 scottda: yeah, it's on my list to look at from Nova perspective as well 21:19:15 yeah, i'm familiar with that problem 21:19:20 so 21:19:38 it seems a prerequisite is getting some gate tests in the mix 21:19:42 scottda: maybe next Friday we could have a small code sprint on the contributors' meetup part as opposed to just talking :) 21:19:47 ildikov: We just merged microversion support in the cinderclient, so you (or someone) could start to POC using that... 21:20:05 I thought we did have some nova live migration tests at some point ? 21:20:08 scottda: yeap, exactly 21:20:13 ildikov: Sounds good. 21:21:24 scottda: I will look at that patch, I'll have some time on the plane tomorrow to do so 21:22:07 Anyone know how difficult it will be to test cinder migration using current infra tools? 21:22:09 hemna_: we do in the multinode job in the gate 21:22:24 hemna_: it's non-voting b/c of some race bugs (we think due to old ass libvirt/qemu in those jobs) 21:22:26 Have we thought about it? 21:22:27 i.e using multinode in the gate. 21:22:41 scottda: i'd say look at the live migration tests in tempest 21:22:49 it's not too complicated, there is logic for finding the two computes 21:22:56 spawning the instance on one and then live migrating to the other 21:23:04 i'd expect volume migration to be similar 21:23:06 mriedem: OK, Cinder has testing on the agenda for the Summit. We can look at this. 21:23:31 https://github.com/openstack/tempest/blob/master/tempest/api/compute/admin/test_live_migration.py 21:23:33 scottda: cool, that sounds good 21:24:31 I put it on Cinder agenda for testing session. 21:24:59 scottda: I'll try to join to that slot 21:25:08 is jdg's option 2 fleshed out as far as the flow is concerned? 21:25:14 so the scenario in question is cinder migrate 21:25:16 looks like some of the notes in the etherpad are stale 21:26:30 mriedem: I don't think things are fleshed out. It would be nice to get a flow diagram for Alt#2 similar to what hemna_ has done for the current flow. 21:26:35 stale as in it seems like we came up with different solutions for some of those items last week 21:26:53 e.g. nova not neeting to store the attachment_id in the bdm.connection_info column 21:26:58 *needing 21:27:00 anyone have the etherpad url handy ? 21:27:10 https://etherpad.openstack.org/p/cinder-nova-api-changes 21:27:18 i don't remember the details though, we have the meeting logs if we need to go back 21:27:24 thanks 21:28:05 here are the 4/13 meeting logs http://eavesdrop.openstack.org/irclogs/%23openstack-meeting-cp/%23openstack-meeting-cp.2016-04-13.log.html#t2016-04-13T21:00:06 21:28:21 i feel like we came to some aha solution type things at the end which aren't fed back into the etherpad 21:28:46 mriedem: we ended up in a discussion about unversioned things 21:29:13 yeah, like os-initialize_connection passing the attachment_id, and nova passing the connector with the instance uuid 21:29:13 I'm not sure whether it was the connection_info itself or another bunch of data 21:29:20 Sorry, I have to drop off. Will catch up later. 21:29:24 later 21:29:28 I think we were going to stuff the attachment_id into the connection_info 21:29:36 for nova to save at detach time 21:29:39 *os-initialize_connection putting the attachment_id in the connection_info dict response i mean 21:29:56 and we were going to save the connector at initialize_connection time. 21:29:58 but we were going to microversion that or not? i thought yes 21:30:12 I was unsure about that 21:30:16 me too :) 21:30:32 technically i don't think we *need* to because nova can check if it's there and if not, determine cinder isn't new enough 21:30:38 the other part is the fact that initialize_connection is called at times to simply fetch the connection_info 21:30:41 during live migration time 21:30:48 Wouldn't Nova want that microversioned? To get a different payload in the connection_info? 21:30:50 and it can't create a new attachment entry then. 21:30:54 but, that happens down in the compute - if nova is going to fail on that, it'd be nice to fail fast in nova-api based on a versoin 21:31:13 scottda: generally yes we want api changes versioned 21:31:29 We really need a design spec for this.... 21:31:32 well 21:31:36 we aren't changing the API here 21:31:39 because nova-api could see it's asked to attach a multiattach volume, check the cinder version to see if it can handle it, and if not fail with a 400 before casting to compute 21:31:48 it's adding additional data to a dict in response 21:31:58 there is no new param in the call 21:32:05 or not a new response either. 21:32:05 hemna_: well, that's still changing the api response 21:32:21 so.... 21:32:22 i realize we've fudged around this forever in the past 21:32:22 I dunno 21:32:29 and that's how it works with neutron port details too 21:32:33 cinder drivers put 'new' things into connection_info at times 21:32:36 i know 21:32:42 that qualifies as a 'change in response' ? 21:32:43 which is why it's not like new badness 21:32:56 I don't know what the right thing is here. 21:32:58 it's just continued badness 21:33:05 we don't version the connection_info as an object 21:33:10 long-term connection_info should be a versioned object in os-brick, imo 21:33:16 every cinder driver has different things it puts in there. 21:33:17 for later 21:33:19 like we're doing with vifs in os-vif 21:33:41 yeah, anyway, this is kind of a minor detail, i think it's a matter of UX on the nova side 21:33:45 we fail in the api or we fail in the compute 21:33:50 failing fast in the api is better 21:34:04 So, it might not be necessary, but it might be useful (for fail-fast, at least) to microversion it. It's not a big deal to bump the version. 21:34:11 scottda: right 21:34:35 cinder api could strip the attachment_id key out of the connection_info in the API if it's not at that microversion 21:34:56 the other part of this is that detach needs the attachment_id 21:34:58 well, if nova isn't requesting that version 21:35:08 that could/should be in the same patch, which requires a microversion 21:35:20 agree it should be the same microversion 21:35:20 Look, I hate to bog down in process, and I don't want this to slip, but there are others who aren't present who'll need to know all these design decisions. We're probably better off writing a spec for all of this, I think. 21:35:41 scottda: this might feed into a spec after the summit, 21:35:47 but these are questions that feed into that 21:36:01 sure, probably not until after the summit 21:36:06 we don't have much time between now and the summit anyway 21:36:13 right 21:36:19 unless someone magic wands the tests into the gate right away :) 21:36:33 jungleboyj has minions, make them write those tests :) 21:36:56 just open a bug and mark it low-hanging-fruit, some intern will write them :) 21:36:57 hemna_: detach is already using the attachment_id 21:37:34 ildikov, wait, didn't we already look at that last week? 21:37:38 * hemna_ is confused 21:37:41 cinderclient(context).volumes.detach(volume_id, attachment_id) 21:37:52 ildikov: got that into nova in mitaka 21:38:13 https://github.com/openstack/cinder/blob/master/cinder/api/contrib/volume_actions.py#L151 21:38:14 in newton api nova passes the attachment id down to the compute, on a mitaka compute it looks up the attachment id 21:38:23 maybe I'm thinking terminate_connection 21:38:30 hemna_: we might mentioned it, the point is that those patches made it into Mitaka that pass the attachment_id to Cinder at detach time 21:38:42 yah I'm thinking terminate 21:38:43 sorry 21:39:01 ok, and that wasn't the thing that was reverted in cinder https://review.openstack.org/#/c/275316/ 21:39:03 mriedem: Since when have I had Minions? 21:39:09 that was passing host and instance uuid to attach 21:39:19 jungleboyj: your glorious army of minions 21:39:23 will solve all of our problems 21:39:33 mriedem: that was the host_name, which got reverted 21:39:35 mriedem: They are my esteemed co-workers. :-) 21:40:29 hemna_: ildikov: scottda: last week we were talking about a cinder api that needed the attachment_id and i suggested that we could stash that in the connector dict passed to cinder, was that terminate_connection? 21:40:30 jungleboyj: tell them they are cute and we like them :) 21:40:36 it's icky, and unversioned 21:40:43 mriedem, that could have been 21:40:54 it's kinda icky to put that into the connector though. 21:41:03 ildikov: You can tell them in Austin. They are all coming with me. :-) 21:41:13 the connector is supposed to describe the host initiator information for the entire system. 21:41:37 yeah, but it's the same idea as putting the attachment_id in the connection_info that comes back from init_conn 21:41:37 it seems like a hack to put that in there to me. 21:41:45 it's a hack both ways :) 21:41:48 jungleboyj: ok Gru, will do ;) 21:41:51 yah, it's a total hack :( 21:41:56 honestly, the thing to probably do, 21:42:06 is not put attachment_id in connection_info or instance uuid in connector, 21:42:13 mriedem, +1 21:42:15 but microversion those APIs and add them to the response body 21:42:17 ildikov: *Laughing* Hey now! 21:42:22 and request body 21:42:24 mriedem, +1 I like that way better. 21:42:27 respecetively 21:42:28 it's clean. 21:42:39 scottda: catch all that? ^ 21:42:45 jungleboyj: lol :) 21:42:52 I did, and it sounds good to me. 21:43:08 i do wonder, 21:43:29 if nova isn't requesting that microversion, will cinder still ceate the attachment entry in init-connection? 21:43:40 mriedem, yes 21:43:49 do we want to store attachment_id in Nova or still not? 21:43:51 will it be able to clean up later? 21:43:54 mriedem, according to jdg's POC patch 21:43:55 in terminate_connection? 21:44:15 terminate_connection shouldn't really touch the attachment entry 21:44:19 only in detach() 21:44:38 ok, and we pass the attachment_id there 21:44:42 yah 21:44:48 which we lookup from the volume id and instance uuid in nova 21:45:02 which makes me wonder why initialize_connection needs to return the attachment id for nova? 21:45:10 i get so confused here 21:45:18 it is confusing 21:45:22 so 21:45:25 the reason is 21:45:28 nova could store it and then it wouldn't need to look it up i guess, but we already have that lookup code since mitaka 21:45:29 for multi-attach 21:45:41 it would get stored in the nova bdm 21:45:49 and nova would use that value directly 21:45:57 instead of looking it up again 21:46:08 still, that's just an optimizatoin right? 21:46:25 I like the Cinder is the ultimate source of truth thing better still 21:46:31 I'm still confused as to how initialize_connection is going to know if it's a call for an new attach, or a fetch for connection_info 21:46:46 I need to go to IRC-on-the-smart-phone mode for a few minutes...limited typing ability... 21:46:56 hemna_: I think that's the key question here 21:47:01 re: live migration. 21:47:18 hemna_: which is why you wanted to create the attachment in reserve/ 21:47:19 ? 21:47:29 mriedem, yes 21:47:45 hemna_: and it still looks weird to call initialize_connection only for fetching already existing connection info 21:47:45 i guess jgriffith would have to answer that 21:47:57 anyway, I'm not going to argue that anymore 21:48:20 ildikov, it seems weird, but it's completely necessary 21:48:36 well, if nova is calling initialize_connection to get the connection_info, and that creates a new attachment in cinder, but it's actually the same volume/instance, won't that be bad? 21:48:44 because the connection_info for volume X on host A will be different on Host B 21:49:01 mriedem, yes 21:49:11 hemna_: I understand the need for the info, I just wanted to highlight that with this naming structure and behaviour it's confusing 21:49:13 that's the problem I've raised a few times with creating the attachment in initialize_connection 21:49:21 but was told that it's not a problem. 21:49:27 * hemna_ shrugs 21:49:30 and why wasn't it a problem? 21:49:36 is something cleaning that up later? 21:49:38 magic 21:49:41 ref counting? 21:49:45 the POC ignores the problem really. 21:50:06 and ignores that we are trying to do this for multi-attach 21:50:09 if necessary, initialize_connection could take a new parameter, attach:boolean 21:50:24 as part of the microversion that returns the attachment id 21:50:39 ildikov, yah initialize_connection the name was fine prior to live migration :) 21:50:39 so nova passes attach=True on attach, cinder creates new attachment in the db, 21:50:57 if just getting the conn info, nova passes attach=False and cinder doesn't create a new attachment 21:51:02 so 21:51:05 hemna_: I kinda thought it's "legacy" :) 21:51:24 or is this bleeding into update_attachment territory? 21:51:26 the other way to do this, which requires even more nova changes is that live migration uses multi-attach 21:51:35 for each attachment of the volume 21:51:55 or maybe we reorganise things slightly and have a call that actually just asks for info 21:52:10 that may open up other problems though. 21:52:19 hemna_: not all volume drivers support multiattach right? 21:52:26 mriedem, correct 21:52:42 hemna_: you mean attaching the same volume to the same instance twice? 21:52:44 lots of confusion and grey area here. 21:52:54 ildikov, that too, but just on a different host. 21:53:07 the unique key is really host + instance_uuid + volume_id 21:53:12 that would probably totally bork the constraints checking in nova for the bdm 21:53:15 hemna_: yeap, exactly 21:53:15 yeah 21:53:27 nova is just checking volume id and instance uuid today 21:53:50 and for live migration, it simply calls initialize_connection on the destination host 21:54:05 which forces cinder to export a volume to a new host 21:54:14 and then it removes the source attachment 21:54:28 it kinda backdoors the 'attach' workflow on the destination compute host 21:54:38 by bypassing reserve 21:54:58 b/c it's already in-use 21:55:02 yup 21:55:09 and multi-attach isn't baked in 21:55:10 etc. 21:55:24 w/ multi-attach you can attach in-use volumes 21:55:24 so, can we sort out the ref counting issue with creating a new attachment on each call to initialize_connection before the summit? 21:55:26 and use the whole workflow 21:55:48 if detach cleans those up, maybe that takes care of it, i'm not sure 21:56:13 detach = delete all volume attachments for this volume + instance 21:56:14 idk 21:56:31 detach takes the attachment id though... 21:56:34 so it's going to just delete that one 21:56:57 it's supposed to delete just that one 21:57:14 by ref counting issue you mean the same host issue? 21:57:25 so I was just looking at the POC 21:57:34 and it creates a brand new attachment on every initialize_connection call 21:57:46 we'll have borked stuffs in the volume_attachment table during live-migration time. 21:57:57 it's not just live migration, 21:58:10 there is code in nova compute manager that calls initialize_connection to refresh the connection info, let me find that quick 21:58:24 mriedem, yah I think that's called during live migration 21:59:09 driver_block_device.refresh_conn_infos() virt/block_device.py 21:59:12 that guy 21:59:19 https://drive.google.com/a/hemna.com/file/d/0B1Kp6K43HLHydU1wWFVIN29tc2s/view 21:59:21 :) 21:59:25 at least live migration and resize 21:59:33 resize too ? 21:59:41 i'm talking about https://github.com/openstack/nova/blob/4ad414f3b1216393301ef268a64e61ca1a3d5be9/nova/compute/manager.py#L1833 22:00:15 and swap_volume https://github.com/openstack/nova/blob/4ad414f3b1216393301ef268a64e61ca1a3d5be9/nova/compute/manager.py#L4886 22:00:18 yah that calls refresh_conn_infos if true 22:00:47 and swap volume calls it outright for the new volume 22:01:10 I 'think' that's ok though 22:01:16 so I don't see where we need a new update attachment API... 22:01:18 because cinder calls nova right? 22:01:31 hemna_: for swap volume yeah, but you can also do swap volume w/o cinder 22:01:33 and I think swap volume already marks the attachment as detached? 22:01:36 I dunno 22:01:43 I'd have to follow that mess of code as well 22:01:50 mriedem, oh ok 22:01:52 w/o cinder. 22:01:55 this is why i was asking for gate testing on ciner migration :) 22:01:59 cinder 22:02:00 damn 22:02:09 ok I am now scared even more now. 22:02:17 now you know why i'm scared 22:02:18 this is all feeling more and more like a giant hack 22:02:36 I dunno, I kinda think reserve should return attachment_id 22:02:38 and nova keeps it 22:02:45 and uses it for calls to cinder 22:03:07 does that solve anything 22:03:19 well, it'd be a single ref count at least 22:03:25 if nova calls initialize_connection(....., attachment_id) 22:03:43 Cinder wouldn't create a new attachment if it gets it passed in. 22:03:46 it simply looks it up 22:04:04 and updates the connector if needed and saves it into the attachment table 22:04:19 along those lines, 22:04:24 if attach volume fails in nova, it calls: self.volume_api.unreserve_volume(context, bdm.volume_id) 22:04:32 which could pass the attachment_id to delete it 22:04:48 yah 22:04:51 as would detach 22:05:02 i mean, nova calls detach with the attachment id already so it would delete it then 22:05:05 in the case of nova doing swap volume, it should be a new attachment really 22:05:11 because it's a different volume 22:05:13 yes 22:05:36 the old attachment should be marked as detached, and the new volume attachment gets it's new attachment table entry and attachment_id. 22:05:46 but, nova doesn't call reserve currently 22:05:50 for swap 22:05:56 ok, so this is making more sense to me than option #2 now, but would need to know why creating a new attachment entry on every call to initialize_connection in option 2 wasn't a problem 22:06:09 it was a problem 22:06:13 it was overlooked IMHO 22:06:19 I tried raising it several times. 22:06:31 hemna_: nova reserves the new volume in the api for swap 22:06:32 self.volume_api.reserve_volume(context, new_volume['id']) 22:06:43 ah ok that's good 22:06:47 and if swap fails 22:06:48 self.volume_api.roll_detaching(context, old_volume['id']) 22:06:48 self.volume_api.unreserve_volume(context, new_volume['id']) 22:06:54 and then it gets into the https://github.com/openstack/nova/blob/4ad414f3b1216393301ef268a64e61ca1a3d5be9/nova/compute/manager.py#L4898 22:06:56 so we should be covered there 22:07:25 yeah, nova-api does the reserve call to cinder, then casts to nova compute to do the actual detach/attach of the old/new volumes 22:07:56 I see the finally clause there 22:08:03 it nukes the old attachment it looks like 22:08:39 I presume that migrate_Volume_completion takes care of the correct state on the volume in that case. 22:08:58 gawd this is confusing 22:10:08 ildikov, I think we burned through out time here? 22:10:11 :) now you want gate tests too? 22:10:17 mriedem, yes 22:10:23 muwahahaha 22:10:25 Yeah, I gotta go myself. 22:10:29 ok, me too 22:10:34 that panic feeling sets in.... 22:10:44 Maybe more talk in IRC, else see you all in Austing 22:10:50 hemna_: I still have more than 5 hours till the taxis picks me up, so it's fine :) 22:10:58