17:00:06 #startmeeting cinder-nova-api-changes 17:00:06 Meeting started Mon Sep 26 17:00:06 2016 UTC and is due to finish in 60 minutes. The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:07 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:10 The meeting name has been set to 'cinder_nova_api_changes' 17:00:19 scottda ildikov DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo gsilvis xyang1 17:00:29 hi 17:00:36 scottda: hi :) 17:00:39 heyas 17:01:04 hi 17:01:08 scottda: I started to put a few item to the etherpad from the last meeting: #link https://etherpad.openstack.org/p/cinder-nova-api-changes 17:01:29 I also saw jgriffith uploaded new versions of his patch last week 17:02:30 last time we weren't 100% sure about the cross-project session on the Summit 17:02:47 are we in an agreement regarding we plan to have one? 17:03:29 0/ 17:03:44 It sounds like a good idea to me. We've API changes, multi-attach, and privsep issues. 17:04:01 scottda: coolio 17:04:16 scottda: that's a pretty long list of things for 40 minutes 17:04:16 our PTLs are in charge for rooms and slots available, right? 17:04:17 fyi, I won't be at the summit 17:04:20 But I reckon it needs buy-in from smginnis and mriedeman, both of whom aren't here 17:04:35 ildikov: Yes, It's in the hands of the PTLs 17:04:42 I guess it will not be that tough to find overlap 17:04:54 hemna: oh no, I'm sorry :( 17:05:12 scottda: ok, I'll ping them about this 17:05:29 jgriffith: Yes, long list, but I think that's whey we need the session. I'm not sure we'll actually solve problems, but I think there's lotsa folks who don't know about what's going on here. 17:05:35 #action ildikov to ping smcginnis and mreiedm about a cross-project session at the summit 17:06:08 hemna: Let's make sure someone attending understands the privsep issues and can advocate for what we need to do. 17:06:27 scottda, ok sounds good 17:09:18 what do we want to discuss before the Summit regarding the API changes? 17:09:42 it would be great to have some decisions before and trying to address the less trivial items during the Summit 17:10:02 like edge cases in Nova and how to handle those without overcomplicating the new Cinder API, etc. 17:10:12 save the connector! 17:10:42 I didn't have the chance to read johnthetubaguy's initial patch yet: #link https://review.openstack.org/#/c/373203/ 17:10:59 hemna: +1, I added that to the etherpad once more to ensure we do that 17:11:10 :) 17:11:17 so I kinda had quite a few open questions in there 17:11:27 jgriffith: did you have a chance to check what johnthetubaguy started to sketch? ^^ 17:11:38 ildikov: I did, and it looked AWESOME 17:11:49 basically I read jgriffith's spec and added some guess work where I wasn't sure :) 17:12:03 jgriffith: I'm glad to hear :) 17:12:12 ildikov: johnthetubaguy I'm *hoping* to push and update to the Nova patch later today, and then use that to work with johnthetubaguy to flush out some remaining details 17:12:43 jgriffith: johnthetubaguy and all: can we go through the sure and then the unsure items? 17:13:25 ildikov: How to do that? ON the etherpad? 17:13:27 in the sense of if we start with the obvious and moving towards the less obvious items and discuss these we will not get to the 20% of the material during the session in Barcelona 17:13:39 yeah, I think thats a good plan 17:13:48 do what we can async, so we have a smaller list in BCN 17:13:56 scottda: we have a few meetings left before the Summit and we can use the etherpad in parallel as well 17:14:11 johnthetubaguy: ildikov +1 17:14:25 honestly, I'm hoping we're just happy and merge this before Barcelona :) 17:14:26 ildikov: Sounds good. Perhaps list things on the etherpad and give feedback, then discuss in detail at these meetings 17:14:45 so a quick one, possible 17:14:49 what is unreserve for? 17:14:51 jgriffith: ++ It would be nice to get some momentum, else we're going to run out of time quickly. 17:14:56 johnthetubaguy: heeh :) 17:15:09 johnthetubaguy: the idea behind it was to notify cinder you're going to let it go 17:15:18 ah, so a pre-remove call? 17:15:20 johnthetubaguy: mark it as detaching to avoid races 17:15:36 johnthetubaguy: exactly.. truly the exact opposite of reserve 17:15:47 johnthetubaguy: it was initially JUST a db update/toggle 17:15:53 so if we call reserve, but attach fails, what do we do, call unreserve then remove? 17:16:21 johnthetubaguy: so that's what's kinda goofy 17:16:36 johnthetubaguy: we don't always do that and probably don't necessarily need to 17:16:48 johnthetubaguy: we probably *should* just for consistency and clarity 17:16:48 scottda: yeap, that sounds good 17:17:05 johnthetubaguy: there's a problem now with cinder objects though... 17:17:13 jgriffith: if we can even merge things before the SUmmit that would be amazing!!! :) 17:17:33 johnthetubaguy: it has an expected status of attaching, so if something failed that went to error-attaching or error* we're sunk 17:17:46 the skip of unreserve bypasses that problem 17:17:58 yikes 17:18:21 johnthetubaguy: easy to fix at least, but a minor problem with that particular call 17:18:22 I've always thought we should have an idempotent attach we could call for cleanup...It should work regardless of the state 17:18:26 so if you skip create_attachment, you skip unreserve? 17:18:42 so thats a good point... 17:19:01 johnthetubaguy: yeah, or if you fail create_attachment 17:19:17 Sorry, that should be an idempotent "detach" for cleanup 17:19:18 ah, so one question about create_attachment, when do I call that? 17:19:24 I guess that gives me the connection_info? 17:19:34 so I call that before calling brick 17:19:49 yes 17:19:54 johnthetubaguy: I'm reworking that, so the flow would be: reserve--->create_attachment 17:19:55 done 17:19:56 you call that before calling brick connect_volume 17:20:11 cool, that makes sense 17:20:23 hemna: johnthetubaguy well that's sort of a gray area with what we've designed in brick though 17:20:27 so yeah, I guess its understanding what I do to clean up each step, should something crazy happen 17:20:34 uh, oh 17:20:46 hemna: johnthetubaguy because there are *things* that some drivers apparantly need to create the initialize and attach on the cinder side 17:21:02 oh, so there is something post attach? 17:21:06 huh? 17:21:22 johnthetubaguy: cleanup with the new code is easier IMHO... if create_attachment fails, then we just call remove_attachment and cleanup on cinder's side 17:21:44 johnthetubaguy: not post attach, post initialize_connection 17:21:53 jgriffith: true 17:22:10 so remember we had the multi-phase mess: reserve-->initialize-->attach 17:22:22 right, we totally skip all that 17:22:34 johnthetubaguy: going to just reserve-->create-attachment means there's some slight changes to what we need 17:23:01 johnthetubaguy: but I'm also working to add an "update-attachment" for those special cases (like boot as instance) 17:23:21 and live-migration 17:23:32 sorry, I am a bit lost, whats that special case about boot as instance? 17:24:00 that makes sense for live migration 17:24:12 PS I thought live-migration would be just another call to create_attachement, with a new connection_info, just its a different host, and same instance_uuid 17:24:29 johnthetubaguy, but you still need to update the older attachment 17:24:34 basically removing it 17:24:46 johnthetubaguy: so when you do a boot from volume, the attach info isn't actually available yet 17:24:56 hemna: we need to remove_attachment for that old one, just like if we were disconnecting 17:24:57 johnthetubaguy: it's easier for me to explain via code... just a sec 17:25:17 jgriffith: cause we can't reserve a volume we haven't created yet? 17:26:28 * johnthetubaguy is not sure this is coming across, but I actually love discovering about who all this *actually* works, totally geeking out 17:27:02 johnthetubaguy: https://github.com/openstack/nova/blob/master/nova/virt/block_device.py#L264 17:28:46 jgriffith: oh... 17:28:52 johnthetubaguy: so in that case the actual attach on the Nova side is not done until later 17:29:27 jgriffith: although, something tells me we should probably just call create right away there 17:29:27 johnthetubaguy: and we may not have all of the info required for the actual attachment record in Cinder, so I stub things out, then do an update later if needed 17:29:51 johnthetubaguy: I refactored some things last night though and may not need to worry about that any more... TBD :) 17:30:13 yeah, I think we should just call reserve in the boot case, until we land on the host, then call create_attachement 17:30:21 at least thats what by "nova gut" tells me 17:30:40 johnthetubaguy: might be a good idea 17:30:57 johnthetubaguy: I'll for sure have a look at that 17:31:11 I think the BDM is fine not having the connection info till later, probably means delete will need updates though, but in a good way 17:31:31 so... that was a big rat hole there 17:31:35 were where we before? 17:31:40 johnthetubaguy: :) 17:32:01 ah, live-migrate 17:32:20 so I have an idea here 17:32:24 create_attachement 17:32:38 :) 17:32:40 lets say that takes volume uuid, host uuid and instance uuid 17:33:00 live-migrate and multi-attach are the only cases we have more than one attachment 17:33:15 you only have two attach for the same volume uuid, and only in live-migrate case 17:33:39 thats a long way of saying, for live-migrate, we call create for the new host, and remove for the old host 17:33:42 Won't you for multi-attach as well? 17:33:59 yeah, multi-attach is different instance-uuid 17:34:06 Have 2 attach for the same volume_id ? 17:34:13 scottda, yah 17:34:17 johnthetubaguy: scottda FWIW, that's what I do currently 17:34:17 for live migration 17:34:18 right, you have N uuids, only two hosts per each uuid 17:34:52 * johnthetubaguy needs a whiteboard, and drawing skills 17:34:58 jgriffith: cool 17:35:10 so I think this means I can update my spec 17:35:20 * error handing for create_attach fails 17:35:27 * live-migrate flow and multi-attach notes 17:35:36 * fix unreserve 17:35:42 johnthetubaguy: I'll sign up to document those out 17:35:51 the live-migrate and multi-attach that is 17:36:05 johnthetubaguy: they'll hopefully end up being the same solution for both problems 17:36:05 yeah, would be good to get that into the Cinder spec 17:36:11 yeah, I think it should be 17:36:23 so there is one bit thats missing 17:36:26 arguments for the API calls 17:36:33 I just need to figure out how to fix the shelved instance stuff then I'll get those tested out and documented 17:36:35 johnthetubaguy: Do you want to tackle shelve_offload as well? Should be a bit simpler than live migration. 17:36:51 snap! jgriffith already is on it. 17:36:58 scottda: I had forgotten about that... 17:37:06 johnthetubaguy: I haven't :( 17:37:13 johnthetubaguy: although I wish I had 17:37:17 so shelve, can we have a host_uuid of None? 17:37:21 johnthetubaguy: I think I've got it working 17:37:23 for the attachement 17:37:30 treat the host change like migrate 17:37:36 johnthetubaguy: yes, that's what I do currently is stub out the info and then update it 17:37:36 crud... migrate 17:37:49 jgriffith: perfect 17:38:02 johnthetubaguy: the trick for me has been figuring out how to work with the bdm object efficiently :) 17:38:03 migrate I think is the same as live-migrate, if we do it correctly 17:38:21 jgriffith: I fear thats not really possible 17:38:21 johnthetubaguy: hopefully all of those just leverage the multi-attach capability 17:38:28 jgriffith: +1 17:38:36 LOL 17:39:04 johnthetubaguy: well, even ignoring efficiency, still trying to untangle how some of it works 17:39:06 seriously though, I think we need to get the BDM into the instance object, and so we don't have to keep fetching the dammed thing ever 5 seconds 17:39:25 but lets do that later! 17:39:37 oh, yeah, arguments 17:39:50 johnthetubaguy: that would be awesome, but yeah... later :) 17:40:06 johnthetubaguy: arguments are subject to change :) 17:40:06 so remove_attachment, that takes attachment_uuid 17:40:18 I am more thinking logically 17:40:26 rather than REST-e-ly 17:40:57 it feels like reserve creates the attachement_uiid 17:41:04 johnthetubaguy: so I'm currently stuffing the attachmnet-id into the bdm record 17:41:26 so only reserve takes volume_uuid, host_uuid, instance_uuid 17:41:28 johnthetubaguy: nahh... it doesn't but do you want it to? Currently create_attachment does that 17:41:32 the others take attachment uuid? 17:41:39 reserve just takes a volume-id that's it 17:41:54 oh... interesting 17:41:58 create_attachment returns a bunch of connector info AND an attachment-id 17:42:03 debugging wise, its probably good to know who did that 17:42:09 from the operator point of view 17:42:10 that way there's no confusion on what we want to do 17:42:24 and it solves the multi-attach problem without a bunch of crazy guessing or inferring 17:42:40 Caller told me to remove so that's what I do 17:42:46 mutli-attach still calls reserve right? 17:42:57 if they have multiple attachments to the same node it still works and I don't care 17:43:09 johnthetubaguy: yes, but for that to work we need to update reserve 17:43:34 ah, true, but thats all non-driver stuff I guess? 17:43:36 johnthetubaguy: and probably include a "what" we're attaching too 17:43:47 yeah, thats what I was thinking above I guess 17:43:54 I've been tabling the multi-attach support for a follow up 17:44:06 at least the "true" multi-attach 17:44:11 ah, so crazy idea.... 17:44:27 ditch create_attachement 17:44:33 call it initialize_connection 17:44:54 oh, wait..., thats nuts 17:45:57 create_attachement (now reserve), initialize_attachement (now create), initialize_dettachement (now unreserve), remove_attachment 17:46:10 so you get the attachement_id back from the first one 17:46:39 all the others are actions on the attachement 17:46:39 I think hemna Was advocating that for a while. 17:46:53 so it turns out I am +1 hemna's suggestion 17:47:00 phew 17:47:22 basically because of the arguments each of these things need, and how it should fit into a REST API 17:47:31 its attachement CRUD now 17:49:04 oh dear, jgriffith has gone rally quite 17:49:11 really 17:50:48 so os-brick and remove 17:50:53 shoot 17:51:06 does unreserve give me the info to call os-brick, or is that remove_attachement? 17:51:42 the bdm has the connection_info currently for os-brick remove_volume 17:52:12 I believe unreserve is just to set the cinder side state for races 17:52:30 nova pulls the current connection_info out of the bdm, then calls os-brick to remove_volume 17:52:37 then nova calls cinder to terminate_connection 17:52:58 oh, I thought terminate_connection had died 17:53:09 anyways, sorry, back up a sec 17:53:14 I am thinking about multi-attach 17:53:15 sorry 17:53:21 I was covering what currently exists afaik 17:53:26 ah, no worries 17:53:31 I am thinking about the future 17:53:34 ok 17:53:51 so is nova not going to store the connection_info in the bdm ? 17:53:59 wondering who decides if we need to call remove 17:54:27 hemna: I was wondering if we could stop needing that, I guess 17:54:36 hrmm 17:55:07 I'm assuming you are thinking of the issue we had with calling or not calling os-brick to remove_volume 17:55:12 it would be great if we called cinder, and it told us, nothing for os-brick to do here, or send os-brick this stuff 17:55:18 yeah, thats the one 17:55:32 depending on if the volume has multiple attachments or a single shared attachment on a single n-cpu host 17:55:33 hemna: I think jgriffith planned a new API call that tells whether it's ok to call remove or not 17:55:34 well, that logout thing 17:55:55 ildikov: yeah, I remember something about that at one point 17:56:18 that does sound familiar 17:56:30 johnthetubaguy, so if we don't store the connection_info in the bdm 17:56:33 then it has to ask cinder for it 17:56:44 and cinder isn't storing it currently 17:57:10 I'd guess nova would have to at least retain the attachment_id to know which attachment it's talking about. 17:57:16 yeap, but as we plan to have Cinder as the ultimate source of truth that should be ok 17:57:24 yeah, I think we can do better deployer clean up, if cinder knows that even if Nova screws up 17:57:35 johnthetubaguy: +1 17:57:37 ildikov: +1 17:57:46 I don't see why we couldn't store that connection_info in the attachments table as well as the connector 17:57:46 :) 17:57:59 also if we can get BDM a bit less messy somehow that would be great too 17:58:05 but that was just a sidenote 17:58:14 ildikov: yeah, thats true 17:58:16 live migration will create a brand new attachment row, with it's new host and connector and would also need to save the connection_info then as well 17:58:27 nova just updates the attachment_id and it's done 17:58:36 hemna: its the same for every new attachement, I believe 17:58:58 I mean each attachment has its own connection info I think 17:58:59 so 17:59:19 the trick then is the case where nova needs to detach the old volume for the source 17:59:27 nova would have to save both attachment_ids 17:59:29 for that small window 17:59:35 For historical reference #link https://etherpad.openstack.org/p/cinder-nova-api-changes L#178 17:59:39 right, but thats when Nova knows we have two attachments now 17:59:40 which is akin to this hack: https://review.openstack.org/#/c/312773/ 17:59:44 yeah 17:59:46 That's what we decided on this issue, I believe 18:00:16 so we are going through the same problems with neutron 18:00:24 I think Nova just needs to know about both connections 18:00:41 so, afraid my hunger is taking over, and we are getting towards the end of time 18:00:47 I do like the idea of nova not saving much of anything but the attachment_id's associated with the volumes it has 18:00:57 hemna: +1 18:00:57 but I feel like we made progress here 18:01:01 it's really future safe 18:01:09 hemna: yeah, that feels right 18:01:20 yeap, our time is passed, I will try to get a summary out of the today's good chat and put it on the etherpad 18:01:20 we give you the control to evolve 18:01:32 yah that sounds good. 18:01:36 johnthetubaguy: +1 18:01:53 johnthetubaguy: did we touch on anything today that should go to your spec? 18:01:58 so it feels like whatever we call unreserve, should return something to tell os-brick what it needs to do 18:02:19 ildikov: I think there will be more questions, but thats a huge step forward for me 18:02:31 I just don't know what those questions will be yet :) 18:02:32 unreserve could fetch the connection_info from the attachment, if we have the attachment_id passed in to unreserve 18:02:51 yeah, +1 18:02:58 johnthetubaguy: no doubt in that, I was just wondering about putting some info into your patch as opposed to the etherpad 18:03:04 johnthetubaguy: just to make some progress :) 18:03:15 this seems much cleaner and explicit. 18:03:15 so I can have an action to rework that tomorrow morning 18:03:23 I am just blobling some comments on there now 18:03:25 hemna: in theory that should work fine 18:03:39 johnthetubaguy: great! 18:03:58 I wonder if someone could create some more nifty diagrams for these new flows..... 18:04:11 #action johnthetubaguy to update his spec in Nova with some details of the discssion from this meeting 18:04:23 scottda: yeah, thats a good point, something visual 18:04:33 lets see how that spec goes 18:04:51 scottda: hemna is our visualisation expert 18:05:02 doh! 18:05:07 Oh yeah, he is, isn't he ? :) 18:05:18 And he did such a great job on the last diagrams... 18:06:09 he seriously did, those are very nice diagrams! 18:06:18 *sigh* 18:06:42 hemna: but if you can share your experience then I guess someone can take that over so you don't need to suffer again 18:06:59 coolness, hopefully with the text straight, we can have a think about those 18:07:12 I have to remember what I used to create those 18:07:14 it would be great before the Summit 18:07:32 it could be a great way to update people more quickly I think 18:07:59 sorry, I must run now 18:08:07 will try get that undated ASAP 18:08:20 johnthetubaguy: thanks!!! 18:08:36 I think we covered much today, great discussion! 18:08:58 Yup, thanks ildikov 18:09:01 so I will update the etherpad, johnthetubaguy will update his spec and let's move forward in the review 18:09:12 and also on the etherpad with the items in question 18:09:21 and talk to you next week this time the latest! 18:09:27 thank you all! 18:09:50 #endmeeting