17:01:01 <ildikov> #startmeeting cinder-nova-api-changes
17:01:02 <openstack> Meeting started Mon Nov 21 17:01:01 2016 UTC and is due to finish in 60 minutes.  The chair is ildikov. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:01:03 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
17:01:05 <openstack> The meeting name has been set to 'cinder_nova_api_changes'
17:01:18 <ildikov> scottda ildikov DuncanT ameade cFouts johnthetubaguy jaypipes takashin alaski e0ne jgriffith tbarron andrearosa hemna erlon mriedem gouthamr ebalduf patrickeast smcginnis diablo_rojo gsilvis  xyang1 raj_singh lyarwood
17:01:29 <mriedem> o/
17:01:45 <scottda> hi
17:02:07 <ildikov> mriedem: I would offer that I ping you to do home work this week, but Thanksgiving kills my joy of annoying people...
17:02:07 <johnthetubaguy> o/
17:02:59 <ildikov> I know that smcginnis and patrickeast are traveling this week
17:03:10 <mriedem> my todo this week is to review johnthetubaguy's updates to https://review.openstack.org/#/c/373203/ which i didn't get to last week because of the nova spec freeze
17:03:21 <mriedem> but i'm also straddling some internal things i need to work on, and it's a short week
17:03:40 <ildikov> mriedem: the Cinder side is updated too
17:04:00 <ildikov> mriedem: code #link https://review.openstack.org/#/c/387712/
17:04:22 <ildikov> mriedem: spec #link https://review.openstack.org/#/c/361512/
17:04:26 <hemna> yough
17:05:05 <ildikov> there is a comment from jaypipes regarding the reserve part, which creates an empty attachment and also changes the volume state to 'reserved'
17:05:14 <johnthetubaguy> I am curious if they are both pointing at the same thing
17:05:31 <ildikov> johnthetubaguy: that's just illusion :)
17:05:58 <mriedem> i'd kind of hoped we wouldn't have reserve, and we'd just have nova-api call create_attachment which would create the empty attachment and set the volume status to 'attaching'
17:06:13 <hemna> I think the reserve should return the attachment uuid
17:06:15 <mriedem> then update_attachment in nova-compute when we have the connector details from os-brick
17:06:18 <hemna> that's what I asked for a while back
17:06:29 <hemna> I think it just makes the entire flow more explicit and sane
17:06:51 <mriedem> either way i think we agree to create the attachment from nova-api
17:06:57 <hemna> mriedem, the problem with that is the create_attachment has to call the cinder backend
17:06:59 <hemna> which is slow
17:07:10 <johnthetubaguy> so my spec calls reserve create attachment
17:07:10 <ildikov> mriedem: the current reserve does this except the name of the volume state
17:07:14 <mriedem> hemna: even if nova doesn't provide a connector?
17:07:26 <johnthetubaguy> right, connector is later
17:07:41 <johnthetubaguy> (optionally, later)
17:07:42 <hemna> why would it call it w/o a connector?
17:07:43 <mriedem> i'm cool with leaving a reserve call in
17:07:52 <ildikov> johnthetubaguy: the Cinder API proposal is not finalized yet completely, so if the functionality is the same I think we can wait with updating the name in your spec
17:07:54 <johnthetubaguy> it would do that to reserve the volume from other people
17:07:56 <hemna> just to reserve?
17:07:59 <mriedem> it's just a bit odd that reserve will create an attachment too
17:08:07 <mriedem> and we'll also have a POST attachment
17:08:07 <ildikov> johnthetubaguy: but basically the answer is yes :)
17:08:14 <hemna> well it's 'odd' either way
17:08:28 <hemna> calling create_attachment, which doesn't create an attachment but is used to reserve
17:08:30 <mriedem> yes it's odd either way i agree
17:08:36 <hemna> kinda....hidden API
17:08:40 <mriedem> it would create an attachment record in the db
17:08:42 <ildikov> mriedem: I know, it's attachment-reserve atm though
17:08:51 <hemna> I prefer having a named API for the capability
17:08:55 <hemna> that returns the attachment uuid
17:09:03 <mriedem> sure that's fine
17:09:10 <mriedem> os-reserve creates an empty attachment and returns the id
17:09:14 <mriedem> s/id/uuid/
17:09:19 <mriedem> nova stores the attachment uuid in the bdm
17:09:20 <ildikov> mriedem: reserve only asks for a volume_id and an instance_id IIRC
17:09:21 <hemna> yuh
17:09:23 <johnthetubaguy> if you look at build instance, and how retry works, it might change your mind about reserve being separate to attachment uuid, at least thats what pushed me the way I suggested in the current Nova spec
17:09:30 <mriedem> then nova-compute uses the bdm.attachment_id to update the attachment later with the connector
17:10:16 <mriedem> johnthetubaguy: with bfv wouldn't we still reserve the volume in the api?
17:10:41 <mriedem> in the case that a volume_id is provided for server create i mean
17:10:54 <mriedem> i'm trying to think of it like with ports
17:11:09 <ildikov> I would guess that reserving the volume or move it to 'attaching' state or smth is needed in the API anyhow
17:11:29 <mriedem> we don't do that today for bfv i don't think, but i think that's wrong
17:11:51 <johnthetubaguy> mriedem: so attaching a volume during boot, I like to think about that separately from bfv, yes you reserve in the API, and then the compute
17:12:01 <ildikov> mriedem: no it's not there today in BFV
17:12:10 <johnthetubaguy> mriedem: yeah, its broken today, but thats what we should do
17:12:11 <ildikov> mriedem: I have a patch for that though on review
17:12:38 <ildikov> mriedem: I have a few questions on the review to ensure we're on the same page, so if you can take a look that would be pretty great too :)
17:13:33 <ildikov> remove check_attach patch #link https://review.openstack.org/#/c/335358/
17:15:30 <johnthetubaguy> created a new section in here: https://etherpad.openstack.org/p/ocata-nova-priorities-tracking
17:16:37 <ildikov> johnthetubaguy: coolio, thanks!
17:17:27 <ildikov> so I guess we agree that having smth to call from the API that locks the volume and returns an attachment_id is what we would like to have
17:17:34 <ildikov> is my understanding correct?
17:18:08 <mriedem> i think so
17:18:17 <scottda> That's my understanding
17:18:35 <johnthetubaguy> the cinder spec seems to include reserver already
17:19:13 <ildikov> johnthetubaguy: it does, but the functionality of that attachment-reserve is what I typed in above
17:19:37 <ildikov> so it seems that we kind of have the functionality, but don't like the naming
17:19:51 * ildikov maybe oversimplifies the situation
17:21:39 <johnthetubaguy> actually, I am not sure if reserve works for the retry build instance on a different host
17:22:09 <johnthetubaguy> we want to keep the volume attached to the instance, but we need to do a detach on the old host, so we end up with two attachments
17:22:29 <johnthetubaguy> I am not sharing enough context there probably
17:22:46 <johnthetubaguy> so instance is building on host 1
17:22:47 <johnthetubaguy> volume 1 is attached to host 1
17:22:48 <johnthetubaguy> it fails, boo
17:22:55 <ildikov> does retry build equal to evacuate?
17:23:07 <johnthetubaguy> close, but not quite
17:23:17 <johnthetubaguy> we can actually clean up host 1, and host 1 is still alive
17:23:30 <johnthetubaguy> but we do need to keep it reserved during the process
17:24:10 <ildikov> johnthetubaguy: isn't it like live migrate?
17:24:19 <johnthetubaguy> yeah, basically the same
17:24:28 <ildikov> where we have two attachments for the same instance?
17:24:47 <johnthetubaguy> yeah
17:24:52 <mriedem> johnthetubaguy: we don't reserve from the compute
17:25:02 <mriedem> just like with ports, we bind on each host
17:25:10 <ildikov> mriedem: +1
17:25:14 <mriedem> and cleanup any garbage on retry
17:25:18 <johnthetubaguy> ah, so maybe build is different
17:25:26 <johnthetubaguy> ah, oops, it is
17:25:31 <mriedem> note that today we don't retry on BFV failures with attachment errors
17:25:32 <johnthetubaguy> we diconnect host 1
17:25:38 <johnthetubaguy> then go to host 2
17:25:59 <johnthetubaguy> do we are not a host 2 at the time we want to clean up host 1
17:26:01 <mriedem> https://review.openstack.org/#/c/246505/ attempts to retry when bfv fails attachment on host 1
17:26:05 <johnthetubaguy> s/do/so/
17:26:33 <mriedem> the author has basically abandoned ^
17:26:45 <johnthetubaguy> I really mean when there is a failure for some other reason, we need to sort out the volume attachments
17:27:09 <mriedem> i guess it depends on what we need to cleanup
17:27:15 <mriedem> i think of it like,
17:27:24 <mriedem> today we update the port's host binding details,
17:27:32 <mriedem> on failure we don't clean that up
17:27:34 <ildikov> mriedem: I think the idea was to handle what we can with a new attachment
17:27:39 <mriedem> we probably should...but we don't today,
17:27:49 <ildikov> mriedem: and move old ones to error state if needed maybe
17:27:53 <mriedem> with volumes i think of that like we update the vol attachment with the connector from that host
17:28:06 <mriedem> ildikov: having 3 vol attachments in error state sucks
17:28:12 <mriedem> who cleans those up? what does it mean for the volume?
17:28:31 <ildikov> I'm fine with just deleting the attachments
17:28:38 <mriedem> well,
17:28:45 <mriedem> i think we have a single vol attachment per vol and instance
17:28:49 <mriedem> and the host changes
17:28:52 <mriedem> per the connector
17:28:57 <ildikov> I think there is at least one part of johnthetubaguy's patch where we want to move it to error state and clean up later I guess
17:28:58 <mriedem> that's what we have with ports
17:29:08 <johnthetubaguy> ildikov: thats evacaute I think
17:29:20 <mriedem> i could see that for evacuate
17:29:26 <ildikov> johnthetubaguy: yeap, I think so too
17:29:26 <mriedem> cleanup the error attachments when the compute comes back
17:29:33 <mriedem> evacuate is a special sort of terrible
17:29:35 <johnthetubaguy> mriedem: I forgot, we are missing reserve on ports today, technically
17:29:41 <mriedem> johnthetubaguy: yeah i know
17:29:54 <ildikov> mriedem: we certainly have an agreement there
17:29:55 <johnthetubaguy> mriedem: they do have a plan to fix that already
17:29:57 <mriedem> johnthetubaguy: but we don't reserve for BFV today either, so it's similar
17:30:03 <johnthetubaguy> yeah, its similar
17:30:26 <ildikov> I hope we fix that soon, I mean the BFV reserve case
17:31:02 <ildikov> it would be great to clean up check_attach before introducing a new Cinder flow IMHO
17:32:05 <johnthetubaguy> so I think this is how I described the retry build in the spec: https://review.openstack.org/#/c/373203/13/specs/ocata/approved/cinder-new-attach-apis.rst@237
17:32:12 <johnthetubaguy> its basically the same thing we are desribing
17:32:20 <johnthetubaguy> I just wanted to avoid a window where the volume is not reserved
17:32:47 <johnthetubaguy> oops, wrong line
17:32:47 <johnthetubaguy> https://review.openstack.org/#/c/373203/13/specs/ocata/approved/cinder-new-attach-apis.rst@273
17:35:08 <ildikov> johnthetubaguy: is this a case when we only reserved the volume and nothing else happened due to any reason before retry?
17:35:45 <johnthetubaguy> I was meaning when the volume is fully attached and working on host 1, but we still need to retry the build and move to host 2
17:35:58 <ildikov> johnthetubaguy: in any case the attachment-create call is now create_and_or_update practically
17:36:26 <ildikov> johnthetubaguy: so if you call it from the compute after having a connector it reserves the volume and does everything else as well
17:36:42 <johnthetubaguy> I don't know what the new host will be, when host 1 is being disconnected, I think...
17:37:01 <johnthetubaguy> so I am calling reserve on an already attached volume, basically
17:37:09 <johnthetubaguy> but with a matching instance_uuid
17:37:23 <johnthetubaguy> to make sure it says reserved when host 1 is disconnected
17:37:36 <johnthetubaguy> I am just wondering if that will work with the current definition of reserver
17:37:40 <johnthetubaguy> reserve
17:38:08 <ildikov> I need to double check, it might does
17:39:34 <ildikov> so basically we need to call something that ensure the volume is locked, when we know retry is needed
17:39:51 <johnthetubaguy> I think, yes
17:40:06 <johnthetubaguy> right now it looks like we just skip any cleanup, but I could be wrong
17:40:17 <ildikov> as we need the scheduler to figure out the host first so we can have the connector, but it takes some time and we don't want to leave the volume available for that window
17:40:21 <mriedem> left a comment in the review in that section
17:41:04 <mriedem> since i haven't read the full spec since it was last updated i might be missing context,
17:41:14 <johnthetubaguy> mriedem: yeah, I was just digging, I couldn't see any cleanup on reschedule either
17:41:20 <mriedem> but really wanted at a high level that the volume is reserved (locked) in n-api
17:41:30 <mriedem> and the computes update the connector per compute, per retry
17:41:44 <mriedem> johnthetubaguy: there isn't cleanup b/c we don't retry for bfv failures
17:42:00 <mriedem> the only thing we cleanup is calling os-detach and os-terminate_connection
17:42:46 <johnthetubaguy> so... its the update the connector bit
17:43:01 <johnthetubaguy> we can't really do that, I thought, as that would break all our locking around shared connections
17:43:17 <johnthetubaguy> ildikov: did we get anywhere on deciding if they actually exist?
17:43:46 <ildikov> so with live migrate we create a new attachment, with evacuate we also and mark the old one as in error state and the rest we would like to just update the existing attachment?
17:44:18 <ildikov> johnthetubaguy: they exist as I saw from the discussions on the Cinder channel last week after the meeting
17:44:28 <johnthetubaguy> OK
17:44:50 <johnthetubaguy> I would prefer to always create a new attachment, and delete the old one, just so the detach logic is always the same
17:44:56 <johnthetubaguy> but maybe I am overthinking it
17:45:07 <ildikov> johnthetubaguy: I think there are still differences in how those connections are handled, but we don't have patrickeast around this time to ask as he had one of the examples
17:45:18 <mriedem> johnthetubaguy: so then nova-api and nova-compute are creating attachments?
17:45:25 <mriedem> i don't really like that
17:45:35 <ildikov> johnthetubaguy: not having exceptions is what I favor too
17:45:45 <johnthetubaguy> oh, true, they are in this model
17:46:00 <johnthetubaguy> but that would be with the server token + user token, once we get that fix in
17:46:13 <mriedem> not sure how that matters
17:46:17 <mriedem> it actually makes it more complicated
17:46:27 <mriedem> b/c then the user listing attachments on a volume doesn't see the ones that nova created
17:46:30 <mriedem> and can't clean them up
17:47:04 <johnthetubaguy> its just an RBAC thing, not an owner thing, but thats a separate conversation I guess
17:47:41 <johnthetubaguy> I don't mean RBAC, I really mean expiry in this context
17:48:02 <johnthetubaguy> so neutorn port bindings have some stuff that will need a service token (setting the host), but lets back out of that
17:48:17 <johnthetubaguy> I get its simpler to just update the connector
17:48:35 <johnthetubaguy> rather than create empty, delete old one, etc
17:48:46 <johnthetubaguy> I just like disconnect always deleting the attachment
17:48:54 <johnthetubaguy> that seemed simpler
17:49:13 <johnthetubaguy> I guess they are logically the same for the cases that matter
17:49:41 <johnthetubaguy> ah, no...
17:49:53 <mriedem> disconnect is cleanup on the host via os-brick though right?
17:49:54 <johnthetubaguy> shared volume, you need to delete (or update) the attachment with the lock held
17:50:00 <johnthetubaguy> shared host connection I mean
17:50:26 <ildikov> mriedem: I believe yes, at least that's my understanding
17:50:27 <johnthetubaguy> the lock being local to host 1
17:50:41 <mriedem> it's fine to cleanup from a failed connection with the lock held,
17:50:50 <mriedem> i don't see how that messes with the cardinality of how many attachments we have
17:51:21 <johnthetubaguy> I am talking about a VM that fails after the volume is attached, and has to re try the build on a new host
17:51:45 <johnthetubaguy> although you said thats not a thing, I am sure I saw that happen... but maybe thats some crazy local patch we had
17:51:47 <mriedem> sure, before we reschedule why can't we disconnect (os-brick) from host and remove the connector from the attachmnet?
17:52:20 <johnthetubaguy> remove connector works, rather than delete, but then it is a special case in the volume_api
17:52:47 <ildikov> can we update as opposed to remove?
17:52:48 <mriedem> this is what we cleanup on a failed attach today https://github.com/openstack/nova/blob/a58c7e5173103755f5a60f0a3ecede77e662ada3/nova/virt/block_device.py#L299
17:53:46 <mriedem> and for vol attach we unreserve in the compute manager here https://github.com/openstack/nova/blob/a58c7e5173103755f5a60f0a3ecede77e662ada3/nova/compute/manager.py#L4731
17:53:52 <mriedem> b/c we reserved in the api
17:54:01 <mriedem> (for attach vol, not BFV)
17:54:33 <mriedem> wherever we end up, i just don't like n-api creating an attachment, n-cpu updating it, maybe deleting it and creating a new attachment on reschedule,
17:54:48 <johnthetubaguy> in my spec, unreserve doesn't exist, that just happens when you delete an attachment
17:54:52 <mriedem> i'd rather see n-api reserve the volume (no attachment created), n-cpu handles CRUD ops on attachments,
17:55:00 <mriedem> or n-api creates attachment, n-cpu updates it, that's it
17:55:21 <mriedem> yeah maybe n-cpu deletes the attachment at the end if things fail
17:55:24 <mriedem> like unreserve
17:55:43 <ildikov> except live migrate I guess where we need the two attachments?
17:56:09 <mriedem> conductor would create the 2 attachments for live migration i'd think
17:56:39 <mriedem> anyway, i'm sure it's in john's spec and i just need to sit down with it for awhile
17:56:54 <johnthetubaguy> so I guess I don't understand why we would want reserve/unreserve as separate calls, I much prefer them just being an attachment thats in a specific state. there must be something I am missing here
17:57:11 <mriedem> johnthetubaguy: i agree with you there
17:57:23 <mriedem> but it does bake extra logic into the cinder api,
17:57:32 <mriedem> so i'm guessing that's why cinder people are opposed to that
17:57:37 <mriedem> and by cinder people i mean hemna
17:58:11 <mriedem> johnthetubaguy: just to confirm, create_attachment w/o a host/connector == reserve for you right?
17:58:18 <johnthetubaguy> mriedem: yeah
17:58:19 <ildikov> in the plans we don't have unreserve
17:58:32 <mriedem> and update_attachment + connector == attached
17:58:39 <mriedem> i.e. initialize_connection
17:58:40 <johnthetubaguy> yeah
17:58:47 <mriedem> ok we're on the same page then
17:59:06 <mriedem> ildikov: unreserve == delete attachment i think
17:59:38 <ildikov> mriedem: yeap, I think delete kind of contains that too in this sense
17:59:43 <johnthetubaguy> mriedem: yeah, in my head delete = unreserve + terminate connection (on cinder side, if required)
18:00:05 <ildikov> mriedem: as the current reserve creates the empty attachment it kind of covers what you described above
18:00:10 <mriedem> i hope the spec has a mapping of old to new world things :)
18:00:22 <mriedem> so we can talk the same language for people not involved in this for the last year
18:00:22 <johnthetubaguy> my one tries to do that
18:00:45 <mriedem> we're out of time
18:00:49 <johnthetubaguy> https://review.openstack.org/#/c/373203/13/specs/ocata/approved/cinder-new-attach-apis.rst@86
18:00:55 <ildikov> mriedem: we might think about what the volume state is called there though...
18:01:11 <mriedem> johnthetubaguy: tops
18:01:23 <mriedem> is that the correct british thing to say?
18:01:28 <johnthetubaguy> :)
18:01:37 <ildikov> LOL :)
18:02:03 <johnthetubaguy> we say lots of random rubbish to mean the same thing
18:02:18 <ildikov> johnthetubaguy: mriedem: should we move the discussion to the Nova spec regarding the edge cases?
18:02:19 <mriedem> that's english in general i believe
18:02:24 <mriedem> ildikov: yes
18:02:32 <johnthetubaguy> yeah, I am good with that
18:02:34 <ildikov> johnthetubaguy: mriedem: and we can argue on the Cinder API on the Cinder spec
18:03:04 <johnthetubaguy> that sounds logical, I need to look at that update
18:04:22 <ildikov> mriedem: if you can get to the remove check_attach patch before Thanksgiving that would be great as I can update the code while you're enjoying turkey :)
18:05:32 <ildikov> johnthetubaguy: cool, I'm trying my best to keep it in sync with the code and discussions
18:06:03 <johnthetubaguy> ildikov: awesome, thank you!
18:06:10 <ildikov> johnthetubaguy: we would like to get the Cinder part merged as soon as possible, so if you can raise any concerns on the spec or the API part of the code that would be great
18:06:47 <johnthetubaguy> honestly, its hard to agree to the API before understanding if it covers all the cases we need
18:06:47 <johnthetubaguy> having said that, it feels like we are really quite close now
18:07:42 <ildikov> yeap and if we can keep things simple like how mriedem described to create and update the attachment where needed then we should be good to go
18:08:13 <ildikov> like having the basic API calls on the Cinder side and play Lego with it in Nova as we need it sounds the way to go I think
18:08:21 <mriedem> as long as the cinder api is close and uses microversions we can tweak it over time too
18:08:38 <johnthetubaguy> being able to reset the host connection is probably the simplification bit thats useful
18:08:41 <johnthetubaguy> mriedem: good point
18:09:18 <ildikov> mriedem: it does, scottda is our person on the microversions part when we get there
18:09:22 <mriedem> anyway, times up, i've got to move on here, i'll try to review all things under the sun and not sleep the rest of this week
18:09:40 <ildikov> mriedem: thanks in advance
18:09:54 <ildikov> mriedem: have a nice Thanksgiving!
18:10:02 <mriedem> thanks
18:11:21 <ildikov> all right, see you all next week!
18:11:31 <ildikov> and on the reviews in the meantime :)
18:11:38 <ildikov> thanks and have a nice rest of the day!
18:11:52 <ildikov> #endmeeting