14:00:11 <edleafe> #startmeeting nova_scheduler
14:00:12 <openstack> Meeting started Mon Oct  9 14:00:11 2017 UTC and is due to finish in 60 minutes.  The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:15 <openstack> The meeting name has been set to 'nova_scheduler'
14:00:20 <efried> \o
14:00:22 <cdent> o/
14:00:22 <edleafe> #link Meeting Agenda https://wiki.openstack.org/wiki/Meetings/NovaScheduler
14:00:23 <alex_xu> o/
14:01:01 <edleafe> Let's wait a minute for more people to arrive
14:01:03 <mriedem> o/
14:02:21 <jaypipes> hola
14:02:30 * jaypipes caffeinated
14:03:01 <edleafe> #topic Specs
14:03:33 <edleafe> Lots merged last week
14:03:42 <edleafe> 3 left that I know of:
14:03:43 <edleafe> #link Granular Resource Request Syntax https://review.openstack.org/510244
14:03:46 <edleafe> #link Add spec for symmetric GET and PUT of allocations https://review.openstack.org/#/c/508164/
14:03:49 <edleafe> #link Support traits in the Ironic driver https://review.openstack.org/#/c/507052/
14:03:58 <edleafe> Any comments on these?
14:04:17 <efried> That first one I just added on Friday.  It's introducing the numbered-group syntax in GET /allocation_candidates
14:04:27 <jaypipes> I'll review each this morning. the symmetric one is a no brainer I think...
14:04:41 <mriedem> https://review.openstack.org/#/c/507052/ is approved, just won't merge because of the depends-on
14:04:56 <jaypipes> the GRanular one was just submitted by efried on Friday. it's the one about requesting multiple distinct "subrequests" of resource/traits
14:04:59 <cdent> there’s a mild issue in the symmetric one that I’ve just discovered while doing the implementation:
14:05:14 * bauzas waves
14:05:19 <cdent> when we GET there’s no project_id and user_id in the response, but we require that on the PUT. Do we care?
14:05:39 <cdent> I’ll highlight it when I commit, and we can discuss it on the review.
14:05:41 <jaypipes> cdent: probably should be made consistent...
14:06:56 <cdent> an aspect of making it consistent is that it kind of assumes that they might stay the same, which may be too big of an assumption
14:07:12 <cdent> it’s easy to adjust whatever we decide
14:07:16 <jaypipes> cdent: agreed
14:07:24 <mriedem> getting the info about the current project/user is fine,
14:07:35 <mriedem> doesn't mean the PUT has to be the same, but i don't know of case where they wouldn't be the same
14:08:47 <edleafe> #topic Reviews
14:08:56 <edleafe> #link Nested RP series starting with: https://review.openstack.org/#/c/470575/
14:08:59 <edleafe> There was one question attached to this in the agenda:
14:09:01 <edleafe> Debate: should the root_provider_uuid be reported in the GET /resource_providers response?
14:09:16 <efried> So my vote is yes.
14:09:21 <edleafe> Someone had concerns about this a while back - anyone remember why?
14:09:31 <efried> edleafe Heh, jaypipes said it was you :)
14:09:44 <edleafe> efried: yeah, I think he's mis-remembering
14:10:07 <jaypipes> very possible
14:10:18 <efried> Okay.  My take is that I want to be able to look at a given RP and get the whole tree for that RP in one step.
14:10:31 <efried> With parent but not root, I have to walk the whole tree up.
14:10:42 <jaypipes> edleafe, efried: I can certainly add it back in if the group votes for that. not a difficult change at all.
14:10:47 <efried> With the root ID, I can just call with ?tree=root and I'm done.
14:11:08 <edleafe> makes sense to me
14:11:14 <jaypipes> efried: well, you could also just do ?tree=<some_provider_uuid> and the backend can query on root.
14:11:25 <jaypipes> efried: meaning there's no reason to expose the attribute.
14:12:01 <efried> if it works that way, I'd be cool with that.  But I still think there's reason to have the root.
14:12:16 <jaypipes> like I said, I'm cool putting it in
14:12:19 <efried> Thinking about the scheduler using it to figure out e.g. the compute host.
14:12:28 <jaypipes> ack
14:12:32 <jaypipes> ok, let's vote...
14:12:43 <jaypipes> #vote expose the root
14:12:47 <jaypipes> sounds kinky.
14:12:55 <bauzas> startvote FTW
14:13:16 <edleafe> Simpler: anyone opposed?
14:13:20 <jaypipes> well, let's do it this way... does anyone NOT want to expose the root?
14:13:21 <cdent> It was likely me that was opposed originally because it seemed an unecessary detail and I was trying to limit the growth of atributes in the representation
14:13:25 <jaypipes> edleafe: heh, jinks
14:13:26 <edleafe> jinx
14:13:29 <jaypipes> lol
14:13:44 <bauzas> seriously? I dunno
14:13:54 <cdent> but at this stage, given the extent of hairiness that nested is looking like it is going to become, I don’t reckon it matters
14:13:57 <bauzas> use a coin ?
14:14:00 <cdent> there’s going to be a lot of hair
14:14:03 <cdent> so I’d say go for it
14:14:18 <jaypipes> bauzas: what say you?
14:14:20 <bauzas> I don't think it hurts
14:14:24 <edleafe> I don't hear anyone saying no, so...
14:14:28 <edleafe> #agreed Add root provider uuid to GET /resource_providers
14:14:33 <jaypipes> dansmith, mriedem: any thoughts?
14:14:45 <bauzas> jaypipes: I meant we should flip a coin
14:14:49 <bauzas> for deciding
14:14:53 <bauzas> but meh
14:15:08 <dansmith> I'd have to read back
14:15:15 <bauzas> just a stupid untranslatable and unbearable French try of joke
14:15:23 <jaypipes> bauzas: :)
14:15:32 <edleafe> jaypipes: anything else on the nested RP series to discuss now?
14:15:36 <mriedem> so we're talking about exposing something when we don't have a use case to use it?
14:15:39 <mriedem> or a need to use it yet?
14:15:44 <bauzas> I think the spec is pretty rock solid
14:16:07 <bauzas> mriedem: we have one approved spec that would use nested RPs
14:16:09 <jaypipes> mriedem: no, there's definitely a use case for it.
14:16:33 <bauzas> oh, the root UUID ?
14:16:35 <bauzas> well, meh
14:16:41 <jaypipes> mriedem: it's something that *could* be derived by the caller though. in other words, it just makes life a little easier for the scheduler code.
14:16:52 <bauzas> lemme say something terrible
14:17:07 <bauzas> just pass a parameter for telling whether we should return it
14:17:09 <bauzas> tadaaaaaaa
14:17:40 <dansmith> um
14:17:41 <bauzas> so, honestly, I don't care and like I said, it doesn't hurt
14:17:49 <mriedem> given i don't have context on how the scheduler code is going to look with or without it, i can't really say
14:17:56 <mriedem> if it makes the scheduler client code better, then sure, throw it in
14:18:04 <bauzas> it's not a performance problem, right?
14:18:10 <dansmith> I don't understand why we wouldn't if we have the data
14:18:10 <bauzas> so, should we really care of that?
14:18:37 <mriedem> yeah, the less rebuilding of the tree client-side is the way to go
14:18:41 <jaypipes> bauzas: no, nothing perf related
14:19:05 <jaypipes> ok, it's settled then, let's move on.
14:19:09 <efried> I'll update the review.
14:19:15 <jaypipes> danke
14:19:15 <edleafe> jaypipes: again, anything else on the nested RP series to discuss now?
14:19:24 <bauzas> jaypipes: yeah I know, so honestly not a big deal if we leak it
14:19:31 <jaypipes> edleafe: just to note that I'm rebasing the n-r-p series on the no-orm-resource-providers HEAD
14:19:47 <edleafe> jaypipes: got it
14:19:53 <edleafe> Next up:
14:19:57 <edleafe> #link Add traits to GET /allocation_candidates https://review.openstack.org/479776
14:20:23 <edleafe> alex_xu is back this week, so we should see some activity there
14:20:36 <alex_xu> yea, i
14:20:42 <alex_xu> 'm working on it
14:20:56 * alex_xu isn
14:21:00 <alex_xu> ...
14:21:26 <alex_xu> new keyboard layout...
14:21:31 <cdent> :)
14:21:32 <edleafe> alex_xu: same thing with
14:21:33 <edleafe> #link Add traits to get RPs with shared https://review.openstack.org/478464/
14:21:35 <efried> Use a Dvorak keyboard.  The ' is nowhere near the <Enter> key.
14:21:36 <edleafe> ?
14:22:50 <mriedem> i thought we were deferring shared support from queens?
14:22:56 <mriedem> why bother with api changes?
14:23:20 <mriedem> because when we start working on what the client needs for that support, we might need to change the api
14:23:36 <mriedem> or, is this totally not that and i should shut up?
14:24:03 * bauzas bbiab (kids)
14:24:12 <mriedem> yeah nevermind, this isn't what i thought it was
14:24:53 <edleafe> moving on
14:24:55 <edleafe> #link Allow _set_allocations to delete allocations https://review.openstack.org/#/c/501051/
14:25:05 <edleafe> cdent: anything going on with that?
14:25:19 <cdent> it’s just waiting for people to review it pretty much
14:25:53 <cdent> it’s a precursor to doing POST /allocations
14:26:05 <edleafe> Good segueway
14:26:06 <edleafe> #link WIP - POST /allocations for >1 consumer https://review.openstack.org/#/c/500073/
14:26:46 <edleafe> next up
14:26:47 <edleafe> #link Use ksa adapter for placement https://review.openstack.org/#/c/492247/
14:27:10 <edleafe> efried: any comments on these? They look pretty straightforward to me
14:27:39 <efried> The base of that series is getting final reviews from mriedem at this point.
14:27:55 <efried> That patch itself should indeed be pretty straightforward.
14:28:21 <efried> And the rest of the stuff in that series doesn't have anything to do with placement/scheduler.
14:28:29 <mriedem> got the tab open
14:28:44 <edleafe> next up
14:28:45 <edleafe> #link Migration allocation fixes: series starting with https://review.openstack.org/#/c/498950/
14:29:01 <edleafe> That series is moving along
14:29:54 <edleafe> Final review on the agenda:
14:29:56 <edleafe> #link Alternate hosts: series starting with https://review.openstack.org/#/c/486215/
14:30:17 <edleafe> I have to add versioning to the allocation_request in the Selection object
14:30:20 <edleafe> :(
14:30:31 <mriedem> jesus does that bottom change still have the s/failure/error/ comment?!
14:31:02 <edleafe> mriedem: what comment?
14:31:21 <mriedem> nvm
14:31:42 <edleafe> ok
14:32:05 <edleafe> I also need suggestions for naming the parameter added to the select_destinations() RPC call
14:32:27 <edleafe> This tells the scheduler to return the selection objects and alternates
14:32:39 <edleafe> I called it 'modern_flag' as a placeholder
14:32:46 <edleafe> let the bikeshedding begin!
14:33:18 <edleafe> Please add your thoughts to the review
14:33:22 <edleafe> Moving on
14:33:23 <edleafe> #topic Bugs
14:33:28 <edleafe> 2 new ones:
14:33:36 <edleafe> #link https://bugs.launchpad.net/nova/+bugs?field.tag=placement
14:33:44 <edleafe> #link placement server needs to retry allocations, server-side https://bugs.launchpad.net/nova/+bug/1719933
14:33:45 <openstack> Launchpad bug 1719933 in OpenStack Compute (nova) "placement server needs to retry allocations, server-side" [Medium,In progress] - Assigned to Jay Pipes (jaypipes)
14:34:05 <edleafe> This was uncovered by mriedem trying to start 1000 servers at once
14:34:24 <mriedem> which wouldn't have fixed the ultimately reason why i was hitting that, but yeah
14:34:25 <jaypipes> edleafe: yeah, I'm on that
14:34:31 <mriedem> *ultimate
14:34:40 <edleafe> cool
14:34:44 <edleafe> The other is:
14:34:45 <edleafe> #link Evacuate cleanup fails at _delete_allocation_for_moved_instance https://bugs.launchpad.net/nova/+bug/1721652
14:34:46 <openstack> Launchpad bug 1721652 in OpenStack Compute (nova) pike "Evacuate cleanup fails at _delete_allocation_for_moved_instance" [High,Confirmed]
14:34:59 <mriedem> gibi has started a recreate for ^
14:35:18 <mriedem> https://review.openstack.org/#/c/510176/
14:36:32 <edleafe> #link Functional test for bug 1721652https://review.openstack.org/#/c/510176/
14:36:32 <openstack> bug 1721652 in OpenStack Compute (nova) pike "Evacuate cleanup fails at _delete_allocation_for_moved_instance" [High,Confirmed] https://launchpad.net/bugs/1721652
14:36:41 <edleafe> #undo
14:36:42 <openstack> Removing item from minutes: #link https://review.openstack.org/#/c/510176/
14:36:50 <edleafe> #link Functional test for bug 1721652 https://review.openstack.org/#/c/510176/
14:37:22 <edleafe> Anything else for bugs?
14:38:23 * cdent watches the pretty tumbleweeds
14:38:30 <mriedem> mr gorbachev, tear down this meeting
14:38:37 <edleafe> nope
14:38:39 <edleafe> #topic Open Discussion
14:38:55 <edleafe> Getting allocations into virt (e.g. new param to spawn). Some discussion here:
14:38:58 <edleafe> #link Getting allocations into virt http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2017-10-04.log.html#t2017-10-04T13:49:18-2
14:39:38 <edleafe> efried: wanna lead this?
14:40:05 <efried> Some alternatives that were (briefly) discussed:
14:40:17 <efried> Adding allocations to the request spec
14:40:24 <efried> Or elsewhere in the instance object.
14:40:47 <efried> IIRC, those were rejected because of general resistance to glomming more gorp onto those things.
14:41:04 <edleafe> Yeah, glomming gorp is bad
14:41:36 <efried> The drive for this is for virt to be able to understand comprehensively what has been requested of it.
14:41:58 <mriedem> right now it's just going to be passing shit through flavor extra specs isn't it?
14:42:00 <mriedem> unless we change that?
14:42:33 <efried> Which is limited
14:42:33 <mriedem> doesn't alex_xu have a spec for specifying traits in a flavor
14:42:38 <efried> yes
14:43:12 <edleafe> efried: so you want something more spectific, no?
14:43:18 <mriedem> we agreed to limited support for stuff like vgpus in queens
14:43:23 <alex_xu> I guess efried is talking about specific resource allocated to the instance?
14:43:28 <mriedem> what are you needing? like a complex data structure?
14:43:35 <edleafe> E.g., not just a VF, but a VF on a particular PF?
14:43:38 <efried> Right; flavor extra specs tells us what was requested generically; the allocations will tell us specific RPs etc.
14:43:50 <efried> edleafe just so.
14:44:02 <efried> mriedem Not any more complex than the allocations object :)
14:44:19 <mriedem> so, as a user, i want not only a VF, but the 3rd VF on 4th PF?
14:44:23 * edleafe remembers when we were making clouds...
14:44:41 <efried> mriedem Or possibly just "a VF on the 4th PF".  But yeah, that's the general idea.
14:44:49 <mriedem> ew
14:44:51 <efried> Because placement is going to have allocated inventory out of a specific RP.
14:44:58 <mriedem> do we need this for queens?
14:45:06 <efried> If spawn doesn't have any way to know which one, how does it know where to take the VF from?
14:45:19 <mriedem> it's random isn't it?
14:45:27 <efried> What's random?
14:45:38 <mriedem> the PF
14:45:39 <efried> Certainly not which PF the VF comes from.
14:45:43 <efried> No, not at all.
14:45:46 <mriedem> or is that all whitelist magic?
14:45:50 <efried> Could be based on traits, inventory, etc.
14:45:54 <efried> Not even thinking about whitelist.
14:46:29 <efried> Placement narrowed it down, scheduler picked one.  Virt needs to know which one.
14:46:48 <mriedem> can't virt ask placement for the one that was picked?
14:46:59 <efried> Yes, could.
14:47:06 <efried> But I got the impression we didn't want virt to talk to placement.
14:47:18 <mriedem> it already does for reporting inventory
14:47:27 <efried> bauzas and dansmith both expressed that
14:47:31 <efried> Not directly
14:47:49 <mriedem> virt == compute service == contains the rt that reports inventory
14:47:55 <mriedem> in my head anyway
14:48:00 <dansmith> virt != compute service:)
14:48:06 <efried> "virt driver" then.
14:48:06 <dansmith> virt should not talk directly to placement, IMHO
14:48:09 <dansmith> compute should
14:48:15 <mriedem> ok,
14:48:22 <mriedem> so compute manager asks placement for the allocations for a given request,
14:48:28 <mriedem> builds those into some fancy pants object,
14:48:32 <mriedem> and passes that to the virt drive
14:48:33 <mriedem> *driver
14:48:44 <mriedem> ?
14:49:07 <efried> Didn't scheduler already give that allocations object to compute manager?
14:49:09 <dansmith> compute should provide the allocations to virt when needed, yeah
14:49:15 <mriedem> just like the neutron network API asks neutron for ports, builds network_info and passes that to spawn
14:49:30 <mriedem> efried: no
14:49:53 <efried> So it'll have to ask placement for that allocations object.  Okay.
14:49:56 <mriedem> so this essentially sounds like the same thing we do for bdms and ports
14:50:08 <mriedem> so in _build_resources you yield another new thing
14:50:13 <mriedem> and pass that to driver.spawn
14:50:17 <efried> And yeah, I guess we could funnel it into a pythonic nova object (which may eventually be an os-placement object)
14:50:29 <efried> right
14:50:32 <mriedem> oo we're already talking about new libraries?!
14:50:40 <mriedem> :P
14:50:41 <efried> When we split placement out into its own thing?
14:50:49 <efried> Sorry, don't mean to muddy the waters.
14:51:13 <mriedem> ok so in the Slime release...
14:51:36 <mriedem> anyway, i think you get the general idea of what the compute would do yeah/
14:51:37 <mriedem> ?
14:52:00 <efried> You're saying this isn't something we want to do in Queens?
14:52:00 <mriedem> is there a specific bp that is going to need this?
14:52:15 <mriedem> there are things we can want to do, and things we can actually get done
14:52:34 <efried> Well, I don't see how e.g. the vGPU thing is going to work without it.
14:52:37 <mriedem> i'm trying to figure out what we actually need to get done so we can focus on those first
14:52:52 <efried> Unless we bridge the gap by having the virt driver ask placement for the allocations.
14:53:08 <mriedem> is there any poc up yet for that?
14:53:19 <mriedem> maybe the xen team hasn't gotten that far?
14:53:21 <efried> For vGPU?
14:53:23 <mriedem> yeah
14:53:41 <dansmith> bauzas was going to be working on this
14:53:42 <mriedem> anyway, maybe it will be needed, but i'd check with the other people working on this too
14:53:43 <efried> Wasn't there a big stack with mdev in libvirt?
14:53:52 <dansmith> providing the allocation to virt so we could do that
14:53:53 <dansmith> however,
14:53:54 <mriedem> the totally separate effort?
14:54:00 <dansmith> we can use the flavor for right now and move on
14:54:33 <efried> dansmith And accept that the virt driver may pick a different PF than that from which placement allocated the inventory?
14:54:50 <efried> And have the virt driver duplicate the logic to check for traits?
14:54:51 <dansmith> efried: placement isn't picking PFs right now
14:54:51 <mriedem> efried: so how about you follow up with the xen team and see what they had in mind for this,
14:55:25 <efried> placement is picking specific RPs.  Depending how the RPs are modeled, those could be PFs.  Just using PFs as a general example.
14:55:30 <dansmith> efried: it's just picking "has a vgpu" which means virt can easily grab the first free one and do that thing
14:55:40 <efried> Unless traits.
14:55:55 <dansmith> efried: we don't have nrps, which means it's not picking traits
14:56:06 <dansmith> er, picking PFs,
14:56:08 <efried> All of that is landing in Queens, early.
14:56:13 <efried> at least in theory.
14:56:15 <dansmith> but also means no multiples, so traits are irrelevant
14:56:28 <efried> Also hopefully landing in Queens.
14:56:35 * bauzas is back
14:56:41 <dansmith> efried: yeah, in theory and we're working on it, but we can easily land a flavor-based thing right now and have that as a backup if we don't get NRPs or something else blocks us
14:56:43 <dansmith> it's trivial
14:57:07 <edleafe> 3 minutes to go
14:57:17 <efried> Let me ask it this way: does putting allocations in a spawn param need a blueprint?
14:57:20 <dansmith> if we linearize everything, something is definitely going to miss queens
14:57:26 <dansmith> efried: not IMHO
14:58:08 <efried> Cool.  Then if someone gets the bandwidth to propose a patch, and it doesn't seem too heinous, it could happen.
14:58:09 <dansmith> efried: the thing I'm worried about is that if we go the allocation route,
14:58:27 <dansmith> you have to build a big matrix of rp_uuids to actual devices and figure out how to do all that accounting before you can do the basic thing
14:58:39 <dansmith> however, if we just assume one set of identical gpus per node with flavor right now,
14:58:45 <dansmith> you can get basic support in place
14:59:08 <dansmith> if we rabbit-hole on this after NRPs are done, we could likely miss queens and bauzas will be taken to the gallows
14:59:10 <efried> dansmith Sure, fair point.  That matrix of RP UUIDs to devices is something that's going to have to happen.
14:59:17 <dansmith> efried: totes
14:59:35 <dansmith> efried: but let's not hamstring any sort of support on that when we can do the easy thing right now
14:59:54 <efried> Sure
15:00:01 <edleafe> OK, thanks everyone! Continue the discussion in -nova
15:00:01 <edleafe> #endmeeting