14:00:19 <edleafe> #startmeeting nova_scheduler
14:00:20 <openstack> Meeting started Mon Jan 29 14:00:19 2018 UTC and is due to finish in 60 minutes.  The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:22 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:24 <openstack> The meeting name has been set to 'nova_scheduler'
14:00:46 <efried> @/
14:00:49 <alex_xu> o/
14:00:53 <ttsiouts> o/
14:00:56 <takashin> o/
14:01:04 <edleafe> Looks like efried didn't comb his hair
14:01:24 * efried is a conditioner commercial
14:01:26 <bauzas> \o
14:01:28 <cdent> i'm only vaguely here
14:02:03 <bauzas> can we please try to make that meeting short ? :)
14:02:09 <bauzas> I have a ton of things to catch up :)
14:02:28 <edleafe> bauzas: sounds good
14:02:31 <edleafe> let's start then
14:02:38 <edleafe> #topic Reviews
14:02:53 <edleafe> #link Provider Tree series, starting with: https://review.openstack.org/#/c/533808/
14:02:56 <edleafe> Most of the bottom patches are +W'd
14:02:59 <edleafe> #link First provider tree patch in progress is https://review.openstack.org/#/c/537648/
14:03:02 <edleafe> Anything to add, efried?
14:03:24 <mriedem> o/
14:03:26 <efried> The weekend saw two of eight merge there.  The remaining six are In The Gate.
14:03:36 <efried> <ominous cadence>
14:03:58 <mriedem> are a bunch failing on functional job timeouts?
14:03:58 <edleafe> ok
14:04:00 <efried> I'll be cleaning up the rest of the series over the next couple of days, but of course those pieces are Rocky-bound.
14:04:14 <efried> mriedem Infra has declared no rechecks until they've finished sleuthing.
14:04:21 <mriedem> ooo...
14:04:27 <bauzas> it should no longer be called "the gate", rather "the wall"
14:04:33 <mriedem> was there something in the ML?
14:04:39 <mriedem> or just a channel message?
14:04:39 <edleafe> "the pit"
14:04:40 <efried> mriedem IRC broadcast
14:04:42 <bauzas> mriedem: see IRC topics
14:04:54 <efried> "...of despair"
14:04:54 <mriedem> ah
14:05:08 <edleafe> next up:
14:05:09 <edleafe> #link Nested RP traits selection: https://review.openstack.org/#/c/531899/
14:05:12 <edleafe> Not much recent action on this, as the series is postponed until Rocky
14:05:14 <bauzas> and https://wiki.openstack.org/wiki/Infrastructure_Status
14:05:31 <bauzas> says that we had an infra cloud provider outage, hence the very large delay
14:05:48 <edleafe> #link Singular request group traits, starting with: https://review.openstack.org/#/c/536085/
14:05:51 <edleafe> This series is also +W'd, although a cleanup patch at the end needs work
14:05:55 <bauzas> the traits API thing has been merged, right?
14:06:17 <edleafe> bauzas: which thing?
14:06:32 <alex_xu> yes
14:06:45 <bauzas> the API side
14:06:51 <alex_xu> bauzas: the traits support in allocation candidates API is merged
14:07:24 <alex_xu> the patch for support traits in flavor extra spec is still in the gate pipeline
14:07:24 <mriedem> https://review.openstack.org/#/c/536085/ is left
14:07:29 <mriedem> to tie things together
14:07:40 <alex_xu> yes
14:07:43 <bauzas> yeah ok, we're on the same page
14:07:46 <mriedem> alex_xu: did you see my request for a functional test?
14:07:48 <bauzas> I was meaning that
14:08:05 <alex_xu> mriedem: the patch for bump the timeout?
14:08:14 <mriedem> alex_xu: no,
14:08:17 <mriedem> on the traits stuff,
14:08:34 <mriedem> i think we should have a functional test with 2 computes and one decorated with a trait that we use via flavor extra spec to see everything works as expected
14:08:35 <mriedem> end to end
14:08:50 <bauzas> +1
14:09:03 <alex_xu> mriedem: yea, I will add one, gibi_ wants that also
14:09:16 <mriedem> thanks. i can't remember which patch i mentioned that in.
14:09:27 <edleafe> Moving on...
14:09:28 <edleafe> #link Granular resource requests, starting with: https://review.openstack.org/#/c/517757/
14:09:31 <edleafe> Still WIP; to be completed in Rocky
14:09:52 <edleafe> Next:
14:09:53 <edleafe> #link Remove microversion fallback: https://review.openstack.org/#/c/528794/
14:09:56 <edleafe> Simple enough; caught in Zuul hell
14:10:03 <edleafe> Needs some +2s, also
14:10:30 <edleafe> Last on the agenda:
14:10:31 <edleafe> #link Use alternate hosts for resize: https://review.openstack.org/#/c/537614/
14:10:34 <edleafe> The main patch finally merged! All that's left is this follow-up unit test
14:10:39 <bauzas> honestly, I think we need to identify what we can still land for Rocky and what can be postponed
14:10:50 <bauzas> I know mriedem made an etherpad for that
14:10:56 <edleafe> s/Rocky/Queens
14:11:05 <bauzas> because I don't know when the gate problems will be fixed
14:11:10 <bauzas> heh, oops, yes
14:11:29 <bauzas> we can't just recheck the entire world
14:11:40 <mriedem> https://etherpad.openstack.org/p/nova-queens-blueprint-status is what i'm still tracking for queens
14:11:40 <edleafe> We're OpenStack - we can do anything!
14:11:52 <bauzas> (and I tell that as I also have one important change blocked in the gate)
14:12:30 <bauzas> mriedem: yup, I was talking of that etherpad
14:12:55 <edleafe> Is there anything on that etherpad we need to discuss? Seems straightforward to me
14:13:05 <mriedem> no i don't think so,
14:13:13 <mriedem> i think all the approved NRP stuff is mergd,
14:13:16 <mriedem> so the rest goes to rocky
14:13:24 <mriedem> i'll probably put a procedural -2 on the bottom change in the series
14:13:44 <mriedem> oh nvm i see a bunch is approved now
14:13:45 <efried> The bottom unapproved one, you mean
14:14:16 <efried> https://review.openstack.org/#/c/537648/ is the first unapproved in the series.
14:14:20 <mriedem> yeah
14:14:53 <mriedem> question
14:14:59 <mriedem> is there an 'end' patch in this series yet
14:14:59 <mriedem> ?
14:15:09 <mriedem> like, when do we say a user can use this stuff
14:15:23 <efried> https://review.openstack.org/#/c/520246/
14:15:31 <efried> if by "user" you mean "virt driver"
14:15:50 <efried> And can use in the sense that they can use update_provider_tree, but they can't make use of nested therein.
14:15:58 <mriedem> i mean,
14:16:04 <mriedem> an API user or operator,
14:16:12 <mriedem> modeling server create requests that rely on NRP
14:17:15 <efried> Actual hierarchical provider models will require 1) the above (https://review.openstack.org/#/c/520246/), 2) Jay's series on NRP in alloc cands, and 3) a virt driver impl of update_provider_tree.
14:17:16 <bauzas> mriedem: the theory is that a good customer usecase would be the VGPU stuff
14:17:20 <efried> None of those things are making Q.
14:17:31 <efried> But are very close.
14:17:48 <efried> Well, somewhat close.  At least already started.
14:17:49 <bauzas> mriedem: ie. xen folks providing a nested inventory so that a server boot asking for VGPU in the flavor would use that
14:18:13 <bauzas> that's the only usecase I'm aware of in the foreseenable future
14:18:23 <mriedem> vgpu and xen reminds me, there are open xen vgpu patches, so what's the support statement for the xen driver wrt gpu in queens?
14:18:25 <bauzas> because PCI things are way behind that in terms of schedule
14:18:33 <mriedem> https://review.openstack.org/#/q/topic:bp/add-support-for-vgpu+status:open
14:19:03 <mriedem> does the xen driver have the same basic support for vgpu as the libvirt driver? and the remaining patches are future work for using NRP?
14:19:12 <bauzas> mriedem: AFAICT, they started to play with nested RPs
14:19:15 <efried> 1) is a few lines of test code from being ready.  2) is fairly far along.  3) gpu stuff has been written based on WIPs of #1, but is obviously blocked by that, and is still pretty nascent afaik
14:19:26 <efried> yes
14:19:40 <bauzas> mriedem: yes, basic feature parity, except they haven't tested all the server actions AFAICS
14:19:58 <mriedem> ok thanks both; bauzas we should have a patch for a feature support matrix entry for vgpu i think
14:20:11 <bauzas> mriedem: yup, I saw your comment and I agree
14:20:12 <mriedem> but now i'm derailing this meeting and will stop
14:20:25 <bauzas> +1
14:20:43 <bauzas> (and I asked for a quick meeting - pretty unfair if I chat too much)
14:21:15 <edleafe> bauzas: heh
14:21:30 <edleafe> Any other reviews to discuss?
14:22:06 <edleafe> ok then
14:22:06 <edleafe> #topic Bugs
14:22:07 <edleafe> #link Placement bugs: https://bugs.launchpad.net/nova/+bugs?field.tag=placement
14:22:11 <edleafe> Nothing new this week; anyone have anything to say about bugs?
14:22:38 <mriedem> we got'em
14:22:49 <mriedem> just keep an eye out for new bugs each day until RC1
14:22:51 <bauzas> just that I'll do some triage
14:23:09 <edleafe> Sounds good
14:23:11 <edleafe> #topic Open Discussion
14:23:16 <mriedem> ooo i have something
14:23:21 <edleafe> go for it
14:23:23 <mriedem> http://lists.openstack.org/pipermail/openstack-dev/2018-January/126653.html
14:23:25 <mriedem> #link http://lists.openstack.org/pipermail/openstack-dev/2018-January/126653.html
14:23:33 <mriedem> just a simple thing i POC'ed over the weekend,
14:23:40 <mriedem> to put the driver.capabilities as traits on the compute node RP
14:24:04 <mriedem> i think this would be useful for scheduling things that depend on driver capabilities, like multiattach and tagged attach,
14:24:14 <mriedem> which are things today that if we pick the wrong compute, we fail and don't reschedule
14:24:31 <mriedem> and also doesn't require a full blown capabilities API
14:24:33 <bauzas> mriedem: I saw your email but I was waiting you to be awake
14:24:52 <bauzas> mriedem: because looks like I wasn't thinking of the same than you when I was looking at a "capabilities API"
14:25:08 <efried> First blush I like the idea.
14:25:12 <bauzas> mriedem: in your email, you mention exposing the CPU features and other nasty bits, right?
14:25:35 <efried> IMO that should be done anyway, but is unrelated to driver capabilities.
14:25:46 <bauzas> oh, sorrt
14:25:49 <bauzas> misunderstood
14:25:54 <cdent> efried: yeah, my thought too
14:26:02 <mriedem> bauzas: no
14:26:03 <bauzas> the proposal is to use traits for exposing what you can do for a compute
14:26:13 <bauzas> correct?
14:26:21 <mriedem> i've always wanted a way to take what's in the driver.capabilities dict,
14:26:25 <mriedem> and expose that out of the rest api
14:26:30 <mriedem> this is an easy way to do that
14:26:55 <efried> In the sense of "has image cache" or "supports quiesce".  Not in the sense of "HW_CPU_X86_AVX".
14:27:01 <mriedem> correct
14:27:09 <bauzas> gotcha
14:27:29 <mriedem> we could have stored these along with the compute_nodes table and written a capabilities subresource API on os-hypervisors,
14:27:43 <mriedem> but a new API just seems like it would get stuck in committee
14:28:09 <mriedem> anyway, it was just an idea, thrown out there for discussion
14:28:10 <efried> I mean, I get that there could be some confusion, because we're mixing virt driver traits with hardware traits on the same resource provider.
14:28:28 <mriedem> not all traits on a RP are going to be hw traits, are they?
14:28:33 <mriedem> that's a bit limited in scope
14:28:50 <mriedem> ages ago we talked about hypervisor version as a trait
14:29:26 <cdent> presumably an rp should expose any traits which are useful in selection
14:29:32 <efried> Yeah, I'm not saying we shouldn't do it.  I'm just saying muddling the distinction between "the box" and "the virt driver"...
14:29:37 <cdent> traits are cheap and designed to be cheap
14:30:11 <efried> We *could* make a dummy RP with no inventory that represents the virt driver.  Put it... really anywhere in the tree.
14:30:21 <efried> But to cdent's point, RPs are more expensive than traits.
14:30:30 <mriedem> yeah that seems overly complicated to me
14:30:34 <efried> Agree.
14:30:35 <edleafe> efried: yeah, I was just going to say that
14:30:54 <efried> reductio ad absurdam and all that.
14:31:06 <mriedem> soon we'll have an SDD for modeling this all
14:31:07 <edleafe> and "supports foo" is better than "version 1.23"
14:31:14 <efried> ++
14:31:39 <bauzas> sorry, was on another discussion
14:31:42 * mriedem gets lost down memory lane http://docs.oasis-open.org/sdd/v1.0/os/sdd-spec-v1.0-os.html
14:31:48 <bauzas> but I think it's a doable way
14:32:05 * cdent demerits mriedem for making a refernce to oasis
14:32:06 <mriedem> i'll throw it in the PTG etherpad and we can discuss there, and move on here
14:32:23 <bauzas> what's next to talk ? TOSCA support ?
14:32:32 <mriedem> mmm https://www.dmtf.org/standards/cim
14:32:46 <edleafe> OK, anything else for opens?
14:33:26 <edleafe> OK, everyone - back to whatever it was you were doing before.
14:33:29 <edleafe> #endmeeting