14:02:25 <bauzas> #startmeeting nova_scheduler
14:02:25 <openstack> Meeting started Mon Feb 22 14:02:25 2016 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:02:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:02:29 <openstack> The meeting name has been set to 'nova_scheduler'
14:02:31 <bauzas> #chair edleafe
14:02:32 <openstack> Current chairs: bauzas edleafe
14:02:33 <bauzas> nah
14:02:45 <bauzas> #chair cdent
14:02:46 <openstack> Current chairs: bauzas cdent edleafe
14:02:52 <bauzas> because he named me
14:02:57 <cdent> heh
14:02:58 <edleafe> chairs al around!
14:03:01 <edleafe> all
14:03:07 <bauzas> supp' ?
14:03:18 <edleafe> is there an agenda prepared?
14:03:27 <bauzas> heh, guessing n0ano's one
14:03:32 <bauzas> bugs, features and open
14:03:34 <bauzas> so
14:03:44 <bauzas> #topic bugs (because we don't like'em)
14:03:53 <bauzas> so?
14:04:01 <bauzas> anything to notice ?
14:04:22 <edleafe> none that I recall
14:04:25 <bauzas> AFAIK, there was some initiative from n0ano to figure some Intel/RAX folks to help us
14:04:32 <bauzas> but I haven't heard more than that
14:04:51 * bauzas is checking the bug list
14:05:29 <bauzas> #info https://bugs.launchpad.net/nova/+bugs?field.tag=scheduler
14:05:35 <Yingxin> I'll try to fix 1523450, 1523459, 1523506, 1515870(1517770)
14:05:59 <bauzas> ok, the last triaged bug is 16 years old
14:06:02 <bauzas> oops
14:06:08 <bauzas> s/years/days :D
14:06:26 <bauzas> Yingxin: cool, ping us anytime if you need further help or guidance
14:06:32 <Yingxin> https://bugs.launchpad.net/nova/+bug/1523506 I don't know whether it is actually a bug to fix.
14:06:32 <openstack> Launchpad bug 1523506 in OpenStack Compute (nova) "hosts within two availability zones" [Undecided,Incomplete] - Assigned to Yingxin (cyx1231st)
14:07:04 <bauzas> Yingxin: okay, I'll look into that one
14:07:06 <edleafe> DO we have more detail on what is needed for https://bugs.launchpad.net/nova/+bug/1431291 ?
14:07:06 <openstack> Launchpad bug 1431291 in OpenStack Compute (nova) "Scheduler Failures are no longer logged with enough detail for a site admin to do problem determination" [High,Incomplete] - Assigned to Pranav Salunke (dguitarbite)
14:07:06 <jaypipes> Yingxin: good evening!
14:07:18 <bauzas> Yingxin: but I fixed most of the races like 1.5yrs ago
14:07:30 <Yingxin> jaypipes: good evening~
14:07:51 <Yingxin> bauzas: I think I've found another one :P
14:07:57 <bauzas> Yingxin: ping me tomorrow morning EU if you wish and I'll triage https://bugs.launchpad.net/nova/+bug/1523506
14:07:57 <openstack> Launchpad bug 1523506 in OpenStack Compute (nova) "hosts within two availability zones" [Undecided,Incomplete] - Assigned to Yingxin (cyx1231st)
14:08:13 <bauzas> Yingxin: interesting, but I doubt :p
14:08:18 <Yingxin> bauzas: ok
14:08:43 <bauzas> edleafe: well, that one is Incomplete, so... :D
14:09:07 <_gryf> i've been working on that one: https://bugs.launchpad.net/nova/+bug/1442024 didn't able to reproduce it, scenario and all steps i've performed are as a comment. no one is complained so far
14:09:07 <openstack> Launchpad bug 1442024 in OpenStack Compute (nova) "AvailabilityZoneFilter does not filter when doing live migration" [Medium,Invalid] - Assigned to Roman Dobosz (roman-dobosz)
14:09:17 <bauzas> edleafe: see https://bugs.launchpad.net/nova/+bug/1431291/comments/22
14:09:17 <openstack> Launchpad bug 1431291 in OpenStack Compute (nova) "Scheduler Failures are no longer logged with enough detail for a site admin to do problem determination" [High,Incomplete] - Assigned to Pranav Salunke (dguitarbite)
14:09:27 <edleafe> bauzas: exactly. What would it take to give ops a good enough understanding?
14:09:52 <bauzas> edleafe: my point is that we need actionable items and that bug reports doesn't
14:09:57 <edleafe> We enhanced the logging - what else are they asking for?
14:09:58 <bauzas> so, it's incomplete
14:10:09 <bauzas> yeah, we can leave that one rest in peace IMHO
14:10:17 <bauzas> it's an invalid, you don't need to care about it
14:10:28 <edleafe> But it was a High, too
14:10:40 <bauzas> and ? :)
14:10:54 <bauzas> anywaty
14:11:06 <bauzas> _gryf: cool, thanks for helpiugn
14:11:09 <bauzas> moving on ?
14:11:18 * bauzas has fat fingers today
14:11:46 * edleafe gets bauzas a keyboard with bigger keys
14:11:54 <bauzas> I'll ask for AZERTY
14:11:58 <bauzas> anyway
14:12:11 <bauzas> #topic features and blueprints (because we like'em)
14:12:30 <bauzas> so big thread here
14:12:42 <bauzas> who shoots first?
14:13:13 <jaypipes> bauzas: i can provide a quick update on resource-providers progress.
14:13:18 <bauzas> \o/
14:13:22 <bauzas> jaypipes: shoot
14:13:45 <bauzas> btw. lemme put your ML report here
14:14:11 <bauzas> #info http://lists.openstack.org/pipermail/openstack-dev/2016-February/086371.html
14:14:21 <bauzas> jaypipes: you got the mic
14:14:23 <jaypipes> bauzas: I pushed up a new revision of the generic-resource-pools blueprint that changes the expected schema slightly (removes the resource_pools table and adds a couple fields to the resource_providers table).
14:14:37 <bauzas> jaypipes: yeah saw that one, it's in my pipe
14:15:01 <jaypipes> bauzas: It also had changes to remove the external_id field and forces the use of --aggregate-uuid option in nova resource-pool-create
14:15:04 <bauzas> cdent: I guess you're modifying your series to match with that ?
14:15:12 <cdent> bauzas: yes
14:15:19 <jaypipes> this was based on discussions with superdan, alaski and cdent on Friday
14:15:20 <bauzas> jaypipes: I saw, I began to mark some notes but not uploaded yet
14:15:28 <cdent> first patch is up (to adjust the models)
14:15:50 <bauzas> jaypipes: actually, lemme see if my notes are for the current PS or not
14:16:16 <bauzas> oh, I uploaded them
14:16:21 <bauzas> https://review.openstack.org/#/c/253187/11/specs/mitaka/approved/generic-resource-pools.rst
14:16:23 <bauzas> oops
14:16:24 <bauzas> #link https://review.openstack.org/#/c/253187/11/specs/mitaka/approved/generic-resource-pools.rst
14:16:33 <jaypipes> dstepanenko continues his work on the pci-generate-stats blueprint. reviews welcome on that: https://review.openstack.org/#/q/topic:bp/pci-stats-generate,n,z
14:16:59 <_gryf> is the bp about resource pools (and implementation) at risk due to feature freeze?
14:17:20 <jaypipes> _gryf: no. I believe we will be able to complete that one.
14:17:41 <bauzas> jaypipes: I had some concerns about the increase of complexity that BP was having
14:17:43 <jaypipes> _gryf: the compute-node-inventory one is slightly at risk but we're trying our best to get most of that pushed.
14:17:53 <bauzas> it introduces a REST API
14:17:55 <jaypipes> bauzas: it acutally has *less* complexity than befroe.
14:18:01 <bauzas> which I agree
14:18:03 <jaypipes> bauzas: yes.
14:18:16 <_gryf> jaypipes, ok, cool. if you require any help on that, just ping me on irc.
14:18:52 <bauzas> jaypipes: so for example, I was pointing out https://review.openstack.org/#/c/253187/11/specs/mitaka/approved/generic-resource-pools.rst@245
14:19:04 <bauzas> (just discovered that we can tag a specific line in a review, woot)
14:20:22 <bauzas> jaypipes: tbc, while I'm a big fan of your series, I just feel those need to be very described about what are the impact for the existing
14:20:25 <jaypipes> bauzas: so, your comment there... we *already* pull all aggregate information in the call to select_destinations().
14:20:42 <bauzas> indeed, but then we filter out
14:20:55 <bauzas> which is per-host
14:21:11 <jaypipes> I don't understand your point.
14:21:16 <bauzas> so, I need to make sure that what you want to modify is the only dummy ComputeNode.get_all() call
14:21:56 <jaypipes> bauzas: I'm not prescribing anything there other than a long-term use case to be satisfied that isn't at all what the scheduler currently supports.
14:23:41 <bauzas> jaypipes: sorry if I'm unclear or misunderstood something, I just want to understand what will change and what will stay :)
14:24:41 <jaypipes> bauzas: and I'd like to get some of these blueprints approved this cycle... I am struggling to add the level of detail you are asking for in all 6 of the blueprints in this series.
14:25:06 <jaypipes> bauzas: there comes a point when we need to be able to amend a blueprint after agreeing on the direction.
14:25:27 <bauzas> jaypipes: that's a good point
14:25:36 <edleafe> jaypipes: agreed
14:26:34 <jaypipes> bauzas: and I understand your concern around making any changes that require a refactoring of the filter shceduler.
14:27:24 <bauzas> jaypipes: again, I'm liking your direction, I'm just somehow struggling with operators impact - but we can figure that out later
14:27:51 <johnthetubaguy> we always need to think about the aim of the process, if there are details that are best delayed till you see the code, then thats fine
14:28:02 <bauzas> and cdent's patch series are worth reviewing them to see the impacts
14:28:12 <jaypipes> bauzas: you are describing concerns about something that is marked as a future use-case that isn't currently supported by Nova. so the impact to operators is non-existent.
14:28:17 <bauzas> johnthetubaguy: agreed
14:29:25 <johnthetubaguy> so there are some upgrade worries around not sorting out the future use case, but at some point we need to just make some forward progress, and fix things as we go
14:29:32 <bauzas> jaypipes: okay, it seems we can discuss that offline and see how we can match
14:29:49 <jaypipes> so I'd like to address the comments from cdent and bauzas in the next revision and get that pushed ASAP (as in less than an hour). And at that point I'd like an up/down vote on it, if we could manage that.
14:30:07 <johnthetubaguy> jaypipes: sounds like a good plan
14:30:21 <cdent> ++
14:30:41 <johnthetubaguy> not to derail things, do we want to delay the scheduler API to newton at this point?
14:31:27 <jaypipes> johnthetubaguy: no.
14:31:28 <cdent> I think dansmith has some opinions on that johnthetubaguy
14:31:46 <bauzas> jaypipes: so creating a new endpoint by end of M-3 ?
14:31:54 <jaypipes> johnthetubaguy: or at least if "scheduler API" means "support for the resource-pools stuff"
14:31:59 <jaypipes> bauzas: yes
14:32:29 <bauzas> well
14:33:15 <bauzas> okay, it seems that we have a plan, moving on then ?
14:34:45 <bauzas> jaypipes: so, given that FF is in 2 weeks, it means that I need somehow to find more review time than the expected one for the next 2 weeks :)
14:35:00 <bauzas> but if you feel that's doable, then okay
14:35:26 <johnthetubaguy> so its normally at this point I -2 all blueprints that don't currently have all their code up for review, for context
14:35:40 <johnthetubaguy> but I want us to make progress here, and we should keep trying for that
14:35:44 <johnthetubaguy> so lets see what we can do
14:36:05 <bauzas> ++
14:36:18 <edleafe> I have cycles available to help out, too
14:36:30 <johnthetubaguy> the only reason I bring up the API, is I think we could get that first bit done
14:36:42 <jaypipes> johnthetubaguy: ++
14:36:43 <johnthetubaguy> but adding the API seems like a mountain too far at this point
14:37:00 <jaypipes> johnthetubaguy: I guess I disagree.
14:37:27 <jaypipes> johnthetubaguy: plus, there's zero benefit to this work if there's no REST API that things can use to create shared pools of resources.
14:38:00 <johnthetubaguy> the benifit is we can add the rest API on top without having to implement the underneath bits
14:38:30 <johnthetubaguy> honestly, I do wonder about a nova-manage hack to let folks test out the new thing, while we agree a REST API
14:39:12 <dansmith> jaypipes: the rest api bit is really only required for the things we said are newton anyway right?
14:39:41 <johnthetubaguy> I think the shared storage stuff, kinda needs it, unless you slurp into the DB via a back door
14:39:50 <jaypipes> right.
14:40:01 <jaypipes> you need some way of adding those records.
14:40:16 <dansmith> right, but in mitaka, our only providers are internal -- compute nodes
14:40:23 <johnthetubaguy> but I would rather have the back door via nova-manage than a quickly written API, and marking those calls as experimental, will be remove, etc, etc
14:40:57 <jaypipes> if that's what you want, that's fine.
14:41:17 <dansmith> johnthetubaguy: I don't even think we need that
14:41:17 <johnthetubaguy> so lets step back, if we get only internal providers sorted for mitaka, we have made a massive step forward, compared to what it looked like two months ago
14:41:20 <jaypipes> I'm just a little weary from the analysis paralysis that's happened so far.
14:43:28 <bauzas> folks, that's very important conversation, and I feel we need to make an agreement, but could we move that offline ?
14:43:41 <bauzas> we're 15 mins away from the end of that meeting
14:43:48 <johnthetubaguy> so this is release critical right, how do we keep moving forward on this work
14:44:20 <johnthetubaguy> cdent: do we have any blockers to get compute's using the resource provider concept in mitaka, at this point?
14:44:37 <dansmith> johnthetubaguy: just the actual work and review.. thing major in the way, IMHO
14:44:48 <johnthetubaguy> if we drop the API, could we get in the supporting infrastructure for pools, even if its not useable?
14:45:03 <bauzas> I can see some in-flights patches
14:45:19 <bauzas> sec, pointing out the series
14:45:28 <cdent> johnthetubaguy: I've been targeting resource-pools as my goal, not entirely certain on the status of compute providers without doing some digging
14:45:36 <dansmith> johnthetubaguy: the generic pool stuff we want for mitaka is nearly merged, and the rest is out for newton, AFAIK
14:45:45 <bauzas> #link https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/generic-resource-pools
14:46:07 <bauzas> ^ that is the generic-rp implementation patches
14:46:37 <johnthetubaguy> OK, so if we cut off the API bits off the top, where are we?
14:46:39 <jaypipes> bauzas: that's only a small part.
14:46:42 <johnthetubaguy> mostly agreed?
14:47:21 <bauzas> https://review.openstack.org/#/q/project:openstack/nova+branch:master+topic:bp/resource-providers seems Implemented to me
14:47:25 <bauzas> cdent: jaypipes: right?
14:47:30 <jaypipes> johnthetubaguy: we still need the resource-tracker pieces, a nova-manage tool to add a resource pool, the work to change the scheduler to look at the resource pool inventory instead of the compute node's out-of-whack view of the shared resources.
14:48:11 <dansmith> jaypipes: that's all stuff for newton, yes?
14:48:29 <jaypipes> dansmith: I was really hoping to have it in mitaka :(
14:48:54 <dansmith> jaypipes: last week on the hangout we said that was newton stuff... I'm confused
14:49:18 <dansmith> I feel like there's no _way_ that is all happening in mitaka
14:49:21 <johnthetubaguy> so that was our main disagreement post midcycle, I guess, I thought we agreed a different set of things, seems maybe not
14:49:25 <jaypipes> dansmith: I don't remember that decision. I was referring to resource-providers-allocations and resource-providers-scheduler blueprints being in Newton.
14:50:14 <dansmith> well, I have a hard time using those specs as terms in a discussion.. so many specs makes it confusing.. so I've been talking in terms of actual work items, so maybe that's the problem
14:50:14 <jaypipes> dansmith: if the three steps above are not done in Mitaka, there's no value at all to any of the patches, since nothing will be fixed w.r.t. how shared resources are tracked.
14:50:27 <dansmith> that's not true
14:50:42 <johnthetubaguy> so we just agreed we should make progress where we can
14:50:44 <dansmith> like I said before, the value is getting the online migrations of compute uuids, compute inventory records created, etc
14:50:49 <dansmith> so that in newton when we go to actually use them,
14:50:51 <johnthetubaguy> even if thats not end user visible
14:51:00 <dansmith> mitaka computes are already doing that and we don't need a dependency
14:51:09 <johnthetubaguy> dansmith: ah, right, the migrations, that is very visible
14:51:22 <johnthetubaguy> yeah, having the migrations completed, will make a big difference in terms of complexity
14:51:36 <dansmith> right, that is what I've been shooting hard for
14:51:37 <johnthetubaguy> like it maybe half the complexity
14:51:45 <jaypipes> but shared resources will still be totally broken in mitaka. ok...
14:51:51 <johnthetubaguy> cdent: does this make sense form where you stand?
14:52:12 <johnthetubaguy> getting those migrations in place, that is
14:52:22 <dansmith> jaypipes: right, I don't think we're going to make any actual resource tracking improvement in mitaka.. there's just no time
14:52:38 <dansmith> but if we don
14:52:43 <dansmith> don't do this bit in mitaka,
14:52:44 <johnthetubaguy> we have like one week left of a working gate, at this point
14:52:52 <cdent> johnthetubaguy: so, I was hoping to get a bit  further, but I agree that getting migrations and models in place before the end of the cycle is the critical part
14:52:53 <bauzas> +1 for iterating fast on the compute stuff so we could avoid online migrations
14:52:54 <jaypipes> dansmith: but you *do* think we should get the compute-node-inventory blueprint cmpleted in mitaka?
14:52:55 <dansmith> we won't be able to reasonably make the improvement in newton either, I expect
14:53:44 <dansmith> jaypipes: again, I can't keep track of the blueprints :) .. I think we need compute nodes recording their inventories in the new place in mitaka, yes, but it won't be read by anything (right?) until newton
14:53:47 <jaypipes> fuck I hate 6 month releases :(
14:54:03 <bauzas> dansmith: ++
14:54:24 <johnthetubaguy> jaypipes: if it helps the operators at the meet up hate them just as much, but the other way around
14:54:53 <johnthetubaguy> we do release every commit though, but lets not go into that hole
14:54:59 <johnthetubaguy> so for mitaka...
14:55:14 <jaypipes> operators will always ask for stability and features at the same time, though.
14:55:20 <johnthetubaguy> get the DB in the right shape to accept the data we want to put in there for newton?
14:55:27 <jaypipes> sure
14:55:49 <bauzas> ++
14:56:02 <bauzas> because we're 5 mins away, I'll take my chair cap and cut
14:56:02 <johnthetubaguy> I think the wibble there is do we add the resource pools bits as well, and since they don't need migrations (?) its not really an issue?
14:56:04 <jaypipes> can we at least merge the generic-resource-pools blueprint though? that adds some necessary fields to the resource_providers table that will be needed.
14:56:30 <johnthetubaguy> jaypipes: I think it would be easier to merge if the API stuff were separate
14:56:31 <bauzas> jaypipes: sure, just put a new rev and I'll vote on it
14:57:00 <johnthetubaguy> I basically agree with the rest of that, at least
14:57:06 <bauzas> +1
14:57:14 <bauzas> #topic open questions
14:57:17 <bauzas> 3 mins left
14:57:18 <johnthetubaguy> the API I just feel like I haven't fully understood it yet
14:57:25 <jaypipes> johnthetubaguy: sigh, ok, yet another blueprint... let me separate the two resource_providers columns into yet another blueprint and then separate out the API bits into yet abnother blueprint.
14:57:27 <bauzas> anyone for anything ?
14:58:11 <jaypipes> and I still need to split resource-provdiers-scheduler blueprint into two so that bauzas and I can argue about whether the copmpute node owns its inventory of resources on a separate blueprint.
14:58:22 <bauzas> jaypipes: I appreciate that :-)
14:58:26 <jaypipes> so that will make 9 separate blueprints for this. awesome.
14:58:49 <johnthetubaguy> so half of those could be combined, but they are separate now, but lets take that offline
14:58:53 <bauzas> ++
14:58:55 * jaypipes goes to get food before he gets more grumpy.
15:00:14 <bauzas> okay, nothing raised, bye folks, we can continue the convo in -nova
15:00:17 <bauzas> #endmeeting