14:00:19 <edleafe> #startmeeting nova_scheduler
14:00:20 <openstack> Meeting started Mon Oct  3 14:00:19 2016 UTC and is due to finish in 60 minutes.  The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:23 <openstack> The meeting name has been set to 'nova_scheduler'
14:00:26 <edleafe> Who's around?
14:00:28 <takashin> o/
14:00:29 <_gryf> o/
14:00:37 <rpodolyaka> o/
14:00:37 <Yingxin> o/
14:00:42 <alex_xu> o/
14:00:43 <cdent> o/
14:01:03 * johnthetubaguy lurks
14:01:42 <jaypipes> o/
14:01:43 <edleafe> Let's wait another minute
14:02:31 <edleafe> #topic Specs and reviews
14:02:37 <edleafe> Nothing on the agenda
14:02:54 <edleafe> Anyone have something that needs special attention?
14:03:02 <bauzas> \o
14:03:19 <alex_xu> the traits API's spec and poc is ready
14:03:27 <cdent> just a remind about the existence of the aggregates reviews: https://review.openstack.org/#/c/362863/
14:03:42 <Yingxin> yup, traits
14:03:48 <edleafe> #link https://review.openstack.org/#/c/362863/
14:04:03 <edleafe> alex_xu: care to add a link to the start of those series?
14:04:11 <Yingxin> start from https://review.openstack.org/#/c/376198
14:04:18 <edleafe> thanks Yingxin
14:04:48 <edleafe> Anything else?
14:04:52 <Yingxin> and the spec https://review.openstack.org/#/c/345138
14:05:07 <edleafe> #link https://review.openstack.org/#/c/376198
14:05:07 <alex_xu> Yingxin: thanks
14:05:15 <edleafe> #link https://review.openstack.org/#/c/345138
14:05:18 <alex_xu> a little trouble with my browser..
14:05:21 <bauzas> I need to write the (hopefully) specless blueprint about the left usage of RequestSpec
14:05:45 <edleafe> Yingxin: you can add them as #link entries so that people can find them in the meeting minuntes
14:06:08 <edleafe> bauzas: ah, good - that leads us to...
14:06:12 <edleafe> #topic Opens
14:06:21 <Yingxin> edleafe: ok sorry
14:06:24 <edleafe> bauzas: care to summarize what's left?
14:06:28 <edleafe> Yingxin: no worries
14:06:45 <bauzas> edleafe: well, are you guys knowing how the RequestSpec object is used ?
14:07:01 <edleafe> bauzas: probably not as well as you
14:07:08 <edleafe> :)
14:07:10 <bauzas> I first wrote the implementation for creating a RequestSpec object in the conductor and passing it to the scheduler
14:07:20 <bauzas> that was the first item
14:07:44 <bauzas> then, alaski helped me to persist that object in DB so that we could use it for cells v2 needs
14:08:06 * alaski peeks in
14:08:32 <bauzas> then, I wrote a 3rd item where all the API move operations check whether the RequestSpec already exists, and if so, pass it to the conductor
14:09:03 <bauzas> plus, I wrote a special DB migration for making sure that all the instances are now having a related RequestSpec object
14:09:32 <bauzas> that means that now, either you boot and then you go to the conductor that writes a RequestSpec object and passes it to the scheduler
14:10:07 <bauzas> or, you do another API operation (resize, migrate, live-mig, etc.) and then you're getting it straight from the API
14:10:09 <bauzas> buuuut
14:10:15 <bauzas> there are a lot of things left to do
14:10:44 <jaypipes> bauzas: short list of those remaining items would be useful.
14:10:58 <bauzas> jaypipes: yeah, hence me saying I need to write a bp
14:11:06 <bauzas> but edleafe wanted to know about :)
14:11:32 <edleafe> So, here's my concern: bauzas is pretty much the only one working on this in enough detail to understand what needs to be changed, and what the effects will be
14:11:51 <edleafe> What I'd like to see is some knowledge distribution
14:12:21 <edleafe> Maybe have bauzas act as oversight, and have others do the work
14:12:33 <edleafe> The RequestSpec czar, if you will
14:12:46 <bauzas> so, what's missing is basically to make sure our computes are getting our spec object instead of getting the legacy dicts, so that they could reschedule by passing back that object
14:12:58 <edleafe> This way he's not the only one who migth feel confident enough to make updates to it in the future
14:13:28 <bauzas> and then we could cleanup our RPC interfaces by removing all ugly request_spec/filter_props dictionaries and all our code conditionals where we test whether it's a dict or a nova object
14:13:29 <edleafe> The same concern goes for cdent and the placement API/enginie
14:13:34 <cdent> +1
14:13:38 <bauzas> that's it for me
14:13:55 <bauzas> edleafe: the only problem I see with that is that I like coding too :p
14:14:06 <edleafe> bauzas: fair enough :)
14:14:14 <edleafe> bauzas: you can do _some_ of it
14:14:20 <bauzas> but I could help with the placement series, like I promised :)
14:14:35 <edleafe> but I'd feel better if more people understood it in depth
14:14:45 <bauzas> either way, I think it's more a problem about how we can, as a team, share our work
14:14:53 <edleafe> bauzas: 'zactly
14:15:11 <bauzas> edleafe: but like I pointed to you, we're having very different backgrounds
14:15:20 <bauzas> so, that's not trivial
14:15:47 <edleafe> bauzas: That's true, but it seems like the technical debt problem
14:16:00 <edleafe> Not enough time to address current debt, so it keeps increasing
14:16:02 <dave-johnston> o/
14:16:18 <edleafe> Maybe since Ocata is a short cycle, this might be the best time to address it
14:17:05 * bauzas shrugs
14:17:16 <bauzas> I mean, sure that would be cool
14:17:29 <edleafe> Anyone else have opinions on this one way or the other?
14:18:20 <cdent> Other than "pay more attention to tech debt and share workload more" are you proposing something more specific edleafe ?
14:18:42 <cdent> because if it is mostly what I've quoted, then yeah, of course we should do that
14:19:00 <edleafe> cdent: yes, something like http://blog.leafe.com/pair-development/
14:19:24 <cdent> yeah, as I've said before I'm pretty keen on that
14:19:25 <_gryf> I agree, working together on certain topic is a better way for understanding code underneath than doing reviews only
14:19:37 <edleafe> But of course there are other options that we should consider
14:20:29 <Yingxin> we can also work on a shift :)
14:20:50 <edleafe> So maybe have bauzas write the BP, and then we can consider how to divide that up so as to spread the knowledge better?
14:21:01 <bauzas> oh man, work for me :p
14:21:03 <edleafe> Yingxin: yes, good point
14:21:15 <bauzas> (and not, work*s* for me :p )
14:21:50 <edleafe> Yingxin: but it does help to have enough timezone overlap for those times that a hangout or similar might be needed
14:22:07 <edleafe> Boss Bauzas :)
14:22:21 <Yingxin> edleafe: yes, indeed
14:23:05 <edleafe> This is certainly something we can discuss at the summit, too.
14:23:17 <edleafe> OK, let's move on
14:23:26 <edleafe> Placement DB Spec
14:23:55 <edleafe> cdent rpodolyaka and I are trying to hammer this one out
14:23:57 <cdent> gist is that Matt has asked that there be a spec for the optional placement db to avoid confusion
14:24:24 <cdent> there's been some noodling at
14:24:26 <cdent> #link https://etherpad.openstack.org/p/placement-optional-db-spec
14:24:36 <cdent> some of which includes "should we even bother"
14:24:42 <rpodolyaka> ++
14:25:02 <edleafe> cdent: "Should we even bother? I mean, we're all going to die anyway"
14:25:05 <edleafe> :)
14:25:08 <rpodolyaka> I just went over the changes that actually made it to Newton and looks like "self healing" should migrate most of the things
14:25:17 <rpodolyaka> *except for aggregates
14:25:36 <edleafe> rpodolyaka: and the initial db/table creation
14:25:40 <cdent> rpodolyaka: it should, but it might stall adding new consumers for too long
14:25:49 <rpodolyaka> edleafe: sure
14:25:52 <cdent> if I remember the concerns correctly
14:26:00 <cdent> we probably need to inquire with dansmith
14:26:24 <rpodolyaka> my understanding was that we report resources state on start in nova-compute and in a periodic task that is executed like every minute
14:26:46 <edleafe> rpodolyaka: but does that account for allocations, too?
14:26:49 <cdent> but in case people were wondering: we're trying to use this as an example of "working on stuff together"
14:26:52 <bauzas> not sure I understand what the problem is
14:27:14 <edleafe> cdent: yes, exactly
14:27:27 <rpodolyaka> edleafe: it turns out we also create allocations based on instances usages. this is done in update_resources_...() in RT, so this should be covered as well
14:27:50 <edleafe> rpodolyaka: ok, thanks. I need to dig into that code path deeper
14:28:01 <rpodolyaka> yeah, I want to give it a try on devstack
14:28:10 <johnthetubaguy> so just looking at the etherpad, is this about the DB no longer being optional? or did I get that back to front?
14:28:24 <edleafe> johnthetubaguy: you got it right
14:28:37 <bauzas> wait, what ?
14:28:44 <cdent> this is still about it being optional
14:29:02 <edleafe> optional in Ocata??
14:29:02 <johnthetubaguy> but about the need for some stuff to be migrated if you want it to be separate?
14:29:02 <cdent> but that in ocata the cost of that optional-ness is different than it would have been if we had managed to get it out in newton
14:29:15 <bauzas> cdent: not sure I understand the same point
14:29:25 <bauzas> cdent: I still think the placement DB should be optional
14:29:45 <cdent> bauzas: yes, that's the plan
14:29:57 <bauzas> just that we need to consider what was written in the API DB could possibly be now written in a separate DB
14:30:05 <johnthetubaguy> so I think what I am saying, is I don't understand the problem that etherpad is trying to solve right now, I am missing context somewhere
14:30:14 <cdent> but if people choose to use it, then there will need to be some kind of migration and what we're discussing (on the etherpad) is how much migration will be required and how to do it
14:30:15 <edleafe> Hmmm, I thought that the optional in Newton was to smooth the transition to non-optional in Ocata
14:30:26 <bauzas> edleafe: not at all
14:30:36 <edleafe> that's disappointing
14:30:49 <bauzas> edleafe: the point about that being optional is that we share the same schema
14:30:52 <edleafe> So is there a plan for when it will be non-optional?
14:30:57 <johnthetubaguy> I thought optional was so you could avoid a nasty migrate, if you were willing to plan ahead, but yeah
14:31:17 <cdent> johnthetubaguy: that's right, both edleafe and bauzus are a bit out of sync
14:31:28 <bauzas> edleafe: so operators wanting to use it heavily could have the benefit of a separate DB without getting the PITA for us to maintain a 3rd schema and all the tooling associated
14:32:22 <johnthetubaguy> so I guess some of the good news here, is we have hit the confusion already, so thats good news
14:32:24 <bauzas> cdent: specs are good for clarifying the intent, I would say :)
14:32:28 <edleafe> I get the 'same schema' concept
14:33:01 <cdent> johnthetubaguy: the reason I've suggested that perhaps we shouldn't even bother with the optional database is now that a migration of some kind is going to be required anyway, may as well just wait and do a big one when placement is extracted, if it ever is
14:33:09 <edleafe> But I'm still wondering if the plan is to always have an option to keep it in the API DB
14:33:37 <cdent> edleafe: that's kind of the flip side of what I just said
14:34:04 <edleafe> cdent:  the "if it ever is" part?
14:34:31 <bauzas> edleafe: cdent: the migration we're talking about are about inventories, right ?
14:34:50 <edleafe> inventories, allocations, and aggregates
14:34:54 <bauzas> because allocations are written using the periodic update or within claims, so that's not really an issue
14:35:01 <bauzas> aggregates aren't a thing yet
14:35:11 <cdent> edleafe: as in: what's the actual goal here, where will the boundary lie, how permeable will it be? will it ever migration? will there always be an option to stay in api db, etc.
14:35:13 <bauzas> so that leaves inventories and allocations
14:35:31 <bauzas> plus the fact that nobody is actually calculating the resource usage
14:35:42 <bauzas> (yet)
14:36:05 <cdent> I'd like to think that we could hash this out on the etherpad and in email in a more...considered... fashion because we're pretty much just throwing words at one another right now and not really making any sense
14:36:05 <bauzas> we rushed in Newton because we wanted our Newton computes to be able to send allocations and inventories
14:36:22 <edleafe> cdent: My understanding was that the separation will happen, with the question being when
14:36:26 <cdent> In large part because we don't actually have consensus on the goals. Making a spec without a big picture goal is _useless_
14:36:49 <cdent> edleafe: me too, but bauzas apparently disagrees
14:37:16 <edleafe> The rush to get stuff in newton was specifically so that the change in Ocata would be possible
14:37:22 * jaypipes reading back, sorry had emergency call
14:37:37 <edleafe> Otherwise, we'd have to wait until Pike to do it
14:38:41 <edleafe> Let's timebox this until 14:45 because we have other items on the agenda we need to get to
14:38:50 <cdent> yes please
14:38:51 <edleafe> We can always continue on -nova
14:39:10 * edleafe is waiting for jaypipes to catch up
14:40:36 <edleafe> Well, let's cover the other stuff. We can circle back to jaypipes after
14:40:46 <edleafe> Placement leftovers:
14:40:50 <edleafe> #link http://lists.openstack.org/pipermail/openstack-dev/2016-October/104900.html
14:41:09 <edleafe> cdent?
14:41:29 <cdent> thanks, I simply wanted to draw people's attention to that email
14:41:44 <cdent> that's a list of things I thought of that are lose ends on the placement work already done
14:41:55 <cdent> stuff that we should let slip away lest we create more cruft and debt
14:41:58 <cdent> shouldn't!
14:41:59 <cdent> :)
14:42:08 <edleafe> Freudian slip there
14:42:30 <jaypipes> bauzas: what are you talking about with "nobody is calculating the resourc eusage yet"?
14:42:39 <cdent> As I say in the message I'd like feedback on what matters, and people who want to work on it with me and others.
14:42:49 <edleafe> #action Everyone read cdent's email at http://lists.openstack.org/pipermail/openstack-dev/2016-October/104900.html and provide feedback
14:42:51 <bauzas> jaypipes: I mean that we're not using the placement API yet for doing our filtering
14:43:00 <jaypipes> bauzas: ah.
14:43:03 <jaypipes> bauzas: yes
14:43:15 <bauzas> ie. we don't have a client actually consuming those placement information
14:43:26 <bauzas> we have clients that provide resource information
14:43:43 <jaypipes> bauzas: well, we do. the scheduler reporting client is consuming the allocation and inventory information.
14:43:57 <bauzas> that's correct
14:44:40 <bauzas> jaypipes: okay, I think it's gonna be tough discussing it there, I would suggest to draw something out later either way
14:45:02 <jaypipes> bauzas: k
14:45:02 <bauzas> I'm just not super convinced that we should make it non-optional if we share the same models
14:45:39 <bauzas> and providing a separate model for placement seems an hard call for Ocata given the shorter cycle
14:45:54 <jaypipes> bauzas: the optional-ness of the placement DB was more about the tooling around db sync and migrations, no? It wasn't about whether we wanted a separate placement DB...
14:46:04 <jaypipes> dansmith: is ^ your recollection?
14:46:09 <bauzas> jaypipes: I agree
14:46:17 <bauzas> hence me wanting to stick with the plan
14:46:18 <edleafe> jaypipes: that was my understanding, too
14:46:53 <cdent> bauzas: can you restate what you agree with please?
14:46:57 <jaypipes> bauzas: I guess I need a refresher on the specifics of the current plan, then. Ill go read that etherpad.
14:47:09 <dansmith> bauzas: yeah, I don't want them to be forced to have another db until it's actually split, but we should provide all the machinery to make it possible
14:47:12 <dansmith> er, jaypipes ^
14:47:49 <jaypipes> dansmith: yes, totes. but the defintion of "all the machinery" needs to be specified.
14:47:50 <cdent> I wonder if confusion is from the etherpad point out that placement _api_ is non optional, which is not the same as placement _db_ being optional or not
14:47:59 <dansmith> jaypipes: agreed
14:48:11 <cdent> there's code here: https://review.openstack.org/#/c/362766/ it already works
14:48:24 <cdent> (in local devstacks)
14:48:40 <cdent> we stalled that code for newton because we didn't want to add to the confusion
14:48:46 <jaypipes> dansmith: for instance, do we want to provide machinery to do a migrate/sync on a separate placement DB connection, or do we continue the strategy from Newton of "if you're a big provider and want to proactively do X, then run these commands..."
14:49:31 <edleafe> cdent: my confusion came from the changes between Austin and the midcycle, where the DB strategy seemed to change. I thought the resolution was to make it temporary short-term, and mandatory after that
14:49:40 <dansmith> jaypipes: I thought you were going to just write a script for people that wanted to?
14:49:58 <edleafe> cdent: I don't recall always being optional as, well, an option
14:50:12 * bauzas needs to drop-off for doing some homework to my elder
14:50:23 <edleafe> ok, thanks bauzas
14:50:31 <jaypipes> dansmith: that would certainly be fine. just need to write down that that is indeed our strategy/plan for Ocata.
14:50:48 <dansmith> cool
14:51:04 <cdent> Can jaypipes, dansmith and bauzas hash out what the end game is, on that etherpad so we don't lose the details?
14:51:06 <edleafe> Yeah, I get the feeling that decisions are being made and not spread publicly very well
14:51:14 <jaypipes> cdent: yep
14:51:19 <cdent> awesome thanks jaypipes
14:51:32 <edleafe> yes, let's move on now
14:51:36 <edleafe> Next up are 2 cold migration specs from takashin:
14:51:36 <edleafe> #link https://review.openstack.org/#/c/334286/
14:51:36 <edleafe> #link https://review.openstack.org/#/c/334725/
14:51:48 <takashin> Could you review them?
14:51:50 <edleafe> takashin: do you have any comments about them?
14:52:15 <takashin> They are about cold migration.
14:53:01 <edleafe> takashin: ok, thanks. I just wanted to make sure that you didn't have any extra concerns about them
14:53:18 <takashin> they are my specs.
14:54:09 <edleafe> ok, thanks
14:54:14 <edleafe> Anything else for opens?
14:54:32 <alex_xu> jaypipes: edleafe so let us hangout for traits API after the meeting?
14:55:08 <edleafe> I'm free
14:55:15 <edleafe> jaypipes?
14:55:53 <alex_xu> edleafe: cool, thanks
14:56:48 <edleafe> alex_xu: let's wait until jaypipes is available. I know it's pretty late for you
14:56:57 <alex_xu> edleafe: yeah
14:57:02 <edleafe> alex_xu: we'll continue in -nova
14:57:05 <cdent> alex_xu, edleafe If you guys can keep me informed on when that is, I'll try to make it
14:57:12 <alex_xu> edleafe: yea, cool
14:57:15 <alex_xu> cdent: ok, got it
14:57:17 <edleafe> cdent: it was supposed to be now
14:57:32 <edleafe> but with no jaypipes it might have to wait
14:57:41 <dave-johnston> ]
14:57:46 <edleafe> Anyway, let's continue in -nova
14:57:50 * cdent nods
14:57:53 <edleafe> #endmeeting