14:00:43 #startmeeting nova_scheduler 14:00:44 Meeting started Mon Mar 26 14:00:43 2018 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:45 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:47 The meeting name has been set to 'nova_scheduler' 14:00:50 o/ 14:00:52 o/ 14:00:52 \o 14:00:52 @/ 14:00:53 o/ 14:01:03 geez, everyone's in a rush today! 14:01:16 * bauzas notes that he'll be only available for the next 20 mins 14:01:19 actually more like รถ/ today (new haircut) 14:01:40 * bauzas greets daylight saving 14:01:49 oh hai 14:01:50 eff dst 14:02:20 #link Agenda: https://wiki.openstack.org/wiki/Meetings/NovaScheduler 14:02:47 I like DST - I hate having to go back to standard time in the winter 14:03:12 Fine, then let's stick to that one. But this switching back and forth is for the birds. 14:03:19 o/ 14:03:21 * bauzas likes DST because for 6 months, my oven has the right time 14:03:44 #topic Specs 14:04:21 I have a big question on NUMA thingy 14:04:23 Here's the current list from the agenda: 14:04:25 #link VMware: place instances on resource pool (using update_provider_tree) https://review.openstack.org/#/c/549067/ 14:04:28 #link Provide error codes for placement API https://review.openstack.org/#/c/418393/ 14:04:31 #link Mirror nova host aggregates to placement API https://review.openstack.org/#/c/545057/ 14:04:34 #link Network bandwidth resource provider https://review.openstack.org/#/c/502306/ 14:04:37 #link Default Allocation Ratios https://review.openstack.org/#/c/552105/ 14:04:40 #link Proposes NUMA topology with RPs https://review.openstack.org/#/c/552924/ 14:04:43 #link Spec for isolating configuration of placement database https://review.openstack.org/#/c/552927/ 14:04:46 #link Account for host agg allocation ratio in placement https://review.openstack.org/#/c/544683/ 14:04:49 #link Spec on preemptible servers https://review.openstack.org/#/c/438640/ 14:04:52 go for it bauzas 14:05:13 bauzas: we only accept small questions on NUMA things. big questions are right out. 14:05:25 looks like we have an agreement on providing resources like CPU and RAM on NUMA nodes directly if the operator wants to 14:05:37 bauzas: Also, I request that you split your question into granular request groups. 14:05:49 appropriately namespaced 14:05:53 ++ 14:06:12 crazy question : could the operator want to tell which specific resource classes it wants NUMA-based ? 14:06:30 like, could I as an operator care about VCPU but not MEMORY_MB ? 14:06:30 bauzas: not following you... could you elaborate? 14:06:43 bauzas: sure, I see no reason why not. 14:06:57 okay, so it would be defined per class, roger. 14:07:08 jaypipes: wouldn't that be done in libvirt? 14:07:21 bauzas: you mean have the dedicated CPU and shared CPU (VCPU) tracked as inventory on each NUMA node and have MEMORY_MB tracked as inventory on the root compute node provider, yes? 14:07:21 IOW, either account for NUMA, or don't 14:07:28 jaypipes: exactly 14:07:33 jaypipes: or the contrary 14:07:52 seems reasonable to me. 14:07:56 like, memory-bound applications could care about MEMORY allocations 14:08:09 but wouldn't care about CPUs 14:08:10 well, let's keep in mind that memory pages != MEMORY_MB... 14:08:13 I mean, doable. I won't speak to reasonable. 14:08:39 jaypipes: memory pages are a different feature, I just wrote an example for it in my to-be-uploaded spec 14:08:45 bauzas: memory pages are not the same thing as MEMORY_MB... memory pages are atomic units of a thing. MEMORY_MB's atomic "unit" is just a MB of RAM somewhere. 14:08:48 bauzas: IMO, we haven't yet landed on a workable way for the op to represent a desire for NUMA affinity. 14:08:56 bauzas: ok, cool. just wanted to clarify that 14:08:58 So at this point, sky's the limit. 14:09:17 efried: yup, that's in my spec 14:09:25 ack 14:09:36 * efried has a pretty big backlog of spec reviews to catch up on. 14:09:46 okay, so I'll provide a revision foir https://review.openstack.org/#/c/552924/ about optionally-NUMA-able resource classes 14:09:53 s/efried/everyone 14:10:02 yeah I'm under the water too 14:10:10 anyway, I'm done with questions 14:10:19 thanks 14:10:30 Any other questions / discussion about specs? 14:10:31 I want to make sure edleafe gets his question answerted. 14:10:48 about libvirt being responsible for tracking NUMA 14:11:18 well, libvirt "owns" resources on the hypervisor, no? 14:11:24 edleafe: currently, the "tracking" of NUMA resources isn't really done in libvirt, but rather it's done in the nova/virt/hardware.py module looking at CONF options like vcpu_pin_set. 14:12:06 jaypipes: yeah, but with the move to placement, I thought libvirt would be authoritative on all resources for a compute node 14:12:16 edleafe: as well as flavor/image properties to do "allocations" of virtual NUMA topologies on top of host NUMA topology. 14:12:17 jaypipes: edleafe: what I'd like is to get a consensus on a model for NUMA resources that virt drivers would implement 14:12:22 s/thought/hoped/ ? 14:12:43 cdent: no, it was what I've gleaned from the various discussions 14:12:47 if Xen starts to implement NUMA topologies, all cool with me provided they model things the same way 14:13:04 edleafe: the virt driver will be, yes, but there will still need to be a way for the ops to signal to libvirt how (or even if) to utilize NUMA-related topology on the host. 14:13:11 the virt driver is responsible for generating the tree, but ideally trees wouldn't be virt-specific 14:13:16 jaypipes: zactly 14:13:34 jaypipes: that has to be addressed in https://review.openstack.org/#/c/552924/ 14:13:41 hence my question about tunability 14:13:46 right, no disagreement from me :) 14:13:47 jaypipes: it was the "do some NUMA, but not all" that seemed to not align 14:13:58 let's be clear 14:14:22 if we go for a config-managed way to provide resources thru the virt driver, that doesn't mean that conf opt will be in the libvirt namespace 14:14:55 bauzas: agreed. 14:14:57 actually, I was thinking on a way for libvirt to pass the NUMA topology to a non-virt specific module that would translate into placement RPs 14:15:21 after all, a NUMA topology is just an architecture 14:16:04 but I'm overengineering the implementation 14:16:24 keep in mind a specific module out of virt.libvirt that libvirt would consume for generating the tree 14:16:44 bauzas: my concern is just that: making the solution overly complex in order to solve a 2% edge case 14:16:45 and that specific module would have config options related to it 14:17:00 edleafe: configability is 2% I agree 14:17:01 Assuming that this is an edge case, and not what most deployments would need 14:17:27 edleafe: having an ubiquitous interface for describing NUMA resources across all drivers isn't a 2% concern 14:17:58 But trying to make it ubiquitous might be a 98% effort. 14:18:04 edleafe: hahahahah you said "edge" 14:18:12 bauzas: ok, it just felt like we started going down the rabbit hole on this 14:18:30 jaypipes: edge case, not edge computing :) 14:18:30 edleafe: my biggest fear is that we go down too-libvirt specific 14:18:59 +++ 14:19:00 edleafe: keeping the code responsible for generating the tree out of libvirt is crucial to me, if we want that to be non-virt specifci 14:19:09 Oh, I see mriedem has added another spec: 14:19:10 #link Complex (Anti)-Affinity Policies https://review.openstack.org/#/c/546925/ 14:19:12 bauzas: there's no disagreement with you. 14:19:23 edleafe: you can ignore for now, it needs an update 14:19:38 mriedem: sure, just including for completeness 14:19:47 #action everyone ignore mriedem 14:20:25 jaypipes: thanks, I don't feel I had an argument, just wanted to clarify the 2%-concern 14:21:25 I'll have to leave soon-ish 14:21:29 So, I actually think we should move all the numa stuff into the virt drivers and not try to genericize 14:21:30 moving on ? 14:21:37 hah 14:21:51 because there's not very much overlap 14:21:52 cdent: any reasons why the contrary, mr. vmware ? :p 14:22:14 I take offense at being slapped with mr vmware. They pay me, but that's about it. 14:22:25 cdent: the problem is that as of now, only libvirt provided NUMA features, right? 14:22:38 cdent: oh sorry, I apologize if you feel offended 14:22:51 If we want ProviderTrees to be "true", then that needs to happen in driver 14:23:05 it wasn't the intent, rather an explicit way of considering your opinion as good because you have other virt driver in mind 14:23:26 I tend to leave the employer at the door to IRC 14:23:43 You have a door on your IRC? 14:23:44 but understand where you were coming from, so offense untaken. 14:24:10 If I can paraphrase what I think cdent is saying (or at least convey my own opinion couched in terms of agreeing with cdent :) 14:24:29 sorry folks, I'll have to drop in a very few and don't want to drop mic 14:24:29 efried: paraphrase away! 14:24:35 We want the virt driver to be responsible for modeling, and the operator can do *something* in the flavor that represents the NUMA topology in a generic way (i.e. designating affinities, not specific NUMA nodes). But beyond that, there's no involvement of the scheduler, conductor, etc. other than the usual translating flavor-ness to placement-ness etc. 14:24:55 cdent: so, could you please explain why you consider the definition of the tree to be virt-specific ? 14:25:41 but on the other hand - and here's where I think I'm entering bauzas-land - we'd like to be able to "advise" the modeling such that the op experience is as similar as possible for whatever virt driver. 14:25:51 my main worries are coming from the fact that there could be high chances that a tree could differ between a libvirt implementation and a, let's say, hyper-v 14:26:13 Is that a bad thing? We want the tree to represent the truth, yes? 14:26:13 bauzas: Perhaps cdent is saying that's going to be unavoidable, and we should butt out and let it happen. 14:26:35 is it unavoidable? 14:26:44 cdent: from a placement perspective, I feel it could be a pain if the trees differ for explaining the same architecture 14:26:44 cdent: ++ 14:26:47 we try to have a consistent compute REST API across various virt drivers right? 14:27:02 If the architectures are different, then what is represented should be different. 14:27:05 i.e. we don't want to add more things like 'agent builds' that are only implemented by one virt driver anymore 14:27:08 bauzas: from a placement perspective, placement doesn't really care :) 14:27:21 cdent: I don't disagree with that 14:27:29 cdent: if architectures are differnt 14:27:41 So if hyperv "sees" something different from libvirt, it should be different in the provider tree(s) 14:27:43 cdent: but if the same architecture, then placement should see the same thing 14:27:52 cdent: again, I don't disagree with that 14:28:13 It's the op experience we'd like to try to smooth. But yeah, not at the cost of wedging e.g. a square libvirt peg into a round hyperv hole. 14:28:14 bauzas: placement is charged with determining the allocation requests against resource providers that meet the required resource and trait constraints. It's not in charge of determining whether the structure of the resource providers being created by its clients are "correct" or not. 14:28:27 anyway, I need to leave 14:28:38 I'll pound that concern 14:28:56 actually, generating a new module and describing an interface is a bit of work for me 14:29:25 so, if consensus is, "meh, let's wait other virt drivers to implement their own NUMA features"', I'm fine keeping the description in libvirt for the moment 14:29:30 less work, yeepeee 14:29:43 * bauzas rushes ou(t 14:29:47 ok, bauzas is gone, so let's start gossiping about him 14:29:51 :) 14:30:34 Anything else on specs? 14:30:38 ok, next topic? 14:30:43 regarding #link Network bandwidth resource provider https://review.openstack.org/#/c/502306/ 14:30:50 Starting this Friday I will dissapear for two weeks (honeymoon) so I'd like to settle all the remaining crazyness in the spec this week if possible. 14:30:59 The related neturon spec work will continue without interruption. 14:31:01 woot, congrats 14:31:02 WAIT! WHAT!? 14:31:22 cdent: thanks :) 14:31:28 indeed, congrats giblet! :) 14:31:34 * edleafe likes the way gibi snuck that in 14:31:50 so I will bug you guys this week for review 14:31:55 gibi: agreed, then on finalizing the spec. 14:31:59 gibi: np 14:32:43 reviews will be your wedding present 14:32:48 hahah 14:32:53 edleafe: :) 14:33:03 I think we can move to the next topic :) 14:33:21 #topic Reviews 14:33:44 Here's a dump of what is on the agenda: 14:33:46 #link Update Provider Tree https://review.openstack.org/#/q/topic:bp/update-provider-tree 14:33:49 #link Request Filters https://review.openstack.org/#/q/topic:bp/placement-req-filter 14:33:52 #link Nested providers in allocation candidates https://review.openstack.org/#/q/topic:bp/nested-resource-providers https://review.openstack.org/#/q/topic:bp/nested-resource-providers-allocation-candidates 14:33:56 #link Mirror nova host aggregates to placement https://review.openstack.org/#/q/topic:bp/placement-mirror-host-aggregates 14:33:59 #link Forbidden Traits https://review.openstack.org/#/q/topic:bp/placement-forbidden-traits 14:34:02 #link Consumer Generations Just started; no patches posted yet. 14:34:05 #link Extraction https://review.openstack.org/#/q/topic:bp/placement-extract 14:34:21 Anyone have a question/concern about any of these? 14:34:56 nope. 14:35:27 I have a query about reviewing in general in a world of runways: Is _all_ of placement non runway oriented? How are we wanting that to work? 14:35:52 cdent: good question. no idea. 14:35:58 My understanding was that runways were for non-priority things 14:36:08 It was originally 14:36:11 Y'know, to give them some attention 14:36:27 efried: ...and it changed? 14:36:33 but I think we're shifting from that to making it more of a generalized queue for "this is ready, let's get focus on it" 14:36:36 What I'm trying to figure out is which of the placement stuff needs to go in the queue, and which doesn't 14:36:59 If it's all, that's cool (and I think I'd actually prefer that for sake of one rule to bind them) 14:37:01 right; IMO, KISS and queue stuff up for a runway when it meets the other criteria. 14:37:34 There seemed to be some agreement on that, though we never like voted or anything. 14:37:36 everything goes into runways 14:37:47 blueprints go into runways, so a placement blueprint would get a slot for a while 14:37:49 are we doing the runways stuff already or were we waiting for the spec review day (tomorrow) to be done? 14:37:50 like anything else 14:37:59 starts day after spec day 14:38:00 jaypipes: starting after tomorrow 14:38:03 k 14:38:24 I have no issues with dealing with placement like anything else. 14:39:28 cool 14:39:51 So... nothing more regarding reviews? 14:40:10 i guess i have something 14:40:20 remember the ironic flavor / trait migration? 14:40:26 yeah 14:40:39 https://review.openstack.org/#/c/527541/ i've been pushing that since queens, and meant to backport to pike 14:40:45 kind of want to know if i should just abandon that 14:41:13 i don't think we should do a hard drop of that migratoin code in the driver until nova-status supports it, 14:41:19 but there doesn't seem to be much interest 14:42:00 mriedem: I just wasn't aware of that. I can review today. seems important to me. 14:42:02 huh, I thought that merged a while ago 14:42:06 suspect it just fell off radar 14:42:19 put it on a runway! 14:42:37 it's not a bp 14:42:39 but thanks 14:43:55 Any other reviews to discuss? 14:45:07 Guess not 14:45:08 #topic Bugs 14:45:12 #link Placement bugs https://bugs.launchpad.net/nova/+bugs?field.tag=placement&orderby=-id 14:45:38 there is a bug re placement we've talked about in gibi's bw provider spec, 14:45:45 but not sure we actually opened a bug to track it, 14:45:59 has to do with cleaning up allocations when deleting an instance but the compute service it's running on is down 14:46:19 if no one remembers opening a bug for that i can circle back 14:46:24 mriedem: I haven't opened it 14:46:50 that's something different from 'local delete'? 14:47:01 cdent: local delete + decomission compute host 14:47:06 https://bugs.launchpad.net/nova/+bug/1679750 14:47:07 Launchpad bug 1679750 in OpenStack Compute (nova) pike "Allocations are not cleaned up in placement for instance 'local delete' case" [Medium,Confirmed] 14:47:20 that's the one 14:47:25 * mriedem thanks his past self 14:47:31 your past self is clever 14:49:28 we done with bugs? 14:49:31 Anything more on bugs? 14:49:36 * cdent blinks 14:50:03 #topic Open discussion 14:50:24 So... what's on your collective minds? 14:50:30 This is not a priority, but I'd like to get some feedback on https://review.openstack.org/#/c/552927/ , which is the spec for optional db setting in placement. 14:50:53 It allows ... options ... when working with placement that are handy 14:51:11 But if you don't do anything different, it all works the same as it does now 14:51:56 If it makes the eventual separation of placement from nova go more smoothly for operators, ++ 14:52:29 there's some discussion in the spec about how it is but one of several options 14:52:42 but also some of why I think it is the better one 14:54:32 * bauzas is back for 6 mins, yay 14:55:34 cdent: ok, I'll re-review 14:55:43 thanks 14:55:50 Anything else for today? 14:56:23 OK, thanks everyone! 14:56:25 #endmeeting