14:00:08 #startmeeting nova_scheduler 14:00:12 Meeting started Mon Mar 12 14:00:08 2018 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:15 The meeting name has been set to 'nova_scheduler' 14:00:28 o/ 14:00:29 o/ 14:00:31 o/ 14:00:39 Good UGT morning! Especially to those recently switched to DST 14:00:53 o/ 14:01:29 * bauzas ressurects from his house 14:01:54 DST got me up 1 hr early :) 14:02:25 arvindn05: I usually need lots of coffee to do that :) 14:03:27 Small crowd today - guess we'll blame DST for that 14:03:46 Let's try to make this quick 14:03:51 #topic Specs 14:04:07 There are a bunch of 'em 14:04:18 I've listed them in the agenda: 14:04:26 #link Meeting Agenda https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting 14:04:29 * gibi just inserted an extra one to the end of that list 14:04:31 o/ 14:04:41 I also need to add another one 14:04:46 for NUMA 14:04:49 Please do! 14:05:12 was distracted trying to explain cache allocation support to non-openstack person... 14:05:28 Rather than go through each, I wanted to have a central list of those we need to review 14:05:38 edleafe: you can remove the second one. it was just some ideas I had after the PTG and is long-term thinking, not for Rocky. 14:05:42 And then only discuss the ones that have questions 14:05:54 jaypipes: ack 14:06:16 I should give credit - I compiled the list of things (and review topics) from cdent 14:06:20 * efried waves late (IRC client disconnected without warning) 14:06:32 the placement email is wonderful 14:06:38 ++ 14:06:40 saves me a lot of time :) 14:07:15 So... anything in those specs we need to discuss? 14:08:17 https://review.openstack.org/#/c/541507/ - latest comment from john garbutt 14:08:34 not specifically on the traits, but perhaps later for the opens, some general things came up 14:10:00 (my comment is semi-related to arvindn05's spec too) 14:10:13 I need to review those specs, that's it :) 14:10:40 the concern seems to be user can add traits on glance image which can request "chargable" traits without the need to go through flavor 14:10:44 I have it open for re-review, too 14:12:34 we could solve it by adding a configuration option to force image traits with flavor traits and not be a union....but wanted to get thoughts 14:12:48 arvindn05: it doesn't seem to to be fresh in anyone's mind at the moment 14:13:01 Let's review and leave comments / thoughts there 14:13:04 ++ 14:13:19 ahh..k...i will leave the comments there....not major so should be able to resovle via comments 14:13:33 Any other specs to discuss? Noting cdent's traits thing for Open 14:14:11 edleafe: it's fresh in my mind. just not sure it's something we want to tackle at this point. frankly, certain image metadata has *always* been a "chargeable" trait -- for instance all the NUMA crap -- and it's not like we have a special solution to that other the protected properties. 14:15:07 this is just one of those problems with having billing based on a tightly-coupled concept like flavors. 14:15:49 * edleafe shakes his fist at Rackspace 14:16:12 not just RAX :) AWS, GCE, Azure all do billing this way... 14:16:35 jaypipes: but it was RAX that insisted it be that way in Nova 14:16:42 ... 14:16:47 edleafe: sure, but that's ancient history ;) 14:16:57 Well, I'm an ancient guy 14:17:01 hehe 14:17:20 anyway, I'm happy to move on past this and just say protected image properties are, well, protected... 14:17:20 OK, let's move on then 14:17:22 fortunately, image metadata changed a lot since 1.5 years 14:17:24 #topic Reviews 14:17:33 now they are objectified 14:17:41 bauzas: actually, it's barely changed in 7 years. 14:18:05 bauzas: at least, what's coming from glance I mean. 14:18:14 I listed the "main themes" of development that cdent had in his email in the agenda 14:18:14 yeah from the glance standpoint, I agree 14:18:26 but from a nova standpoint, that's much better 14:18:39 We can go through them quickly 14:18:42 * jaypipes has one patch to discuss -- more about resource tracking.. 14:18:58 efried: I assume Update Provider Tree work hasn't moved much since PTG? 14:19:23 edleafe: needs a rebase. 14:19:25 edleafe: I haven't had a chance to check up on it yet. 14:19:38 okay, I'll try to get to that today, unless someone else wants to volunteer. 14:19:49 efried: cool, just checking 14:20:19 efried: I would, but I'm going to be busy on the mirroring aggs thing 14:21:02 jaypipes: yeah, I had you to talk about the mirroring work 14:21:17 jaypipes: anything to discuss? Or still being worked on? 14:21:35 edleafe: still need to finalize the spec. haven't started on the code yet. 14:21:46 jaypipes: gotcha 14:22:01 next theme: Request Filters 14:22:08 dansmith: around? 14:22:16 yeah 14:22:35 I'm just starting the API change for agg filtering of A/Cs 14:22:52 dansmith: besides reviews on your series, anything you need to bring up? 14:23:10 my series is blocked on getting that api added to placement, 14:23:20 which edleafe is gonna do 14:23:27 so I haven't really been doing much with it until that happens 14:23:34 k 14:23:48 dansmith: will have something by late today or tomorrow, I hope 14:23:57 sweet, thanks 14:24:11 * edleafe hopes he doesn't keep getting pulled into meetings 14:24:39 Next: the tantalizingly-titled "Forbidden Traits" 14:24:48 heh 14:24:49 cdent? 14:24:56 * jaypipes still needs to finalize that spec review. 14:25:01 just a question, why "forbidden" and not "avoided" ? 14:25:08 * bauzas loves to rathole during meetings 14:25:19 I was waiting on the spec to resolve before starting on an implementation 14:25:20 call me pedantic 14:25:22 bauzas: dunno - I wanted "negative" 14:25:31 bauzas: avoided is not the right word, IMHO 14:25:31 forbidden and avoided mean different things 14:25:41 heh, no worries 14:25:43 ditto negative 14:25:44 negative would be okay, although I personally like forbidden 14:25:56 anyway, it was just a pun 14:26:08 won't comment on the spec for that :) 14:26:09 I try to avoid rat-holing on semantics, but I'm not forbidden from doing so. 14:26:10 it's just adding a boolean NOT 14:26:27 edleafe: NOT any or NOT all? 14:26:29 shit, I created a discussion just about a pun 14:26:43 kill me 14:26:58 jaypipes: since required is all, forbidden should be all too, no? 14:27:11 edleafe: I would actually think it should be NOT any... 14:27:26 edleafe: i.e. "if the provider has any of these traits, filter it out" 14:27:52 jaypipes: yeah, that's what I thought I was saying 14:28:00 IOW, all are forbidden 14:28:20 * cdent gets out his semantics workbook 14:28:27 anyway 14:28:43 let's call it "forbidden" and just add a nice documentation explaining what's for 14:29:04 It was noted in the placement email that Consumer Generations has not been started, so the last main theme to discuss is Extraction 14:29:35 I split up the resource provider object move into three patches as requested by dansmith 14:30:04 including above those are some other extraction related patches 14:30:15 Inventories, Allocations, RPs ? 14:30:51 bauzas: no... resource class fields, nova/objects/resource_provider.py move, and oslo.versionedobjects basing. 14:30:59 #link move rp objs to placement https://review.openstack.org/#/c/540049/ 14:31:18 bauzas: this isn't about splitting out the objects inside resource_provider.py but rather removing it from the nova/objects directory. 14:31:18 ack 14:31:28 thanks 14:31:30 np 14:31:34 splitting up the file based on object type would be nice to do eventually but is not currently on the radar (as far as I know) 14:31:42 ack 14:31:53 * bauzas nods (about not being a priority) 14:32:10 yeah, the file is big but not unworkable 14:32:38 Any other reviews anyone wants to discuss 14:32:40 ? 14:33:02 yes 14:33:21 this one: https://review.openstack.org/#/c/532924/ 14:34:28 hah 14:34:59 it's related to the allocations discussion 14:35:04 with all respect to maciej, the mess of default values and upgrade steps in there is too dangerous to mess with like that, IMHO. 14:35:42 jaypipes: well, we should maybe add him to the convo we had about alloc ratios 14:35:46 and the spec itself 14:36:07 the fact that people would like to change the ratio default value is understandable 14:36:24 the whole block of cruft in the ComputeNode._from_db_obj() to deal with "Liberty computes" is going to lead to a situation where it will be impossible to tell whether or not an allocation ratio on an aggregate should be applied to a resource provider or not. 14:36:30 but maybe those operators would prefer to rather discuss about the next spec;) 14:37:19 like, I don't remember whether OVH said they were okay with deprecating the possibility to set ratios thru nova.conf 14:37:28 but that folk was at the PTG 14:37:32 and if the default allocation ratios are changed (back) to 16.0/1.5/1.0 from the current 0.0 values, there will be no conceivable way to determine what to set the dang values to. 14:38:07 bauzas: at the PTG, dansmith offered up a solution to use new configuration options (default_xxx_allocation_ratios) that would only be used for initializing a new compute node's allocation ratios 14:38:12 jaypipes: I'll comment the patch and tell him to look at https://review.openstack.org/#/c/544683/ 14:38:24 yeah I remember 14:38:42 bauzas: and I think dansmith is correct to say that we will need a new option and not try to jury-rig this crap any more than it already has been 14:38:51 what I'm trying to explain is that instead of him working on his sole patch, I'd be interested in him chiming on the spec :) 14:39:00 we need *something* other than just this 14:39:02 bauzas: so if you don't mind, I'm going to -2 that particular patch./ 14:39:29 I'm fine with that 14:39:32 bauzas: that's cool with me, of course. 14:39:44 again, just make sure that he knows https://review.openstack.org/#/c/544683/ 14:39:51 if you -2 it 14:40:14 dansmith: did we decide at the PTG who might be responsible for doing this? you didn't volunteer for that right? just want to make sure I'm not stepping on something you wanted to work on. 14:40:27 the spec needs an update 14:40:31 at least 14:40:40 with what we agreed at the PTG 14:40:43 no, I guess I didn't think it got much traction, so I don't know that anyone is on the hook for it 14:40:52 bauzas: yeah, I'm not even talking about the agg allocation ratio spec (yet). 14:40:57 I'll *try* to help 14:41:14 once I'm done with shitty paperworking 14:41:31 dansmith: k, I will work with bauzas and maciej on this then. we definitely need to solve this default allocation ratio (for the compute node, not agg) before proceeding with anything else on the aggregate side. 14:41:50 bauzas: heh.. yeah, I need to expense report stuff today as well. 14:41:54 ack 14:42:12 * edleafe just realized he has expense report crap too 14:42:31 jaypipes: see my note on #nova about the paperwork thingy 14:43:01 heh. ok, back to you edleafe 14:43:11 jaypipes: thank you kind sir 14:43:16 #topic Bugs 14:43:19 #link Placement bugs https://bugs.launchpad.net/nova/+bugs?field.tag=placement&orderby=-id 14:43:27 No new bugs this week 14:43:32 sec 14:43:36 abotu bugs 14:43:41 go for it 14:43:42 I haven't triaged for 3 weeks 14:43:53 now we have a shit number of untriaged and new bugs 14:44:27 so, disclaimer: that's not because you had no new placement bugs that there couldn't be placement bugs :p 14:44:48 there is a new bug this week: https://bugs.launchpad.net/nova/+bug/1751692 14:44:49 Launchpad bug 1751692 in OpenStack Compute (nova) "os_region_name an unnecessary required option for placement " [Undecided,Confirmed] 14:44:54 if folks feel enough brave to face the beast and fight it 14:44:55 ... 14:44:55 which I think needs some deciding 14:45:19 cdent: oh, that was from a few weeks ago 14:45:34 I thought it would have been discussed last week 14:45:36 cdent: we used that option for a signal 14:45:46 its the wrong option 14:45:48 bauzas: signal for what? 14:45:57 edleafe: yeah, not new, rather unchanged 14:46:10 jaypipes: for finding whether nova.conf was set for placement 14:46:57 I don't exactly remember why the others were not possible to use, but AFAIK os_region_name was like the rare only ones that were good for signaling 14:47:09 * edleafe is very calendar-literal 14:47:29 * jaypipes doesn't remember the patch that added this signal or discussing it 14:47:35 the idea was that you had to amend nova.conf to amend placement things before restarting nova-compute in Ocata 14:47:55 now we are post that, maybe that defensive approach becomes useless 14:48:07 it was more for an upgrade perspective 14:48:19 like, you have to make things in order to have things to works 14:48:42 bauzas: right, but just trying to connect to the placement service would provide the same signal, no? 14:48:44 os_region_name should be deprecated at this point, since the ksa adapter work. 14:49:13 jaypipes: IIRC, we weren't hard-failing when I introduced that 14:49:23 ie. the RT was just working "the old way" 14:49:33 now, it's indeed a blocking error 14:49:35 efried: that's the exact point of the bug 14:49:44 hence my point, just kill the conditional I guess 14:50:00 but the ideal would be to test 14:50:42 my point being that configuring access to placement is a prerequisite 14:51:03 if you haven't done that in Rocky, you should be already in serious trouble 14:52:25 So are we agreed? 14:52:33 I guess 14:52:37 * bauzas just adding a bug comment 14:53:00 Let's move on, then 14:53:07 #topic Open Discussion 14:53:22 cdent: did you have something about traits? 14:53:47 my two comments on the end of https://review.openstack.org/#/c/541507/ and https://review.openstack.org/#/c/497733/ are related to the general topic of "what's authoritative about traits" and "do we ever need to block them being reported" 14:54:01 neither comments is directly related to the reviews, just inspired by them 14:55:15 that reminds me some good efried's point about removing traits if the virt driver tells nothing 14:55:33 yes, that's part of it 14:56:14 so, I think there is an agreement that if the driver reports nothing, then remove the trait 14:56:26 trait(s) even 14:57:01 since traits are on RPs, then the RP should be authoritative. For compute, that would be the virt driver, right? 14:57:10 I'm not sure a CPU feature could just suddently disappear, but from a conceptual PoV, I just feel we need to keep the driver responsible for passing the traits accordingly 14:57:14 That's how I feel edleafe 14:57:19 yeah that 14:57:21 But there was disagreement at the PTG. 14:57:23 now, there is a trap 14:57:38 ironic can have some intermittent issues 14:57:46 where it could report dumb 14:58:12 That sounds like an ironic bug (title: Ironic reports dumb) 14:58:14 I think we basically said that that special case was just an exception handling 14:58:45 bauzas: if the virt driver reports nothing, but an admin has set a trait on the compute node, we delete what the admin set? I don't think that's correct. 14:58:48 during the ironic/nova discussion at the super loudy and noisy breakfast room 14:59:02 jaypipes: hah, sec 14:59:06 that's another concern 14:59:37 here, I'm just talking of the fact that if the driver says "I have foo" and later says "I have bar", then we should add bar and remove foo 14:59:39 nearly out of time, continue in #openstack-nova? 14:59:54 jaypipes: IMO, what that means is that "admin sets a trait" needs to take some form that ultimately means "virt driver sets that trait". 14:59:55 FTR, I have the exact same concern about allocation ratios being set externally by an admin and being overwritten by the compute node resource tracker every periodic interval 15:00:15 jaypipes: Which may or may not be what your spec suggests. 15:00:24 I think at the PTG, we agreed on the fact that we can merge operator's defined traits with driver's defined traits 15:00:34 on an additive way 15:00:35 That way lies madness ^ 15:00:42 Moving to -nova. Thanks everyone! 15:00:44 #endmeeting