14:00:13 #startmeeting nova_scheduler 14:00:14 Meeting started Mon Feb 5 14:00:13 2018 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:14 #link Agenda for this meeting https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting 14:00:15 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:18 The meeting name has been set to 'nova_scheduler' 14:00:22 Good UGT morning! 14:00:26 Who's here? 14:00:43 o/ 14:01:02 @/ 14:01:15 comb your hair, efried! 14:01:26 oh hi 14:03:23 Huh, looks like it will a quick meeting 14:03:24 #topic Specs & Reviews 14:03:40 Since Feature Freeze is past, not much to say about these changes, so I'll just paste 'em here for posterity, and if anyone has anything to say about them, go for it. 14:03:52 #link Provider Tree series, starting with: https://review.openstack.org/#/c/537648/ 14:03:55 #link Nested RP traits selection: https://review.openstack.org/#/c/531899/ 14:03:58 #link Granular resource requests review: https://review.openstack.org/#/c/517757/ 14:04:01 #link Remove microversion fallback: https://review.openstack.org/#/c/528794/ 14:04:04 #link Use alternate hosts for resize: https://review.openstack.org/#/c/537614/ 14:04:09 Anyone want to discuss any of these? 14:04:25 I just edited to include spec links 14:05:08 efried: gotcha 14:05:21 We can be getting a jump on reviewing ready code. 14:05:26 even if we can't merge it. 14:05:38 true, true 14:06:01 o/ sorry for being late 14:06:15 hey there jaypipes_! 14:06:43 #topic Bugs 14:06:43 #link Placement bugs https://bugs.launchpad.net/nova/+bugs?field.tag=placement 14:06:52 A few new ones this week 14:06:53 efried seems to have found issues with generation consistency 14:07:29 I think I have fixes proposed for most of the ones I opened. 14:07:36 There's still at least one that's going to need some discussion. 14:07:45 Not sure if we want to do that here or wait til the PTG 14:08:27 Unless it's a simple thing, I'd prefer to wait for PTG so that we can get several eyes on it 14:08:34 I made a spec for the aggregates one: https://review.openstack.org/#/c/540447/ and including some editorializing in the alternatives section 14:08:41 It's simple. It's not easy. 14:08:51 * efried clicks 14:09:26 #link Add generation support in aggregate association https://review.openstack.org/#/c/540447/ 14:10:26 efried: it sounds like we need a general cleanup of how we handle generations 14:10:38 Yes 14:10:53 BRB - gotta move my car 14:10:55 edleafe: cleanup? 14:11:23 More of a sweep to make sure we've got all the holes filled - on both sides (server & report client) 14:11:29 efried: is it cleanup or just making sure we are consistent in exposing the generation when changing things related to the provider? 14:11:33 gotcha 14:12:10 jaypipes: On the client side, there's a couple reviews out there for bugs. 14:13:04 one of them you've already been reviewing. 14:13:26 Then there's this one: https://review.openstack.org/#/c/539712/ 14:13:37 * edleafe is back 14:14:00 efried: k 14:14:15 hm, maybe that's the only one related to generation conflicts so far. The rest we need to discuss eventually. 14:14:35 Which kind of leads us to... 14:14:36 #topic Open discussion 14:14:38 One thing on the agenda: 14:14:47 Should allocations fail if there is a generation mis-match for a provider, even if the provider still has sufficient available inventory? 14:14:50 #link https://bugs.launchpad.net/nova/+bug/1719933 14:14:51 Launchpad bug 1719933 in OpenStack Compute (nova) "placement server needs to retry allocations, server-side" [Medium,Triaged] - Assigned to Jay Pipes (jaypipes) 14:14:52 Could we perhaps increment the generation on a successful allocation even if the generation is older? 14:15:13 We don't need to answer this today 14:15:28 edleafe: it should fail and let the caller decide. 14:15:30 But it came up in discussion last week, and I didn't want it to get lost 14:15:31 This is related to https://bugs.launchpad.net/nova/+bug/1746373 14:15:32 Launchpad bug 1746373 in OpenStack Compute (nova) "Placement APIs with missing conflict detection" [Undecided,New] - Assigned to Eric Fried (efried) 14:16:04 jaypipes: why should allocations fail, though? 14:16:09 jaypipes: but if there is sufficient inventory, why should it fail? 14:16:14 Sorry, maybe I missed something. Does bug 1719933 imply that we're using generations in allocation APIs? Cause afaict, we're not. 14:16:15 bug 1719933 in OpenStack Compute (nova) "placement server needs to retry allocations, server-side" [Medium,Triaged] https://launchpad.net/bugs/1719933 - Assigned to Jay Pipes (jaypipes) 14:16:20 * bauzas waves super late 14:16:24 efried: they are used only server-side 14:16:54 cdent, edleafe: it should "fail" in so much as a 409 Conflict is returned and allows the caller to retry if it wants. i.e. what we already do in the claim_resources() process. 14:16:57 Swhat I thought. So we're always incrementing based on what's in the db, not what comes in on the request, right? 14:17:10 Or does the allocation_request contain a generation (and I missed it)? 14:17:32 at the api level it acquires a generation, and then can concurrent update fail 14:17:32 I think that this would be part of an overall discussion of how generations are used/exposed/etc. for PTG 14:18:12 ack 14:18:25 Right now the doc for PUT /allocations/{c} 1.12- says generation is ignored. 14:18:26 jaypipes: sure, but we talked (you even TODO'd) about allowing a few server-side retries, and it was in the discussion of how that was actually more complex than we initially thougt that we realized this generation issues 14:18:48 cdent: ok 14:19:03 and at the moment I can't think of any reason why we would want to take a generation, it ought to just work (if there's room) 14:19:28 yeah, if we can retry on generation mismatch, what is having a generation involved getting for us? 14:19:59 Food for thought, and we have 3 weeks to digest until PTG 14:20:00 ...and doesn't figure in the POST /allocations API at all (at least based on the doc). 14:22:07 edleafe: did you make an explicit entry about this on the ptg etherpad (I think you said you did, but I can't remember for sure)? 14:22:10 efried: the generation is part of the RP part of the Allocation, no? 14:22:21 nope 14:22:36 In PUT, it's there but ignored. In POST, it's not there. 14:22:39 of the Allocation object, it is 14:22:42 cdent: not yet. I figured let's mention it here, and if it wasn't clarified, we'd PTG it 14:23:10 cdent: I do not see it. 14:23:28 efried: ask me after the meeting and I'll point it/explain it 14:23:29 Anyway, I believe I did add this to the etherpad; let me go find the line no. 14:23:54 L61 14:24:34 Added 14:24:54 edleafe: Duplicate 14:24:57 It's kind of a dupe 14:24:59 jinx 14:25:06 at least move it next to the other one. 14:25:16 (just to note: "more than one thread managing allocations for a single consumer" is the crux of that biscuit, and something I thought we were disallowing, so is the real root of the topic) 14:25:18 There is a little bit of non-overlap. 14:26:49 So it seems to me that we need to, as a group, define the areas where generation is critical to ensure data integrity, and where it isn't. 14:27:00 And then make sure that the code matches that 14:27:55 just so 14:28:32 OK, then, anything else for Open Discussion? 14:28:49 yeah, couple things 14:29:03 I just wrote a blog post summarizing placement api changes: 14:29:08 #link queens summary https://anticdent.org/placement-queens-summary.html 14:29:17 Ah, yes, thanks for that 14:29:20 that's _only_ api changes. can do more if people ask 14:29:39 I'm 2/3 of the way through my blog post on Alternate Hosts 14:29:49 And last week I wrote a different posting, which I would appreciate some feedback on, especially from jaypipes (if only from a long term history point of view) 14:29:57 cdent: I like the idea to just tell about API changes 14:29:58 #link placement extraction https://anticdent.org/placement-extraction.html 14:30:34 I'd like to think that _someday_ we can do the extraction, but without some shared discussion on the topic, it will never get traction 14:30:51 and we'll always be saying "after X" 14:30:52 _someday_ is a reasonable target 14:30:58 And I think the longer we wait to do that, the more difficult it will be 14:31:10 do we still consider that as a necessary thing ? 14:31:32 I do 14:31:34 I mean, if other projects can use openstackclient or directly call the Placement API, why is that such a big deal ? 14:31:42 bauzas: it's necessary for the health of openstack, even if there are no technical reasons 14:31:43 from a technical perspective I mean 14:31:46 (but there are technical reasons) 14:32:03 OpenStack could all be in Nova :) 14:32:11 cdent: do you feel placement was less prioritary than other features in Nova ? 14:32:32 bauzas: if you haven't had a chance to read the posting, it covers various things 14:32:43 bauzas: no, actually, I think that placement takes _too much_ priority in nova 14:32:53 and that nova would benefit by having placement not in it 14:32:57 cdent: that's certainly a good point 14:33:08 cdent: too much traction 14:33:19 Development of placement is currently being guided entirely by nova, and therefore (despite intentions to the contrary) primarily for the benefit of nova. 14:33:23 cdent: but the thing is, I feel we're still on a firedrill 14:33:41 an artificial firedrill 14:33:44 and the interface between Nova and Placement isn't fully sold 14:34:00 that interface shouldn't matter to extraction, should it? 14:34:04 for example, consider efried's point on generic request expression 14:34:16 cdent: well, a single repo eases things, right? 14:34:19 no 14:34:20 no 14:34:20 no 14:34:32 totally disagree with that 14:34:45 honestly, I was way less interacting with placement that cycle than the others 14:34:55 (this :) ) 14:34:55 so I need to catch up 14:35:24 but still, I'm a bit considering how it can be difficult for things to coordinate if we split 14:35:26 anyway 14:35:32 let's not overdiscuss that now 14:35:45 it's a pretty serious meat for disussions at the PTG :-) 14:36:04 tbc, I'm not opposed to the split 14:36:05 bauzas: if you've got comments on the blog posting, before the ptg, if you could leave them there that would help move things forward before we get to the ptg 14:36:18 cdent: I'll certainly read the blogpost 14:36:32 if this coming ptg is anything like the last one, speculative discussion about what we _might_ do will be shut down by people who want to talk only about stuff we are definitely going to do 14:36:44 so need to discuss as much speculatively beforehand 14:36:53 good point 14:36:54 bauzas: comments at least as important as reading 14:37:14 edleafe: yeah, not just about extraction, but most topics 14:37:39 roger this 14:37:42 "If it isn't going to be completed in this cycle, why should we talk about it?" 14:38:18 Anyway, let's continue the discussion on the blog post 14:38:26 ✔ 14:38:28 Anything else before we wrap things up? 14:38:36 no sir 14:38:45 Going once... 14:39:02 Going once... 14:39:10 oops - twice! 14:39:24 OK, thanks everyone! 14:39:24 #endmeeting