14:04:00 #startmeeting nova scheduler 14:04:01 Meeting started Mon Jun 4 14:04:00 2018 UTC and is due to finish in 60 minutes. The chair is cdent. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:04:02 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:04:06 noo 14:04:06 The meeting name has been set to 'nova_scheduler' 14:04:16 howdy, it works \o/ 14:04:19 o/ 14:04:26 #chair efried bauzas edleafe jaypipes 14:04:26 s//_/ 14:04:29 Current chairs: bauzas cdent edleafe efried jaypipes 14:04:41 ō/ again 14:04:46 it fixes it automagically as I recall 14:05:05 I wasn't expecting to have "nova scheduler" be transformed into "nova_scheduler", I was rather thinking it was creating a "nova" one 14:05:11 anyway, gtk 14:05:15 #link agenda https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting 14:05:25 I'm there for 20 mins 14:05:45 please thank the daylight summer time 14:06:00 As there hasn't been a recent placement update email (I'll try to do one this coming Friday) I'm not sure of the state of specs and reviews but: 14:06:04 #topic specs and reviews 14:06:13 any that people would like to bring up? 14:06:20 nrp-in-alloc-cands series needs second +2s 14:06:29 current bottom: https://review.openstack.org/#/c/567113/ 14:06:46 will be exciting to see that live 14:06:48 melissaml is +1, so it's clearly ready. 14:07:03 I can honestly approve the spec, not the series 14:07:08 because SQL, dudes 14:07:23 I looked at it, I understand it 14:07:40 but how can I know whether we have problems with that except just trusting ? 14:07:46 The second +2 is really on jaypipes 14:07:51 maybe jaypipes is the best person to +W 14:07:53 very close reading of the tests? 14:07:53 yeah that 14:07:54 yes 14:07:58 o/ 14:08:15 any other reviews in a similar state? 14:08:23 im ready to response questions on that series and hope that jay have a second look on that 14:08:33 I'd like to know the state of consumer gen series. 14:08:41 Whether that's ready for a final look yet. 14:08:51 my own series is blocked by https://review.openstack.org/#/c/557065/ 14:08:55 I beg here for reviews 14:08:58 current bottom of consumer gen: https://review.openstack.org/#/c/567678/9 14:09:34 bauzas: Have we given up on using yaml right out of the gate on that one? 14:09:37 I have a merge conflict on my implementation but since I need to update my series because of a spec's revision, please at least review the spec first 14:09:43 efried: indeed 14:09:44 tetsuro, efried, bauzas: I don't believe we should proceed with anything in n-r-p until we settle on a design for the in-place upgrade of the compute nodes. 14:10:08 jaypipes: that's a reasonable assumption, at least for new resource allocations 14:10:34 jaypipes: I think the problem is not existing with resource classes that don't need to be migrated 14:10:36 jaypipes: That's not a necessary blocker, esp. for drivers that haven't yet modeled resources that will need to move. 14:10:43 exactly 14:10:43 what efried said 14:10:56 also 14:11:01 one thing we need to keep in mind 14:11:04 Concrete example: I want to start modeling GPUs in powervm. 14:11:07 we support rolling upgrade 14:11:22 so we still need to support N-1 computes 14:11:34 bauzas, efried: guys, if the drivers begin modeling resources with trees, that's when stuff will blow up, right? so we need to put all those series on hold until we figure out a path forward for the "healing" process stuff. 14:12:05 jaypipes: No, only in cases where *existing* resources are being *moved* from the compute node RP to subtrees 14:12:15 jaypipes: does it include the case when Neutron will start report nested RPs? 14:12:21 efried: right. so, NUMA, VGPU, etc. 14:12:35 jaypipes: PowerVM doesn't have GPUs yet - been deliberately waiting for this reason. 14:12:56 again, one thing to keep in mind is that even if we merge compute patches for NRP in Rocky, scheduler and placement still have to handle Queens RPs 14:13:04 So as soon as we have nrp-in-alloc-cands, we can start - we don't have to wait for the migration stuff. Because we won't have migration yet. 14:13:44 efried: here comes the bluntness everyone hates me for apparently... but I really don't prioritize PowerVM's driver stuff over getting the fundamentals correct for the whole rolling upgrade problem that bauzas brought up on ML. 14:13:47 efried: tbc, nrp-in-alloc-cands work with rolling upgrades because we check microversions 14:14:22 I'm giving PowerVM as an example of why nrp shouldn't be blocked on the upgrade issue. 14:14:24 ok, let's not freak out 14:14:30 (like the GitHub thing :p ) 14:14:40 we're still end of Roxky-2 14:15:04 Not asking anyone to prioritize PowerVM. Just asking to prioritize nrp like we've been asking to prioritize it since the beginning of the cycle. 14:15:08 so we can settle down a solution for rolling upgrades *and* merge bits on nrp in time for Rocky hopefully 14:15:42 I was just trying to identify the upgrade impact 14:15:55 hence me saying that some things shouldn't be blocked 14:15:59 efried: I'm just saying if we're going to focus on n-r-p things, it should be anything that will enable the fixes for rolling upgrade. 14:16:00 but looks like we're diverting 14:16:11 jaypipes: that sounds reasonable 14:16:23 in terms of upstream effort on priotization 14:16:35 gosh, I have gloves 14:16:46 efried: if that's tetsuro's n-r-p with allocs series, fine, but I don't *think* that series will have anything to do with rolling upgrades will it? I mean, we've been discussing a completely new HTTP endpoint for handling these mass migrations. 14:17:06 jaypipes: tetsuro is proposing a new microversion for that 14:17:15 bauzas: for what? 14:17:22 jaypipes: for returning child resources 14:17:23 sorry, I didn't follow that last question, jaypipes 14:17:37 so I guess we can pretend in Rocky to not know about nested RPs 14:17:46 agh, no. 14:18:00 inventories will report the new model, but then scheduler will only speak old greek 14:18:06 bauzas: right, but that (while a very useful addition for the scheduler to make use of n-r-p) isn't relevant to the mass migration API. 14:18:12 that's one way to tackle this 14:18:34 jaypipes: agreed, it's orthogonal, good point 14:18:44 jaypipes: but I guess we somehow needs to address that too 14:18:54 ie. we have two upgrade concerns 14:18:59 OK, what patches *currently up* would enable any of the mass migration stuffs? anything? 14:19:12 1/ is the allocations and inventories of resource classes that are now attached to a child RP 14:19:29 no, we haven't started that yet jaypipes, as far as I'm aware. 14:19:35 2/ is the fact that scheduler will have to handle the fact we have both Queens and Rocky RPs 14:19:53 AFAICT, we should -W the XenAPI VGPU series as well as the multi-gpu-type series until the mass migration stuff is resolved. Would bauzas, efried, cdent you agree with that? 14:19:54 last I recall was last friday's (or maybe thursday's) discussion of the etherpad 14:19:58 (hope people follow me) 14:20:14 jaypipes: can't disagree 14:20:24 jaypipes: Agree we should hold up series impacted by upgrade issue, yes. 14:20:29 it's sad but it's necessary 14:20:43 bauzas, efried: specifically those two series, temporarily, yes? 14:20:54 we should _not_, however, block tetsuro's stuff, right? 14:20:59 bauzas, efried: are there any *other* patch series currently up that should be -W'd for the same reason? 14:21:06 I haven't thought through thoroughly, but that's a qualified agreement that it's those two series that need to be held. 14:21:20 cdent: block, no. de-prioritize (slightly) to deal with the upgrade-y stuff, maybe. 14:21:21 cdent: Correct, we should *not* block nrp-in-alloc-cands - we should get that reviewed and merged asap. 14:21:43 efried: ack. I can review it today. 14:21:48 jaypipes: my own series wasn't yet providing nested inventories yet 14:21:50 I'd like to see it merged asap as well, just so it is out of the way, but still usable 14:22:00 efried: just want to point out the upgrade stuff should take precedence/priority as much as possible. 14:22:11 jaypipes: Cool; if it helps, I can propose spec and/or code for the upgrade stuff so you're clear to review tetsuro's series. 14:22:41 I'm +2 on the bottom few patches, and mostly up to speed on the rest 14:22:46 efried: well, if you can whip up a spec that would be cool, yes. we still need to settle differences on the various proposals in that etherpad, though. 14:23:02 yes. Let's talk about that some more in a bit. 14:23:05 efried: you're +2 on tetsuro's series' bottom patches? 14:23:07 can we put the etherpad link here ? 14:23:17 jaypipes: yes 14:23:20 and snaaaap, I need to go babysitting 14:23:24 https://etherpad.openstack.org/p/placement-migrate-operations 14:23:26 #link https://etherpad.openstack.org/p/placement-migrate-operations 14:23:49 I'd actually appreciate a hangout session on the ^^ for higher-bandwidth communication. 14:24:26 Cool, are folks free after this? cdent jaypipes, anyone else? 14:24:29 would anyone have time to discuss the above on a hangout? 14:24:46 efried: I'd like at least cdent and edleafe on the hangout if possible. 14:24:53 I'm free 2.5 hours from now 14:24:57 our API experts-in-resident 14:25:30 cdent: heh, unfortunately I have a call from 2.5 hours from now to 3.5 hours from now. 14:25:32 jaypipes: I can do the hangout right around 3pm UTC 14:25:45 uhu 14:25:51 anyway, I need to leave now 14:25:51 perhaps we can schedule a hangout for tomorrow morning (EST) and do some brainstorming today on the ehterpad? 14:25:57 I could maybe do 1 hour from now if it was only .5 hour long? 14:26:01 find a slot and I'll see how I can sneak into 14:26:13 or tomorrow morning works too 14:26:24 (I'm still pst, but operating early) 14:26:35 reminder to folks on the etherpad... please set your name in the participants color box thing. 14:26:46 currently on the table: 1530 UTC for half an hour, or 1700 UTC 14:26:57 cdent: oh, crap, forgot about you being in PST... 14:27:07 edleafe: either of those better for you? 14:27:18 jaypipes: no worries, I'm waking up early 14:27:19 efried: I think edleafe may be commuting to the office? 14:27:24 cdent: ack 14:27:41 * jaypipes needs to head to his daily standup meeting in 2 minutes. 14:27:57 * alex_xu only wants to wait the conclusion 14:27:58 let's work out a when in #openstack-placement 14:28:08 k 14:28:13 for having one tomorrow morning 14:28:18 with 14:28:25 #action review https://etherpad.openstack.org/p/placement-migrate-operations today 14:28:33 ++ 14:28:38 wfm 14:28:59 ok, unfortunately I need to run to another meeting :( 14:29:09 I got that we're focusing on upgrade issue. After this I should go, but will have a look on that etherpad tomorrow 14:29:17 #action review tetsuro's n-r-p alloc cands patch series as soon as possible 14:29:22 okay, moving on, any other reviews to discuss? 14:29:39 * jaypipes goes ethereal 14:29:45 #topic bugs 14:29:54 #link bugs https://bugs.launchpad.net/nova/+bugs?field.tag=placement&orderby=-id 14:30:26 32 of them, nothing super new 14:30:37 anybody want to talk about a bug? 14:31:06 #topic opens 14:31:14 anyone on anything? 14:31:49 If not, then our primary next steps are to look at the migration etherpad and to get tetsuro's nrp in allocation-candidates reviewed and merged 14:31:56 cool? 14:32:24 cool 14:32:30 cool 14:32:36 ++ 14:32:38 thanks gibi, for not leaving me hanging 14:32:54 thanks for coming everyone 14:32:57 #endmeeting