14:00:24 #startmeeting nova-scheduler 14:00:25 Meeting started Mon Mar 4 14:00:24 2019 UTC and is due to finish in 60 minutes. The chair is efried. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:26 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:28 The meeting name has been set to 'nova_scheduler' 14:00:32 o/ 14:00:48 \o 14:00:59 o/ 14:01:09 \o 14:01:54 o/ 14:02:00 o/ 14:02:03 #link agenda (just updated, please refresh) https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting 14:02:28 \o 14:02:45 #topic last meeting 14:02:45 #link last minutes http://eavesdrop.openstack.org/meetings/nova_scheduler/2019/nova_scheduler.2019-02-25-14.00.html 14:03:26 old business: 14:03:31 #link alloc cands in_tree series starting at https://review.openstack.org/#/c/638929/ 14:03:31 merged \o/ 14:03:36 huzzah 14:04:00 #link the OVO-ectomy https://review.openstack.org/#/q/topic:cd/less-ovo+(status:open+OR+status:merged) 14:04:00 merged \o/ 14:04:23 the rest we'll get to in new bizniss. 14:04:29 anything else from last week? 14:04:50 #topic specs and review 14:04:50 #link latest pupdate http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003385.html 14:04:50 pupdates are back \o/ 14:05:09 #link blueprint tracking https://etherpad.openstack.org/p/nova-stein-blueprint-status 14:05:21 #link libvirt reshaper https://review.openstack.org/#/c/599208/ 14:05:21 bauzas, what news? 14:05:46 efried: I just provided a comment 14:06:00 given the concern of alex_xu 14:06:15 but tbc, the series is done 14:06:21 we need reviews 14:06:44 o right, alex_xu would you please have a look? 14:06:59 Since you're here - if you still have concerns, we can discuss them in #opens 14:07:00 yea, sorry just saw the reply 14:07:17 or, uh, now I guess would also be fine. 14:07:42 no, my reading is slow, please go through other item first 14:07:55 sure thing. 14:08:07 #link DRY/clean up placement objects starting at https://review.openstack.org/#/c/637325/ 14:08:41 This is getting tall. Much work is being done here. jaypipes has a -1 on the bottom, cdent has responded (TLDR: what jaypipes suggests is happening, just further up the chain) 14:08:43 I rebased that this morning, more than I initially intended 14:08:50 heh, I noticed 14:09:11 my local tree had something already merged in it and I got confused 14:09:39 white belt mistake 14:09:53 * cdent has never liked rebasing 14:10:17 anyway, most of those 15 reviews are pretty easy, more eyes would be most welcome. 14:10:37 14 (/me can't count) 14:10:38 yeah, if we could get more of that merged, it would make the tallness less of a challenge 14:11:01 #topic Extraction 14:11:01 #link Extraction https://etherpad.openstack.org/p/placement-extract-stein-5 14:11:09 whoops, I put a link in the wrong place on the agenda... 14:11:15 #link Placement is set up to use storyboard https://review.openstack.org/#/c/639445/ 14:11:18 ^ is merged 14:11:27 #link storyboard: https://storyboard.openstack.org/#!/project_group/placement 14:11:27 what does this mean, that we've officially got a storyboard presence? 14:11:32 thanks cdent 14:11:48 I've never used storyboard before, will need to find the storyboard-for-dummies reference 14:11:51 yes, but we haven't yet solidified plans of how to use it 14:12:02 ditto 14:12:15 I think they have some test projects too where it is possible to play around 14:12:42 related to storyboard: I marked as "fix committed" a bunch of bugs that were committed but hadn't been auto updated. there are a few left I'm not sure about: 14:13:18 which I suppose we can talk about during the bugs topic 14:13:33 https://storyboard-dev.openstack.org/ is the sandbox for storyboard 14:13:41 for playing around in 14:13:46 Thanks SotK 14:13:56 to make it clear, you want to use storyboard for what ? blueprints only or bugs ?N 14:13:58 #link storyboard sandbox https://storyboard-dev.openstack.org/ 14:14:20 bauzas: nothing yet, because we haven't yet made a plan and it seemed weird to try to make a plan at this stage in the cycle 14:14:26 why? 14:14:26 ok 14:14:41 why to who ? 14:14:44 me or cdent? 14:15:01 why shouldn't we try to figure out how we're going to use sb? 14:15:05 me : because I'd like to know what to do when you want to provide some feature for placement 14:15:22 that's it 14:15:24 what are the options on the table? 14:15:34 we should figure it out, but we don't need to rush to change anything because we don't want to confuse end users who might be wanting to report bugs _now_ 14:15:44 and are not party to our meanderings 14:15:58 for train related features I would think that creating stories in storyboard will be the way to go 14:16:00 Some time at the PTG for SB discussion/education would be a help 14:16:09 agreed with edleafe 14:16:11 What does it take to get, um, "subscribed" to new things that are put into sb against our project? 14:16:19 or wait for the new PTL to decide :) 14:16:28 but since we're not in a hurry in that direction, and people are focused on feature freeze and other things, I assumed we could put the SB planning on the back burner 14:16:53 efried: go to preferences under your name 14:17:03 where there are email settings you can choose 14:17:10 Yeah, let's not do StoryBoard 101 now 14:17:24 figured it out. 14:17:32 It was fairly intuitive, once I had realized I needed to log in. 14:17:35 so 14:17:57 It's probably fairly obvious, having decided to use this thing at all, what should happen in the Train cycle. 14:18:10 feel free to come and ask questions in #storyboard when you start trying to figure things out 14:18:38 Namely: new features and long-term wishlist items should get a, um, "story". And bugs should be reported there too. 14:18:49 I think it's the interim between now and then that's not cut and dried 14:18:56 is that a reasonable assessment? 14:19:10 yes 14:20:01 Be prepared for a "moving to git from svn" brain reshuffling. The workflow is not 1:1 to launchpad 14:20:16 so it seems to me as though we do need to figure out that interim, precisely so that people are not confused about where and how to report bugs 14:20:18 I think for "us" we can manage to use both launchpad and storyboard for the rest of the cycle, and in train it will be only storyboard. But for "them" I think asking people to switch not on a cycle boundardy is weird 14:20:42 then this seems like an ideal transition period. 14:20:48 Send an email announcing we're migrating to SB 14:20:55 and that we'll continue accepting lp bugs until train opens 14:21:10 * bauzas just loves the SB acronym 14:21:21 but at that point we'll shut down lp (which I guess we won't/can't really do, but we can say it) and all bugs need to be reported in sb. 14:21:37 AFAIK, there are some tools for migrating bugs from LP to SB 14:21:42 and meanwhile train specs should be opened in sb exclusively. 14:21:53 efried: sorry, was etherpadding about cyborg... yes, will try to get to that series today. 14:21:56 bauzas: I think that's the part we're planning to wait on, until train officially opens. 14:21:59 right: the main stickler here is that people will _always_ continue reporting bugs in launchpad, so we can't simply forget about it because nova isn't getting shut down 14:22:00 jaypipes: ack, thx 14:22:07 you should poke some folks in #storyboard that'd be happy to help you, incl. but not exhaustively fungi (IIRC) 14:22:26 definitely not exhaustively ;) 14:22:37 yup, I get it cdent. When we officially cut over, we would I guess start marking placement bugs as Invalid in lp, requesting they be opened in sb instead. 14:22:51 something like that 14:22:51 there is an upgrade path 14:22:59 again, you should get info first 14:23:11 I really don't think we need to over analyzse this. there is a very small number of bugs that get reported on placement itself 14:23:12 but some projects already made the leap 14:23:13 and in the interim, we would respond to all such bugs with a message like, "Okay, we're dealing with this here now, but on such-and-such a date we're transitioning..." 14:23:20 we just need to pay some attention and it will be fine 14:23:37 Okay, but in terms of immediate action, IMO somebody should send an email. 14:23:52 to which audience? 14:24:00 because we already did to "us" 14:24:02 ML 14:24:03 again, before taking any action, just consider the migration path *first* :) 14:24:27 efried: I mean which audience within the ML 14:24:31 unless you only wanna features 14:24:33 I can volunteer for that, though it'll have to be exclusively "what" since I have no idea on the "how". 14:24:41 [placement] ought to suffice, no? 14:24:57 because we only care about people opening bugs/features against placement 14:25:00 efried: I'm asking if you mean end users/operators or devs? Because that changes the pitch of the message 14:25:12 it does? 14:25:12 In any case: 14:25:14 Lemme draft something 14:25:18 and I'll show it to you 14:25:21 and we can go from there 14:25:22 #action: cdent send a message about storyboard 14:25:24 i'll do it 14:25:25 I don't think this is rocket science 14:25:26 okay. 14:25:36 moving on 14:25:46 It's not rocket science is what I've been trying to say for the last 10 minutes 14:25:52 anything else extraction? 14:26:07 some of the puppet stuff merged, which is cool 14:27:14 other than that, I think extraction is pretty much done for this cycle 14:27:26 it's next cycle when nova deletes placement that things get exciting again 14:28:18 cool 14:28:30 #topic bugs 14:28:30 #link Placement bugs https://bugs.launchpad.net/nova/+bugs?field.tag=placement 14:28:42 cdent: you wanted to talk about some possibly-done bugs needing to be marked in lp? 14:28:58 yeah, one sec 14:29:06 #link https://bugs.launchpad.net/nova/+bug/1696830 14:29:07 Launchpad bug 1696830 in OpenStack Compute (nova) "nova-placement-api default config files is too strict" [Low,In progress] - Assigned to Corey Bryant (corey.bryant) 14:29:15 #link https://bugs.launchpad.net/nova/+bug/1778591 14:29:15 Launchpad bug 1778591 in OpenStack Compute (nova) "GET /allocations/{uuid} on a consumer with no allocations provides no generation" [Medium,In progress] 14:29:21 are either done or stuck, I'm not sure which 14:30:04 I dropped https://goo.gl/vzGGDQ from 17 to 6 earlier today 14:30:19 mostly because we no longer have auto reporting of "fix committed" 14:30:25 \o/ 14:31:40 I assuming we have a similar problem with bugs not yet marked as in progress: 14:31:51 #link https://goo.gl/TgiPXb 14:31:52 but i haven't checked that and probably won't get to it today 14:32:45 so what needs to be done to move things along? Or is it necessary to move things along with any urgency? 14:33:30 no immediate urgency, but if any of those are "done" or "dead" they can be made to not cloud the storyboard waters 14:33:51 and just as a matter of principle a bug that hasn't had attention in a while isn't really real 14:35:07 okay. 14:35:20 #topic opens 14:35:20 (From last last week) Format of this meeting - new PTL to query ML 14:35:20 (From last last week) Placement team logistics at PTG - new PTL to query ML 14:35:20 #link post-extraction plans for placement specs/bps/stories http://lists.openstack.org/pipermail/openstack-discuss/2019-February/003102.html 14:35:40 libvirt reshaper fup - alex_xu, ready to discuss? 14:35:51 yea 14:36:05 bauzas: ^ 14:36:19 yup 14:36:21 may i ask why we have this middle step https://review.openstack.org/#/c/599208/18? before we have multiple types 14:36:42 because if not, we need to reshape once more 14:36:57 Perhaps alex_xu is suggesting we go straight to multiple type support 14:37:03 the original PS was only providing a single inventory per *type 14:37:18 alex_xu: I think we agreed to do it in stages just to keep the surface area manageable for stein. 14:37:19 but for NUMA awareness, that couldn't make it 14:37:19 oh...that is what i thought when I read the spec 14:37:34 so, we decided to not just take only one aspect at once 14:37:50 but rather model the resources as the best for both affinity and type check 14:38:07 even if both affinity and type check features aren't there yet 14:38:22 - just to prevent extra reshapes that are costly - 14:38:41 HTH ? 14:39:12 yea, i see the reason now 14:39:41 but in the end, when we have numa and multiple type, then we will put same type in the same RP, right? 14:39:59 I don't think so. 14:40:05 I think we would still split along pGPU lines. 14:40:18 what is the reason behind that ? 14:40:40 because it's still important to be able to map inventory units (placement) to PCI addresses (libvirt/linux) 14:41:06 and a given pGPU has a given number of VGPU units available 14:41:17 that's assuming we keep the traits of all the pGPUs the same, which isn't necessarily going to happe 14:41:18 n 14:41:36 and if we were to change the config of a pGPU that had been consolidated with another of the previously-same type, we would have to reshape. 14:41:38 yup, correct 14:41:47 whereas if we keep them separate, we can just do the change. 14:41:48 yea, agree with that, at least a vm needn't a lot of vGPU 14:41:56 we will still provide inventories per physical GPUs 14:42:12 then we simple the code a little since RP mapping to the PCI address 14:42:19 if two pGPUs share same type, that's good, but that's still two RPs 14:42:31 alex_xu: there is a spec on it 14:42:38 that basically says that yes 14:42:46 cool, thanks, i got it 14:43:36 alex_xu: fyk https://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/vgpu-rocky.html 14:43:40 Do any of the other reviewers who have participated along the way want the opportunity to give it a nod before we +W it? 14:43:55 honestly, except you and alex_xu... :p 14:44:09 and matt co-authored, so he can't really argue 14:44:17 but at the edge case, like each two of PGPU only left 1 VGPU, and a VM is asking 2 VGPUs? 14:44:19 mriedem jaypipes cdent artom melwitt gibi 14:44:52 alex_xu: If requested in the same request group, fail. If requested separately, that request can be fulfilled. 14:44:52 or actually pGPU can provide enough number of VGPUs? that isn't the case worry about? 14:45:11 efried: but that required a different flavor 14:45:24 efried: I followed a bit. and I'm for representing separate PGPUs as separate RPs 14:45:56 I think the implementation is what we said it should be, and in general a different "physical" thing should be a different rp 14:46:00 if I were designing a flavor, and it had multiple vgpus in it, and I wanted to assure most-likely-to-fit, I would request resources1:VGPU=1&resources2:VGPU=1&...&resourcesN:VGPU=1&group_policy=none 14:46:31 at some point somebody will ask for the numa awareness of the vgpu-s and then we need to know which vgpu is coming from which PGPU which close to which numa node 14:46:32 efried: I'd like the opportunity. 14:46:35 please 14:46:39 if I were being lazy and didn't care so much about most-likely-to-fit, I would request resources:VGPU=N 14:46:56 oh....request group can use like that 14:46:56 but that would bounce in scenarios of high saturation as alex_xu mentions 14:47:21 jaypipes: ack. So alex_xu if you approve, please don't +W 14:47:28 you lost me 14:47:39 I was dragged from two mins 14:47:56 no, I will leave that to jaypipes, I'm not follow this series enough 14:48:08 okay 14:48:11 bauzas: I think we're clear. 14:48:15 are we drawing something ? 14:48:21 a conclusion? 14:48:24 yes, I'm clean :) 14:48:27 just to make it clear, the reshape is just modeling resources 14:48:32 * efried rolls out his Jump To Conclusions Mat. 14:48:34 not how you query them :) 14:48:53 given you can't allocate more than a single vGPU atm, it's a no-brainer :) 14:49:02 for the record, I've reviewed that series a number of times now. I think half my reviews are sitting in draft state because various others bring up identical points I was raising. 14:49:11 just would like to do a once over on it. 14:49:17 for peace of mind 14:49:20 that would be helpful in any case jaypipes, thank you. 14:49:22 tbh, the reshape change is hairy 14:49:45 hence why I'm very reluctant to have more than a single reshape :) 14:50:48 yeah, it's not that it's a resource drain on the system or anything; it's just pretty complicated to code up. And a special-purpose chunk of code that's going to get run at most once in a deployment and then never used again. 14:51:14 so good to minimize how many reshapers we write overall. 14:51:30 anything else on this topic before we move on? 14:51:57 okay, other topics for open discussion before we close? 14:52:45 Thanks everyone. 14:52:45 o/ 14:52:45 #endmeeting