14:00:14 <melwitt> #startmeeting nova
14:00:14 <openstack> Meeting started Thu Sep  6 14:00:14 2018 UTC and is due to finish in 60 minutes.  The chair is melwitt. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:15 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:18 <openstack> The meeting name has been set to 'nova'
14:00:18 <cdent> o/
14:00:27 <melwitt> hello everybody
14:00:28 <dansmith> o.
14:00:32 <mriedem> o/
14:00:40 <takashin> o/
14:00:46 <tetsuro> o/
14:00:49 <edleafe> \o
14:00:56 <melwitt> let's get started
14:01:06 <melwitt> #topic Release News
14:01:15 <melwitt> #link Stein release schedule: https://wiki.openstack.org/wiki/Nova/Stein_Release_Schedule
14:01:40 <efried> ō/
14:01:41 <melwitt> final rocky release was last thursday. we're still working on bugs and backporting them to stable/rocky
14:02:00 <melwitt> so now, we kick off the stein cycle with the PTG next week
14:02:39 <melwitt> that's all I have for release news. anyone have anything else?
14:03:00 <melwitt> #topic Bugs (stuck/critical)
14:03:17 <melwitt> we have one bug in the critical link
14:03:26 <melwitt> https://bugs.launchpad.net/nova/+bug/1790701
14:03:26 <openstack> Launchpad bug 1790701 in OpenStack Compute (nova) "online_data_migrations fail in rocky+" [Critical,In progress] - Assigned to Matt Riedemann (mriedem)
14:03:49 * bauzas waves late
14:03:55 <mriedem> need https://review.openstack.org/#/c/599744/ approved
14:04:04 <dansmith> I have that open now
14:04:07 <mriedem> with that and another fix already merged,
14:04:11 <mriedem> we have nova-status passing in devstack https://review.openstack.org/#/c/599847/
14:04:14 <mriedem> for fresh install
14:04:18 <bauzas> I'm on the patch
14:04:20 <mriedem> something i should have added long ago
14:04:32 <mriedem> i'll start backports after the meeting
14:04:45 <melwitt> ok, coolness
14:05:06 <melwitt> thsnk
14:05:09 <melwitt> *thanks
14:05:14 <melwitt> #link 51 new untriaged bugs (up 1 since the last meeting): https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New
14:05:22 <melwitt> #link 11 untagged untriaged bugs (up 1 since the last meeting): https://bugs.launchpad.net/nova/+bugs?field.tag=-*&field.status%3Alist=NEW
14:05:39 <melwitt> not too big of an increase from last week in bugs, thanks to all who have been helping with triage
14:05:47 <melwitt> #link bug triage how-to: https://wiki.openstack.org/wiki/Nova/BugTriage#Tags
14:05:52 <melwitt> #help need help with bug triage
14:06:09 <melwitt> Gate status
14:06:10 <melwitt> #link check queue gate status http://status.openstack.org/elastic-recheck/index.html
14:06:11 <melwitt> gate has seemed OK
14:06:22 <melwitt> 3rd party CI
14:06:27 <melwitt> #link 3rd party CI status http://ci-watch.tintri.com/project?project=nova&time=7+days
14:06:46 <melwitt> anything else for bugs or gate status or 3rd party CI?
14:06:52 <mriedem> 3rd party ci needs https://review.openstack.org/#/c/599672/
14:07:05 <mriedem> that was the 0.0 allocation ratio thing killing the non-libvirt ci jobs
14:07:13 <melwitt> right, ok. will review
14:07:22 <melwitt> thanks
14:07:43 <melwitt> #topic Reminders
14:07:51 <melwitt> #link Stein Subteam Patches n Bugs: https://etherpad.openstack.org/p/stein-nova-subteam-tracking
14:07:56 <melwitt> #link Stein PTG planning: https://etherpad.openstack.org/p/nova-ptg-stein
14:08:12 <melwitt> I've updated the etherpad with a schedule ^
14:08:39 <melwitt> the cyborg team is going to talk about placement integration stuff on monday from 2pm - 3pm at the cyborg room
14:08:53 <melwitt> they'd like for interested folks from the nova team to join
14:09:08 <cdent> there's a blazar one on tuesday at 10am (I think)
14:09:12 <efried> yes
14:09:25 <melwitt> ok, will add a note about that on the schedule
14:09:28 <mriedem> and mfing edge at 4pm on tuesday
14:09:53 <melwitt> edge is having an all day thing on tuesday, I
14:09:58 <mriedem> right,
14:10:00 <melwitt> will add a note about 4pm being nova time
14:10:03 <mriedem> but their nova-specific stuff starts around 4
14:10:06 <mriedem> already done
14:10:09 <melwitt> thanks
14:10:49 <melwitt> we have the rocky retro first thing on wednesday
14:10:54 <melwitt> #link Rocky retrospective for the PTG: https://etherpad.openstack.org/p/nova-rocky-retrospective
14:11:09 <melwitt> there's almost nothing on the etherpad, so I expect it to be short
14:11:28 <melwitt> but we'll at least talk about runways and any changes we'd like to make to the spec freeze date this time
14:11:39 <melwitt> and kick off runways for stein accordingly
14:11:46 <efried> oo, I just thought of this, when we do the retrospective *next* time, we get to call it the stein whine
14:11:53 * efried crawls back into hole
14:12:00 <melwitt> that's something to look forward to
14:12:17 <melwitt> ok, that's all I have for reminders. anyone else have anything to add for reminders?
14:12:36 <melwitt> #topic Stable branch status
14:12:58 <melwitt> #link stable/rocky: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/rocky,n,z
14:13:10 * melwitt needs to review
14:13:18 <melwitt> #link stable/queens: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/queens,n,z
14:13:23 <melwitt> #link stable/pike: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/pike,n,z
14:13:28 <melwitt> #link stable/ocata: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/ocata,n,z
14:13:53 <melwitt> #help please help with stable reviews
14:14:04 <melwitt> there are a lot of reviews
14:14:22 <melwitt> anything else for stable branch status?
14:14:36 <melwitt> #topic Subteam Highlights
14:14:51 <melwitt> we didn't have a cells v2 meeting yesterday. anything you'd like to mention here dansmith?
14:14:59 <dansmith> not really,
14:15:15 <dansmith> several of us have been out here and there, like surya this week
14:15:38 <dansmith> I think we got the flag for reverting to the old skip behavior all nailed down (not sure if it merged yet or not)
14:15:52 <dansmith> and we've been iterating on the proper down-cell stuff, which has been a little slow with the people outages
14:15:53 <mriedem> i haven't looked at that yet
14:15:56 <dansmith> but otherwise going pretty well
14:16:01 <dansmith> mriedem: yeah, would be good to get your ack on that
14:16:10 <mriedem> given i asked for it...
14:16:11 <mriedem> yeah
14:16:22 <melwitt> cool, thank you
14:16:42 <melwitt> scheduler, efried?
14:16:46 <efried> No sched meeting this week due to labor day (though in retrospect it would have been polite of me to send an email to that effect).
14:16:46 <efried> But I would like to have a brief update on placement extraction.
14:16:52 <efried> As of yesterday we've merged the forty-some patches to get the extracted repository to the point of gating/voting unit/func/pep, which is a great milestone.
14:17:25 <cdent> it was in honors of efried's 42 birthday
14:17:36 <efried> And with a couple of pending patches as deps, I think cdent has gotten devstack working, as proven by placecat etc. cdent, care to unmuddle that?
14:18:03 <cdent> I got tempest working against https://review.openstack.org/#/c/600162/
14:18:07 <cdent> but not grenade of course
14:18:29 <cdent> and placecat is my docker driven test suite for placement, the container now uses openstack/placement instead of openstack/nova as its source
14:19:59 <melwitt> cool, glad things are going well
14:20:35 <melwitt> I think gibi isn't around, no notes left for notifications team
14:20:58 <melwitt> and I think gmann isn't around, no notes left for API team
14:21:09 <melwitt> anything else for subteams before we move on?
14:21:44 <melwitt> #topic Stuck Reviews
14:21:56 <melwitt> no items in the agenda. does anyone in the room have anything for stuck reviews?
14:22:36 <melwitt> #topic Open discussion
14:22:57 <mriedem> if it's not on the agenda,
14:23:03 <mriedem> cern is going to have a specless bp request for https://blueprints.launchpad.net/nova/+spec/extend-in-use-rbd-volumes
14:23:12 <mriedem> to support extending in-use rbd volumes
14:23:23 <mriedem> the os-brick code isn't merged yet
14:23:26 <melwitt> ok, yeah not in the agenda
14:23:40 <mriedem> and i've said on the nova change that i want to see the ceph job passing with the volume extend tempest test on that nova change first
14:23:40 <melwitt> ok
14:24:07 <melwitt> sounds like a good plan to me
14:24:25 <efried> Something I'd like to put in folks' noggins:
14:24:25 <efried> Do we ultimately see *all* device passthrough eventually going through cyborg, or just accelerators?
14:24:51 <efried> Looking at the long-term plan for torching the existing pci passthrough code
14:25:35 <bauzas> efried: no
14:25:40 <bauzas> efried: please
14:25:52 <melwitt> mriedem: looks like a parity thing for that blueprint, so I'm +1 on approving
14:26:07 <bauzas> efried: cyborg is a management API for accelerators, but please don't purge the capabilities that nova has to manage a set of devices out of it
14:26:16 <melwitt> anyone else have opinions about the approval of specless blueprint https://blueprints.launchpad.net/nova/+spec/extend-in-use-rbd-volumes ?
14:26:46 <mriedem> melwitt: it might be premature until the actual brick change is approved and ceph testing is green
14:26:50 <mriedem> i was just bringing it up as an fyi
14:26:54 <bauzas> I was about to say the same
14:26:56 <efried> sorry for the cross-talk, lemme know when you're done
14:27:01 <bauzas> it requires a new osbrick version
14:27:10 <bauzas> os-brick even
14:27:30 <melwitt> mriedem: ok. so we'll wait to approve until after that. sorry, I thought you were asking to approve now
14:27:35 <bauzas> but if that's straightforward in nova, I'm not opposed to the specless-y
14:28:21 <melwitt> k. cool. I think we're done with that then
14:28:59 <melwitt> efried: go ahead, sorry about that
14:29:09 <efried> thanks.
14:29:37 <dansmith> my opinion is that cyborg isn't far enough along to have enough confidence in it to replace things like basic device attach,
14:29:38 <efried> So I know cyborg is going to get involved in doing the discovery and reporting (to placement) of accelerator inventory.
14:29:48 <dansmith> especially with SRIOV type things that need some network attention
14:30:19 <dansmith> I would kindof expect that the PCI attach functionality in nova is how we end up attaching accelerators under the covers anyway, perhaps without the same level of whitelisting nonsense
14:30:21 <bauzas> yeah I think it's premature
14:30:28 <efried> well, I agree with that for sure. We're not going to be able to replace the whole pci subsystem all at once.
14:30:33 <dansmith> but until cyborg becomes a much more mature thing, I'm not really in favor of replacing anything with it,
14:30:39 <efried> but we can take one of two paths wrt cyborg
14:30:40 <dansmith> and only trying to enable what new things it might bring
14:30:59 <bauzas> I'm still a bit concerned
14:31:11 <efried> we can either make the effort to embrace it and thus help it mature, pulling in pieces as they become available/usable
14:31:17 <bauzas> if we say this way, then we should have said to leave vGPUs out of the nova radar
14:31:32 <efried> or we can go our own way and then do a second, bigger, more painful integration later when we consider cyborg "mature".
14:31:34 <bauzas> the most crucial thing is not what we have, but how we support it
14:31:53 <dansmith> efried: you mean "if/when"
14:31:56 <mriedem> we've said no to fpga directly in nova for years,
14:32:01 <dansmith> because the if part is the important bit to me
14:32:02 <mriedem> cyborg is the path to fpga in nova
14:32:08 <mriedem> so let's see that happen first
14:32:12 <dansmith> mriedem: exactly
14:32:21 <mriedem> before spending a bunch of time retrofitting what we already have
14:32:28 <bauzas> oh yeah
14:32:36 <efried> chicken/egg, self-fulfilling prophecy, and all that.
14:32:50 <efried> I.e. if we take path A, cyborg is more likely to be a long-term success.
14:32:51 <cdent> If I'm understanding efried correctly, the concern here is about architecture over the long term
14:33:05 <bauzas> and we could potentially improve the PCI functionality without really pulling it out of nova
14:33:08 <cdent> if there's a chance that cyborg will become more generic it needs to start out that way sooner
14:33:14 <efried> yes, that ^
14:33:28 <bauzas> I'm not opposed to have the same feature be done in two different ways
14:33:54 <efried> Look, it actually makes my life easier if we say we're going to ignore cyborg for a couple of cycles and start rolling our own placement-based device passthrough, per kosamara's spec as written.
14:34:05 <bauzas> after all, it's now 4 cycles that we are wondering how cyborg will interact with nova
14:34:52 <bauzas> yeah, and I think it's not a big deal for placement, right?
14:34:53 <efried> well, not really, only since Dublin has it been more than a haze
14:35:14 <bauzas> I heard of cyborg since barcelona
14:35:39 <bauzas> it's just that we had a chat with them since Dublin, yeah
14:35:48 <efried> but I'm trying to consider what's best long term, and whether we have a duty^Wresponsibility^Wopportunity to help raise the project and help it mature.
14:36:14 <bauzas> but we can also try to avoid overguessing what the future could be, and leave people engage with us
14:36:30 <bauzas> for example, blazar is way older than placement
14:36:38 <bauzas> but at the end, they will use it
14:36:59 <bauzas> I don't see a problem having both efforts
14:37:19 <melwitt> I guess I'm not sure how cyborg being generic enough is related to which thing we integrate first
14:37:29 <efried> so I believe it was dansmith who asked the question on kosamara's spec, lemme find that...
14:37:47 <melwitt> the fpga thing will be the first step and if that works well, we could consider moving other passthrough to it right?
14:37:49 <cdent> a question standing here is "do we have a chance to collaborate rather than duplicate effort"
14:38:07 <dansmith> melwitt: cyborg is not generic enough today, as defined/planned I think
14:38:14 <efried> dansmith: https://review.openstack.org/#/c/591037/ PS5: "I would have expected a lot of the stuff described here to be in scope for cyborg. Not that we should exclude all that from nova necessarily, but I think that it's probably worth calling out how this intersects (or not) cyborg's intended scope."
14:38:15 <dansmith> melwitt: they're asking if we should encourage them to *be* generic enough
14:38:20 <melwitt> dansmith: oh, ok
14:38:46 <melwitt> was just looking at their wiki again, "various types of accelerators such as GPU, FPGA, ASIC, NP, SoCs, NVMe/NOF SSDs, ODP, DPDK/SPDK and so on" so I thought that sounded generic
14:38:47 <efried> well, when I asked Sundar this question, his reaction was yes.
14:39:09 <mriedem> this is premature - given how slow things move, they should opt to be generic if possible,
14:39:15 <mriedem> but not at the expense of actually getting shit done
14:39:22 <efried> right
14:39:30 <mriedem> i don't think we have any duty to raise that project
14:39:34 <mriedem> we can collaborate, sure
14:39:40 <mriedem> but it's not my top priority by any means
14:39:44 <efried> My point being that that affects how we proceed in nova with device passthrough and making existing pci code diaf
14:39:44 <dansmith> neither mine,
14:39:47 <mriedem> and expect it's not the priority for others
14:39:54 <efried> it is mine, actually.
14:39:57 <dansmith> but I think that efried is asking because he wants to know whether to push on the nova-centric generic device approach,
14:40:00 <dansmith> or go push in cyborg
14:40:02 <efried> correct
14:40:07 <efried> thanks dansmith, nail on head
14:40:37 <mriedem> i'm likely not going to be involved in that either way, at least not in stein, so doesn't matter to me personally
14:40:54 <mriedem> obviously decomp is best if possible,
14:40:59 <mriedem> but that might take a couple of years
14:41:09 <dansmith> decomp like "let that corpse rot" ?
14:41:16 <efried> vay
14:42:39 <efried> Okay, so dansmith if the response in that review is, "this may or may not be in scope for cyborg long-term, but we're going to do it this way until that project matures more"...
14:42:40 <mriedem> punt to ptg?
14:42:41 <efried> that wfy?
14:43:06 <dansmith> mriedem: yeah, I've typed out several responses and deleted them all because I can't articulate my feelings on the matter
14:43:11 <dansmith> so maybe ptg
14:43:12 <efried> Yeah, definitely going to discuss some at ptg, but wanted to get a couple of gears turning in y'all's heads.
14:43:26 <dansmith> I guess the bottom line is:
14:43:42 <dansmith> I don't have a lot of faith in cyborg becoming a useful generic device service as it is today
14:43:54 <dansmith> so if I cared about generic devices a lot, I probably wouldn't put my eggs in that basket
14:44:23 <dansmith> but, since I don't care so much, putting them over there keeps them out of the way in nova
14:44:24 <dansmith> so..? :)
14:44:37 <efried> okay
14:44:38 <efried> so
14:45:00 <efried> I'm going to be pushing hard for at least a small piece (full GPUs) of generic placement-based device passthrough in Stein.
14:45:25 <efried> And obviously will be asking people like those present here to review things in that space.
14:45:41 <efried> so wanted to get pre-buy-in for which approach to take short-term (stein)
14:45:44 <efried> which I think I have now
14:45:44 <dansmith> I think fleshing out GPUs in nova, which we already have makes sense
14:45:45 <efried> so
14:45:46 <efried> thanks.
14:45:56 <efried> well, distinguishing VGPU from GPU in this case dansmith
14:46:09 <efried> Those are going to be very different things.
14:46:09 <dansmith> oh,
14:46:26 <dansmith> you want a GPU-specific PCI passthrough replacement?
14:46:30 <efried> The full-GPU passthrough thing is going to actually subsume some of the functionality you can currently do with [pci]*
14:46:35 <efried> yes exactly
14:46:43 <efried> GPU first
14:46:55 * dansmith looks for his spoon
14:46:56 <efried> or possibly any "full card"
14:47:53 <efried> I see being able to use either mechanism (legacy [pci]passthrough_whitelist/alias or The New Thing) for multiple releases
14:48:03 <efried> until we have full parity and can start ripping out the legacy thing
14:48:09 <efried> if we try to do it all at once, fail
14:48:19 <mriedem> we should make a list of the ginormous tasks we think we're going to take on in stein - at the ptg of course
14:48:28 <mriedem> b/c i remember a lot of wailing about not having shared storage support yet
14:48:48 <mriedem> cross-cell cold migrate is going to be my albatross
14:48:56 <dansmith> or a plan for numa
14:49:12 <mriedem> or just being able to upgrade to stein with placement working :)
14:49:20 <dansmith> yeah
14:49:20 <melwitt> yeah, I want to get shared storage squared away. being that it looks like it's close too
14:49:27 <mriedem> or instance ownership transfers
14:49:28 <mriedem> etc
14:49:33 <mriedem> lots of big proposals on the plate right now
14:49:45 <mriedem> we're gonna need to weigh this stuff
14:50:09 <cdent> should be plenty of scales in colorado
14:50:20 <mriedem> b/c of fat coloradoans?
14:50:27 <cdent> weeeeeeeed
14:50:28 <dansmith> cows
14:50:30 <mriedem> oh right
14:50:31 <mriedem> heh
14:50:36 <mriedem> *rimshot*
14:50:55 <melwitt> ok, are we done? :)
14:50:57 <dansmith> cdent: post-legalization, it's not that big a deal to make sure the dime bag is no larger than it should be :)
14:51:49 <efried> I think I'm done
14:51:53 * cdent avoids going off into too much weed jargon
14:52:13 <melwitt> ok, let's call it. thanks everyone
14:52:17 <melwitt> #endmeeting