14:00:10 <efried> #startmeeting nova
14:00:11 <openstack> Meeting started Thu Aug 22 14:00:10 2019 UTC and is due to finish in 60 minutes.  The chair is efried. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:12 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:15 <openstack> The meeting name has been set to 'nova'
14:00:23 <mriedem> o/
14:00:24 <takashin> o/
14:00:25 <gibi> o/
14:00:29 <yonglihe> o/
14:00:49 <dansmith> o/
14:00:51 <stephenfin> o/
14:01:33 <alex_xu> \o
14:02:18 <aspiers> o/
14:02:26 <efried> #link agenda https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting
14:03:27 <efried> #topic Last meeting
14:03:27 <efried> #link Minutes from last (*2) meeting: http://eavesdrop.openstack.org/meetings/nova/2019/nova.2019-08-08-14.01.html
14:03:57 <efried> Last week it was just me & takashin, and we didn't have anything to discuss, so we didn't bother going through the agenda. Stats below will be for 2 weeks worth of bugs.
14:04:07 <efried> anything to bring up from 2 weeks ago?
14:05:15 <mriedem> i was younger and happier
14:05:28 <mriedem> full of vim and vigor
14:05:28 <efried> there there
14:05:51 <efried> #topic Release News
14:05:59 <efried> three weaks to feature freeze
14:06:02 <efried> heh
14:06:09 <efried> sean-k-mooney moment there
14:06:37 <efried> #link Train release schedule https://wiki.openstack.org/wiki/Nova/Train_Release_Schedule
14:07:00 <efried> As if you didn't know, #action everyone do lots of reviews
14:07:05 <aspiers> What are the chances of SEV landing before the freeze? It's currently 2nd in the runway queue
14:07:29 <efried> so I had been tempted to compose a list of blueprints I think are "close" and could just stand a quick final look to be pushed through
14:07:40 <efried> but thought that would potentially subvert the runways process
14:07:47 <efried> how do others feel?
14:08:39 <gibi> I assume that if something is in the runway queue it is ready so you actualy has a list already
14:09:01 <gibi> efried: or you want to filter that list down a bit?
14:09:03 <efried> right, but there are things that are "really close" that are waaay down in the queue
14:09:08 <mriedem> there are at least 3 api changes that are all conflicting for the next microversion,
14:09:22 <mriedem> so somehow serializing those would be nice, and i think they are all the same owner (brinzhang)
14:09:35 <alex_xu> yonglihe has one very close
14:09:37 <mriedem> i think the unshelve + az one is going to be next given it's in a runway right now and it's had active review
14:09:42 <mriedem> oh right that's 4
14:10:02 <yonglihe> @alex_xu, thank, i just thing how to attract force.
14:10:12 <yonglihe> thing/ think
14:10:23 <mriedem> efried: really close aka forbidden aggregates?
14:10:43 <efried> yeah, that's one that was on top of brain
14:11:04 <mriedem> some stuff has been reviewed outside of the runway slot, like pmu, but i had looked for that one as well since i knew the spec was pretty simple
14:11:19 <efried> yeah, not being in a runway slot doesn't mean it can't be reviewed of course
14:11:26 <efried> I think we have several blueprints that are counting on that ^ :P
14:11:27 <mriedem> i think for the really close stuff,
14:11:30 <mriedem> people just need prodding
14:11:47 <efried> "ad hoc prod" vs "efried writes up something more formal"
14:11:50 <mriedem> so if for forbidden aggregates that means poking dansmith, i guess poke on
14:12:26 <mriedem> efried: writing something up / etherpad doesn't mean people will look w/o being poked
14:12:29 <mriedem> from experience,
14:12:41 <mriedem> it's good for project management either way so if it clears your head go nuts
14:12:44 <efried> true story.
14:12:50 <mriedem> i did it in the past for my own mental health
14:12:50 <efried> public shaming can be effective, though
14:13:46 <yonglihe> 😂
14:14:01 <efried> okay, I'll see whether "digital pillory" floats to the top of my to-do list this week.
14:14:18 <efried> #topic Bugs (stuck/critical)
14:14:18 <efried> No Critical bugs
14:14:18 <efried> #link 69 new untriaged bugs (+2 since the last meeting): https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New
14:14:18 <efried> #link 3 untagged untriaged bugs (+2 since the last meeting): https://bugs.launchpad.net/nova/+bugs?field.tag=-*&field.status%3Alist=NEW
14:14:49 <mriedem> i haven't really done anything with this one https://bugs.launchpad.net/nova/+bug/1839800
14:14:50 <openstack> Launchpad bug 1839800 in OpenStack Compute (nova) "instance need to trigger onupdate when nwinfo update" [Undecided,New]
14:14:52 <mriedem> since it's kind of opinion,
14:14:56 <mriedem> but i can see the logic
14:14:59 <mriedem> in case others care to weigh in
14:16:10 <gibi> feels similar to this https://bugs.launchpad.net/nova/+bug/1704928
14:16:11 <openstack> Launchpad bug 1704928 in OpenStack Compute (nova) "updated_at field is set on the instance only after it is scheduled" [Medium,In progress] - Assigned to Balazs Gibizer (balazs-gibizer)
14:16:39 <mriedem> gibi: that's a deep cut
14:17:04 <gibi> I know, this is why it is not progressing
14:18:16 <sean-k-mooney> im not sure if the updated at field should be updated for this
14:18:23 <mriedem> you never replied to my nack :)
14:18:27 <mriedem> anyway,
14:18:33 <mriedem> let's talk on reviews and in bugs or -nova later
14:18:38 <gibi> mriedem: I had no idea what to do with it :)
14:18:49 <mriedem> gibi: but you have the cape man!
14:19:13 * gibi needs to think about who to pass that cape
14:20:00 <efried> Totally.
14:20:00 <efried> moving on...
14:20:00 <efried> #topic Gate status
14:20:00 <efried> #link check queue gate status http://status.openstack.org/elastic-recheck/index.html
14:20:00 <efried> #link 3rd party CI status (seems to be back in action) http://ciwatch.mmedvede.net/project?project=nova
14:20:17 <efried> seeing a lot of that innodb thing lately :(
14:20:39 <efried> also grenade things have seemed quite brittle for the past week or two.
14:20:40 <mriedem> yeah, that's one a single provider so i'm not sure what's up with that
14:20:58 <mriedem> grenade as in ssh fails?
14:21:00 <mriedem> on the old side?
14:21:05 <mriedem> if so, there is a devstack patch to stein for that
14:21:16 <efried> I'm not sure, I haven't dug into a lot of them
14:21:25 <mriedem> https://review.opendev.org/#/c/676760/
14:21:31 <sean-k-mooney> mriedem: the patch to enable memcache?
14:21:49 <mriedem> yes
14:21:51 <efried> After you mentioned a bug number a few days ago, I started trying to find that same issue in the subsequent failures, but either it was something different or I was looking in the wrong place.
14:21:54 <efried> so I basically gave up.
14:22:02 <efried> and just started blind rechecking.
14:22:30 <efried> mriedem: does that need a devstack core or a stable core?
14:22:40 <mriedem> devstack core
14:22:43 <efried> or is devstack-stable its own thing?
14:22:45 <mriedem> so gmann
14:22:47 <mriedem> no
14:22:51 <efried> okay
14:22:53 <mriedem> devstack is special, there is no stable team
14:23:04 <efried> I knew there was something unusual there
14:24:37 <efried> Could frickler approve it?
14:24:48 <mriedem> oh i'm sure, or ianw
14:24:52 <mriedem> i pinged gmann in -qa
14:24:57 <mriedem> or clarkb
14:25:00 <mriedem> or sdague!
14:25:13 <efried> I was about to say, let's resurrect that guy ^
14:26:01 <efried> As for the innodb thing, that's on limestone-regionone? Who owns that?
14:26:01 <mriedem> moving on?
14:26:12 <mriedem> you'd have to ask in -infra
14:26:14 <mriedem> i forget the name
14:26:48 <efried> okay, pinging in -infra
14:26:49 <efried> moving on.
14:27:00 <efried> #topic Reminders
14:27:01 <efried> any?
14:27:26 <dansmith> wear your seatbelts, kids
14:27:42 <mriedem> my car dings if i don't so i do
14:27:55 <efried> #action kids to wear seatbelts
14:27:55 <efried> #topic Stable branch status
14:27:55 <efried> #link stable/stein: https://review.openstack.org/#/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/nova)+branch:stable/stein
14:27:55 <efried> #link stable/rocky: https://review.openstack.org/#/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/nova)+branch:stable/rocky
14:27:55 <efried> #link stable/queens: https://review.openstack.org/#/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/nova)+branch:stable/queens
14:27:56 <mriedem> big gubment safety standards
14:28:19 <efried> mriedem: stable nooz?
14:28:21 <mriedem> stable reviews are piling up again,
14:28:24 * alex_xu can keep the body float on the seat
14:28:43 <mriedem> we have a regression on stable that needs reviews, sec
14:29:05 <mriedem> https://review.opendev.org/#/q/topic:bug/1839560+(status:open+OR+status:merged)
14:29:11 <mriedem> lyarwood must be on vacation?
14:29:20 <sean-k-mooney> he is back
14:29:26 <mriedem> maybe dansmith can hit those stein ones
14:29:41 <dansmith> my dance card is getting pretty full today
14:29:49 <dansmith> but remind me
14:30:20 <sean-k-mooney> mriedem: did TheJulia comment on if the backport fixed there issue
14:30:23 <mriedem> https://review.opendev.org/#/q/topic:bug/1839560+branch:stable/stein
14:30:27 <mriedem> sean-k-mooney: haven't heard
14:30:42 <TheJulia> mriedem: zuul won't let us trigger the job, just says merge error
14:30:49 <TheJulia> so... *shrug*
14:31:01 <mriedem> TheJulia: link me the ironic patch in -nova
14:31:48 <efried> anything else stable?
14:31:54 <mriedem> no
14:32:00 <efried> #topic Sub/related team Highlights
14:32:00 <efried> Placement (cdent)
14:32:00 <efried> #link latest pupdate http://lists.openstack.org/pipermail/openstack-discuss/2019-August/008537.html
14:32:06 <efried> cdent is on vacation this week
14:32:16 <efried> summary:
14:32:16 <efried> Consumer Types: "nice to have" but not critical for Train.
14:32:16 <efried> same_subtree discoveries - docs needed
14:32:16 <efried> osc-placement needs attention
14:32:38 <mriedem> tetsuro and i have been reviewing mel's osc-placement series
14:32:42 <mriedem> for the aggregate inventory thing
14:32:43 <mriedem> should land soon
14:32:46 <efried> ++
14:33:16 <efried> API (gmann)
14:33:16 <efried> This week updates: http://lists.openstack.org/pipermail/openstack-discuss/2019-August/008669.html
14:34:18 <efried> comments, questions, concerns?
14:34:47 <efried> #topic Stuck Reviews
14:34:47 <efried> any?
14:35:33 <efried> #topic Review status page
14:35:33 <efried> #link http://status.openstack.org/reviews/#nova
14:35:33 <efried> Count: 459 (+2); Top score: 1415 (+42)
14:35:33 <efried> #help Pick a patch near the top, shepherd it to closure
14:35:44 <efried> #topic Open discussion
14:35:55 <efried> one item on the agenda
14:35:57 <efried> #link generic resource management for VPMEM, VGPU, and beyond http://lists.openstack.org/pipermail/openstack-discuss/2019-August/008625.html
14:36:06 <efried> alex_xu: care to take the mic?
14:36:23 <mriedem> i'll say i know i haven't replied yet, still need to digest that
14:36:37 <dansmith> and I'm recusing myself
14:36:53 <alex_xu> yea, I summary the xml way and db way's pros/cons for the resource claim
14:37:10 <sean-k-mooney> most of my comment in the mail about future use. but i like the direction
14:37:53 <alex_xu> here is the new plan https://etherpad.openstack.org/p/vpmems-non-virt-driver-specific-new
14:38:06 <efried> TL;DR:
14:38:06 <efried> - generic `resources` object stored on the Instance and (old/new) on MigrationContext
14:38:06 <efried> - virt driver update_provider_tree responsible for populating it if/as necessary
14:38:06 <efried> - RT uses it (without introspecting the hyp-specific bits) to claim individual resources on the platform.
14:38:15 <alex_xu> efried: thanks
14:38:43 <mriedem> sounds like ERT
14:38:50 <mriedem> amiright folks
14:38:53 <dansmith> yup
14:39:05 <efried> I don't know what that is
14:39:11 <dansmith> the  previous dumpster fire
14:39:22 <alex_xu> also like pcimanager, but without host side persistent
14:39:22 <mriedem> json blob for out of tree hp public cloud things pre-placement
14:39:27 <mriedem> that killed about 3 years of effort
14:39:38 <dansmith> and squashed many souls
14:39:49 <mriedem> phil day hasn't been seen since
14:40:01 <sean-k-mooney> ERT (extensible resouce tracker)
14:40:23 <alex_xu> oh, at least, we aren't going to make any extension
14:40:26 <sean-k-mooney> https://specs.openstack.org/openstack/nova-specs/specs/juno/implemented/extensible-resource-tracking.html
14:40:59 <sean-k-mooney> alex_xu: right your proposal does not leave the RT/virt driver
14:41:41 <mriedem> i thought we've wanted to move away from the pci manager for placement but in the case of vgpu and vpmem we're just using placement inventory to tell us how many of a device we have on a host, but not which devices are allocated, which is what pci manager does for pci devices (inventory and allocation tracking)
14:42:05 <efried> right, this would (eventually) allow us to do away with the PCI manager
14:42:24 <efried> effectively, move the "select which device" logic into the virt driver and get rid of the rest
14:42:43 <efried> scratch that, "select which device" still in RT.
14:42:50 <efried> so yeah, get rid of PCI manager.
14:43:06 <sean-k-mooney> yes. but there are other gaps to fill first before we can
14:43:26 <mriedem> i know i mentioned to alex_xu that i thought it could be possible to get the vpmem allocation info from the hypervisor and only use persistence with the migration context during a move operation (resize) to ease that issue with same host resize, but i still need to read through the ML thread
14:43:29 <sean-k-mooney> e.g. moding pci device in placment. and passing allocation candiates to the weighers
14:43:40 <sean-k-mooney> this is a step on that path
14:44:24 <efried> sean-k-mooney: yes. It provides a clean divide at the virt driver boundary, which IMO is the biggest architectural hurdle we were going to need to overcome.
14:44:29 <mriedem> anyway, not going to solve this here
14:44:45 <alex_xu> yea, I think the goal is total different
14:44:45 <alex_xu> ERT has plugin for RT, and plugin for host manager, that is extension for the whole nova-scheduler
14:44:45 <alex_xu> that isn't what we want
14:44:46 <alex_xu> we want something to manage the resource assignment of compute node. since placement only know how many resource we have, but don't know which resource can be assigned
14:45:00 <alex_xu> oops, my network slow, just bump a lot of messages...
14:45:03 <efried> Fair enough. Sounds like dansmith is abstaining, so I think basically we're asking for mriedem's buy-in on this direction.
14:45:30 <mriedem> alex_xu: to be clear, my ERT comment was mostly about storing generic json blobs for resource tracking purposes
14:45:38 <mriedem> not that anyone is proposing plugins
14:46:13 <mriedem> what if i defer to dansmith? then we hit an infinite loop?
14:46:16 <alex_xu> mriedem: that json-blob is version object dump, so under the version control, and only read/write by virt driver.
14:46:28 <dansmith> heh
14:46:43 <alex_xu> :)
14:46:47 <mriedem> alex_xu: for now, until someone wants to add a weigher
14:46:50 <mriedem> but i digress
14:47:08 <sean-k-mooney> we can continue on ml i think
14:47:28 <efried> alex_xu: Did you say code was forthcoming soon?
14:47:59 <sean-k-mooney> but i think plamcnet aware (allocation candiate aware) weigher would mitigate the need to look at the json blob
14:48:09 <alex_xu> yes, luyao already verfied the code, and she is working on unittest, and refine the code. I think we can bring the code up in two or three days
14:48:32 <efried> mriedem: would that help ^ or is it really just a matter of carving out time to read the ML?
14:48:50 <alex_xu> most of code she already have, since our initial proposal is about the db
14:48:53 <mriedem> reading wip code isn't going to help me, i'll procrastinate more on that
14:49:00 <efried> ack
14:49:04 <mriedem> i just need to read the ML
14:49:11 <efried> Okay, /me adds to pester list
14:49:13 <mriedem> i also know that CI will never happen for any of this
14:49:33 <mriedem> so i'm on the fence about just saying, "do whatever you want, it won't be tested anyway"
14:49:50 <efried> You mean CI jobs with real vpmems behind them?
14:50:00 <mriedem> i.e. if anyone ever uses it and it doesn't work, i guess we hope they report it and someone fixes it
14:50:01 <mriedem> efried: yes
14:50:10 <alex_xu> we promised to have ci for vpmem, and rui is working on it
14:50:11 <mriedem> even a periodic job
14:50:24 <alex_xu> and good progress
14:50:53 <efried> that's good to hear
14:50:57 <efried> okay, let's move on.
14:51:00 <efried> any other open topics?
14:51:15 <mriedem> there is an open question about a regression in stein for ironic rebalance,
14:51:17 <mriedem> and lost allocations,
14:51:32 <mriedem> so i need to find someone to test and verify that, but it's probably better to just ask on the ML
14:51:38 <mriedem> dtantsur might be able to do it
14:52:08 <mriedem> so action for me to ask about testing that on the ML
14:52:27 <mriedem> no idea if ironic has a CI job that does a rebalance with an allocated node,
14:52:30 <mriedem> if so we could just check there
14:54:45 <efried> Okay.
14:54:45 <efried> Thanks all.
14:54:45 <efried> o/
14:54:45 <efried> #endmeeting