14:00:04 <melwitt> #startmeeting nova
14:00:06 <openstack> Meeting started Thu May  3 14:00:04 2018 UTC and is due to finish in 60 minutes.  The chair is melwitt. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:08 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:11 <openstack> The meeting name has been set to 'nova'
14:00:32 <takashin> o/
14:00:32 <melwitt> hi everyone
14:00:33 <mriedem> o/
14:00:33 <gibi> o/
14:00:36 <dansmith> o/
14:00:38 <tetsuro> o/
14:01:03 <efried> ō/
14:01:14 <melwitt> #topic Release News
14:01:20 <melwitt> #link Rocky release schedule: https://wiki.openstack.org/wiki/Nova/Rocky_Release_Schedule
14:01:34 <melwitt> we have the summit coming up in a few weeks
14:02:06 <melwitt> we have some nova-related topics approved for the forum
14:02:12 <tssurya_> o/
14:02:23 <melwitt> the forum schedule has been sent to the dev ML but I don't have a link handy
14:02:23 <edleafe> \o
14:02:52 <melwitt> but do check that out to see what we have coming up at the forum
14:02:54 <melwitt> #link Rocky review runways: https://etherpad.openstack.org/p/nova-runways-rocky
14:03:03 <melwitt> current runways:
14:03:11 <melwitt> #link runway #1: XenAPI: Support a new image handler for non-FS based SRs [END DATE: 2018-05-11] series starting at https://review.openstack.org/#/c/497201
14:03:13 <patchbot> patch 497201 - nova - XenAPI: deprecate the config for image handler cla...
14:03:26 <melwitt> #link runway #2: Add z/VM driver [END DATE: 2018-05-15] spec amendment needed at https://review.openstack.org/562154 and implementation starting at https://review.openstack.org/523387
14:03:27 <patchbot> patch 562154 - nova-specs - Add additional information for z/VM spec.
14:03:28 <patchbot> patch 523387 - nova - z/VM Driver: Initial change set of z/VM driver
14:03:34 <melwitt> #link runway #3: Local disk serial numbers [END DATE: 2018-05-16] series starting at https://review.openstack.org/526346
14:03:34 <patchbot> patch 526346 - nova - Give volume DriverBlockDevice classes a common prefix
14:04:21 <melwitt> please take some time to review the patches in runways as a priority ^
14:04:46 <melwitt> anything else for release news or runways?
14:04:49 * bauzas waves
14:05:42 <melwitt> #topic Bugs (stuck/critical)
14:05:55 <melwitt> no critical bugs
14:06:04 <melwitt> #link 31 new untriaged bugs (same since the last meeting): https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New
14:06:27 <melwitt> the untriaged bug count hasn't gone up, so thanks to everyone who's been helping with triage
14:06:38 <melwitt> #link 3 untagged untriaged bugs: https://bugs.launchpad.net/nova/+bugs?field.tag=-*&field.status%3Alist=NEW
14:06:45 <melwitt> #link bug triage how-to: https://wiki.openstack.org/wiki/Nova/BugTriage#Tags
14:07:02 <melwitt> please lend a helping hand with bug triage if you can. this is a good how-to guide ^
14:07:03 * johnthetubaguy nods at the XenAPI runway item
14:07:27 <melwitt> yep, that's a good one for you :)
14:07:37 <melwitt> Gate status:
14:07:42 <melwitt> #link check queue gate status http://status.openstack.org/elastic-recheck/index.html
14:08:05 <melwitt> gate's been ... good I think
14:08:14 <melwitt> 3rd party CI:
14:08:16 <mriedem> except ceph!
14:08:21 <bauzas> yup, I saw
14:08:26 <bauzas> some timeouts too
14:08:52 <melwitt> yes, except ceph. but there's a patch for that, there's been a cinder API change that has caused a call that used to work to fail
14:09:13 <mriedem> oh that merged
14:09:16 <mriedem> the cinder one
14:09:27 <melwitt> it actually affects more than ceph in that any delete of a BFV instance would fail to detach the volume because of it
14:09:31 <melwitt> oh, good
14:09:51 <edmondsw> powervm CI was recently updated to use a queens undercloud, and is now experiencing connection issues. Working to get that fixed
14:10:05 <melwitt> thanks for the heads up edmondsw
14:10:13 <melwitt> #link 3rd party CI status http://ci-watch.tintri.com/project?project=nova&time=7+days
14:10:32 <melwitt> virtuozzo CI has been broken for awhile and now it's returning a 404 on its test result links
14:11:26 <melwitt> I sent a message to the dev ML asking if anyone from virtuozzo could reply and nothing so far
14:11:55 <melwitt> anything else for bugs, gate status, or third party CI?
14:12:23 <esberglu> PowerVM CI has been hitting connection issues from jenkins to nodepool nodes
14:12:39 <esberglu> https://wiki.jenkins.io/display/JENKINS/Remoting+issue
14:12:44 <edmondsw> if anyone has ideas on that...
14:12:49 <edmondsw> we're all ears :)
14:12:58 <esberglu> Causing pretty much all runs to fail at the moment
14:13:04 <mriedem> ask in -infra
14:13:09 <edmondsw> yeah
14:14:03 <melwitt> topic #Reminders
14:14:10 <melwitt> #topic Reminders
14:14:19 <melwitt> #link Rocky Review Priorities https://etherpad.openstack.org/p/rocky-nova-priorities-tracking
14:14:34 <melwitt> subteam and bug etherpad ^ it is there
14:14:50 <melwitt> does anyone have any other reminders to highlight?
14:16:04 <melwitt> #topic Stable branch status
14:16:11 <melwitt> #link stable/queens: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/queens,n,z
14:16:26 <melwitt> there have been a lot of backports proposed
14:16:46 <melwitt> stable cores, please take a look at some reviews when you can
14:16:53 <melwitt> #link stable/pike: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/pike,n,z
14:17:05 <melwitt> same for pike, the list is growing
14:17:11 <melwitt> #link stable/ocata: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/ocata,n,z
14:17:23 <melwitt> and ocata doesn't have too many
14:17:38 <melwitt> anything else for stable branch status?
14:17:53 <mriedem> ocata should taper off
14:17:56 <melwitt> oh, also we released ocata 15.1.1 recently
14:18:31 <melwitt> #link ocata 15.1.1 released on 2018-05-02 https://review.openstack.org/564044
14:18:32 <patchbot> patch 564044 - releases - nova: release ocata 15.1.1 (MERGED)
14:18:52 <melwitt> #topic Subteam Highlights
14:18:56 <dansmith> no
14:19:01 <melwitt> lol
14:19:24 <melwitt> yeah, so for cells v2 we skipped the meeting again bc we didn't need to have a meeting
14:19:36 <dansmith> because we're efficient like that
14:19:48 <tssurya_> :P
14:19:53 <mriedem> i'm following up on a bug with tssurya in -nova
14:20:01 <mriedem> sounds like cern is upgrading and putting out fires, recap to come
14:20:22 <mriedem> plus a talk in vancouver about their cells v2 upgrade
14:20:23 <mriedem> right?
14:20:27 <tssurya_> Yes
14:20:36 <melwitt> yep. CERN peeps are in the middle of an upgrade to multi-cell cells v2
14:20:42 <melwitt> exciting stuff
14:20:45 <mriedem> https://www.openstack.org/summit/vancouver-2018/summit-schedule/events/20667/moving-from-cellsv1-to-cellsv2-at-cern
14:20:49 <tssurya_> Totally :)
14:21:14 <melwitt> really looking forward to that talk
14:21:23 <dansmith> be careful what you wish for
14:21:30 <dansmith> might be a bunch of "the nova team sucks because" talk
14:21:32 <tssurya_> exactly!
14:21:37 <melwitt> heh
14:21:43 <dansmith> "nova cellsv2 ate the higgs boson data"
14:21:43 <melwitt> :(
14:21:43 <dansmith> etc
14:21:49 <tssurya_> Hehe ,
14:22:15 <melwitt> okay, anything else for cells?
14:22:45 * bauzas needs to disappear for babytaxiing
14:22:49 <melwitt> edleafe: scheduler?
14:22:54 <edleafe> Clarified the status of Nested Resource Providers for the Cyborg team. They were under the impression that since jaypipes's patch for handling requests for resources when nested providers are involved had merged, that NRP was functional.
14:22:58 <edleafe> #link Resource Requests for Nested Providers https://review.openstack.org/#/c/554529/
14:22:59 <patchbot> patch 554529 - nova - placement: resource requests for nested providers (MERGED)
14:23:01 <edleafe> Discussed the two approaches I had proposed for the consumer generation issue. No decision was made, which eventually didn't matter, since jaypipes ended up redoing the entire series on his own.
14:23:05 <edleafe> Discussed the bug reported by the CERN folks that the association_refresh should be configurable. The current hard-coded interval of 300 seconds was a huge drag on their deployment. When they lengthened the interval, they saw a "big time improvement".
14:23:09 <edleafe> We discussed whether it was OK to backport a config option to Queens. Decided that it would be ok as long as the default for the option didn't change behavior.
14:23:12 <edleafe> That's it.
14:23:32 <mriedem> that change is merged on master,
14:23:34 <mriedem> needs a backport
14:23:50 <mriedem> oh nvm https://review.openstack.org/#/c/565526/
14:23:50 <patchbot> patch 565526 - nova - Make association_refresh configurable
14:24:20 <tssurya_> mriedem : will out up the backport soon
14:24:26 <tssurya_> Put*
14:24:41 <melwitt> okay, so needs another +2
14:24:48 <mriedem> yeah i'll review it after the meeting
14:24:54 <melwitt> coolness
14:24:56 <tssurya_> Thanks!
14:25:09 <melwitt> edleafe: cool. I was thinking about the cyborg stuff yesterday and was wondering where things are. is it that once NRP is functional, they can start the implementation on their side?
14:25:48 <edleafe> melwitt: they are starting the implementation already. We just had to explain that we aren't able to return nested RPs yet
14:26:03 <edleafe> and that we probably won't be able to in Rocky
14:26:13 <mriedem> return from what? GET /allocation_candidates?
14:26:20 <edleafe> yes
14:26:26 <mriedem> isn't that tetsuro's spec?
14:26:31 <edleafe> yes again
14:26:40 <mriedem> so why won't we be able to in rocky?
14:26:43 <mriedem> just too much other stuff?
14:27:09 <edleafe> that was the status I got - it *might* make it in, but that it isn't a sure thing
14:27:22 <jaypipes> tetsuro's patches are coming along nicely...
14:27:24 <melwitt> okay, so is it that returning nested RPs required for them to complete the implementation on their side? trying to get an idea of when we can expect the FPGA stuff to be live
14:27:51 <mriedem> they need ^ to schedule in nova properly
14:27:56 <edleafe> melwitt: they can code to the specs now
14:28:11 <efried> Yeah, the series is up and ready for review
14:28:16 <efried> I've got +2s on the first three or four patches.
14:28:18 <edleafe> it just won't work until the rest is merged
14:28:20 <melwitt> okay, so nothing is blocking them and once they're done, then the last piece will be the nova integration part
14:28:26 <efried> The final patch (with the microversion) is proposed now.
14:28:31 <jaypipes> melwitt: cyborg still needs to implement their update_provider_tree() work. we still need to implement the scheduler side of things (which is the GET /allocation_candidates stuff tetsuro is working on and the granular request stuff efried is working on, both of which have active reviews for patches)
14:29:01 <edleafe> jaypipes: yes, the point was that they can start working on it. We aren't blocking them
14:29:06 <efried> I wasn't sure about it on Monday, because that one wasn't there, but now that it's up, I'm confident we can make Rocky with this stuff.
14:29:14 <jaypipes> edleafe: correct, no disagreement from me on that
14:29:19 <edleafe> kewl
14:29:21 <mriedem> so cyborg puts fpga traits or providers in the tree, and then you can request that via flavor extra specs using granular request syntax right?
14:29:43 <efried> yup
14:30:10 <melwitt> all sounds good, thanks y'all
14:30:36 <melwitt> last on subteams, notifications, there are no notes from gibi in the agenda
14:30:47 <gibi> No notification meeting this week due to public holiday
14:30:56 <gibi> There will be meeting next week and I will try to send status mail as well.
14:31:05 <melwitt> ah, right. I thought I knew that and then forgot
14:31:10 <melwitt> cool gibi
14:31:26 <melwitt> anything else for subteams?
14:31:50 <melwitt> #topic Stuck Reviews
14:32:02 <melwitt> there's one thing in the agenda, the rebuild thing again
14:32:21 <melwitt> http://lists.openstack.org/pipermail/openstack-dev/2018-April/129726.html
14:32:42 <efried> I thought I saw chatter last night, we're going with the "check allocations to find our RPs and compare those traits" approach, yes?
14:32:51 <melwitt> there's been more replies to the ML thread
14:33:22 <melwitt> yeah, I mean, I don't think there's any other "right" way to handle it. that's just MHO
14:34:10 <melwitt> the other option was to just punt in the API and only allow old image-traits == new image-traits on a rebuild, or rather kick out any request that has image-traits that did not exist in the original image
14:34:52 <melwitt> and arvind isn't here
14:35:09 <dansmith> is this really stuck?
14:35:12 <dansmith> by the definition?
14:35:16 <mriedem> yeah i think it is
14:35:29 <dansmith> it's been a while since I looked,
14:35:35 <dansmith> but what is the blocking disagreement?
14:35:48 <mriedem> how to handle rebuild with a new image that has required traits
14:35:54 <melwitt> whether or not to ask placement to verify the image-traits on a rebuild request
14:35:56 <mriedem> it's in that ML thread
14:36:05 <dansmith> I've been ignoring it
14:36:27 <dansmith> the question is whether or not we block rebuild to same host if the traits have changed?
14:36:28 <melwitt> my opinion is, we have to ask placement if we want to check whether the image-traits are okay
14:37:05 <johnthetubaguy> should it just match live-migrate kind of things, where the host is specified?
14:37:13 <melwitt> and the debate is whether we should check them with placement or just do a flavor compare, old vs new and only allow same traits. and not ask placement anything
14:37:23 <johnthetubaguy> ah
14:37:39 <dansmith> just comparing flavors is way too naive I tink
14:37:40 <mriedem> s/flavor/image/
14:37:48 <melwitt> image, sorry
14:37:49 <dansmith> that'll work in a contrived functional test,
14:37:56 <dansmith> but not in real life I think
14:38:19 <dansmith> melwitt: yeah I know, same deal.. "without asking placement" I mean
14:38:38 <mriedem> right the current proposal in the spec from arvind is the api checks if the new image for rebuild has traits which are a subset of the original image traits used to create the instance
14:38:45 <mriedem> which might or might not work with the current state of the host,
14:38:53 <mriedem> assuming no external service has changed the traits on the compute node provider
14:38:58 <dansmith> I can't imagine that working the way you expect in practice
14:39:31 <mriedem> i think this is basically the same thing as running the ImagePropertiesFilter on rebuild for a new image
14:39:33 <melwitt> and it artificially limits a rebuild if the compute host has trait A and B and you originally booted with trait A and want to rebuild with trait B and then it says NO
14:39:35 <mriedem> i don't really see how this is different
14:39:50 <melwitt> yup, agreed, I see it the same way
14:39:56 <dansmith> I shall commentificate upon the threadage and reviewage
14:40:03 <melwitt> thanks dansmith
14:40:34 <melwitt> okay, any other comments on that before we move to open discussion?
14:41:00 <melwitt> #topic Open discussion
14:41:05 <efried> Do we have a bug open for the func test failure in test_parallel_evacuate_with_server_group (AssertionError: 'host3' == 'host3').  Been seeing this pretty frequently, seems like a race.
14:41:15 <melwitt> we do
14:41:29 <melwitt> cfriesen had some ideas about it. lemme see if I can find it
14:41:44 <efried> Not that I plan to debug the thing.  Just want to be able to start saying "recheck bug #xxxx" instead of "recheck" :)
14:41:49 <melwitt> #link https://bugs.launchpad.net/nova/+bug/1763181
14:41:50 <openstack> Launchpad bug 1763181 in OpenStack Compute (nova) "test_parallel_evacuate_with_server_group intermittently fails" [Medium,Confirmed]
14:41:55 <efried> Thanks melwitt
14:42:11 <mriedem> yes it's been open for a couple of weeks
14:42:22 <mriedem> http://status.openstack.org/elastic-recheck/#1763181
14:42:24 <melwitt> yeah, so from what cfriesen mentioned on there, it sounds like it's a race in the actual code, not just the test
14:42:30 <melwitt> which sucks
14:42:39 <melwitt> because that'll be harder to fix
14:43:16 <melwitt> so if anyone wants to help with the solution to that, it would be really cool ;)
14:43:28 <melwitt> anything else for open discussion?
14:44:20 <melwitt> okay, we'll wrap up. thanks everybody
14:44:23 <melwitt> #endmeeting