15:00:08 #startmeeting neutron_dvr 15:00:08 Meeting started Wed Dec 9 15:00:08 2015 UTC and is due to finish in 60 minutes. The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:10 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:13 The meeting name has been set to 'neutron_dvr' 15:00:13 hi swami 15:00:21 #chair Swami 15:00:24 Current chairs: Swami haleyb 15:00:56 o/ 15:01:04 obondarev:hi 15:01:07 Swami: i had a complete system failure earlier this morning, so if I disappear it's all yours... 15:01:18 haleyb: no problem 15:01:20 * regXboi wanders in late 15:01:21 hi 15:01:45 I have to leave early today, so may be we can wind up the meeting early 15:01:54 I'm in favor of that 15:02:02 sure 15:02:19 #topic Announcements 15:02:28 haleyb: I have edited the meeting wiki with bugs and then sorted the bugs based on category 15:03:03 ok, we might have stepped on each other as oleg and myself also edited today, i'll look 15:03:27 yeah 15:03:29 haleyb: yes my changes overlapped yours and then I fixed it. 15:03:36 I didn't have any particular announcements, just that reviews have continued to merge, which is goodness 15:03:50 #topic Bugs 15:03:53 haleyb: is the gate, zuul happy 15:03:54 * carl_baldwin wanders in 15:04:12 . 15:04:14 Ok we have at least 4 new bugs for this week that was filed. 15:04:29 Swami: get is getting better 15:04:34 s/get/gate 15:04:36 #link https://bugs.launchpad.net/neutron/+bug/1524291 15:04:36 Launchpad bug 1524291 in neutron "check_ports_on_host_and_subnet() duplicates check_ports_exist_on_l3agent()" [Low,In progress] - Assigned to Oleg Bondarev (obondarev) 15:04:55 preparing fix for it 15:04:58 This is a low category and probably a cleanup and I don't think we need more discussion on this. 15:05:06 obondarev: thanks 15:05:14 just noticed while working on dvr sheduling refactoring 15:05:19 next one 15:05:23 #link https://bugs.launchpad.net/neutron/+bug/1524020 15:05:23 Launchpad bug 1524020 in neutron "DVRImpact: dvr_vmarp_table_update and dvr_update_router_add_vm is called for every port update instead of only when host binding or mac-address changes occur" [Medium,Confirmed] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan) 15:05:52 This bug was filed by me, again this is related to reducing the arp calls that are initiated from the server for all updates. 15:05:58 I have a patch up for review. 15:06:01 I've left a couple og suggestions for that on in review 15:06:04 please feel free to review it. 15:06:12 of* 15:06:19 obondarev: I will take a look at it. Thanks for your ongoing review comments. 15:06:34 just for the benefit of the audience here is the patch details. 15:06:37 Swami: thanks for being patient 15:06:45 i didn't see a review link in the bug 15:06:56 #link https://review.openstack.org/#/c/253685/ 15:07:26 haleyb: sometimes I have seen if you push the patch first and then assign the bug id, it is not getting populated in the launchpad. 15:07:45 The next one is 15:07:49 #link https://bugs.launchpad.net/neutron/+bug/1522824 15:07:49 Launchpad bug 1522824 in neutron "DVR multinode job: test_shelve_instance failure due to SSHTimeout" [High,In progress] - Assigned to Oleg Bondarev (obondarev) 15:08:11 this one is one of the culprits for multinode job failures 15:08:11 There is patch that oleg pushed in for this issue. 15:08:27 obondarev: yes I have seen this failure more 15:08:43 there is one with 'resize' test as well 15:08:51 obondarev: also the other failure i have noticed is the "volume-boot-pattern". Just to consider after. 15:08:54 let's see if it's the same root cause 15:09:02 obondarev: thanks 15:09:07 we need to ping an ML2 core on that review 15:09:22 #link https://review.openstack.org/#/c/253569/ 15:09:50 just added kevinbenton there 15:09:51 obondarev: you have mentioned in your comment, that we don't need any event update from nova and neutron should handle itself. 15:10:42 Swami: correct 15:10:52 obondarev: going forward if we need any nova handshake is that possible with neutron or neutron should just work on the basis of the port status. The reason I ask this question is for the live migration. 15:11:25 Swami: live migration is a bit harder 15:11:39 Swami: for this bug we don't need anything on nova side 15:11:39 obondarev: ok we can discuss about it later. 15:12:04 That's all for the new bugs this week. 15:12:13 But i wanted to discuss about a new feature patch. 15:12:16 #link https://review.openstack.org/#/c/143169/ 15:12:25 This is the DVR SNAT HA patch. 15:13:03 This has been pending for a while and fitoduarte has addressed all merge conflicts on this patch. 15:13:19 Can we have the cores attention on this patch. 15:13:54 Also a couple of days back amuller mentioned about this patch and other dependent patches for the DVR, L3 HA to be working smoother. 15:13:57 IIRC, this had +2s on an earlier revision 15:14:04 the issue was amuller wanted to review 15:14:06 regXboi: yes 15:14:08 I will review again, but assuming we need amuller has final say 15:14:13 I'll review it, it seems to be affecting sheduling refactoring as well 15:14:38 maintaining that patch is too hard right now and we should probably close the loop on it. 15:14:40 * carl_baldwin will look again 15:14:52 carl_baldwin: obondarev: haleyb: thanks 15:15:56 I have a patch for addressing the allowed address pair with FIP on DVR. 15:16:00 #link https://review.openstack.org/#/c/254439/ 15:16:14 This is currently WIP and I would like to get some early feedback on this. 15:16:39 here is the bug details on this. 15:16:43 #link https://bugs.launchpad.net/neutron/+bug/1445255 15:16:43 Launchpad bug 1445255 in neutron "DVR FloatingIP to unbound port does not work" [Low,In progress] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan) 15:17:39 am I still visible. 15:17:52 yes on the visibility 15:17:55 * regXboi reading 15:18:51 next one in the list 15:18:55 #link https://bugs.launchpad.net/neutron/+bug/1521815 15:18:55 Launchpad bug 1521815 in neutron "DVR functional tests failing intermittently" [Low,Confirmed] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan) 15:19:18 This bug can't be reproduced and it kind of settled itself in the gate. 15:19:56 We are not sure what caused this issue in the gate. So let us keep an eye on this and move forward. regXboi has lowered the serverity on this bug. 15:20:37 I think that's all we have for bugs today. 15:20:38 I've got nothing to add that isn't in the bug discussion 15:20:55 If you guys have anything else to discuss we can discuss, else we can move one. 15:21:00 let's move on 15:21:10 ok, that's all I have for bugs. 15:21:21 #topic gate-failures 15:21:42 How is gate failure looking with respect to dvr 15:22:19 so the check pipelines have settled out after the metering hangup 15:22:29 regXboi: yes true 15:22:49 I'm concerned by a new dvr multinode failure here https://review.openstack.org/#/c/250075/ 15:22:51 single node DVR can probably go back to voting - it is tracking neutron-full pretty well 15:23:00 which is probably related to the patch 15:23:00 * regXboi looks 15:23:27 need to investigate before merging 15:23:28 obondarev: is it causing failures in gate. 15:23:40 obondarev: I'm -1 that patch because of that 15:23:43 obondarev: I think we agreed that if it breaks the gate we can revert it. 15:24:08 We've not merged it yet, so we have time to investigate 15:24:11 at this point we don't want to introduce more issues in gate. 15:24:17 I've thrown a -1 on that patch 15:24:19 Swami: right 15:24:37 *because* that dvr-multinode failure isn't one I can just say "oh that's independent" 15:24:40 regXboi: fair enough 15:24:48 obondarev: if you have more data points please update it. 15:24:51 regXboi: right 15:25:05 Swami: I don't have any ATM 15:25:11 obondarev: ok 15:25:18 will try to find time to investigate 15:25:30 so right now I'm thinking we can ask for dvr to be voting again 15:25:36 and maybe multinode-full 15:25:39 ok I do have a debug patch to address the fip failures in the gate, but I am seeing another issue where the pings are not getting response. 15:25:46 but multinode-dvr is still too high in its failure rate 15:25:48 regXboi: Let's get a patch up to make single node dvr voting. It might take a few more days of stability but at least we'll have it queued up. 15:25:48 regXboi: so the DVR multinode still seems double the regular multinode - 25% 15:26:09 haleyb: yes - that's what I mean by "still too high" 15:26:22 If everyone is with an agreement then I can push a voting job patch just for the single node. 15:26:45 Swami: if you have time today, please spin the patch - I'm stuck in meetings all day :( 15:26:53 In the case of multinode we should first fix the live migration problem otherwise the tests will fail. 15:27:06 multinode-dvr still needs to be nv 15:27:15 regXboi: I will be in a meeting as well the entire day. 15:27:24 Swami: +1 on the singlenode, let me know if i can help 15:27:29 Swami, carl_baldwin: ok, I'll spin a patch here 15:27:30 regXboi: but i will try to push one. 15:27:46 heh - ok - you spin the singlenode dvr 15:27:51 I'll spin the multinode-full 15:28:00 because I'd like to have that queued up as well 15:28:01 regXboi: ok fine, I will spin it up. 15:28:10 ok, let us move on to the next item. 15:28:11 Swami: which live migration problem are you reffering? 15:28:36 obondarev: the block migration issue that is breaking with ssh. 15:29:01 Swami: ok 15:29:02 #topic performance-scalability 15:29:27 obondarev: anything to add in here. 15:29:33 I've put an initial WIP patch for dvr sheduling refactoring 15:29:46 #link https://review.openstack.org/#/c/254837/ 15:30:01 continue working on it 15:30:15 obondarev: thanks 15:30:18 ++ 15:30:30 #topic open-discussion 15:30:41 anything else we need to discuss here. 15:30:42 it is sad that https://review.openstack.org/#/c/143169 had to deal with all complexity of dvr scheduling 15:30:57 which should be cleaned up soon 15:31:07 obondarev: yes I agree. 15:31:18 obondarev: that is the reason it took a while. 15:31:36 one of the reasons I guess :) 15:31:47 obondarev: agreed 15:31:58 ok, that's all I have today. 15:32:02 Swami: i found the infra change that disabled the dvr job, https://review.openstack.org/#/c/223173/ i can send a change to un-do that 15:32:13 haleyb: sure, that would work. 15:32:52 #action haleyb will re-enable dvr job 15:33:03 haleyb: Thanks 15:33:07 ok, multinode-full voting patch is up for review: https://review.openstack.org/#/c/212058/ 15:33:27 you're faster than me 15:33:28 I admit to having an old patch set to rebase :) 15:33:40 regXboi: you might have had it all prepared. 15:33:45 regXboi: good work. 15:34:00 look at the number, and then look at the patch history 15:34:06 it's been hanging around for months 15:34:19 do we have anything else to discuss here. 15:34:27 if not we can end this meeting. 15:34:31 no, I'm good 15:34:43 Thanks for your attendance and see you all next week. 15:34:57 ok, thanks everyone, keep up the great work :) 15:35:02 #endmeeting