14:02:52 #startmeeting neutron_l3 14:02:53 Meeting started Wed Mar 6 14:02:52 2019 UTC and is due to finish in 60 minutes. The chair is mlavalle. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:02:54 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:02:56 The meeting name has been set to 'neutron_l3' 14:03:12 hi 14:03:15 hi 14:03:24 hi 14:03:29 hi 14:03:33 hi 14:03:38 sorry for the delay. dog was not very cooperative during our walk. lots of stops to sniff 14:03:52 LOL 14:04:33 o/ 14:04:37 o/ 14:04:38 #topic Announcements 14:04:41 so a dog being a dog? 14:04:51 more or less 14:04:58 LOL 14:05:25 We all now this is the week of Stein-3 milestone 14:05:40 so we are at the end of the cycle 14:06:44 I see messages in my inbox about PTL non candidacies and candidacies.... so we are in PTL candidacy season 14:07:06 any other announcements from the team? 14:07:37 ok, let's move on 14:07:47 ahhh, forgot 14:07:59 #chair liuyulong 14:08:00 Current chairs: liuyulong mlavalle 14:08:26 liuyulong: just let me know when you are ready to run this meeting.... 14:08:38 #topic Bugs 14:09:01 Lots of bug I'm working on now, : ) 14:09:14 First bug for today is https://bugs.launchpad.net/neutron/+bug/1818334 14:09:15 Launchpad bug 1818334 in neutron "Functional test test_concurrent_create_port_forwarding_update_port is failing" [Medium,Confirmed] 14:09:32 we discussed this one yesterday during the CI meeting 14:09:42 liuyulong: are you working on it? 14:09:58 not yet 14:09:58 slaweq: should we increase its priority? 14:10:31 mlavalle: I don't think so, this one isn't happening very often IIRC 14:10:40 but let me check in logstash 14:10:59 yeah, I got a little more sense of urgency yesterday 14:11:39 liuyulong: so you are planning to work on it. Can I assign it to you? 14:11:55 looks like it failed 5 times in last 7 days 14:11:59 slaweq, yes, 20 times last 30 days. 14:12:06 mlavalle, OK 14:12:40 IMO we have much more important issues currently, but we can set it to high as it impacts our gates 14:12:51 +1 14:13:11 ok 14:13:45 The functional and fullstack test looks not so much friendly to us. 14:13:47 liuyulong: are you dragon889 in launchpad? 14:13:59 mlavalle, yes 14:14:26 ok assigned to you 14:14:40 it would be good if you can get to it asap 14:15:26 Next one is https://bugs.launchpad.net/neutron/+bug/1818614 14:15:27 Launchpad bug 1818614 in neutron "Various L3HA functional tests fails often" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq) 14:15:45 I started looking into this today 14:16:03 slaweq: you working in this one. Any comments? 14:16:06 for now I only pushed patch https://review.openstack.org/#/c/641127/ 14:16:19 to add journal.log in functional tests logs 14:16:46 what I found in logs which I checked is that keepalived wasn't spawned for router, like http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%20%5C%22line%20690%2C%20in%20wait_until_true%5C%22 14:16:51 sorry wrong link 14:17:03 http://logs.openstack.org/74/640874/2/check/neutron-functional/37e3040/logs/dsvm-functional-logs/neutron.tests.functional.agent.l3.extensions.test_port_forwarding_extension.TestL3AgentFipPortForwardingExtensionDVR.test_dvr_ha_router_failover_without_gw.txt.gz#_2019-03-05_23_36_44_978 14:17:06 this one is good 14:17:21 but I don't know why keepalived is not starting 14:17:36 I couldn't of course reproduce this issue locally :/ 14:17:51 but I will continue work on it and will update launchpad when I will find something 14:17:54 looks like radvd didn't start either 14:18:13 haleyb: but radvd isn't necessary always I think 14:18:29 and IIRC it is like that in "good" runs too 14:18:44 ah, just noticed same error 14:20:18 Anyone noticed/checked is there infra updating/upgrading? 14:20:31 I was looking e.g. on keepalived version 14:20:37 it's the same for very long time 14:20:46 so that is not the case here 14:20:52 and it would reflect in other kind of tests 14:20:55 wouldn't it? 14:21:00 like scenario 14:21:06 probably 14:21:24 but also please note that (almost) all scenario jobs are already running on Bionic 14:21:36 and functional tests are still legacy jobs and running on Xenial 14:21:41 ahhhh 14:21:44 maybe there is some difference there 14:21:56 I'm not sure 14:22:06 yeah, there may be some difference 14:22:25 yep, so I will continue work on it probably today evening and tomorrow 14:22:52 The dsvm-functional instance, I mean the virtual machine for devstack, is changed? 14:23:41 liuyulong: I'm not 100% sure, maybe some packets were changed 14:24:09 I will try to compare with job from e.g. 2 weeks ago 14:24:17 but I'm not sure that this will help 14:24:47 slaweq, cool, thanks 14:25:43 Next one is https://bugs.launchpad.net/neutron/+bug/1818015 14:25:44 Launchpad bug 1818015 in neutron "VLAN manager removed external port mapping when it was still in use" [Critical,New] 14:26:04 This one was classified as critical last week by bug deputy 14:26:25 but we don't see it in our jobs 14:26:48 and we don't have any other reports about it 14:26:52 am I right? 14:27:46 submitter indicates he cannot reproduce 14:28:06 so I will lower priority to medium and respond with some questions 14:28:09 makes sense? 14:28:24 makes sense 14:28:58 yes, we should get some l3 and ovs agent logs from time when this happend for them 14:29:07 maybe even You can mark it as incomplete for now? 14:29:37 slaweq: yes, that's what I'll do 14:29:42 mlavalle++ 14:29:59 Next one is https://bugs.launchpad.net/neutron/+bug/1795870 14:30:01 Launchpad bug 1795870 in neutron "Trunk scenario test test_trunk_subport_lifecycle fails from time to time" [High,In progress] - Assigned to Miguel Lavalle (minsel) 14:30:36 For this one I have two patches proposed as fix. We know they work. This is the latest run: http://logs.openstack.org/10/636710/5/check/neutron-tempest-plugin-dvr-multinode-scenario/24e0ec4/testr_results.html.gz 14:31:54 mlavalle: do You have links to those patches? or should I look for them in gerrit? 14:32:04 https://review.openstack.org/#/c/636710/ is one 14:32:22 https://review.openstack.org/#/c/639375/ is other 14:32:22 https://review.openstack.org/#/c/639375 14:32:28 is the other 14:32:46 thank You both haleyb and mlavalle :) 14:32:51 there are actually 4 in the bug, maybe the first two can be abandoned? 14:33:15 haleyb: yes, I'll do that. the other two were really tests 14:33:51 now I need to create a plausible test where: 14:33:58 1) a process is spawned 14:34:04 mlavalle: please check functional tests in those patches - it looks that failures might be related 14:34:27 slaweq: I know the functional test I created didn't work 14:34:42 no, but some of existing tests are failing 14:35:18 I will do that 14:35:26 thx 14:35:45 really what I am trying to get at is to ask suggestions on the best way to test this 14:36:05 we don't have tests for kill filters in our tree, do we? 14:36:30 nope AFAIK 14:37:14 do you think a functional test is the best approach? 14:37:44 so do You want to spawn process and then try simply to kill it? 14:38:11 yes, but in that proceess has to call setproctitle 14:39:08 to change its name 14:39:25 its command^^^^ 14:39:39 maybe we can add/change somehow existing L3 fullstack tests and check if after removing router e.g. keepalived processes are killed 14:39:51 (that's only an idea, I don't know if it's good one) 14:40:29 ok, I'll keep digging 14:40:44 let's move on 14:40:58 any other bugs we should discuss today? 14:41:10 there were two new ones 14:41:18 https://bugs.launchpad.net/neutron/+bug/1818805 14:41:19 Launchpad bug 1818805 in neutron "Conntrack rules in the qrouter are not deleted when a fip is removed with dvr" [Undecided,New] 14:41:32 haleyb: was faster than me with them :) 14:41:36 thx haleyb 14:41:51 i have not triaged yet, but can take a look 14:42:03 ok, thanks 14:42:21 https://bugs.launchpad.net/neutron/+bug/1818824 14:42:23 Launchpad bug 1818824 in neutron "When a fip is added to a vm with dvr, previous connections loss the connectivity" [Undecided,New] 14:42:52 it's related in that it's conntrack w/DVR, so maybe there is a regression there on matching connections 14:43:08 this one is "interesting" because it's different behaviour for dvr and non-dvr routers 14:44:17 so I guess it's a bug in dvr implementatio as existing connection shouldn't be broken IMO but maybe it's a "feature" and bug is in no-DVR solution :) 14:44:47 "feature", yes :) 14:45:29 :) 14:46:06 are we saying this is not a bug? 14:46:24 I wanted to ask You as more experienced L3 experts :) 14:46:53 what is expected behaviour because it should be the same for each implementation IMO 14:46:57 no, just joking, i haven't fully looked at the second, but it's a difference in behavior between centralized/dvr so probably a bug 14:47:07 jinx 14:47:25 slaweq probably doesn't get that 14:47:32 LOL 14:47:37 I got it :) 14:47:43 so will you continue looking at it? 14:48:31 i can look at both as i've got a dvr setup running and should be easy to see the first i hope 14:48:49 How to transmit the 'previous connection' contrack state from network node to compute node? 14:50:32 yeah, we can't do that. i hadn't read it completely but now see that's the difference 14:50:45 Centralized floating IPs may not have such issue. : ) 14:50:56 i don't think connections using the default snat IP should continue once a floating IP is assigned 14:51:18 I mean dvr_no_external with centralized floating IPs。 14:52:07 right, but then we have different outcomes depending on deployment 14:53:33 ok, let's move on 14:53:42 any other bugs? 14:54:22 ok 14:54:29 #topic On demand agenda 14:54:43 I have one additional topic 14:55:34 in our downstream CI (Verizonmedia) we are seeing this unit test failing frequently: https://github.com/openstack/neutron/blob/master/neutron/tests/unit/scheduler/test_dhcp_agent_scheduler.py#L524 14:56:06 do any of you remember seeing this failure upstream? 14:56:27 nope 14:56:35 at least I don't remember 14:56:50 seems new and different to me 14:56:51 yeah me neither 14:57:10 not particularly, but there were some changes in the dhcp agent regarding the network list building i thought, if it's related 14:57:29 oh, that's the scheduler, never mind 14:57:50 haleyb: any quick pointers where to look? 14:58:45 mlavalle: i think they were agent changes, so maybe not related to this 14:58:56 ok cool. thanks 14:59:06 any other topics we should discuss today? 14:59:44 ok, thanks for attending 14:59:54 have a nice rest of your day 14:59:57 o/ 14:59:58 #endmeeting