15:00:20 <mlavalle> #startmeeting neutron_l3
15:00:21 <openstack> Meeting started Thu Aug 31 15:00:20 2017 UTC and is due to finish in 60 minutes.  The chair is mlavalle. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:22 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:25 <openstack> The meeting name has been set to 'neutron_l3'
15:00:32 <Swami> hi
15:00:38 <haleyb> hi
15:01:14 <mlavalle> #chair haleyb Swami
15:01:14 <openstack> Current chairs: Swami haleyb mlavalle
15:01:31 <mlavalle> #topic Announcements
15:01:53 <mlavalle> Pike is being released this week
15:02:31 <mlavalle> so we will start the whole thing again and move to Queens now :-)
15:03:32 <mlavalle> We are a little more than one week away from the PTG
15:04:41 <mlavalle> I arrive Sunday 10th at around 2:30pm. Staying at the event hotel (Renaissance) until Saturday morning
15:05:12 <Swami> mlavalle: I am arriving late in the evening on Sunday and will be flying back on friday evening.
15:05:25 <haleyb> i'll be there Sunday as well
15:05:42 <mlavalle> Any other annoucements from the team?
15:06:30 <Swami> nothing from me.
15:07:10 <mlavalle> I had a quick chat a couple of weeks ago with carl_baldwin. He is going to join us for dinner Monday or Tuesday
15:07:26 <Swami> mlavalle: great!
15:07:42 <mlavalle> He'll drive all the wasy from Fort Collins
15:07:55 <mlavalle> ok, moving on
15:08:00 <mlavalle> #topic Bugs
15:08:14 <mlavalle> Swami: go ahead, as usual
15:08:17 <Swami> mlavalle: thanks
15:08:34 <Swami> We had a critical bug in the grenade job that was filed yesterday.
15:08:54 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1713927
15:08:55 <openstack> Launchpad bug 1713927 in neutron "gate-grenade-dsvm-neutron-dvr-multinode-ubuntu-xenial fails constantly" [Critical,Confirmed]
15:09:39 <Swami> If all are not aware about the problem. It seems somehow the server is providing a floatingip that is not bound to a host.
15:10:03 <Swami> The agent configures the floatingip without checking for the host.
15:10:22 <Swami> So the floatingip is residing on two different nodes.
15:11:00 <Swami> This happens only when a neutron-server is restarted after some time interval. At that time when a full-sync happens, we sneak in this duplicate floatingip.
15:11:20 <haleyb> Swami: kevin did put a reproducer in the bug this morning, and i had a discussion with kuba earlier, think we might have a workaround
15:11:51 <Swami> haleyb: I did see that jakob had posted a patch.
15:12:11 <haleyb> yes, but we also think we need this revert https://review.openstack.org/#/c/499263/
15:12:11 <Swami> haleyb: Yes I also ready kevin's reproducing steps.
15:12:32 <Swami> haleyb: yesterday I was trying to reproduce by restarting the agent. I could not reproduce it.
15:13:23 <Swami> haleyb: As I mentioned there are two problems.
15:13:45 <Swami> haleyb: mlavalle: From the server side, the notification should be only sent to the host that is hosting it.
15:14:13 <Swami> haleyb: mlavalle: on the client side it should check for host or dest_host, before configuring the floatingip.
15:14:53 <Swami> haleyb: mlavalle: While I have not figured out what is happening on the server side yet. But on the client side we may have a better solution.
15:15:00 <haleyb> Swami: i think with the revert and jakub's change the agent side would be fixed, fixing the check queue
15:15:12 <Swami> haleyb: mlavalle: Yes I agree.
15:15:32 <mlavalle> so the host fix we merged last week was really a symptom?
15:15:59 <Swami> haleyb: mlavalle: But still the check that we have under the 'get_external_device_interface_name"
15:16:11 <Swami> haleyb: mlavalle: may not be the right place.
15:16:25 <Swami> haleyb: mlavalle: we should have this check before configuring the floatingip.
15:17:12 <Swami> #link https://github.com/openstack/neutron/blob/master/neutron/agent/l3/dvr_local_router.py#L103
15:17:31 <Swami> we should have this check at this line, so that no floatingips are configured.
15:17:45 <Swami> and only floatingips that are associated with that host will be configured.
15:18:31 <Swami> mlavalle: To your question. Yes it is a symptom that we saw last week, which was sending floatingip's without host.
15:18:44 <haleyb> but the agent shouldn't have sent, right?
15:18:47 <mlavalle> makes sense, thanks
15:19:20 <Swami> haleyb: Yes, you mean the 'server' shouldn't have sent in the first place.
15:20:11 <haleyb> Swami: right, sent by accident, or because 'host' not set
15:20:40 <Swami> haleyb: Yes.
15:21:27 <Swami> haleyb: mlavalle: The case where we check for the 'host' condition in the server has three different conditions, so probably we might be having a leak in one of those checks.
15:22:23 <haleyb> Swami: so let's do the revert in stable/pike and/or master until we can fix correctly
15:22:37 <Swami> haleyb: mlavalle: Probably the fast approach is to fix the agent side to handle the fips based on the host and then we can take a look at the server side.
15:23:27 <Swami> haleyb: mlavalle: I think we should fix the agent first and no need to revert at this point.
15:24:18 <haleyb> Swami: i think without the pike revert we can't land anything, as it's now broken so grenade in master can't pass (from what I understood)
15:25:22 <haleyb> we can ask jakub in #neutron after meeting, but that's what i understood
15:25:38 <Swami> haleyb: Ok, if that is the case then we can decide on reverting.
15:27:17 <Swami> haleyb: mlavalle : So the plan is, let us check with jakub and see how his patch fairs.
15:28:03 <Swami> haleyb: mlavalle: If it works and can be merged with the grenade job allowing to merge, then we can go ahead and push this patch and need not revert. Otherwise we should revert.
15:28:05 <haleyb> grenade job is now doing pike->queens as of yesterday, so that changed things
15:28:31 <Swami> haleyb: what is that? I don't get it.
15:28:50 <mlavalle> yeah, the grenade equation chenged
15:28:55 <haleyb> grenade configures "old" version, then upgrades to "new" version
15:28:57 <mlavalle> changed
15:29:06 <haleyb> so if pike is broken, then we can't upgrade
15:29:40 <haleyb> i think the revert fixes pike, then jakub's change fixes master enough to make progress
15:30:08 <Swami> haleyb: ok makes sense.
15:31:22 <Swami> haleyb: mean while I will try to see what is happening on the server side logic.
15:32:08 <Swami> Let us move on.
15:32:46 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1712913
15:32:48 <openstack> Launchpad bug 1712913 in neutron "Update DVR router port cause error with QoS rules" [Undecided,In progress] - Assigned to Slawek Kaplonski (slaweq)
15:33:00 <Swami> This is another bug that was filed against DVR.
15:33:40 <Swami> But by looking at the bug description and the logs, it seems that it might also happen with legacy routers.
15:34:16 <Swami> The issue is with the ovs_agent. There is patch currently for review.
15:34:39 <Swami> #link https://review.openstack.org/#/c/498598/
15:35:12 <Swami> Please take a look at the patch, after we fix this critical issue and when the gate is happy.
15:35:32 <mlavalle> ok
15:35:48 <Swami> The next bug in the list is
15:35:50 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1712795
15:35:52 <openstack> Launchpad bug 1712795 in neutron "Fail to startup neutron-l3-agent" [Undecided,New]
15:36:49 <Swami> This bug seems to be incomplete. I don't see any issue in neutron-l3-agent processing routers. I have asked couple of questions on the branch and how it can be reproduced, but have not received any information yet.
15:36:57 <Swami> Until then we can mark it as incomplete.
15:37:34 <Swami> There are two other bugs that was filed against the multinode failure.
15:37:56 <Swami> Since we have already discussed about the multinode failure, we can have the discussions next week on where we stand.
15:38:05 <Swami> mlavalle: back to you.
15:38:18 <mlavalle> Swami: Thanks
15:38:26 <mlavalle> Good discussion
15:38:48 <mlavalle> I don't have any major bugs to discuss
15:38:56 <mlavalle> #topic Open Agenda
15:39:33 <mlavalle> Since we have that critical DVR bug breaking the gate, let's move to Opem Sgenda
15:39:36 <mlavalle> Agenda
15:40:07 <mlavalle> Any other topics we should discuss today?
15:40:34 <Swami> mlavalle: No I will keep working on the fix.
15:40:58 <haleyb> https://www.eventbrite.com/e/drinks-with-rdo-at-the-ptg-tickets-37396477872
15:41:05 <haleyb> RDO release party at PTG :)
15:41:23 <haleyb> i think that's open to all
15:41:51 <mlavalle> But I think you need to register. I will right now
15:42:13 <mlavalle> ok guys, thanks for attending
15:42:21 <mlavalle> #endmeeting