15:00:20 <tidwellr> #startmeeting neutron_l3
15:00:21 <mlavalle> o/
15:00:24 <haleyb> hi
15:00:25 <openstack> Meeting started Thu Aug  4 15:00:20 2016 UTC and is due to finish in 60 minutes.  The chair is tidwellr. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:26 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:28 <openstack> The meeting name has been set to 'neutron_l3'
15:00:42 <tidwellr> #chair mlavalle carl_baldwin
15:00:42 <openstack> Current chairs: carl_baldwin mlavalle tidwellr
15:01:19 <carl_baldwin> o/
15:01:44 <tidwellr> calr_baldwin is here, we do announcements! j/k
15:01:46 <tidwellr> #topic Announcements
15:03:34 <tidwellr> mid-cycle is coming 17th-19th
15:03:41 <tidwellr> #link https://etherpad.openstack.org/p/newton-neutron-midcycle
15:04:32 <tidwellr> anybody have a sense for what some of the hot topics might be?
15:05:12 <carl_baldwin> There have been some ML posts. There are lots of topics.
15:05:39 <carl_baldwin> Etherpad has a lot of info on it.
15:05:59 <haleyb> https://etherpad.openstack.org/p/newton-neutron-midcycle-workitems
15:06:00 <tidwellr> I asked because I didn't see much other than travel info in the etherpad
15:06:10 <haleyb> tidwellr: ^^
15:06:11 <tidwellr> ah, that's the one I was looking for
15:06:32 <carl_baldwin> You're right. I might have been thinking of another page.
15:06:34 <john-davidge> ah, haleyb beat me to it
15:06:47 <haleyb> and seeing the inside of pubs is a work item for some
15:06:49 <carl_baldwin> Yep, that page.
15:07:22 <carl_baldwin> The two should be cross-linked.
15:07:51 <mlavalle> haleyb: you not going?
15:08:48 <haleyb> mlavalle: no, on vacation with family that week, although i'll try to be online for some
15:09:13 <mlavalle> :-(
15:09:34 <mlavalle> Enjoy the vacation though!
15:09:48 <haleyb> yes, i tried to have the vacation moved to IE
15:10:39 <tidwellr> any more announcements?
15:11:35 <tidwellr> alright, moving on
15:11:41 <tidwellr> #topic Bugs
15:12:20 <tidwellr> looking at the agenda https://etherpad.openstack.org/p/neutron-l3-subteam, it looks like we should go over potential backports
15:12:27 <HenryG> I would like to highlight one bug
15:12:42 <HenryG> Bug 1562878
15:12:42 <openstack> bug 1562878 in neutron "L3 HA: Unable to complete operation on subnet" [High,Confirmed] https://launchpad.net/bugs/1562878 - Assigned to Ann Taraday (akamyshnikova)
15:13:23 * carl_baldwin looks...
15:14:03 <HenryG> It has been occuring regularly in the check/gate since July 24.
15:15:02 <carl_baldwin> HenryG: Should it be critical?
15:15:27 <carl_baldwin> Most gate failures are treated as critical. Especially if they're occurring regularly.
15:15:50 <HenryG> carl_baldwin: the recurrence is low enough that a recheck usually passes
15:16:06 <HenryG> But I am leaning towards critical
15:16:48 <carl_baldwin> HenryG: Do you have a logstash query URL you've been using?
15:17:15 <HenryG> I put it in comment #5
15:17:49 <HenryG> Remove the build_queue:"gate" option to see all the occurrences
15:17:58 <haleyb> jschwarz: is ^^ on your radar?  I know you've been looking at HA issues
15:19:03 <carl_baldwin> HenryG: Ah, I see it.
15:21:08 <carl_baldwin> HenryG: That looks like a lot of occurrences. But, many are in the same run.
15:21:15 <jschwarz> reading
15:21:23 <jschwarz> haleyb, yes, it's on my radar
15:21:45 <jschwarz> haleyb, it's in my queue (which is a bit full atm), but Ann should be back next week and I hope to cooperate with her on this
15:21:45 <carl_baldwin> I'm scratching my head over how retrying DBConnectionError will cause it.
15:21:54 <HenryG> carl_baldwin: I suck at logstash queries
15:22:34 <carl_baldwin> HenryG: Me too.
15:22:51 <jschwarz> also, I got this reproduced locally on a simple 2-node devstack deployment
15:23:07 <jschwarz> so that should give us a better understanding on why it's happening
15:23:13 <HenryG> carl_baldwin: I doubt the BDConnectionError patch causes it, I just located the last merge that went in before the bug started showing up.
15:24:05 <carl_baldwin> HenryG: Do you think maybe you just hit the limit of what logstash keeps around?
15:24:57 <carl_baldwin> I think it only keeps about 7 days of data.
15:25:20 <HenryG> carl_baldwin: It allows me to select 30 days from the drop-down.
15:26:34 <HenryG> Anyway, I think the jschwarz option is the better way to track this down.
15:26:52 * jschwarz lols @ "the jschwarz option"
15:26:57 <carl_baldwin> jschwarz: Is it reliably reproducible?
15:27:30 <jschwarz> carl_baldwin, I remember running a bunch of rally tests (5ish?) and it happened a few times (2-3)
15:27:48 <carl_baldwin> HenryG: I think all my queries get clamped at around 7 days even when selecting the 30 day option.
15:28:08 <carl_baldwin> jschwarz: Sounds good.
15:28:15 * mlavalle has never been able to get a 30 days query
15:28:26 <carl_baldwin> Should we keep it assigned to Ann?
15:28:31 <HenryG> carl_baldwin: then I have been on many wild goose chases :(
15:28:51 <carl_baldwin> HenryG: I've been on those.
15:28:56 <jschwarz> carl_baldwin, I think so - she'll come back next week and I'll discuss this bug with her and see if I should take it or not
15:29:10 <carl_baldwin> ok
15:29:33 <carl_baldwin> Let's move on.
15:29:59 <tidwellr> alright
15:30:16 <tidwellr> https://bugs.launchpad.net/neutron/+bug/1604370
15:30:16 <openstack> Launchpad bug 1604370 in neutron "functional: test_legacy_router_ns_rebuild is unstable" [High,Fix released] - Assigned to Terry Wilson (otherwiseguy)
15:30:55 <tidwellr> looks like we can pull this off the agenda
15:31:01 <mlavalle> it seems fix is realeased
15:31:32 <tidwellr> moving on
15:31:35 <tidwellr> https://bugs.launchpad.net/neutron/+bug/1596075
15:31:35 <openstack> Launchpad bug 1596075 in neutron "Neutron confused about overlapping subnet creation" [High,In progress] - Assigned to Kevin Benton (kevinbenton)
15:31:51 <mlavalle> I spent some time this morning tracking this one
15:32:16 <mlavalle> It is a complicated affair, involving the quota engine, db retries and Galera
15:32:36 <mlavalle> 2 patchsets have been merged in relationship to it
15:32:49 <mlavalle> and kevinbenton is working on another 2 fixes:
15:33:08 <mlavalle> #link https://review.openstack.org/#/c/339226/
15:33:32 <mlavalle> #link https://review.openstack.org/#/c/346289/
15:34:22 <mlavalle> I'll keep tracking it
15:34:37 <tidwellr> mlvalle: thanks for staying on top of this
15:35:06 <tidwellr> https://bugs.launchpad.net/neutron/+bug/1599329
15:35:06 <openstack> Launchpad bug 1599329 in neutron "Potential regression on handing over DHCP addresses to VMs" [High,In progress]
15:36:08 <carl_baldwin> In watch mode.
15:36:15 <tidwellr> yeah, looks like we're just waiting to see if this strikes again
15:36:47 <tidwellr> can we take https://bugs.launchpad.net/neutron/+bug/1605277 off the agenda?
15:36:47 <openstack> Launchpad bug 1605277 in neutron "[IPAM] 'Internal' ipam driver does not allow to delete all pools on subnet update" [High,Fix released] - Assigned to Carl Baldwin (carl-baldwin)
15:37:11 <carl_baldwin> Yes
15:37:28 <tidwellr> cool, done
15:37:45 <tidwellr> https://bugs.launchpad.net/neutron/+bug/1603162
15:37:45 <openstack> Launchpad bug 1603162 in neutron "Pluggable IPAM rollback fails with reference driver" [High,In progress] - Assigned to Carl Baldwin (carl-baldwin)
15:38:04 <tidwellr> carl_baldwin: any luck with this one?
15:38:31 <carl_baldwin> I've had some discussion on the ML with kevinbenton . We have some ideas.
15:38:40 <carl_baldwin> I've got to get on this one quickly.
15:39:02 <carl_baldwin> I'll be working on it today.
15:39:04 <tidwellr> this is a blocker for the cutover to pluggable IPAM, right?
15:40:06 <carl_baldwin> Yes.
15:40:10 <tidwellr> ok
15:40:18 <tidwellr> https://bugs.launchpad.net/neutron/+bug/1608406
15:40:18 <openstack> Launchpad bug 1608406 in neutron "BGP: DVR fip host routes query including legacy/HA fip routes" [Undecided,In progress] - Assigned to LIU Yulong (dragon889)
15:40:31 <tidwellr> just wanted to call this out as it has backport potential
15:40:53 <tidwellr> this is my worst BGP nightmare come true
15:41:20 <carl_baldwin> :(
15:42:00 <tidwellr> we're sending the wrong next-hop for a FIP
15:42:42 <haleyb> tidwellr: just for mitaka, right?  just updating bug tags
15:42:57 <tidwellr> both newton and mitaka
15:43:26 <haleyb> for backport i meant :)
15:43:32 <mlavalle> lol
15:43:44 <tidwellr> oh, right
15:43:53 <tidwellr> yes, mitaka for backport :)
15:44:03 <tidwellr> ugh, it's been a morning......
15:44:20 <tidwellr> I was able to reproduce locally, attaching a legacy router and a distributed router to the same external network results in a FIP on the legacy router also being announced as accessible via one of the FIP gateways
15:44:40 <tidwellr> I'm helping chase this down
15:45:07 * mlavalle appreciates that it is early in tidwellr time zone and he still shows up
15:46:10 <tidwellr> any more bugs to discuss?
15:46:33 <jschwarz> I have one but that can be left for the Open Discussion if there's time
15:46:34 <mlavalle> We have https://bugs.launchpad.net/neutron/+bug/1609540, filed by a certain carl_baldwin
15:46:34 <openstack> Launchpad bug 1609540 in neutron "Deleting csnat port fails due to no fixed ips" [Critical,In progress] - Assigned to Kevin Benton (kevinbenton)
15:46:49 <carl_baldwin> It is in watch mode.
15:47:09 <carl_baldwin> I'm going to keep an eye on it and hopefully reduce the severity soon.
15:47:36 <mlavalle> Tnaks!
15:48:42 <mlavalle> I'll put in the etherpad anyway
15:48:55 <tidwellr> alright, we don't have much time to dive in to routed networks, FWaaS, RFE's, etc.
15:49:03 <mfranc213> hello, i'm filling in for njohnston again this week.  just 3 things:
15:49:09 <mfranc213> the handle_router method was split into add_router and update_router in the L3 Agent Extension Manager patch (https://review.openstack.org/#/c/339246/10..11/neutron/agent/l3/l3_agent_extension.py)
15:49:18 <tidwellr> I'm thinking we move to open discussion
15:49:20 <mfranc213> so sorry
15:49:48 <tidwellr> mfranc213: no worries, go for it
15:50:01 <mfranc213> next patchset for the FWaaS L3 agent extension was issued yesterday (Refactor FWaaS' L3 agent extension)
15:50:06 <mfranc213> ork on the FWaaS plugin  is proceeding and i believe we will get a patchset pushed in the next couple of days.
15:50:08 <tidwellr> #topic Open Discussion
15:50:09 <mfranc213> that's it!
15:50:33 <steve_ruan> https://review.openstack.org/#/c/337662/ need 1 more "+2"
15:50:37 <tidwellr> mfranc213: thanks for the update
15:50:45 <steve_ruan> anyone can help?
15:51:24 <carl_baldwin> haleyb: could you take a look ^
15:51:45 <carl_baldwin> mfranc213: I'll take a look at that change.
15:51:48 <john-davidge> A couple of updates on service subnets. Brian noticed a possible problem with the deletion logic for https://review.openstack.org/#/c/337851/ and pushed a fix yesterday. It should be good to go. And a WIP for the follow-up patch is here https://review.openstack.org/#/c/350613/ - It's almost ready for review.
15:51:49 <haleyb> i'm on it
15:51:49 <tidwellr> steve_ruan: thanks for making some noise about that one
15:51:54 <mfranc213> thank you carl_baldwin
15:52:26 <jschwarz> I filed https://bugs.launchpad.net/neutron/+bug/1609738 which deals with a weird state HA routers can get into while creating/updating it.. the solution I have in mind involves refactoring update_router_db for the l3_hamode_db.py
15:52:26 <openstack> Launchpad bug 1609738 in neutron "l3-ha: a router can be stuck in the ALLOCATING state" [Undecided,New] - Assigned to John Schwarz (jschwarz)
15:52:52 <haleyb> john-davidge: and the OSC patch https://review.openstack.org/#/c/342976/ just got a +2
15:53:05 <jschwarz> such that modifying admin_state_up will unschedule/schedule the router (as opposed for the current ha attribute change which does this one)
15:53:16 <john-davidge> haleyb: woop \o/
15:53:16 <jschwarz> I need more opinions though on this matter
15:53:39 <mlavalle> john-davidge: I took a look at https://review.openstack.org/#/c/350613
15:54:30 <john-davidge> mlavalle: Yes, thanks for the review. I've already incorporated it into the next patch. Should help with performance at scale
15:56:55 <mlavalle> jschwarz: is there a patchset up for review or should we comment in the bug?
15:57:08 <jschwarz> mlavalle, comments on the bug will be much appreciated
15:57:27 <jschwarz> it's quite a refactor and I started working on it today and it broke a few things :<
15:58:02 <mlavalle> yeap, that's what big refactors do
15:58:07 <jschwarz> XD
15:59:46 <mlavalle> it seems bugs left us exhausted today :-)
16:00:13 <tidwellr> mlavalle: indeed
16:00:25 <tidwellr> thanks everyone!
16:00:31 <tidwellr> #endmeeting