15:02:31 <carl_baldwin> #startmeeting neutron_l3
15:02:31 <openstack> Meeting started Thu Oct 16 15:02:31 2014 UTC and is due to finish in 60 minutes.  The chair is carl_baldwin. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:02:32 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:02:32 <seizadi> Hi
15:02:34 <openstack> The meeting name has been set to 'neutron_l3'
15:02:37 <carl_baldwin> Rajeev: hi
15:02:49 <carl_baldwin> #topic Announcements
15:03:12 <carl_baldwin> #link https://wiki.openstack.org/wiki/Meetings/Neutron-L3-Subteam
15:03:22 <carl_baldwin> I don’t think I have any announcements.  Anyone else?
15:04:04 <carl_baldwin> #topic l3-high-availability
15:04:15 <carl_baldwin> safchain: hi.  Anything new?
15:05:13 <carl_baldwin> #undo
15:05:14 <openstack> Removing item from minutes: <ircmeeting.items.Topic object at 0x3188550>
15:05:32 <carl_baldwin> Actually, I thought of an announcement.  It just hit me that Juno final should be today.
15:05:44 <carl_baldwin> #topic bgp-dynamic-routing
15:05:46 <carl_baldwin> devvesa: hi
15:06:23 <devvesa> hi
15:07:32 <carl_baldwin> I think we’re almost there with the blueprint.
15:08:45 <carl_baldwin> The question remaining is whether the dr agent is associated with a network or a router.  I’m still thinking about it.
15:09:07 <devvesa> me too. I understand you point, it can become annoying to add all the routers
15:09:27 <matrohon> me too :) an alternative could be to associate it to a l3 agent?
15:09:29 <devvesa> but... it can be the use case where you don't want to advertise all?
15:09:34 <carl_baldwin> I can see advantages and disadvantages to both strategies.  I’m wondering if we need to hear some more opinions.
15:09:49 <devvesa> matrohon: the alternative is associate the router to the dr_agent
15:10:12 <devvesa> so: a network with all his routers or a single router one by one?
15:11:14 <carl_baldwin> matrohon: I’m not sure an l3 agent fits.  they can be associated with many routers.  There can be many l3 agents associated with an external network.  In some cases, there can be more than one external network associated to an l3 agent.
15:11:25 <ryu25> i see attaching to a router as more intuitive to the users, but providing a way to do 'bulk attach' by specifying external network sounds useful.  Perhaps allow both?
15:11:58 <carl_baldwin> ryu25: I was just going to suggest maybe both too.  We could start with routers initially maybe.
15:12:28 <carl_baldwin> That complicates the db model a little bit…  It is something to think about.
15:13:36 <carl_baldwin> Would we model that with two types of associations (two tables) or one table that can handle different types of association?
15:13:40 <carl_baldwin> devvesa: ^
15:14:09 <devvesa> yes, why not?
15:14:30 <devvesa> another think to talk about is if we forget the discovery or not
15:14:45 <devvesa> I removed any discover reference in the spec, but you seemed concerned about it
15:14:50 <carl_baldwin> haleyb: hi
15:14:56 <ryu25> right...  the db model also feels more natural if the router and dynamic routing are directly attached.  If the only reason for going with external network is to provide a way to associate it with multiple routers, then it sounds like it should be a thought of as an API level enhancement
15:14:58 <haleyb> carl_baldwin: hi
15:15:28 <carl_baldwin> devvesa: Are you thinking of putting that off to another blueprint?  How important is the discover to your use case?
15:15:42 <devvesa> it is not important, we just need to advertise a full range
15:16:19 <devvesa> let's avoid the discovery but trying to do things in a way that won't be difficult to add it in a future
15:16:25 <ryu25> agreed that advertisement is more useful than discovery
15:16:25 <devvesa> what do you think?
15:16:26 <carl_baldwin> Then I think leaving it out of this blueprint is fine.  I’d like to leave the door open to addressing it later.
15:17:06 <devvesa> ok
15:17:06 <carl_baldwin> It does make sense to address advertisement first and discovery later if there is demand for it.
15:18:35 <carl_baldwin> So, have we decided to implement an associated with a router now and follow-on later with an association to a network?  I’m happy with that.
15:18:59 <devvesa> I'm happy too
15:19:30 <carl_baldwin> I’m think I’m leaning toward adding a property to the association to indicate if it is a router or a network.  What do you think?
15:19:48 <ryu25> sounds good to me.  In the future, tenants will want to add BGP to their own routers to advertise routes over VPN too
15:20:38 <carl_baldwin> ryu25: Yes, I think so.  I still see an admin involved in that.
15:21:14 <carl_baldwin> devvesa: okay.  Anything else?  I hope we can get this blueprint in soon.
15:21:18 <devvesa> Ok, I'll take that into account in the next spec review
15:21:32 <carl_baldwin> devvesa: great.  Thanks for your work and patience.
15:21:37 <devvesa> me too! thanks carl!
15:21:58 <carl_baldwin> thanks ryu25 and matrohon for your input too.
15:22:09 <carl_baldwin> #topic L3 Agent Refactoring
15:22:11 <ryu25> carl_baldwin:  anytime!
15:22:20 <devvesa> please matrohon, review the spec, we want to do something useful for VPN guys
15:22:38 <carl_baldwin> devvesa: matrohon: +1
15:23:00 <matrohon> devvesa : I'll send a potential design for IPVPN attachment
15:23:16 <devvesa> looking forward to see it
15:23:31 <matrohon> you can have a first look here :  https://docs.google.com/drawings/d/1NN4tDgnZlBRr8ZUf5-6zzUcnDOUkWSnSiPm8LuuAkoQ/edit
15:23:40 <carl_baldwin> I have to apologize that I have not posted the spec for the refactoring of the L3 agent.  I wrote it as promised but I need a sign off from our legal team to post it.  I’m still trying to get that.
15:24:12 <yamahata> hi, I have uploaded WIP patch for l3 agent.
15:24:24 <yamahata> https://review.openstack.org/#/c/128846/
15:25:00 <yamahata> That's quite incompleted, but enough to show the idea/direction
15:25:35 <Swami> Will this be part of the L3 agent refactoring effort or will this be a separate patch
15:25:54 <carl_baldwin> yamahata: do you have a blueprint?  This is a pretty big change and looks like it has a big chance of conflicting with other efforts.
15:25:58 <yamahata> I expect it will be a part of l3 agent refactoring.
15:26:31 <matrohon> yamahata : this leads to modula l3 agent?
15:26:31 <Swami> yamahata: then you may need to add in your idea to the same blueprint that carl was mentioning
15:26:40 <yamahata> carl_baldwin: no blueprint.
15:27:02 <yamahata> carl_baldwin: so I'd like to discuss before going further.
15:27:09 <matrohon> is modular l3 agent the overall direction? (sorry for the newby question:)
15:27:21 <carl_baldwin> yamahata: I see value in this.  Could you hold off on it for just a bit while we try to bring our efforts together?
15:27:41 <yamahata> carl_baldwin: Sure.
15:27:51 <carl_baldwin> matrohon: We would like to get closer to a modular agent.  But that is a bit longer term.
15:28:20 <matrohon> carl_baldwin : thanks
15:28:44 <carl_baldwin> yamahata: I like the enthusiasm.  I don’t want to squash it.  I will have a look at what you have proposed and try to work it in to the overall effort.
15:29:07 <carl_baldwin> matrohon: We won’t get there in one step.  However, the L3 should get more modular as the work progresses.
15:29:30 <yamahata> carl_baldwin: so far I heard just refactor of l3 agent. Can you please elaborate what kind of refactoring?
15:29:46 <yamahata> My motivation is routervm.
15:30:08 <matrohon> carl_baldwin : fine : will try to help since I think i'll need it for BGPVPN
15:30:19 <carl_baldwin> yamahata: I was hoping to have the blueprint up to answer these questions.  Sigh.  Let me try to give you the gist of it.
15:30:29 <yamahata> So I'd like to split out device specific logic and the logic of polling/syncing
15:31:20 <carl_baldwin> I’d like to start by adding an encapsulation for a router to remove all of the router stuff from l3_agent.py.  This will leave the agent class to handle RPC updates and queuing them to workers.
15:31:30 <carl_baldwin> This sounds similar to your goal.
15:31:55 <yamahata> carl_baldwin: yes, sound similar.
15:32:24 <carl_baldwin> Another early step will be to encapsulate a namespace and create a manager that will handle the clean up of stale ones.
15:33:09 <Swami> I like the idea. +1
15:33:19 <carl_baldwin> Then, I’d like to encapsulate plugging ports, especially the external gateway port since there is so much logic around that for DVR vs legacy routers.  All of the floating ip namespace handling would go under this encapsulation.
15:33:56 <mrsmith> great ideas carl_baldwin
15:34:08 <carl_baldwin> I’m thinking of maybe a driver model to handle the differences between  DVR and legacy connection to the external network.
15:35:02 <carl_baldwin> I think with those initial steps, the L3 agent will be significantly less tangled up in itself than it is now.
15:35:25 <carl_baldwin> It should enable further work to clean it up even more.
15:36:07 <carl_baldwin> I’ll try to get the sign-off to post the blueprint and will add you all as reviewers.
15:37:13 <carl_baldwin> #topic neutron-ovs-dvr
15:37:26 <Swami> carl_baldwin: hi
15:37:29 <carl_baldwin> Swami: mrsmith: Rajeev: anything since yesterday?
15:37:45 <Swami> We got another bug filed regarding the floating IP status update
15:37:48 <carl_baldwin> Swami: I did not get a chance yesterday to look at the locking issues much.  I have that on my plan for today.
15:37:54 <Swami> We are looking into it right now.
15:38:08 <carl_baldwin> Swami: Ah yes, the new tempest test was added for that.
15:38:13 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1381617
15:38:34 <Swami> It seems to be like a timing issue, but we are looking into it to confirm.
15:38:51 <Swami> We had some headway on the lockwait issue.
15:39:10 <carl_baldwin> Swami: headway since yesterday?
15:39:14 <carl_baldwin> What have you found?
15:39:38 <Swami> On the lockwait it seems that a port is deleted by "gateway_clear".
15:39:53 <Swami> The router_interface_delete is trying to delete the same port.
15:40:30 <Swami> When it goes to the delete_port, it calls "db.get_locked_port_and_binding".
15:40:51 <Swami> In a  normal scenario, this function returns the port that was deleted and then it logs a message and quits.
15:41:03 <Swami> But in this case this function does not return.
15:41:16 <Swami> the query to find out the ports that are locked fails.
15:42:44 <Rajeev> On another note: Saw the re-occurrence of the race condition in l_3 processing floating ips.
15:43:04 <Rajeev> filed defect 1381238
15:43:15 <Rajeev> #link https://bugs.launchpad.net/neutron/+bug/1381238
15:43:23 <carl_baldwin> Swami: So, let me see if I have this right...
15:43:31 <Swami> This is query that fails: port = (session.query(models_v2.Port). enable_eagerloads(False). filter_by(id=port_id). with_lockmode('update'). one())
15:44:00 <carl_baldwin> delete_port tries to get_locked_port_and_binding and it cannot because gateway_clear has already obtained the lock to delete the port?
15:44:37 <carl_baldwin> Rajeev: Thanks for the link.
15:44:44 <Swami> Yes in this case the gateway_clear has already deleted the port, it is not clear after deleting the port if it had released the lock or not.
15:45:14 <Swami> The log message shows that the port has already been deleted.
15:45:57 <Swami> I have updated the launchpad bug with the neutron-server log and I have mentioned the port-id that was causing this problem. Take a look at it.
15:46:26 <carl_baldwin> Swami: okay.  I couldn’t yesterday but will probably have some time today.
15:46:35 <Swami> carl_baldwin: thanks
15:46:47 <carl_baldwin> Swami: thank you
15:47:31 * carl_baldwin goes to look at bug 1381238
15:48:31 <carl_baldwin> Rajeev: How is this bug going?
15:49:00 <Rajeev> carl_baldwin: have a review up with possible fix
15:49:19 <Rajeev> #link https://review.openstack.org/#/c/128131/
15:50:22 <carl_baldwin> Rajeev: Could you link the review to the bug?  Somehow it did not get linked.
15:50:42 <Rajeev> sure, didn't realize that.
15:50:56 <carl_baldwin> ^ It did not get linked because the first PS didn’t mention the bug in the commit msg.  I’ve had that happen.
15:51:23 <carl_baldwin> Swami: mrsmith: Rajeev:  Anything else?
15:51:30 <Rajeev> I see, how do I link it now ?
15:51:52 <Swami> carl_baldwin: I don't think I have anything more to add. We are still focussed on bugs and backlog items.
15:51:59 <carl_baldwin> Swami: Thanks.
15:52:06 <Rajeev> On bug #1381617
15:52:06 <carl_baldwin> Rajeev, just paste a link in a comment.
15:53:10 <Rajeev> from the logs it appears the test is checking the status of the floating ips too quick.
15:53:43 <Rajeev> It would be a timing issue with the test as Swami mentioned earlier.
15:53:54 <mrsmith> dvr takes a little longer to setup FIPs since we have extra ns, port, etc
15:54:00 <Rajeev> #link https://bugs.launchpad.net/neutron/+bug/1381617
15:54:10 <Rajeev> mrsmith: yes
15:55:23 <Rajeev> no more updates from me except , take a look at review https://review.openstack.org/#/c/128131/
15:57:28 <carl_baldwin> Thanks all.  We’re about out of time.
15:57:34 <carl_baldwin> Keep up the good work.
15:57:43 <carl_baldwin> #endmeeting