09:00:19 <oanson> #startmeeting Dragonflow
09:00:20 <openstack> Meeting started Mon Nov 14 09:00:19 2016 UTC and is due to finish in 60 minutes.  The chair is oanson. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:00:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:00:23 <openstack> The meeting name has been set to 'dragonflow'
09:00:27 <oanson> Hello.
09:00:34 <dimak> Good morning
09:00:35 <oanson> Welcome to Dragonflow weekly.
09:00:37 <lihi> Hi
09:00:39 <rajivk> Hi
09:00:45 <oanson> Reminder, the agenda is posted here: https://wiki.openstack.org/wiki/Meetings/Dragonflow
09:00:53 <xiaohhui> hello
09:01:03 <oanson> I doubt we will cover everything, but here's hoping
09:01:24 <oanson> #info yuli_s dimak lihi rajivk xiaohhui In meeting
09:02:01 <oanson> All right. Let's start.
09:02:07 <oanson> #topic Ocata Roadmap
09:02:18 <oanson> There are some specs that were uploaded this week
09:02:39 <oanson> I think they need to be merged by Thursday to make the first milestone
09:03:16 <oanson> Let's start with IPv6
09:03:29 <oanson> #link IPv6 spec https://review.openstack.org/#/c/396226/
09:04:00 <oanson> lihi, I take it you fixed the previous comments on the spec?
09:04:11 <oanson> Anyone, if you have any open questions about it, this is the time
09:04:12 <yuval> hey all :)
09:04:17 <lihi> Yes. I've upload an update to the spec a few minutes ago.
09:05:02 <oanson> All right. No questions are good questions :)
09:05:05 <oanson> Next up is SFC.
09:05:18 <oanson> #link Service Function Chaining (SFC) spec https://review.openstack.org/#/c/396226/
09:05:25 <oanson> #link Service Function Chaining (SFC) spec https://review.openstack.org/#/c/394498/
09:05:34 <oanson> Wrong copy-paste buffer.
09:05:40 <dimak> I'm waiting for more comments on the spec
09:06:07 <oanson> All right. I should be able to go over all the specs today.
09:06:12 <dimak> I've addressed all the comments and uploaded an new patchset yesterday
09:06:38 <oanson> Yes. It looks like there are no new comments yet.
09:06:40 <dimak> I'd love to see the lbaas spec to see if I can use it for load balancing between sfcs
09:06:45 <dimak> SFs*
09:07:00 <xiaohhui> is there a lbaas spec?
09:07:08 <oanson> I received a draft of the LB spec.
09:07:39 <oanson> The author will upload it today.
09:07:51 <xiaohhui> OK
09:08:02 <dimak> other than that, I hope the spec is extensive enough
09:09:03 <oanson> Looks like no one has any complaints :)
09:09:28 <dimak> thats all for me then
09:09:45 <oanson> Next up is chassis health/service health report
09:10:32 <xiaohhui> I have reply comments in the spec and hope rajivk can give some input based on that
09:10:35 <oanson> #link Is chassis alive support https://review.openstack.org/#/c/385719/
09:10:58 <oanson> xiaohhui, rajivk, maybe you'd like to take this chance for the discussion?
09:11:05 <xiaohhui> Sure
09:11:10 <oanson> We may be able to close any loose ends quickly in this format
09:11:30 <rajivk> xiaohhui, can you share your thought?
09:11:44 <rajivk> i could not check my mail today on gmail
09:11:54 <rajivk> it is banned in my organization.
09:12:21 <rajivk> What do you think, should we do it together or separately?
09:12:23 <xiaohhui> I added comments at https://review.openstack.org/#/c/385719/7/doc/source/specs/support_check_chassis_alive.rst@39
09:12:39 <xiaohhui> to add my thought about service monitor
09:13:13 <oanson> xiaohhui, I see you mostly recommend using Neutron's ProcessManager?
09:13:39 <xiaohhui> Yeah, I think that will be simpler
09:13:55 <oanson> Can we use that for the DF local controller too?
09:14:25 <xiaohhui> I don't think so, df local controller should act as the controller to other service.
09:14:27 <oanson> I also saw it has some accuracy issues. Is that something you ran into? (Or was that only in my setup)
09:14:28 <rajivk> what do you think about each service reporting it's own status?
09:15:14 <xiaohhui> oanson: what do you mean by accuracy issues?
09:15:44 <oanson> Sometimes is shows that agents which have failed that they are still alive
09:15:44 <xiaohhui> rajivk: I think that is ok too, but what's the purpose for that?
09:15:53 <oanson> And shows agents that are still alive as failed
09:16:12 <xiaohhui> You mean the neutron agent status, right?
09:16:43 <oanson> With Neutron's ProcessManager, my understanding is that each process sends a heartbeat, and you have a monitor reporting that heartbeats are missing. xiaohhui, is that correct?
09:17:56 <xiaohhui> Neutron's ProcessManager run in one process, and in that process, the process will be checked, if the process dies, the monitored process will be restarted
09:18:20 <xiaohhui> For example the metadata-ns-proxy in neutron-l3-agent
09:18:55 <xiaohhui> neutron-l3-agent manages  metadata-ns-proxy by using ProcessManager, to keep it always alive.
09:19:50 <oanson> And metadata-ns-proxy updates the Neutron database with a heartbeat_timestamp? This way the l3 agent knows the ns-proxy service is alive?
09:20:23 <xiaohhui> no, l3 agent check  metadata-ns-proxy's aliveness by checking its pid file
09:20:56 <xiaohhui> no hearbeat is required for metadata-ns-proxy
09:21:22 <oanson> I'm a little confused.
09:21:50 <oanson> You recommend that local controller verifies health of other services (metadata, l3-agent, publisher) by checking their PID.
09:22:03 <oanson> And reporting at intervals to the database for itself?
09:22:51 <oanson> And have the publisher on the Neutron server be called via the ProcessManager? Where health-check is managed automatically?
09:23:07 <xiaohhui> yeah, that is the model of neutron agents for now. agents report to neutron db periodly, however, sub-process will be managed by ProcessManager
09:23:36 <oanson> I see.
09:24:18 <xiaohhui> publisher can be managed by neutron server, as it only needs to work with neutron server(maybe not need to live with neutron server)
09:24:43 <oanson> We could maintain that model. Have the local controller monitor other df services. When it reports to the database, it will report the health of all services. The only issue is that these services are started outside of the controller.
09:25:14 <oanson> The publisher has to live with the neutron server, since the communicate over IPC. Currently, it is started externally (But obviously this can be changed)
09:25:38 <oanson> I think it is started externally to avoid the q-svc forking issue.
09:25:38 <xiaohhui> If we decided to go that way, some changes need to be made for starting process
09:26:10 <oanson> Yes.
09:26:38 <oanson> Good thing this is the start of the cycle :)
09:27:16 <oanson> The metadata service can be started from the DF metadata app.
09:27:16 <xiaohhui> I am thinking adding more information to chassis, we can definitely add the status of sub-process to chassis too.
09:27:32 <oanson> We can write a centralised snat app to maintain df-l3-agent.
09:27:55 <xiaohhui> yeah, so that if user don't add the metadata app, the metadata proxy service don't need to be started
09:28:12 <oanson> I need to review the Neutron code, but I'm sure we can find the correct place to add the publisher service startup code
09:28:25 <oanson> xiaohhui, yes. Same for the l3 agent.
09:28:26 <xiaohhui> we can wait for the distributed snat and then remove df-l3-agent, right?
09:28:34 <oanson> That's the plan
09:28:55 <oanson> But with this change, centralised snat can still be used, as a DF app.
09:29:14 <oanson> Which is more in-line with our global design
09:29:40 <xiaohhui> I guess that is something can be implemented.
09:29:52 <oanson> Sounds like we're in agreement. rajivk, any comments?
09:30:15 <rajivk> i am still going through your dicussion
09:30:21 <rajivk> :)
09:30:26 <rajivk> You can continue
09:30:29 <oanson> All right. We can take a second to breathe :)
09:30:47 <xiaohhui> We can continue our discussion in the spec, if we miss something here.
09:30:54 <oanson> Sounds like a plan
09:30:56 <rajivk> No, it is ok, you carry on with the meeting.
09:31:14 <rajivk> I will need extra effort to understand, what you mean :)
09:31:38 <oanson> rajivk, no worries. We're here for any follow-ups.
09:31:47 <rajivk> oanson, thanks :)
09:32:02 <oanson> I'll skip ahead for a bit:
09:32:15 <oanson> LBaaS, as I said, the spec should be up today.
09:32:39 <oanson> I also received a draft for sNAT. I can't promise it will be up today, but I hope by tomorrow.
09:32:44 <irenaber> oanson, to support V2 apis?
09:32:50 <oanson> Yes
09:32:50 <irenaber> for LBaaS
09:33:08 <oanson> V1 has been removed in Newton, so supporting it is a bit...
09:34:10 <oanson> Let's move on to Tap as a service.
09:34:32 <oanson> #link Tap as a service (TAPaaS) spec https://review.openstack.org/#/c/396307/
09:35:20 <oanson> yamamoto added some comments about the TAP location.
09:35:37 <oanson> yuli_s, would you like to discuss the two TODO items?
09:36:08 <oanson> no. 1 (TAP destination port on different compute nodes) should be easy to tackle.
09:36:13 <yuli_s> yes, sure
09:36:28 <oanson> We already have everything in place. Just need to update the correct parameters, and send to the table that handles tunneling.
09:36:46 <yuli_s> i got a comments from Yamamoto
09:37:05 <yuli_s> that the packets should be intercepted after the SG
09:37:05 <oanson> Yes. He mentions that security groups should be taken into account.
09:37:14 <oanson> Could you please verify and update the spec?
09:37:28 <yuli_s> so, I need to update the spec with his comments
09:37:33 <oanson> Yes
09:38:13 <yuli_s> I will review the rules and release a new version today
09:38:17 <yuli_s> os the spec
09:38:20 <yuli_s> of the spec
09:38:35 <oanson> Great.
09:38:44 <oanson> About TAP of a TAP. Do you have any thoughts?
09:38:46 <yuli_s> regarding the second item
09:38:52 <yuli_s> yes, tap of the tap
09:39:36 <oanson> Any thoughts?
09:39:44 <yuli_s> i think I need additional time
09:39:47 <yuli_s> to research this
09:39:54 <oanson> Sure.
09:40:11 <yuli_s> because dimak suggested that it can create a loop
09:40:21 <oanson> Sure. Worst comes to worst, we can leave it as a limitation in Ocata.
09:40:35 <yuli_s> ok, great
09:41:01 <yuli_s> I would like to get oppinion of all the experts
09:41:01 <oanson> But that will have to be written in the spec. And if you can make it after all, that would be better
09:41:29 <oanson> Also don't forget to treat dimak 's comments about router ports and dhcp ports.
09:41:29 <yuli_s> if the target VM ( the one that should receive the intercepted traffic)
09:41:38 <yuli_s> has seurity group in place
09:41:59 <yuli_s> should we filter that traffic of other hosts as well ?
09:43:15 <yuli_s> ok, I will write this in spec and we will see other people opinon
09:43:20 <oanson> yuli_s, could you explain a bit more?
09:43:26 <oanson> I'm not sure I understand the question
09:44:23 <yuli_s> as I understand the comments from Yamamoto, it is not clear if we need to filter traffic 2 times
09:44:35 <yuli_s> ones when the traffic leaves the source vm
09:44:47 <yuli_s> and another when it received by target VM
09:45:11 <yuli_s> (I am talking here about tapped traffic)
09:46:09 <yuli_s> I will write to Yamamoto to get his comments,
09:46:12 <oanson> I think it would be best to understand how TAPaaS should work with SG
09:46:12 <yuli_s> lets continue
09:46:31 <oanson> The above is already an implementation question, but I would like to understand the requirement.
09:46:39 <oanson> yuli_s, please do. I think this needs clearing up.
09:46:52 <oanson> All right, this is all for roadmap.
09:47:00 <oanson> Is there a spec, blueprint, or feature I missed?
09:47:27 <oanson> Remember, to meet the Ocata milestones, we need these merged by the end of the week.
09:47:53 <oanson> I think we're in good shape!
09:48:09 <oanson> #topic bugs
09:48:38 <oanson> Status hasn't changed since last week. There is still one High priority bug with no progress
09:48:45 <oanson> Bug 1638151
09:48:45 <openstack> bug 1638151 in DragonFlow "Router schedule error in L3 router plugin as there are multi-external network" [High,New] https://launchpad.net/bugs/1638151 - Assigned to rajiv (rajiv-kumar)
09:49:21 <xiaohhui> I think I can give a look to it from neutron side,
09:49:46 <xiaohhui> After another thought, I think it might be a neutron issue, but I need more investigation
09:50:00 <oanson> Currently its assigned to rajivk, so make sure you guys aren't stepping on each others' toes
09:50:14 <rajivk> I will abondon it
09:50:20 <xiaohhui> sorry, I didn't notice that it has owner
09:50:28 <rajivk> I know, it is high priority
09:50:35 <oanson> xiaohhui, that's all right. That;s why we are discussing it :)
09:50:47 <rajivk> and xiaohhui can do it better than me
09:51:04 <oanson> My question is, is it really high priority? Are there any functional issues, except the thrown error?
09:51:22 <oanson> (I asked this also on the bug)
09:51:42 <xiaohhui> if we don't eager to support mult-external network, I think it is ok
09:52:56 <xiaohhui> don't -> aren't
09:53:10 <oanson> All right. Then I'll reduce it to medium
09:53:31 <oanson> I think everything else is under control. I want to concentrate on specs this weekl
09:53:33 <rajivk> I have removed myself from assignee
09:53:33 <oanson> week*
09:53:52 <oanson> Anything else for bugs?
09:54:05 <oanson> #topic Open Discussion
09:54:11 <oanson> The floor is for the taking
09:54:45 <xiaohhui> If you guys have extra time
09:54:58 <xiaohhui> I would like to you to review this https://review.openstack.org/#/c/339975/
09:55:31 <oanson> I'll try and get to it today/tomorrow
09:55:34 <xiaohhui> it is in merge conflict now, but it will be updated soon
09:55:39 <xiaohhui> thanks
09:55:49 <oanson> It's not in merge conflict...
09:56:18 <xiaohhui> my eyes go wrong...
09:56:31 <oanson> You mean patch: "add active detection app for allowed address pairs"
09:56:33 <oanson> ?
09:56:48 <xiaohhui> yeah
09:57:12 <oanson> Sure. I'll look at it today/tomorrow :)
09:57:17 <xiaohhui> it is in merge conflict, I give it another glance...
09:57:23 <xiaohhui> anyway, thanks
09:57:45 <oanson> Sure.
09:57:47 <oanson> Anything else?
09:57:56 <xiaohhui> nothing from me...
09:58:14 <oanson> Great. Then we can go to eat :)
09:58:22 <oanson> Thanks everyone for coming!
09:58:30 <oanson> #endmeeting