09:00:19 #startmeeting Dragonflow 09:00:20 Meeting started Mon Nov 14 09:00:19 2016 UTC and is due to finish in 60 minutes. The chair is oanson. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:00:21 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:00:23 The meeting name has been set to 'dragonflow' 09:00:27 Hello. 09:00:34 Good morning 09:00:35 Welcome to Dragonflow weekly. 09:00:37 Hi 09:00:39 Hi 09:00:45 Reminder, the agenda is posted here: https://wiki.openstack.org/wiki/Meetings/Dragonflow 09:00:53 hello 09:01:03 I doubt we will cover everything, but here's hoping 09:01:24 #info yuli_s dimak lihi rajivk xiaohhui In meeting 09:02:01 All right. Let's start. 09:02:07 #topic Ocata Roadmap 09:02:18 There are some specs that were uploaded this week 09:02:39 I think they need to be merged by Thursday to make the first milestone 09:03:16 Let's start with IPv6 09:03:29 #link IPv6 spec https://review.openstack.org/#/c/396226/ 09:04:00 lihi, I take it you fixed the previous comments on the spec? 09:04:11 Anyone, if you have any open questions about it, this is the time 09:04:12 hey all :) 09:04:17 Yes. I've upload an update to the spec a few minutes ago. 09:05:02 All right. No questions are good questions :) 09:05:05 Next up is SFC. 09:05:18 #link Service Function Chaining (SFC) spec https://review.openstack.org/#/c/396226/ 09:05:25 #link Service Function Chaining (SFC) spec https://review.openstack.org/#/c/394498/ 09:05:34 Wrong copy-paste buffer. 09:05:40 I'm waiting for more comments on the spec 09:06:07 All right. I should be able to go over all the specs today. 09:06:12 I've addressed all the comments and uploaded an new patchset yesterday 09:06:38 Yes. It looks like there are no new comments yet. 09:06:40 I'd love to see the lbaas spec to see if I can use it for load balancing between sfcs 09:06:45 SFs* 09:07:00 is there a lbaas spec? 09:07:08 I received a draft of the LB spec. 09:07:39 The author will upload it today. 09:07:51 OK 09:08:02 other than that, I hope the spec is extensive enough 09:09:03 Looks like no one has any complaints :) 09:09:28 thats all for me then 09:09:45 Next up is chassis health/service health report 09:10:32 I have reply comments in the spec and hope rajivk can give some input based on that 09:10:35 #link Is chassis alive support https://review.openstack.org/#/c/385719/ 09:10:58 xiaohhui, rajivk, maybe you'd like to take this chance for the discussion? 09:11:05 Sure 09:11:10 We may be able to close any loose ends quickly in this format 09:11:30 xiaohhui, can you share your thought? 09:11:44 i could not check my mail today on gmail 09:11:54 it is banned in my organization. 09:12:21 What do you think, should we do it together or separately? 09:12:23 I added comments at https://review.openstack.org/#/c/385719/7/doc/source/specs/support_check_chassis_alive.rst@39 09:12:39 to add my thought about service monitor 09:13:13 xiaohhui, I see you mostly recommend using Neutron's ProcessManager? 09:13:39 Yeah, I think that will be simpler 09:13:55 Can we use that for the DF local controller too? 09:14:25 I don't think so, df local controller should act as the controller to other service. 09:14:27 I also saw it has some accuracy issues. Is that something you ran into? (Or was that only in my setup) 09:14:28 what do you think about each service reporting it's own status? 09:15:14 oanson: what do you mean by accuracy issues? 09:15:44 Sometimes is shows that agents which have failed that they are still alive 09:15:44 rajivk: I think that is ok too, but what's the purpose for that? 09:15:53 And shows agents that are still alive as failed 09:16:12 You mean the neutron agent status, right? 09:16:43 With Neutron's ProcessManager, my understanding is that each process sends a heartbeat, and you have a monitor reporting that heartbeats are missing. xiaohhui, is that correct? 09:17:56 Neutron's ProcessManager run in one process, and in that process, the process will be checked, if the process dies, the monitored process will be restarted 09:18:20 For example the metadata-ns-proxy in neutron-l3-agent 09:18:55 neutron-l3-agent manages metadata-ns-proxy by using ProcessManager, to keep it always alive. 09:19:50 And metadata-ns-proxy updates the Neutron database with a heartbeat_timestamp? This way the l3 agent knows the ns-proxy service is alive? 09:20:23 no, l3 agent check metadata-ns-proxy's aliveness by checking its pid file 09:20:56 no hearbeat is required for metadata-ns-proxy 09:21:22 I'm a little confused. 09:21:50 You recommend that local controller verifies health of other services (metadata, l3-agent, publisher) by checking their PID. 09:22:03 And reporting at intervals to the database for itself? 09:22:51 And have the publisher on the Neutron server be called via the ProcessManager? Where health-check is managed automatically? 09:23:07 yeah, that is the model of neutron agents for now. agents report to neutron db periodly, however, sub-process will be managed by ProcessManager 09:23:36 I see. 09:24:18 publisher can be managed by neutron server, as it only needs to work with neutron server(maybe not need to live with neutron server) 09:24:43 We could maintain that model. Have the local controller monitor other df services. When it reports to the database, it will report the health of all services. The only issue is that these services are started outside of the controller. 09:25:14 The publisher has to live with the neutron server, since the communicate over IPC. Currently, it is started externally (But obviously this can be changed) 09:25:38 I think it is started externally to avoid the q-svc forking issue. 09:25:38 If we decided to go that way, some changes need to be made for starting process 09:26:10 Yes. 09:26:38 Good thing this is the start of the cycle :) 09:27:16 The metadata service can be started from the DF metadata app. 09:27:16 I am thinking adding more information to chassis, we can definitely add the status of sub-process to chassis too. 09:27:32 We can write a centralised snat app to maintain df-l3-agent. 09:27:55 yeah, so that if user don't add the metadata app, the metadata proxy service don't need to be started 09:28:12 I need to review the Neutron code, but I'm sure we can find the correct place to add the publisher service startup code 09:28:25 xiaohhui, yes. Same for the l3 agent. 09:28:26 we can wait for the distributed snat and then remove df-l3-agent, right? 09:28:34 That's the plan 09:28:55 But with this change, centralised snat can still be used, as a DF app. 09:29:14 Which is more in-line with our global design 09:29:40 I guess that is something can be implemented. 09:29:52 Sounds like we're in agreement. rajivk, any comments? 09:30:15 i am still going through your dicussion 09:30:21 :) 09:30:26 You can continue 09:30:29 All right. We can take a second to breathe :) 09:30:47 We can continue our discussion in the spec, if we miss something here. 09:30:54 Sounds like a plan 09:30:56 No, it is ok, you carry on with the meeting. 09:31:14 I will need extra effort to understand, what you mean :) 09:31:38 rajivk, no worries. We're here for any follow-ups. 09:31:47 oanson, thanks :) 09:32:02 I'll skip ahead for a bit: 09:32:15 LBaaS, as I said, the spec should be up today. 09:32:39 I also received a draft for sNAT. I can't promise it will be up today, but I hope by tomorrow. 09:32:44 oanson, to support V2 apis? 09:32:50 Yes 09:32:50 for LBaaS 09:33:08 V1 has been removed in Newton, so supporting it is a bit... 09:34:10 Let's move on to Tap as a service. 09:34:32 #link Tap as a service (TAPaaS) spec https://review.openstack.org/#/c/396307/ 09:35:20 yamamoto added some comments about the TAP location. 09:35:37 yuli_s, would you like to discuss the two TODO items? 09:36:08 no. 1 (TAP destination port on different compute nodes) should be easy to tackle. 09:36:13 yes, sure 09:36:28 We already have everything in place. Just need to update the correct parameters, and send to the table that handles tunneling. 09:36:46 i got a comments from Yamamoto 09:37:05 that the packets should be intercepted after the SG 09:37:05 Yes. He mentions that security groups should be taken into account. 09:37:14 Could you please verify and update the spec? 09:37:28 so, I need to update the spec with his comments 09:37:33 Yes 09:38:13 I will review the rules and release a new version today 09:38:17 os the spec 09:38:20 of the spec 09:38:35 Great. 09:38:44 About TAP of a TAP. Do you have any thoughts? 09:38:46 regarding the second item 09:38:52 yes, tap of the tap 09:39:36 Any thoughts? 09:39:44 i think I need additional time 09:39:47 to research this 09:39:54 Sure. 09:40:11 because dimak suggested that it can create a loop 09:40:21 Sure. Worst comes to worst, we can leave it as a limitation in Ocata. 09:40:35 ok, great 09:41:01 I would like to get oppinion of all the experts 09:41:01 But that will have to be written in the spec. And if you can make it after all, that would be better 09:41:29 Also don't forget to treat dimak 's comments about router ports and dhcp ports. 09:41:29 if the target VM ( the one that should receive the intercepted traffic) 09:41:38 has seurity group in place 09:41:59 should we filter that traffic of other hosts as well ? 09:43:15 ok, I will write this in spec and we will see other people opinon 09:43:20 yuli_s, could you explain a bit more? 09:43:26 I'm not sure I understand the question 09:44:23 as I understand the comments from Yamamoto, it is not clear if we need to filter traffic 2 times 09:44:35 ones when the traffic leaves the source vm 09:44:47 and another when it received by target VM 09:45:11 (I am talking here about tapped traffic) 09:46:09 I will write to Yamamoto to get his comments, 09:46:12 I think it would be best to understand how TAPaaS should work with SG 09:46:12 lets continue 09:46:31 The above is already an implementation question, but I would like to understand the requirement. 09:46:39 yuli_s, please do. I think this needs clearing up. 09:46:52 All right, this is all for roadmap. 09:47:00 Is there a spec, blueprint, or feature I missed? 09:47:27 Remember, to meet the Ocata milestones, we need these merged by the end of the week. 09:47:53 I think we're in good shape! 09:48:09 #topic bugs 09:48:38 Status hasn't changed since last week. There is still one High priority bug with no progress 09:48:45 Bug 1638151 09:48:45 bug 1638151 in DragonFlow "Router schedule error in L3 router plugin as there are multi-external network" [High,New] https://launchpad.net/bugs/1638151 - Assigned to rajiv (rajiv-kumar) 09:49:21 I think I can give a look to it from neutron side, 09:49:46 After another thought, I think it might be a neutron issue, but I need more investigation 09:50:00 Currently its assigned to rajivk, so make sure you guys aren't stepping on each others' toes 09:50:14 I will abondon it 09:50:20 sorry, I didn't notice that it has owner 09:50:28 I know, it is high priority 09:50:35 xiaohhui, that's all right. That;s why we are discussing it :) 09:50:47 and xiaohhui can do it better than me 09:51:04 My question is, is it really high priority? Are there any functional issues, except the thrown error? 09:51:22 (I asked this also on the bug) 09:51:42 if we don't eager to support mult-external network, I think it is ok 09:52:56 don't -> aren't 09:53:10 All right. Then I'll reduce it to medium 09:53:31 I think everything else is under control. I want to concentrate on specs this weekl 09:53:33 I have removed myself from assignee 09:53:33 week* 09:53:52 Anything else for bugs? 09:54:05 #topic Open Discussion 09:54:11 The floor is for the taking 09:54:45 If you guys have extra time 09:54:58 I would like to you to review this https://review.openstack.org/#/c/339975/ 09:55:31 I'll try and get to it today/tomorrow 09:55:34 it is in merge conflict now, but it will be updated soon 09:55:39 thanks 09:55:49 It's not in merge conflict... 09:56:18 my eyes go wrong... 09:56:31 You mean patch: "add active detection app for allowed address pairs" 09:56:33 ? 09:56:48 yeah 09:57:12 Sure. I'll look at it today/tomorrow :) 09:57:17 it is in merge conflict, I give it another glance... 09:57:23 anyway, thanks 09:57:45 Sure. 09:57:47 Anything else? 09:57:56 nothing from me... 09:58:14 Great. Then we can go to eat :) 09:58:22 Thanks everyone for coming! 09:58:30 #endmeeting