09:03:46 #startmeeting Dragonflow 09:03:47 Meeting started Mon Jul 11 09:03:46 2016 UTC and is due to finish in 60 minutes. The chair is oanson. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:03:48 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:03:50 The meeting name has been set to 'dragonflow' 09:04:39 #info gsagie, nick-ma, DuanKebo is in the meeting. 09:04:48 Is anyone else here for the Dragonflow meeting? 09:05:14 All right, then let's get started 09:05:19 #topic Roadmap 09:05:48 Reminder, the Newton roadmap link is here: https://etherpad.openstack.org/p/dragonflow-newton 09:05:51 #link https://etherpad.openstack.org/p/dragonflow-newton 09:06:18 I'll start with my update about packaging and distribution: Sadly I wasn't able to make much progress. 09:06:54 I am in China this week for openstack days china and collaborating with DuanKebo and team, so I probably won't have it ready for next week either. 09:08:18 I'm waiting a second for the others to join. Then we can talk about DB synchronisation, ML2, and VLAN. 09:08:27 If anyone wants to take the floor, it's available 09:09:58 hujie, wangyongben, liuhaixia, are you here? 09:11:55 hujie says he commited a patch. The link is: https://review.openstack.org/336377 09:12:01 He'd be happy for reviews. 09:12:54 liuhaxia also made the ML2 patch smaller. It contains only L2 for ML2. It is available here: https://review.openstack.org/#/c/334798/ 09:13:32 Patches for vxlan, vlan, and flat networks will come later. vlan and flat will depend on vxlan, so they may come after it is merged (or at least passes a few review cycles) 09:14:44 wangyongben uploaded the L3 plugin to here: https://review.openstack.org/#/c/316785/ . It also needs reviews. 09:14:48 hshan is online, there are not only omer himself having meeting :) 09:15:21 #link https://review.openstack.org/#/c/316785/ 09:15:26 #link https://review.openstack.org/#/c/334798/ 09:15:32 #link https://review.openstack.org/336377 09:15:57 Any other updates someone wants to share? 09:16:04 not here 09:16:25 All right. 09:16:51 #topic Barcelona Summit 09:17:07 The call for presentations end this week 09:17:28 If you want to submit a talk, you need to prepare your title and abstract as soon as possible 09:17:53 It is highly recommended to go if possible. We can meet and attract new people to work on dragonflow. 09:18:02 I guess the talks might be interesting as well :) 09:18:44 Anyone has anything to add on this topic? 09:19:15 I was thinking of the talks, not sure about which topics can cover the whole 40 minutes. 09:19:35 in the last summit, we did one for general introduction. 09:20:40 nick-ma: Each of the talks I'm submitting is comprised from 3 smaller titles that I tied together. 09:22:17 nick-ma: I think oshidoshi may have a few ideas that are still missing submitters. 09:22:41 i think we can take this offline 09:22:47 yes. 09:22:49 agreed. 09:23:05 Anything else here? 09:23:29 #topic Performance Testing 09:23:51 yuli_s: I understand you have done some work here? 09:24:10 not sure he is here 09:24:17 ahh yes 09:24:20 he is 09:24:43 His user is here 09:24:56 yuli_s is working on implementing end-to-end enviorments 09:25:04 to test both control and data plane performance 09:25:16 so instead of only testing the DB backends we can tests it all 09:25:42 How is he setting up the environments? 09:25:43 he is checking ansible/vagrant now to do the env installation 09:25:50 with your project 09:26:13 or with openstack-ansible 09:26:51 That's great. I'll sit with him and collaborate when I get back. 09:27:13 The best solution would be to use both - my project to set up VMs using vagrant, and openstack-ansible for installation. 09:27:29 This will push the deployment project forwards as well. 09:27:44 If we want to do the large scale test, we need lots of servers 09:28:08 DuanKebo: openstack-ansible should be usable on both virtual and physical servers 09:28:16 can be simulated? 09:28:32 However, we need these physical servers somewhere. 09:28:46 In the discussion last time, we said we will have 20 servers for redis db cluster, not for the local controller server, it that enough?? 09:29:06 for simulation 09:29:21 hujie: I think it's enough for a start. Enough to test the deployment and get initial results. 09:29:33 good :) 09:29:54 But the best thing would be as many servers as possible to run a real test, and be able to prove we scale to as many nodes we say we scale. 09:30:14 I'm not sure, but i think simulated ways may be more viable for large scale test. 09:31:19 thinking about how to test DC with 10,000 servers. 09:31:46 :-) 09:31:48 DuanKebo: That's where ansible comes in - it should allow us to provision the servers from a single location. 09:32:32 the deployment is not bottleneck for DC with 10,000 servers. 09:32:52 hujie: So wher's the bottleneck? 09:33:12 problem is we can deploy so many servers. 09:33:25 so you have 10,000 real servers in single location for testbed? 09:33:28 * can not 09:33:51 DuanKebo, hujie, the deployment is or is not the bottleneck? 09:34:09 isn't 09:34:12 If we have a deployment solution, is there anything else that's blocking us (provided we have the servers) 09:34:20 oanson: i think what hujie and DuanKebo are saying the deployment tool is not a problem 09:34:24 its only physical resources 09:34:41 the test method and the control and data plane performance is more important I think :) 09:35:02 This is something yuli_s and Shlomo_N are working on. 09:35:04 yes, the deployment tool is good itself and for dragonflow :) 09:35:13 hujie: for data plane performance we dont need many servers 09:35:14 Shlomo_N has a patch here: https://review.openstack.org/#/c/304470/ 09:35:25 for control plane testing we will do a simulated tests 09:35:41 we will of course try to squeeze as much "local controllers" as possible in one server 09:35:49 yuli_s has a control plane test here: https://review.openstack.org/#/c/309948/ 09:35:54 we haven't so many servers. So we have calculate the control plane and data plane traffic, and simulate it. 09:35:55 usually its best to run them 1 per core so we wont get into scheduling problems 09:36:26 Yes, Gal 09:36:34 DuanKebo: lets start with what we have 09:36:39 we dont need to start with 10,000 09:36:56 how many do we have? 09:37:03 I think for DC with 10,000 servers, deployment is necessary and good thing, but it's not what we truely worried :) 09:37:03 including your servers 09:37:05 obviously. As I said, the 20 servers is definitely enough for a start 09:37:07 DuanKebo: If we have 20 servers? 09:37:22 Yes, we have 09:37:32 around how many cores each? 09:37:40 12, 14? 09:37:56 16 maybe 09:38:09 ok so thats 320 local controllers 09:38:21 that we can run on them 09:38:23 gsagie: Some of them should be compute nodes 09:38:26 lets say less 09:38:34 If we want to test end-to-end and not just controll plane 09:38:35 so we have around 200 09:38:44 Some should also be database. 09:39:18 I think perhaps we should construct a testing plan, and then decide how many nodes of each we need. 09:39:31 Yes, I worry about db and pub/sub performance 09:39:36 We can say that we have 320 simulated servers for now, but I don't think it's that important. 09:39:50 Let's get the testing framework and deployment finished first. 09:40:16 if the computing resource is containers we can save much resource. 09:40:20 Omer, we are working on the test plan 09:40:28 OK, we can move on to do the deployment work and design the test plan :) 09:40:30 I'll step on deployment it next week. I'll also help yuli_s to get the testing framework finished as soon as possible. 09:40:45 nick-ma: I think that's the direction yuli_s is taking. 09:40:58 the problem with containers is OVS 09:41:06 DuanKebo: Great. I'd be happy to see it after the meeting. 09:41:08 we will need to do work to "simulate" the kernel module 09:41:14 since it will be shared 09:41:27 gsagie: Can be worked around, maybe with a OVS bridge per container. 09:41:40 It's not ready yes.^o^ 09:41:46 not ready yet 09:42:09 DuanKebo: Sorry. I'm enthusiastic :) 09:42:27 gsagie: Can the OVS+container be worked around with an OVS bridge per container? 09:42:28 ok, get it. 09:42:53 oanson: we need to think about it, because we will need something that creates the ports and connect them to each container 09:43:00 and simulate the OVSDB events 09:43:09 so it can work but it need some work 09:43:46 gsagie: So we'll do the work :) 09:43:53 maybe we need kuryr to do this ? 09:43:57 it might be faster to just use user space OVS part and not use the kernel module as we only checking the control plane 09:44:06 yuli_s is suppose to check it 09:44:07 soon 09:44:22 DuanKebo: we will need kuryr 09:44:27 for this 09:44:36 if we use Docker 09:44:36 gsagie: I thought we're checking end to end. 09:45:31 oanson: "end-to-end" as in using Dragonflow controller full code, but we can still ignore the part that install flows to OVS given that can help 09:45:44 but again yuli_s will check it 09:45:52 and hopefully provide some insights next week 09:46:23 All right, let's wait till next week. We'll also figure out what end-to-end means once the test plan is ready :) 09:46:34 Anything else in this topic? 09:46:57 #topic Bugsw 09:46:58 #topic Bugs 09:46:59 the meaning is not end-to-end as in test the entire system but test the controller instead of just the DB 09:47:04 which is what was done till now 09:47:35 All right. 09:48:13 the fullstack is.failed due to arpresponder.test.simple.response. but it really works in my local machine. 09:48:33 nick-ma: Yes. It works locally for me as well. 09:48:33 also the ryu 4.4 09:49:03 ryu breaks secgroup. 09:49:06 About ryu 09:49:20 yamamoto and iwamoto say that the change should be reverted in ryu 09:49:39 They brought this up in their mailing list, but I didn't see any progress yet. 09:49:53 i suggest we first revert to 4.3 if.possible. and wait for the upstream change 09:50:11 We can't revert, because we are locked to the openstack requirements 09:50:20 got it. 09:50:24 I though of aligning ourselves, and reverting if the upstream changes. 09:50:34 It's more work for us, but at least our code will work. 09:50:45 and I don't know if and when the upstream will change. 09:50:56 that is the problem 09:51:29 I will try and catch Yamamoto offline and discuss it. See what he thinks, what our chances are, and what is the expected change in ryu. 09:51:45 because any change will either break ryu 4.4, or keep <=4.3 broken. 09:52:00 Due to the API changes they have made. 09:53:07 ok. 09:53:18 nick-ma: I see you opened a bug about the ARP responder test. 09:53:48 i am trying to figure out why. we didn't change any codes related 09:53:52 to it 09:54:16 I suspect the gate test will have to be debugged. 09:54:22 yes. 09:55:15 Any other items here? 09:55:25 by the way, suggest we clean up the bug list of launchpad, lots of them are out of date. 09:56:05 Yes. Each bug should be verified, and marked invalid if appropriate. 09:56:38 Yes, nick 09:56:59 i went through most of them, but lots of.them are assigned, so not sure the progress 09:57:07 I suggest that everyone go over the bugs they are assigned, and verify them. 09:57:39 Bugs that haven't been updated in a while, or owners that have been inactive for a while should be pinged. 09:57:40 afaik, lots of.them are invalid due to testing 09:59:28 I see also that there are many undecided importance bugs. 09:59:41 yuli_s, as our bugmaster: Will you rate them? 09:59:56 He might still not be here. 10:01:26 All right. That's our time. 10:01:32 Thanks everyone. 10:01:34 #endmeeting