09:01:07 <oanson> #startmeeting Dragonflow
09:01:08 <openstack> Meeting started Mon Sep  5 09:01:07 2016 UTC and is due to finish in 60 minutes.  The chair is oanson. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:01:09 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:01:12 <openstack> The meeting name has been set to 'dragonflow'
09:01:23 <oanson> All right, who is here for the Dragonflow meeting?
09:01:27 <oshidoshi> o/
09:01:47 <lihi> o/
09:02:37 <yuli_s1> o/
09:02:39 <oanson> I was kinda sortsa hoping for more people :)
09:02:51 <oshidoshi> let's give them a couple more
09:05:03 <oanson> DuanKebo, hi.
09:05:20 <DuanKebo> hi
09:07:31 <oanson> Well, let's get started
09:07:52 <oanson> #topic Roadmap
09:08:03 <oanson> Security groups is giving us a lot of grief.
09:08:12 <oanson> For a while it isn't working on the gate.
09:08:23 <oanson> DuanKebo, I understand your team is taking this?
09:08:50 <DuanKebo> It's duo to kernerl module of ct
09:08:55 <DuanKebo> *due to
09:09:04 <oanson> What version has to be installed?
09:09:28 <DuanKebo> as far as i know, introduced by the dpdk patch.
09:09:50 <DuanKebo> I think we can optimize the install of ovs
09:10:08 <oanson> DuanKebo, that would be great. How?
09:10:19 <DuanKebo> ovs version 2.5 is ok,
09:10:34 <oanson> Do you know what kernel version and what modules are needed?
09:11:11 <DuanKebo> if the right version of ovs is installed, we needn't to uninstall and install it again.
09:11:28 <oanson> Because in my environment I have upgraded to the latest kernel (4.7.2) and it still fails.
09:11:42 <oshidoshi> so, this is a patch to the dragonflow devstack?
09:12:18 <oanson> Additionally, in ML2 (I'll go into more details in a second) it failed before the DPDK patch was merged.
09:12:20 <DuanKebo> i'm using ubuntu 14.04
09:13:13 <nick-ma> hi
09:13:17 <oanson> nick-ma, Hi
09:13:43 <DuanKebo> Hi
09:13:49 <oanson> DuanKebo, have you tried locally reverting the dpdk patch and seeing if it solves the issue?
09:14:16 <DuanKebo> the fialure is caused by not loading contrack module
09:14:27 <nick-ma> why dpdk patch affect sg?
09:14:35 <nick-ma> sorry i dont get.it
09:14:44 <oshidoshi> nick-ma, +1 - neither do i
09:14:54 <DuanKebo> one second, yuanwei has the details.
09:15:17 <oanson> DuanKebo, yuanwei, what's the name of the kernel module that needs to be loaded?
09:15:42 <yuanwei> Hello
09:15:57 <yuanwei> wait a sec
09:17:11 <yuanwei> I found errors in vswitch log
09:17:27 <oanson> yuanwei, yes
09:17:46 <nick-ma> i tried several combination of kernel and ovs locally. but they failed. any temp solutions?
09:18:16 <oshidoshi> oanson, i think ovs uses libnetfilter_conntrack
09:18:24 <yuanwei> 2016-08-31T07:47:38.381Z|00011|ofproto_dpif|INFO|system@ovs-system: Datapath does not support ct_state
09:18:24 <yuanwei> 2016-08-31T07:47:38.381Z|00011|ofproto_dpif|INFO|system@ovs-system: Datapath does not support ct_state
09:18:24 <yuanwei> Datapath does not support ct_state
09:18:24 <yuanwei> Datapath does not support ct_state
09:18:24 <yuanwei> Datapath does not support ct_state
09:19:15 <yuanwei> I think a old datapath was in the kernel, not the new one
09:19:48 <oanson> yuanwei, the dpdk patch should only be in effect if it is activated. How did it make a difference?
09:21:48 <oanson> I would like to move on.
09:21:59 <yuanwei> I found this patch deleted some codes
09:22:03 <oanson> DuanKebo, yuanwei, please open a bug about SG, and put all the information you have in there
09:22:18 <DuanKebo> a temp solution is remove the ovs and reinstall it manually
09:22:23 <oanson> We will keep the discussion on the bug, and see if we can find a solution.
09:22:34 <oanson> DuanKebo, noted. Please add that to the bug report.
09:23:01 <oanson> #action DuanKebo,yuanwei open bug report on SG and update it with cause and workaround
09:23:10 <oanson> About ML2
09:23:18 <oanson> I have tried setting it up and testing it on my environment.
09:23:28 <oanson> It appears to work, but the fullstack tests fail there as well
09:23:41 <oanson> I also saw Li Ma tried to have the gate test it, and there are many errors there too.
09:24:00 <oanson> In my environment, specifically, some issues (but not all) were related to SG, so I will try again with the workaround
09:24:15 <oanson> DuanKebo, can you guys take ownership on this and iron out the ML2 bugs?
09:24:38 <oanson> I want to reach a state where ML2 is completely working before the summit
09:24:52 <DuanKebo> We have the same puzzles.
09:25:24 <oanson> I think for now we should add another gate task for ml2 fullstack.
09:25:28 <DuanKebo> the ml2 pulgin runs successfully on our local envionment.
09:25:44 <oanson> i.e. have both core plugin and ml2 run on the gate.
09:25:49 <DuanKebo> but no problem, we can take it.
09:25:58 <oshidoshi> oanson, perhaps try to uninstall OVS and re-install OVS 2.5
09:25:59 <oanson> This way we can keep tabs on how well ml2 is doing as well.
09:26:34 <yuanwei> https://review.openstack.org/#/c/348169/15/devstack/ovs_setup.sh@116  @MaLi those change may cause that error I mentioned
09:26:41 <oanson> oshidoshi, will try that too
09:27:08 <oshidoshi> oanson, if that works out, the devstack can be updated to do the same (uninstall and reinstall)
09:27:23 <oanson> oshidoshi, the devstack already does that.
09:27:30 <oanson> The location may have to be changed
09:27:36 <nick-ma> you mean the removing of module
09:27:46 <yuanwei> yes
09:28:06 <nick-ma> yuanwei: ok, but these lines are always failed in my local env.
09:28:37 <nick-ma> i think we need a more stable solution to reload kernel mmodule.
09:29:09 <oanson> yes
09:29:12 <yuanwei> ok, I agree, that is only a temp fix
09:30:37 <DuanKebo> in fact, for system like ubuntu 14.04, we can just use the default ovs, it's already 2.5.0
09:30:58 <oanson> DuanKebo, I think you need to install the cloud:mitaka repository for that.
09:31:03 <oanson> And in Fedora, that's what we do
09:31:22 <oanson> I think we just need to make sure the conntrack module is loaded before openvswitch
09:32:21 <oanson> QoS and VLAN reviews are online.
09:32:32 <oanson> Non-cores, please review them.
09:33:00 <oanson> There are a lot of reviews, and only two cores, so we need all the help we can get
09:33:23 <oanson> Especially the long ones, e.g. qos, vlan, extra routes
09:33:45 <oanson> DuanKebo, about extra routes - what about adding routing policy api to Neutron? Is that still interesting?
09:34:23 <DuanKebo> Yes, before we run it stably in df
09:34:40 <oanson> all right. I'll work something up.
09:35:09 <yuanwei> Please also review this patch https://review.openstack.org/#/c/339975/, that is the first patch about allowed address pairs module
09:35:27 <nick-ma> ok
09:35:40 <oanson> yuanwei, this patch has 2 downvotes on it.
09:35:46 <yuanwei> I will update this patch today
09:35:57 <oanson> All right. Then we will review it.
09:36:07 <yuanwei> thanks;)
09:36:25 <oanson> Any other road-map issues?
09:36:27 <DuanKebo> oanson, you need spend more time reviewing also ^o^
09:36:54 <oanson> DuanKebo, as I mentioned, there are too many reviews for the number of reviewers.
09:37:17 <oanson> But I am doing my best :)
09:37:45 <oanson> Any other roadmap issues?
09:37:56 <nick-ma> i will also try to spend more time on reviewing.
09:38:12 <oanson> nick-ma, I think you're the only one who reviews too much :)
09:38:42 <oanson> #topic Bugs
09:39:01 <yuli_s1> sec.
09:39:14 <oanson> There is bug 1619101
09:39:14 <openstack> bug 1619101 in DragonFlow "secgroup ofperror flooding in the fullstack ci" [Critical,New] https://launchpad.net/bugs/1619101
09:39:28 <yuli_s1> yes, i wanted to talk about it
09:39:38 <oanson> It is critical, and it is SG related. Yuanwei, can I assign it to you?
09:40:57 <DuanKebo> is this the same problem related with the contrack
09:41:02 <oanson> Is bug 1571551 still experienced by anyone?
09:41:02 <openstack> bug 1571551 in DragonFlow "Kernel module vport_geneve.ko fails to load on ubuntu" [High,New] https://launchpad.net/bugs/1571551
09:41:11 <oanson> DuanKebo, yes, that's what I thought as well
09:41:24 <oanson> That's why I am assigning it to you and yuanwei .
09:41:55 <oanson> All right, I am bumping 1571551 down to medium. If anyone runs into it, please bump it back up to high.
09:41:56 <DuanKebo> OK
09:42:04 <nick-ma> ok
09:42:13 <yuli_s1> we need https://bugs.launchpad.net/dragonflow/+bug/1571551
09:42:13 <openstack> Launchpad bug 1571551 in DragonFlow "Kernel module vport_geneve.ko fails to load on ubuntu" [Medium,New]
09:42:32 <DuanKebo> we can bypass it by completely reinstall ovs
09:42:47 <DuanKebo> before it is solved
09:43:04 <oanson> DuanKebo, yes, but that's a workaround. On the other hand, this is devstack environment, so it's acceptable.
09:43:28 <oanson> Bug 1480672 , which is on Gal
09:43:28 <openstack> bug 1480672 in DragonFlow "Add Neighbour Discovery handling in local controller" [Medium,Triaged] https://launchpad.net/bugs/1480672 - Assigned to Gal Sagie (gal-sagie)
09:43:33 <oanson> lihi, I understand you are working on it?
09:43:52 <DuanKebo> yes, omer, we still need work on the bug.
09:43:52 <lihi> yes, I'm on it
09:44:12 <oanson> DuanKebo, I am just organising to know who works on what.
09:44:24 <oanson> You don't have to solve it today.
09:44:26 <oanson> Tomorrow...
09:44:35 <oanson> lihi, I am assigning it to you, then.
09:44:55 <yuli_s1> we need another owner for : https://bugs.launchpad.net/dragonflow/+bug/1614334
09:44:55 <openstack> Launchpad bug 1614334 in DragonFlow "Fail to Install dragonflow" [Medium,New]
09:45:01 <lihi> OK. Just don't forget to review it on time
09:46:07 <DuanKebo> yuli, if no one,  I can take it.
09:46:13 <oanson> DuanKebo, about bug 1614334, did you have git installed when this happened?
09:46:13 <openstack> bug 1614334 in DragonFlow "Fail to Install dragonflow" [Medium,New] https://launchpad.net/bugs/1614334
09:46:31 <yuli_s1> DuanKebo, it is possible that this bug happened because of old libraries
09:46:51 <oanson> yuli_s1, no.
09:47:10 <oanson> I'll take it. I was just looking at versioning issues this morning, so I may recognise something :)
09:47:15 <DuanKebo> I think someone else who don't use git may face this problem
09:47:41 <DuanKebo> if i git clone the code, it can install successfully.
09:47:55 <oanson> DuanKebo, yes.
09:48:11 <DuanKebo> yuli, not because of old libs.
09:48:16 <oanson> That's what the error message says - it needs access to the remote repository, which it doesn't have with a tarball.
09:48:27 <DuanKebo> as i have said, when using git clone, it works.
09:48:39 <oanson> Yes
09:48:50 <oanson> All right, I'll see what I can find about it.
09:48:50 <yuli_s1> yesterday i found another issue with performance degradation when creating of hundreds of subnets,
09:49:19 <oanson> yuli_s1, did you open a bug report?
09:49:22 <yuli_s1> i have not reported it, i will report it and take it to myself.
09:49:38 <oanson> Very good.
09:49:49 <nick-ma> control plane performance?
09:49:50 <oanson> Anything else?
09:49:59 <oanson> #topic Performance
09:50:14 <oanson> yuli_s1, any new exciting results?
09:50:20 <yuli_s1> i was working on performance tests
09:50:42 <yuli_s1> I have build a following test
09:51:03 <yuli_s1> 30+ real servers running full devstack installation
09:51:29 <yuli_s1> +129 additional df-controllers (each one works with it's own br-int)
09:51:36 <yuli_s1> (on each server)
09:52:00 <yuli_s1> in my test I am foing the following
09:52:09 <yuli_s1> i create a network and 256 subnets for it
09:52:37 <yuli_s1> currently it takes almost 200 secs to finish the whole tests
09:53:08 <yuli_s1> yesterday I was checking
09:53:18 <DuanKebo> According to your test, do you think currently, can dragonflow support 4000+ compute nodes? @yuli?
09:53:23 <yuli_s1> number of rules added to each br-int
09:53:48 <yuli_s1> when running script that printed me numbers of rules
09:54:03 <yuli_s1> sometimes the counter went down and than up
09:54:24 <yuli_s1> i suppose, when receiving an update for a network, we recreate dhcp related rules
09:54:40 <yuli_s1> witch can be a reason for performance degradation
09:55:08 <oshidoshi> DuanKebo, I think we can support it, but I think we need to run another test first, to measure full-system update when creating ports - it is the more generic use case
09:55:11 <DuanKebo> yuli, can you report a bug about this?
09:55:18 <yuli_s1> sure, will do it
09:55:40 <DuanKebo> what is full-system update?
09:55:41 <yuli_s1> DuanKebo, yes
09:55:41 <yuli_s1> sure
09:56:09 <oanson> yuli_s1, 256 subnets per server? Or just 1 network with 256 subnets for the entire test?
09:56:20 <yuli_s1> 130 df-controllers per server
09:56:33 <oanson> But how many subnets?
09:56:36 <yuli_s1> in test I create 1 network with 256 subnets
09:56:37 <oanson> And was this core or ML2?
09:56:46 <oanson> yuli_s1, I see.
09:56:51 <yuli_s1> sec.
09:57:10 <yuli_s1> not ml2
09:58:00 <oanson> yuli_s1, I think the test with ML2 is also important - since this will be our standard next version
09:58:05 <oanson> (hopefully)
09:58:15 <yuli_s1> sure, I will test it too !
09:58:20 <oanson> All right. That's all the time we have
09:58:33 <oanson> Lucky no one added an items to the open discussion :)
09:58:38 <DuanKebo> we'd better test is first. @yuli
09:58:53 <yuli_s1> ok
09:58:55 <oanson> Anything else before we finish?
09:58:55 <DuanKebo> if you have some problem testing it, we can help you
09:59:01 <yuli_s1> thanks !
09:59:19 <yuli_s1> DuanKebo, I will try to push the scripts to repository
09:59:38 <oanson> yuli_s1, that's a great idea!
09:59:44 <yuli_s1> so you will be able to see the code
09:59:46 <nick-ma> launchpad blueprints need to be  updated.
09:59:56 <oanson> nick-ma, yes.
10:00:05 <oanson> I will go over them for next meeting.
10:00:09 <DuanKebo> ok, we will review it.
10:00:11 <oanson> I think we need more than 2 minutes for them :)
10:00:30 <oanson> #action oanson Review and summarise blueprints.
10:00:37 <oanson> All right. That's our time.
10:00:40 <oanson> Thanks everyone.
10:00:49 <oanson> #endmeeting