09:00:19 <gsagie> #startmeeting dragonflow
09:00:20 <openstack> Meeting started Mon Feb 29 09:00:19 2016 UTC and is due to finish in 60 minutes.  The chair is gsagie. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:00:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:00:23 <openstack> The meeting name has been set to 'dragonflow'
09:00:27 <gsagie> Hello to all the dragons :) and the flows..
09:00:30 <gampel> Hi
09:00:36 <gsagie> let this show begin!
09:00:45 <gsagie> who is here for dragonflow meeting?
09:00:45 <matrohon> hi
09:00:45 <Shlomo_N> hi
09:00:48 <nick-ma> hi
09:00:51 <matrohon> hi
09:00:58 <gsagie> matrohon: welcome :) first time i see you
09:01:12 <matrohon> gsagie, thanks :)
09:01:20 <gampel> welcome
09:01:28 <yuli_s_> Hello
09:01:44 <matrohon> I hope there will be room for open discussion today!
09:02:02 <gsagie> #info gampel, matrohon, nick-ma, oanson, Shlomo_N, yuli_s, gsagie, DuanKebo in meeting
09:02:20 <DuanKebo> Hi
09:02:24 <gsagie> matrohon: sure, we have a tight schedule but we are here for the open discussion so we will make some time
09:02:40 <gsagie> #topic design summit
09:02:44 <nick-ma> we also can move to dragonflow room for discussion.
09:03:01 <gsagie> so, Dragonflow was approved as big-tent project
09:03:05 <matrohon> gsagie, nick-ma fine!
09:03:24 <Shlomo_N> Congrats...!
09:03:26 <gsagie> and this means we need to request for rooms for design summit sessions, we have few topics and gampel
09:03:33 <gsagie> need to send an email to request how many rooms
09:03:37 <gampel> we need to decide  how many work session  we want in the summit
09:03:37 <gsagie> how many sessions
09:04:07 <gsagie> gampel: i think we also want 1-2 fishbowl to hopefully have a broder community discussion regarding general roadmap and users requests/questions
09:04:28 <gampel> I think that we need at least 4  1 hour sessions
09:04:34 <gsagie> so i was thinking something in the area of 1 fishbowl session and 5 1 hour sessions
09:04:35 <gsagie> yeah
09:04:50 <gampel> yes sound good
09:05:02 <gsagie> gampel: ok, lets also start etherpad to prioritze the points we want to discuss
09:05:05 <gsagie> so anyone can add
09:05:14 <gsagie> and publish to mailing list maybe
09:05:28 <gampel> road map for N  is one
09:05:42 <nick-ma> we can list our session schedule on etherpad for priority.
09:05:44 <nick-ma> yes
09:05:46 <gsagie> #action gampel return design session numbers we need (1 fishbowl, 5x1 hour)
09:05:59 <gsagie> #action gampel start etherpad for design summit topics
09:06:12 <gsagie> yeah lets continue on that etherpad, we still have time
09:06:13 <gampel> I will create the etherpad and sent it to the mailing list
09:06:18 <gsagie> ok great!
09:06:20 <gsagie> thanks
09:06:22 <nick-ma> thakns
09:06:25 <gsagie> #topic testing
09:06:41 <gsagie> Ok, Shlomo_N, yuli_s, please update us on this front
09:06:49 <Shlomo_N> ok
09:06:53 <gsagie> We are starting to create scale and API tests
09:07:18 <Shlomo_N> First, the data plane performance testing spec was merged into the DragonFlow git repository, you can find it here:
09:07:22 <Shlomo_N> https://github.com/openstack/dragonflow/blob/master/doc/source/specs/performance_testing.rst
09:07:37 <gsagie> #link https://github.com/openstack/dragonflow/blob/master/doc/source/specs/performance_testing.rst
09:07:46 <Shlomo_N> 10x gsagie
09:07:50 <Shlomo_N> I have finished the data plane performance testing for DVR and
09:07:55 <Shlomo_N> DragonFlow in multi-node environment
09:08:03 <gsagie> Shlomo: good job, lets wait with publishing the results
09:08:12 <gsagie> until we have full picture of Dragonflow with security groups
09:08:22 <Shlomo_N> sure
09:08:23 <gsagie> but after that we can start pushing things to openstack-performance-docs
09:08:27 <gsagie> ok thanks
09:08:36 <gsagie> We need to see how we can automate this
09:08:43 <gsagie> so we can start applying this per patch
09:09:03 <gsagie> yuli_s also published API/control plane testing spec
09:09:03 <Shlomo_N> I am starting to work on the automation part
09:09:16 <gsagie> #link https://review.openstack.org/#/c/282873/
09:09:32 <gsagie> so please everyone review and share comments/ideas
09:09:50 <gsagie> we also would like to work on this with the community yuli_s and define shared standards
09:09:53 <DuanKebo> Can we also test the perf of neutron refrence design.
09:10:02 <gsagie> DuanKebo: thats what we do
09:10:09 <DuanKebo> and compare dragonflow's to it .
09:10:12 <gsagie> DuanKebo: Shlomo_N did a comparsion and will send results
09:10:19 <yuli_s_> the doc covvers more single box tests
09:10:20 <DuanKebo> Great!
09:10:20 <gsagie> we need to go over them and verify
09:10:21 <gampel> Good job Yuli , are you going to do it with rally ?
09:10:29 <yuli_s_> so, it will be extended as we go
09:10:46 <gsagie> yeah as me and yuli_s talked about the goal is to reach rally
09:10:50 <yuli_s_> and work with bigger test lab
09:11:07 <gsagie> for automation, but get initial numbers first as well so we will have an idea what to expect
09:11:48 <gampel> It will be important to simulate DB client's as well to test
09:12:08 <gampel> DC scale 1000+ compute node
09:12:14 <gsagie> gampel: yes good point, after API we will want to test the DB backend as well
09:12:36 <gsagie> i think its mentioned in the document
09:13:10 <gsagie> yuli_s: i say lets first get initial numbers and then start investigating Rally for this, i can help you with that
09:13:23 <gsagie> hi vikram! :)
09:13:26 <yuli_s_> gsagie: sure !
09:13:28 <gsagie> ok lets move to the next topic
09:13:28 <vikram_> hi gal ;)
09:13:33 <gsagie> #topic DB consistency
09:13:40 <gsagie> nick-ma: :)
09:13:42 <gampel> welcome @ vikram_
09:13:49 <nick-ma> the current code is in the review.
09:13:50 <gsagie> hi raofei welcome
09:13:54 <raofei> hi
09:13:55 <vikram_> gampel; thanks
09:13:57 <gsagie> #info raofei, vikram are in meeting as well
09:14:18 <gsagie> nick-ma: good job, i did saw we had some exceptions in neutron server, will re check
09:14:19 <nick-ma> i'm also working on the testing. there are some errors on updating subnet in fullstack.
09:14:36 <gsagie> yeah i saw some tests are sometimes failing
09:14:39 <nick-ma> i updated the code.
09:14:47 <nick-ma> yes. but for manual testing and rally, it works.
09:14:54 <nick-ma> so i need to figure out why.
09:15:02 <yuli_s_> the DB consistency architecture is great, I think we might have some rejections from neutron team that we want to add a new table
09:15:02 <gsagie> #link DB consistency review https://review.openstack.org/#/c/282290/
09:15:17 <yuli_s_> to sync data in neutron db
09:15:31 <gsagie> nick-ma: okie great
09:15:35 <gsagie> yuli_s: thats not a problem
09:15:42 <gampel> it is an additional table only accessed from our plugin i do not see a problem
09:15:58 <yuli_s_> gsagie: i hope too
09:16:12 <gsagie> nick-ma: okie let us know if you need any help with investigating these tests, i saw some fails that relate to floating ip as well
09:16:13 <nick-ma> yes. there's no problem for adding a new table for a plugin.
09:16:27 <gsagie> anything else for DB consistency?
09:17:16 <gsagie> #topic pub-sub
09:17:20 <gsagie> okie..
09:17:21 <nick-ma> there's some discussion on getting rid of neutron db and just use nosql for persistent storage. i'll write my thoughts on it. we can discuss it later.
09:17:53 <gsagie> nick-ma: okie, we had same thoughts but this require much more work and needs to be synced with the Neutron community
09:18:08 <gsagie> so its defiantly not for this release
09:18:13 <gampel> yes this is a big change  and the biggest problem there is that the DB is only eventually  consistent
09:18:22 <yuli_s_> for me it is not clear why we need to use ZerroMQ together with redis
09:18:28 <nick-ma> yes, you are right.
09:18:32 <gsagie> gampel: the pub-sub stage is yours, please publish us some status :)
09:18:34 <DuanKebo> redis pub/sub driver is under test now.
09:18:45 <gampel> Yes pub-sub status
09:18:52 <DuanKebo> @ruli_s Yes
09:19:00 <gampel> we have the ZMQ merged
09:19:15 <DuanKebo> We are try not to use zerroMQ in redis driver
09:19:35 <gampel> and@ omer is  working  on separating the publisher into a different service
09:19:39 <gsagie> DuanKebo: gampel found some interesting posts describing that ZeroMQ performance is much better then Redis
09:19:44 <gsagie> we do need to verify all of it
09:19:55 <gampel> Curently we only support Neutron server Publishers
09:20:05 <gsagie> so it might prove more efficent to us redis as DB backend and ZeroMQ as the pub/sub
09:20:07 <yuli_s_> gsagie: in the control performance doc we are covering this
09:20:08 <DuanKebo> Yes, we will investigate it.
09:20:17 <gampel> we do not support for this release  controller to controller
09:20:41 <gampel> the chasiss are handled from the neutron server side
09:21:05 <gampel> ZMQ and REdis do not work together  they are two drivers
09:21:18 <gsagie> DuanKebo, gampel: any open points regarding Redis pub/sub?
09:21:19 <gampel> we can use any DB with any Pub sub driver
09:21:31 <gsagie> any problems we need to discuss?
09:21:31 <oanson> In case we'll want DFcontroller-DFcontroller, we can use the database + a table monitor on the controller.
09:21:45 <DuanKebo> Currently, we are using redis for db and pub/sub
09:21:57 <gampel> From the discussion today we understand  that redis does not need to bind a local socket for the publisher
09:22:01 <DuanKebo> But open to other opinions
09:22:11 <gampel> and can run publisher per Neutron server process
09:22:16 <DuanKebo> opent to other options.
09:22:48 <gsagie> okie, then the implementation doesnt need to use the pub/sub service
09:22:57 <DuanKebo> Yes, it support multiply neutron servers on one host.
09:23:04 <oanson> DuanKebo, by separating the pub/sub and DB drivers, we allow flexibility. In configuration, we can select to do both with redis, or DB with redis and pub/sub with ZMQ.
09:23:05 <gampel> what I suggest is that we test performance of the different Pub?sub and then we could compare them all
09:23:36 <gsagie> DuanKebo, gampel: okie then, lets continue with Redis implementation and compare results to Redis + ZeroMQ or only Redis
09:23:36 <yuli_s_> yes, we can start the actual control plane test from here
09:23:51 <gsagie> DuanKebo: is there any time frame for Redis to be completed?
09:23:55 <gampel> For M Cycle i propose to focus only on Neutron server publishers
09:23:56 <gsagie> are we close?
09:24:20 <DuanKebo> We are doing the test.
09:24:36 <DuanKebo> I think  it can be finished this week
09:24:43 <gsagie> the test or the code?
09:24:52 <DuanKebo> the test
09:25:12 <nick-ma> we need to build gate check for each db backend.
09:25:23 <gampel> yes this is a good idea
09:25:33 <gsagie> DuanKebo: okie, so please update us with the results
09:25:39 <gsagie> when you have them
09:25:41 <DuanKebo> OK, np
09:25:50 <gampel> I will add it to the list and register a bug
09:26:04 <oanson> Gate-tests for each possible configuration is very expensive performance-wise, and may take a lot of time. I think this should be considered.
09:26:05 <gsagie> But if the effort of completing Redis is minimal, i think we should do it with the pub/sub as well
09:26:22 <yuli_s_> DuanKebo: check this out https://github.com/openstack/performance-docs/blob/master/doc/source/test_plans/mq/plan.rst
09:26:42 <gsagie> lets talk about gate tests after..
09:27:14 <nick-ma> yes, very expensive. :-)
09:27:45 <gsagie> okie
09:27:59 <gsagie> #action DuanKebo publish results of pub/sub testing
09:28:03 <gampel> I think in this teste there are not using pub/sub sockets
09:28:19 <gsagie> #action DuanKebo try to asses the time it takes to finish Redis for both pub/sub and DB
09:28:52 <gsagie> gampel:  we need to test this end to end, if the effort of adding this now is small i think we need to test this end to end
09:29:21 <gampel> @DuanKebo the test are mainly for The MSQ part and not for the Pub/Sub
09:29:30 <gsagie> There are some advantage to selective proactive distribution with redis
09:29:34 <gsagie> we need to explore
09:29:52 <gsagie> so lets see if we can complete the work and then do the testing
09:30:11 <gsagie> DuanKebo: what do you think?
09:30:15 <gampel> Yes i think that the controller path testing will be the key for the selection
09:30:50 <yuli_s_> DuanKebo I will be happy to help with the test
09:31:08 <gsagie> okie, lets continue talking about this after
09:31:15 <gsagie> #topic security groups and port security
09:31:22 <gsagie> dingbo here?
09:31:28 <gsagie> raofei: can you update on this?
09:31:33 <gampel> we will add th reliability latter but I agree we should start testing as soon as possible
09:31:49 <raofei> security group is done by Yuanwei now
09:32:08 <raofei> The code is almostly completed, now it's testing
09:32:45 <gsagie> okie, you know when he will upload it up stream?
09:32:51 <gsagie> so we can all review
09:33:05 <gampel> maybe ask him to upload with WIP
09:33:23 <gampel> so we could speed the review cycle
09:33:26 <raofei> Yes, I think so. @duankebo, when does yuanwei can commit the latest code?
09:34:06 <gsagie> i think he is disconnected
09:34:16 <DuanKebo> online again
09:34:19 <gsagie> ahh ok
09:34:23 <gsagie> welcome back :)
09:34:38 <DuanKebo> I will confirm it with yuanwei
09:34:51 <DuanKebo> I think it may be this week
09:34:59 <gsagie> DuanKebo: please do because we are approaching end of release
09:35:06 <gsagie> doing the testing without security groups is pointless
09:35:35 <gsagie> #action DuanKebo check security groups status and upload for review to upstream
09:35:40 <gsagie> #topic distributed DNAT
09:35:47 <DuanKebo> yes. as i know most of the code have been finished.
09:35:48 <gsagie> raofei: ...:)
09:36:09 <raofei> it's coding phase.
09:36:22 <gsagie> only the application is missing right?
09:36:35 <raofei> I think the coding and testing will be completed  this week.
09:36:36 <raofei> yes.
09:36:46 <gsagie> raofei: ok good job
09:36:50 <gampel> great :)
09:36:56 <raofei> of course, the plugin also need to do some change.
09:37:00 <gsagie> #action raofei upload distributed dnat upstream for review
09:37:02 <DuanKebo> @raofei Hujie is doing the integration
09:37:23 <gsagie> #topic selective proactive
09:37:44 <gsagie> DuanKebo: i saw the code, i need to see about my comments from last patch but over all it looks good
09:38:10 <gsagie> need to make sure the OVSDB monitor patch updates the correct queue and it seems we have this covered and we only then need the support of the DB/pub sub
09:38:11 <DuanKebo> Yes i have saw the comments.
09:38:21 <DuanKebo> and will upload a patch today.
09:38:32 <DuanKebo> it is under testing aslo.
09:38:33 <gsagie> oanson: you have a bug on you to move the local cache to be tenant aware, will you find time to work on this?
09:38:48 <oanson> Sure
09:38:49 <gsagie> DuanKebo: ok great job, looks good to me
09:39:04 <gsagie> #action oanson continue gsagie patch for Tenant aware cache
09:39:16 <gsagie> we can then create better searched per tenant in cache
09:39:26 <gsagie> DuanKebo: any open issue regarding that?
09:39:40 <gsagie> #link tenant cache https://review.openstack.org/#/c/277176/
09:39:41 <DuanKebo> @gsagie I need nb-api to support query by topic
09:39:49 <gampel> You mean adding additional indexes  ?
09:40:15 <gsagie> gampel: no, we talked about that to make the cache structure by tenants
09:40:28 <DuanKebo> Some apis miss this para. But we can discuss this after the IRC
09:40:31 <gsagie> so you have a list/dict of tenants and so on
09:40:46 <gampel> I see we just need to make sure we do not slow the other query s
09:40:49 <gsagie> DuanKebo: ok, look at this patch: https://review.openstack.org/#/c/284178/
09:40:56 <gsagie> it has eveyrthing you need i think
09:40:59 <gsagie> we need to merge it soon
09:41:11 <DuanKebo> OK
09:41:15 <gsagie> gampel: maybe you can review and merge
09:41:26 <gampel> OK
09:41:33 <gampel> i will
09:41:37 <gsagie> gampel: its not going to slow things only make them faster
09:41:54 <gsagie> both for L3 apps and for DB/controller
09:42:04 <gsagie> especially when using L3 reactive
09:42:27 <gsagie> i do have some work to also fix some things in the L3 proactive app
09:42:46 <gsagie> #action gsagie decrease flow number in L3 proactive app, one flow per router interface
09:43:02 <gsagie> for controller reliability, we need to review this code
09:43:09 <gsagie> #topic controller reliability
09:43:17 <gsagie> DuanKebo: is there any open issue for that?
09:43:25 <DuanKebo> no
09:43:58 <DuanKebo> This work is delayed.
09:44:19 <gsagie> okie great, gampel, Li-Ma please review this patch when you have time (and everyone else of course)
09:44:34 <DuanKebo> Heshan(The guy in charge of this) has been assigned another work
09:44:40 <gsagie> DuanKebo: ok :(
09:44:44 <gampel> I will review it today
09:45:11 <gampel> so we need some one to pick this work
09:45:18 <gsagie> DuanKebo: its probably not the most urgent job, but do you want someone else continue this work?
09:45:18 <nick-ma> okie
09:45:20 <gsagie> i can work on it
09:45:29 <DuanKebo> No, Heshan will come back soon
09:45:29 <gsagie> or you have anyone else?
09:45:39 <gsagie> ok
09:45:41 <DuanKebo> He can continue this work.
09:46:00 <gsagie> #info Heshan come back :) we need you!
09:46:11 <gsagie> #topic open discussion
09:46:18 <gsagie> matrohon: stage is yours :)
09:46:28 <matrohon> gsagie, thanks
09:46:29 <gsagie> then we can talk about CI jobs
09:47:07 <matrohon> as nick-ma said earlier, we are considering getting rid of neutron db, and rely on a nosql db instead
09:47:40 <nick-ma> haven't decided yet. there are lots of tradeoffs and discussion on it.
09:47:41 <matrohon> this kind of work take place in the following context :
09:47:51 <matrohon> #link https://www.openstack.org/summit/austin-2016/vote-for-speakers/presentation/7342
09:47:52 <gsagie> matrohon: you work with nick-ma?
09:48:03 <gampel> Did you evaluate  how much work it is ?
09:48:14 <matrohon> gsagie, no, I'm working on distributed cloud
09:48:51 <matrohon> some experiments have already been done by replacing the neutron db backend with redis
09:49:02 <gsagie> matrohon: such a model can really help us, so its an interesting thing that we actually also considered
09:49:04 <nick-ma> i will go to that topic and we can discuss it during the summit.
09:49:20 <matrohon> nick-ma, +1
09:49:36 <gampel> we consider it as well main issues are the eventually consistency of teh D DB and the work load
09:49:41 <gsagie> we would all like to participate, i think gampel and I would also love to be there
09:49:49 <matrohon> there are chances that a dedicated WG gets created in Openstack
09:50:32 <nick-ma> neutron xxx-list may mislead end-users when db backend is not ACID, due to eventual consistency.
09:50:33 <matrohon> gampel, yep, I'm not a nosql expert, but I understand that it's a challenging topic
09:50:45 <nick-ma> yes.
09:51:03 <gampel> and the query speed
09:51:06 <gampel> for read
09:51:15 <gsagie> matrohon: its nice to know about this effort, how can we help?
09:51:26 <matrohon> it seems your group already gave it a lot of thoughs, it would be awesome to share them in the WG
09:51:37 <nick-ma> for that dedicated WG? I'm interested in it.
09:51:44 <gsagie> matrohon: ok would love that, how do we do it?
09:51:47 <gampel> yes this is a great effort  that we would like to join
09:52:07 <matrohon> gsagie, we'll probably set up a dedicated meeting during the next summit, I'll let you know
09:52:30 <gsagie> matrohon: ok that will be great, thanks for sharing and hope to meet you in the summit
09:52:42 <nick-ma> awesome.
09:53:21 <gsagie> we might also use one Dragonflow design session to discuss about it, as we explored this area quiet alot
09:53:27 <gampel> Yes lets us know and if there is active talk about it can you please send us a link
09:53:38 <matrohon> do you already have some materials about changes to be made in upstream project to achieve the distributed goal
09:53:40 <matrohon> ?
09:53:54 <gsagie> #link https://www.openstack.org/summit/austin-2016/vote-for-speakers/presentation/7342
09:53:57 <gampel> yes this is a good idea DB consistency and DB alternative
09:54:03 <gsagie> matrohon: nothing written
09:54:24 <gsagie> matrohon: but we know others that might be interested with this as well
09:54:30 <nick-ma> you can review the spec of db consistency. we discussed a lot during review.
09:54:41 <nick-ma> matrohon: .
09:54:43 <matrohon> gsagie, great!
09:54:57 <matrohon> nick-ma, I'll do thanks
09:55:07 <gsagie> okie, so lets continue talk about this and feel free to drop to our channel #openstack-dragonflow if you have any more questions
09:55:23 <matrohon> gsagie, ok
09:55:30 <gsagie> and if you guys start meeting before the summit, would love to join
09:55:46 <gsagie> for the CI, i think we need to decide on our default setup first
09:56:15 <gsagie> as i mentioned to nick-ma in one of the reviews, i think he will also want to run CI for zookeeper
09:56:20 <gampel> For the Ci I feel it is very impotent to test the main used DB drivers
09:56:54 <gsagie> gampel: i agree, so maybe after Redis is implemented we can add this as well
09:56:59 <gampel> in the CI so it seem that zookeeper, Redis, etcd
09:57:05 <gsagie> yep
09:57:19 <gampel> do we have a limit on the number of CI ?
09:57:22 <gampel> jobs
09:57:31 <gsagie> at least for the fullstack tests, dont know if we need to do it for tempest
09:57:47 <gsagie> gampel: i am not aware of such restriction, but i dont think it will be a problem to add 2 more jobs
09:57:52 <nick-ma> if we set up all of them, it will take long time on verification.
09:57:53 <gampel> yes i agree for the fullstack
09:58:07 <gsagie> nick-ma: i think its done in parallel so not sure about that
09:58:10 <oanson> Do we want such a job only for zookeeper, redis, etcd?
09:58:15 <nick-ma> ok
09:58:18 <oanson> Or any other db driver we add?
09:58:30 <nick-ma> fullstack only? or adding tempest api testing?
09:58:30 <gampel> i think it is enough  for now
09:58:34 <gsagie> oanson: lets decide that as we go, right now i dont see a reason to add anythign else
09:58:49 <gampel> RamCloud is not used yet
09:58:51 <gsagie> nick-ma: what do you suggest?
09:58:51 <oanson> All right
09:59:17 <gsagie> i think fullstack is enough as it grows it verify the important things
09:59:29 <gsagie> tempest is more about the controller logic and less about the DB it self
09:59:37 <nick-ma> priority? before or after summit?
09:59:39 <gsagie> maybe add some more DB specific tests to the fullstack
09:59:46 <gampel> yes I agree only fullstack and maybe rally latter
10:00:00 <gsagie> nick-ma: zookeeper depends on you :)
10:00:05 <gsagie> i can add the one for Redis
10:00:07 <yuli_s_> gsagie: meta service without q-dhcp
10:00:10 <yuli_s_> :)
10:00:43 <gsagie> yuli_s: yeah, we need to work on this as well :)
10:00:45 <nick-ma> ok
10:00:49 <gampel> this is a feature we do not support yet
10:00:49 <gsagie> lets take this offline as our time is done
10:00:54 <gsagie> thanks everyone for attending!
10:00:59 <gampel> thanks
10:01:02 <gsagie> and see you next week
10:01:07 <gsagie> #endmeeting