09:00:19 #startmeeting dragonflow 09:00:20 Meeting started Mon Feb 29 09:00:19 2016 UTC and is due to finish in 60 minutes. The chair is gsagie. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:00:21 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:00:23 The meeting name has been set to 'dragonflow' 09:00:27 Hello to all the dragons :) and the flows.. 09:00:30 Hi 09:00:36 let this show begin! 09:00:45 who is here for dragonflow meeting? 09:00:45 hi 09:00:45 hi 09:00:48 hi 09:00:51 hi 09:00:58 matrohon: welcome :) first time i see you 09:01:12 gsagie, thanks :) 09:01:20 welcome 09:01:28 Hello 09:01:44 I hope there will be room for open discussion today! 09:02:02 #info gampel, matrohon, nick-ma, oanson, Shlomo_N, yuli_s, gsagie, DuanKebo in meeting 09:02:20 Hi 09:02:24 matrohon: sure, we have a tight schedule but we are here for the open discussion so we will make some time 09:02:40 #topic design summit 09:02:44 we also can move to dragonflow room for discussion. 09:03:01 so, Dragonflow was approved as big-tent project 09:03:05 gsagie, nick-ma fine! 09:03:24 Congrats...! 09:03:26 and this means we need to request for rooms for design summit sessions, we have few topics and gampel 09:03:33 need to send an email to request how many rooms 09:03:37 we need to decide how many work session we want in the summit 09:03:37 how many sessions 09:04:07 gampel: i think we also want 1-2 fishbowl to hopefully have a broder community discussion regarding general roadmap and users requests/questions 09:04:28 I think that we need at least 4 1 hour sessions 09:04:34 so i was thinking something in the area of 1 fishbowl session and 5 1 hour sessions 09:04:35 yeah 09:04:50 yes sound good 09:05:02 gampel: ok, lets also start etherpad to prioritze the points we want to discuss 09:05:05 so anyone can add 09:05:14 and publish to mailing list maybe 09:05:28 road map for N is one 09:05:42 we can list our session schedule on etherpad for priority. 09:05:44 yes 09:05:46 #action gampel return design session numbers we need (1 fishbowl, 5x1 hour) 09:05:59 #action gampel start etherpad for design summit topics 09:06:12 yeah lets continue on that etherpad, we still have time 09:06:13 I will create the etherpad and sent it to the mailing list 09:06:18 ok great! 09:06:20 thanks 09:06:22 thakns 09:06:25 #topic testing 09:06:41 Ok, Shlomo_N, yuli_s, please update us on this front 09:06:49 ok 09:06:53 We are starting to create scale and API tests 09:07:18 First, the data plane performance testing spec was merged into the DragonFlow git repository, you can find it here: 09:07:22 https://github.com/openstack/dragonflow/blob/master/doc/source/specs/performance_testing.rst 09:07:37 #link https://github.com/openstack/dragonflow/blob/master/doc/source/specs/performance_testing.rst 09:07:46 10x gsagie 09:07:50 I have finished the data plane performance testing for DVR and 09:07:55 DragonFlow in multi-node environment 09:08:03 Shlomo: good job, lets wait with publishing the results 09:08:12 until we have full picture of Dragonflow with security groups 09:08:22 sure 09:08:23 but after that we can start pushing things to openstack-performance-docs 09:08:27 ok thanks 09:08:36 We need to see how we can automate this 09:08:43 so we can start applying this per patch 09:09:03 yuli_s also published API/control plane testing spec 09:09:03 I am starting to work on the automation part 09:09:16 #link https://review.openstack.org/#/c/282873/ 09:09:32 so please everyone review and share comments/ideas 09:09:50 we also would like to work on this with the community yuli_s and define shared standards 09:09:53 Can we also test the perf of neutron refrence design. 09:10:02 DuanKebo: thats what we do 09:10:09 and compare dragonflow's to it . 09:10:12 DuanKebo: Shlomo_N did a comparsion and will send results 09:10:19 the doc covvers more single box tests 09:10:20 Great! 09:10:20 we need to go over them and verify 09:10:21 Good job Yuli , are you going to do it with rally ? 09:10:29 so, it will be extended as we go 09:10:46 yeah as me and yuli_s talked about the goal is to reach rally 09:10:50 and work with bigger test lab 09:11:07 for automation, but get initial numbers first as well so we will have an idea what to expect 09:11:48 It will be important to simulate DB client's as well to test 09:12:08 DC scale 1000+ compute node 09:12:14 gampel: yes good point, after API we will want to test the DB backend as well 09:12:36 i think its mentioned in the document 09:13:10 yuli_s: i say lets first get initial numbers and then start investigating Rally for this, i can help you with that 09:13:23 hi vikram! :) 09:13:26 gsagie: sure ! 09:13:28 ok lets move to the next topic 09:13:28 hi gal ;) 09:13:33 #topic DB consistency 09:13:40 nick-ma: :) 09:13:42 welcome @ vikram_ 09:13:49 the current code is in the review. 09:13:50 hi raofei welcome 09:13:54 hi 09:13:55 gampel; thanks 09:13:57 #info raofei, vikram are in meeting as well 09:14:18 nick-ma: good job, i did saw we had some exceptions in neutron server, will re check 09:14:19 i'm also working on the testing. there are some errors on updating subnet in fullstack. 09:14:36 yeah i saw some tests are sometimes failing 09:14:39 i updated the code. 09:14:47 yes. but for manual testing and rally, it works. 09:14:54 so i need to figure out why. 09:15:02 the DB consistency architecture is great, I think we might have some rejections from neutron team that we want to add a new table 09:15:02 #link DB consistency review https://review.openstack.org/#/c/282290/ 09:15:17 to sync data in neutron db 09:15:31 nick-ma: okie great 09:15:35 yuli_s: thats not a problem 09:15:42 it is an additional table only accessed from our plugin i do not see a problem 09:15:58 gsagie: i hope too 09:16:12 nick-ma: okie let us know if you need any help with investigating these tests, i saw some fails that relate to floating ip as well 09:16:13 yes. there's no problem for adding a new table for a plugin. 09:16:27 anything else for DB consistency? 09:17:16 #topic pub-sub 09:17:20 okie.. 09:17:21 there's some discussion on getting rid of neutron db and just use nosql for persistent storage. i'll write my thoughts on it. we can discuss it later. 09:17:53 nick-ma: okie, we had same thoughts but this require much more work and needs to be synced with the Neutron community 09:18:08 so its defiantly not for this release 09:18:13 yes this is a big change and the biggest problem there is that the DB is only eventually consistent 09:18:22 for me it is not clear why we need to use ZerroMQ together with redis 09:18:28 yes, you are right. 09:18:32 gampel: the pub-sub stage is yours, please publish us some status :) 09:18:34 redis pub/sub driver is under test now. 09:18:45 Yes pub-sub status 09:18:52 @ruli_s Yes 09:19:00 we have the ZMQ merged 09:19:15 We are try not to use zerroMQ in redis driver 09:19:35 and@ omer is working on separating the publisher into a different service 09:19:39 DuanKebo: gampel found some interesting posts describing that ZeroMQ performance is much better then Redis 09:19:44 we do need to verify all of it 09:19:55 Curently we only support Neutron server Publishers 09:20:05 so it might prove more efficent to us redis as DB backend and ZeroMQ as the pub/sub 09:20:07 gsagie: in the control performance doc we are covering this 09:20:08 Yes, we will investigate it. 09:20:17 we do not support for this release controller to controller 09:20:41 the chasiss are handled from the neutron server side 09:21:05 ZMQ and REdis do not work together they are two drivers 09:21:18 DuanKebo, gampel: any open points regarding Redis pub/sub? 09:21:19 we can use any DB with any Pub sub driver 09:21:31 any problems we need to discuss? 09:21:31 In case we'll want DFcontroller-DFcontroller, we can use the database + a table monitor on the controller. 09:21:45 Currently, we are using redis for db and pub/sub 09:21:57 From the discussion today we understand that redis does not need to bind a local socket for the publisher 09:22:01 But open to other opinions 09:22:11 and can run publisher per Neutron server process 09:22:16 opent to other options. 09:22:48 okie, then the implementation doesnt need to use the pub/sub service 09:22:57 Yes, it support multiply neutron servers on one host. 09:23:04 DuanKebo, by separating the pub/sub and DB drivers, we allow flexibility. In configuration, we can select to do both with redis, or DB with redis and pub/sub with ZMQ. 09:23:05 what I suggest is that we test performance of the different Pub?sub and then we could compare them all 09:23:36 DuanKebo, gampel: okie then, lets continue with Redis implementation and compare results to Redis + ZeroMQ or only Redis 09:23:36 yes, we can start the actual control plane test from here 09:23:51 DuanKebo: is there any time frame for Redis to be completed? 09:23:55 For M Cycle i propose to focus only on Neutron server publishers 09:23:56 are we close? 09:24:20 We are doing the test. 09:24:36 I think it can be finished this week 09:24:43 the test or the code? 09:24:52 the test 09:25:12 we need to build gate check for each db backend. 09:25:23 yes this is a good idea 09:25:33 DuanKebo: okie, so please update us with the results 09:25:39 when you have them 09:25:41 OK, np 09:25:50 I will add it to the list and register a bug 09:26:04 Gate-tests for each possible configuration is very expensive performance-wise, and may take a lot of time. I think this should be considered. 09:26:05 But if the effort of completing Redis is minimal, i think we should do it with the pub/sub as well 09:26:22 DuanKebo: check this out https://github.com/openstack/performance-docs/blob/master/doc/source/test_plans/mq/plan.rst 09:26:42 lets talk about gate tests after.. 09:27:14 yes, very expensive. :-) 09:27:45 okie 09:27:59 #action DuanKebo publish results of pub/sub testing 09:28:03 I think in this teste there are not using pub/sub sockets 09:28:19 #action DuanKebo try to asses the time it takes to finish Redis for both pub/sub and DB 09:28:52 gampel: we need to test this end to end, if the effort of adding this now is small i think we need to test this end to end 09:29:21 @DuanKebo the test are mainly for The MSQ part and not for the Pub/Sub 09:29:30 There are some advantage to selective proactive distribution with redis 09:29:34 we need to explore 09:29:52 so lets see if we can complete the work and then do the testing 09:30:11 DuanKebo: what do you think? 09:30:15 Yes i think that the controller path testing will be the key for the selection 09:30:50 DuanKebo I will be happy to help with the test 09:31:08 okie, lets continue talking about this after 09:31:15 #topic security groups and port security 09:31:22 dingbo here? 09:31:28 raofei: can you update on this? 09:31:33 we will add th reliability latter but I agree we should start testing as soon as possible 09:31:49 security group is done by Yuanwei now 09:32:08 The code is almostly completed, now it's testing 09:32:45 okie, you know when he will upload it up stream? 09:32:51 so we can all review 09:33:05 maybe ask him to upload with WIP 09:33:23 so we could speed the review cycle 09:33:26 Yes, I think so. @duankebo, when does yuanwei can commit the latest code? 09:34:06 i think he is disconnected 09:34:16 online again 09:34:19 ahh ok 09:34:23 welcome back :) 09:34:38 I will confirm it with yuanwei 09:34:51 I think it may be this week 09:34:59 DuanKebo: please do because we are approaching end of release 09:35:06 doing the testing without security groups is pointless 09:35:35 #action DuanKebo check security groups status and upload for review to upstream 09:35:40 #topic distributed DNAT 09:35:47 yes. as i know most of the code have been finished. 09:35:48 raofei: ...:) 09:36:09 it's coding phase. 09:36:22 only the application is missing right? 09:36:35 I think the coding and testing will be completed this week. 09:36:36 yes. 09:36:46 raofei: ok good job 09:36:50 great :) 09:36:56 of course, the plugin also need to do some change. 09:37:00 #action raofei upload distributed dnat upstream for review 09:37:02 @raofei Hujie is doing the integration 09:37:23 #topic selective proactive 09:37:44 DuanKebo: i saw the code, i need to see about my comments from last patch but over all it looks good 09:38:10 need to make sure the OVSDB monitor patch updates the correct queue and it seems we have this covered and we only then need the support of the DB/pub sub 09:38:11 Yes i have saw the comments. 09:38:21 and will upload a patch today. 09:38:32 it is under testing aslo. 09:38:33 oanson: you have a bug on you to move the local cache to be tenant aware, will you find time to work on this? 09:38:48 Sure 09:38:49 DuanKebo: ok great job, looks good to me 09:39:04 #action oanson continue gsagie patch for Tenant aware cache 09:39:16 we can then create better searched per tenant in cache 09:39:26 DuanKebo: any open issue regarding that? 09:39:40 #link tenant cache https://review.openstack.org/#/c/277176/ 09:39:41 @gsagie I need nb-api to support query by topic 09:39:49 You mean adding additional indexes ? 09:40:15 gampel: no, we talked about that to make the cache structure by tenants 09:40:28 Some apis miss this para. But we can discuss this after the IRC 09:40:31 so you have a list/dict of tenants and so on 09:40:46 I see we just need to make sure we do not slow the other query s 09:40:49 DuanKebo: ok, look at this patch: https://review.openstack.org/#/c/284178/ 09:40:56 it has eveyrthing you need i think 09:40:59 we need to merge it soon 09:41:11 OK 09:41:15 gampel: maybe you can review and merge 09:41:26 OK 09:41:33 i will 09:41:37 gampel: its not going to slow things only make them faster 09:41:54 both for L3 apps and for DB/controller 09:42:04 especially when using L3 reactive 09:42:27 i do have some work to also fix some things in the L3 proactive app 09:42:46 #action gsagie decrease flow number in L3 proactive app, one flow per router interface 09:43:02 for controller reliability, we need to review this code 09:43:09 #topic controller reliability 09:43:17 DuanKebo: is there any open issue for that? 09:43:25 no 09:43:58 This work is delayed. 09:44:19 okie great, gampel, Li-Ma please review this patch when you have time (and everyone else of course) 09:44:34 Heshan(The guy in charge of this) has been assigned another work 09:44:40 DuanKebo: ok :( 09:44:44 I will review it today 09:45:11 so we need some one to pick this work 09:45:18 DuanKebo: its probably not the most urgent job, but do you want someone else continue this work? 09:45:18 okie 09:45:20 i can work on it 09:45:29 No, Heshan will come back soon 09:45:29 or you have anyone else? 09:45:39 ok 09:45:41 He can continue this work. 09:46:00 #info Heshan come back :) we need you! 09:46:11 #topic open discussion 09:46:18 matrohon: stage is yours :) 09:46:28 gsagie, thanks 09:46:29 then we can talk about CI jobs 09:47:07 as nick-ma said earlier, we are considering getting rid of neutron db, and rely on a nosql db instead 09:47:40 haven't decided yet. there are lots of tradeoffs and discussion on it. 09:47:41 this kind of work take place in the following context : 09:47:51 #link https://www.openstack.org/summit/austin-2016/vote-for-speakers/presentation/7342 09:47:52 matrohon: you work with nick-ma? 09:48:03 Did you evaluate how much work it is ? 09:48:14 gsagie, no, I'm working on distributed cloud 09:48:51 some experiments have already been done by replacing the neutron db backend with redis 09:49:02 matrohon: such a model can really help us, so its an interesting thing that we actually also considered 09:49:04 i will go to that topic and we can discuss it during the summit. 09:49:20 nick-ma, +1 09:49:36 we consider it as well main issues are the eventually consistency of teh D DB and the work load 09:49:41 we would all like to participate, i think gampel and I would also love to be there 09:49:49 there are chances that a dedicated WG gets created in Openstack 09:50:32 neutron xxx-list may mislead end-users when db backend is not ACID, due to eventual consistency. 09:50:33 gampel, yep, I'm not a nosql expert, but I understand that it's a challenging topic 09:50:45 yes. 09:51:03 and the query speed 09:51:06 for read 09:51:15 matrohon: its nice to know about this effort, how can we help? 09:51:26 it seems your group already gave it a lot of thoughs, it would be awesome to share them in the WG 09:51:37 for that dedicated WG? I'm interested in it. 09:51:44 matrohon: ok would love that, how do we do it? 09:51:47 yes this is a great effort that we would like to join 09:52:07 gsagie, we'll probably set up a dedicated meeting during the next summit, I'll let you know 09:52:30 matrohon: ok that will be great, thanks for sharing and hope to meet you in the summit 09:52:42 awesome. 09:53:21 we might also use one Dragonflow design session to discuss about it, as we explored this area quiet alot 09:53:27 Yes lets us know and if there is active talk about it can you please send us a link 09:53:38 do you already have some materials about changes to be made in upstream project to achieve the distributed goal 09:53:40 ? 09:53:54 #link https://www.openstack.org/summit/austin-2016/vote-for-speakers/presentation/7342 09:53:57 yes this is a good idea DB consistency and DB alternative 09:54:03 matrohon: nothing written 09:54:24 matrohon: but we know others that might be interested with this as well 09:54:30 you can review the spec of db consistency. we discussed a lot during review. 09:54:41 matrohon: . 09:54:43 gsagie, great! 09:54:57 nick-ma, I'll do thanks 09:55:07 okie, so lets continue talk about this and feel free to drop to our channel #openstack-dragonflow if you have any more questions 09:55:23 gsagie, ok 09:55:30 and if you guys start meeting before the summit, would love to join 09:55:46 for the CI, i think we need to decide on our default setup first 09:56:15 as i mentioned to nick-ma in one of the reviews, i think he will also want to run CI for zookeeper 09:56:20 For the Ci I feel it is very impotent to test the main used DB drivers 09:56:54 gampel: i agree, so maybe after Redis is implemented we can add this as well 09:56:59 in the CI so it seem that zookeeper, Redis, etcd 09:57:05 yep 09:57:19 do we have a limit on the number of CI ? 09:57:22 jobs 09:57:31 at least for the fullstack tests, dont know if we need to do it for tempest 09:57:47 gampel: i am not aware of such restriction, but i dont think it will be a problem to add 2 more jobs 09:57:52 if we set up all of them, it will take long time on verification. 09:57:53 yes i agree for the fullstack 09:58:07 nick-ma: i think its done in parallel so not sure about that 09:58:10 Do we want such a job only for zookeeper, redis, etcd? 09:58:15 ok 09:58:18 Or any other db driver we add? 09:58:30 fullstack only? or adding tempest api testing? 09:58:30 i think it is enough for now 09:58:34 oanson: lets decide that as we go, right now i dont see a reason to add anythign else 09:58:49 RamCloud is not used yet 09:58:51 nick-ma: what do you suggest? 09:58:51 All right 09:59:17 i think fullstack is enough as it grows it verify the important things 09:59:29 tempest is more about the controller logic and less about the DB it self 09:59:37 priority? before or after summit? 09:59:39 maybe add some more DB specific tests to the fullstack 09:59:46 yes I agree only fullstack and maybe rally latter 10:00:00 nick-ma: zookeeper depends on you :) 10:00:05 i can add the one for Redis 10:00:07 gsagie: meta service without q-dhcp 10:00:10 :) 10:00:43 yuli_s: yeah, we need to work on this as well :) 10:00:45 ok 10:00:49 this is a feature we do not support yet 10:00:49 lets take this offline as our time is done 10:00:54 thanks everyone for attending! 10:00:59 thanks 10:01:02 and see you next week 10:01:07 #endmeeting