08:59:28 #startmeeting dragonflow 08:59:29 Meeting started Mon Mar 21 08:59:28 2016 UTC and is due to finish in 60 minutes. The chair is gsagie. Information about MeetBot at http://wiki.debian.org/MeetBot. 08:59:30 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 08:59:32 The meeting name has been set to 'dragonflow' 08:59:37 hi 08:59:40 hi 08:59:44 Hello everyone 08:59:58 ok, we have gampel and DuanKebo for the meeting, anyone else? 09:00:13 hi 09:00:17 Hello 09:00:18 hi nick-ma 09:00:23 Good morning! 09:00:39 #info gampel, DuanKebo, nick-ma, gsagie in meeting 09:00:58 hi 09:01:02 Hi 09:01:06 hi 09:01:14 #info Shlomo_N, oanson, dingboopt in meeting as well 09:01:14 hi 09:01:24 #info vikram_ in meeting :) 09:01:34 good to see you here Vikram! 09:01:43 gsagie: thanks ;) 09:01:43 hi vikram_ welcome :) 09:01:46 ok, lets start 09:01:51 #topic redis driver 09:02:10 #link https://review.openstack.org/#/c/274340/ 09:02:17 who can update on the progress there? 09:02:19 I am ok with the patch just missing devstack 09:02:21 we have any open issues? 09:02:31 install script 09:03:19 okie, DuanKebo the team is working on adding this so we can merge Redis? 09:03:23 feipeng is working on the redis script 09:03:25 any more open issues? 09:03:31 okie great 09:03:42 after that i believe we can merge it according to gampel 09:03:43 Hope he can complete it this week. 09:03:44 DuanKebo: will you add it in another patch or in this one 09:03:47 and its self contained so its good 09:04:03 we'll add it to another patch 09:04:06 We can merge it and do it in another patch as well thats a good point 09:04:15 okie, so lets review and try to merge the implementation 09:04:19 Ok so lets all try to review it today 09:04:33 #link https://review.openstack.org/#/c/286028/ 09:04:47 yes, we can merge the implementation firstly 09:04:58 #action gsagie, gampel, nick-ma, oanson review Redis DB and try to merge it hopefully 09:05:07 ok 09:05:09 feipeng: good work 09:05:20 good work feipeng 09:05:25 #topic security groups and port security 09:05:32 dingboopt, please update :) 09:05:50 some import feature is added to this patch 09:05:51 #link https://review.openstack.org/#/c/280538/ security groups app 09:06:06 and fullstack test code is under development 09:06:13 will upload soon 09:06:24 ok good, i think we can merge this as well after the tests 09:06:29 its also relatively self contained 09:06:30 dingboopt, you need reserve some flow tables for qos and port security 09:06:34 yes 09:06:54 since we use constants for tables, it will be easy to just play with them 09:06:59 so no worry there 09:07:20 we can think however on how to make the "pipeline" configurable in such a way that we wont need to change the apps 09:07:23 all the time for it 09:07:29 like dingboopt had to change the L2 App 09:07:40 maybe make it a bit better, i will think of how to do this 09:07:52 anything else on this front? 09:08:01 I agree this is a good point 09:08:03 We also need to do the work for port security right? 09:08:07 yes 09:08:13 whats the status on that front ? 09:08:19 I will start to implement the plugin side 09:08:21 gsagie, it's a good idea, i like it 09:08:21 soon 09:08:24 :) 09:08:24 #action gsagie think how to model pipeline configuration without keep changing apps for table names/numbers 09:08:51 okie thanks dingboopt, will add action item to track this 09:08:57 ok 09:09:02 #action dingboopt implement port security part 09:09:15 anything on this topic from your end DuanKebo? 09:09:39 we are review sg code now 09:09:43 btw, is raofei here? 09:09:48 no suggestion now 09:09:51 ok great, thanks 09:10:17 #topic distributed DNAT 09:10:21 I just notified raofei 09:10:25 we need raofei for the DNAT status 09:10:29 yeah 09:10:37 thanks dingboopt 09:10:53 lets change topics and then come back to this 09:10:58 #topic controller reliability 09:11:09 He said he has a problem to log in irc clould 09:11:16 gampel: you had some comments about this? 09:11:24 So he will talk to you in other time 09:11:30 raofei 09:11:33 we need to first approve the spec 09:12:02 dingboopt: ok thanks, maybe it will be best if you can ask him to send a "status" report about DNAT, if he has anything that blocks him or code is complete 09:12:09 it look in a good state to me lets try to all review it today , i think it is the last spec for this release 09:12:33 #link https://review.openstack.org/#/c/274334/ controller reliability spec 09:12:37 heshan: are you here 09:12:46 We are also testing the prototype of this patch. 09:13:34 the patch currently conflict with the L3 app, unless he already did the split 09:13:45 and find out it depends on the startup sequence of different parts 09:13:46 of the cookie id 09:13:59 heshan has sent a mail to you, gsagie. 09:14:02 and in that case he also needs to update the L3 app to work (unless we will convert it to not use the cookie id for now) 09:14:06 i am not sure what he means by local cookie and global one 09:14:26 DuanKebo: i will take a look 09:15:04 DuanKebo: do you want to talk about the startup sequence now? or you prefer we continue it after meeting 09:15:22 we can talk about it later. 09:15:30 maybe we should have a meeting with heshan to better review this patch 09:15:40 gampel, heshan is not here. 09:15:54 he can explain it to you after the meeting by email. 09:16:02 Ok 09:16:19 okie no problem, i think i see it in my head how the sequence should work and maybe i can help if i understand the questions you facing with 09:16:24 but we can continue this later 09:16:33 i will look at the email, havent seen it yet 09:17:03 Do you think we need a meeting to talk about the spec? 09:17:09 #action gsagie, gampel meeting with heshan and DuanKebo about controller reliability 09:17:17 DuanKebo: depends on you guys 09:17:30 OK 09:17:44 if you need help with it, just need to keep in mind that also needs to convert L3 application to also work 09:17:51 if you base the code on cookie id 09:17:56 yes 09:18:14 for the sequence, if you need help let me know and we can schedule a meeting 09:18:21 i will look at the email anyway today 09:18:34 ohh hi hshan :) 09:18:41 the issue is that the openflow app load first before our app 09:18:48 hi, all 09:18:53 and in some cases we could miss events 09:18:59 i am not sure what he means by local cookie and global one 09:19:10 will you please explain it to Eran? 09:19:12 gampel: which events are you talking about ? 09:19:18 okay 09:19:26 and how this depends on the controller reliability 09:19:45 openflow features and others 09:20:14 according to hshan we need to change the seq of loading our application 09:20:39 the way i see it is "simple", the controller go up, find the last ID, change the ID to new one (and every new flow configured with this ID) then do the sync process for all current data, install all flows 09:20:46 and then just deletes all the old flows with the old cookie 09:21:04 hshan: am i in the right direction? 09:21:10 global cookies are set for all flows, while local cookies are set for a group of flows. 09:21:10 for example: aging cookie is global cookie, security group cookie is local cookie. 09:21:10 global cookies are set for all flows, while local cookies are set for a group of flows. 09:21:10 for example: aging cookie is global cookie, security group cookie is local cookie. 09:21:10 global cookies are set for all flows, while local cookies are set for a group of flows. 09:21:11 for example: aging cookie is global cookie, security group cookie is local cookie. 09:21:11 global cookies are set for all flows, while local cookies are set for a group of flows. 09:21:12 yes 09:21:13 sorry 09:21:14 yes 09:21:43 please go on gsagie 09:21:48 gsagie:yes i agree i am talking about the bug he submitted https://bugs.launchpad.net/dragonflow/+bug/1558415 09:21:50 Launchpad bug 1558415 in DragonFlow "the problem of modules start sequence for dragonflow" [High,New] - Assigned to hujie (mike-hu) 09:22:47 okie, just read it now 09:22:58 local cookies are for the applications ? 09:23:17 gampel: i think so :) 09:23:22 hshan ? 09:23:35 yes 09:23:40 hshan: i would suggest changing the name 09:23:50 such as l3_proactive and security group 09:24:17 hshan: which openflow messages are you worried about? 09:24:20 that we are going to miss 09:24:40 because since we are using OVSDB monitor, the port up are not really important for us 09:25:12 actually, if ovs_monitor and 'reliability' are not merged in, there are no problem 09:25:35 hshan: ovs_monitor is already merge 09:25:51 hshan: for reliability, what problems are you afraid of ? 09:26:09 because reliability needs to do the process:'get old cookie, gen new cookie' before all other apps to flush flows 09:26:24 so we need to block the other apps 09:27:00 hshan: why the apps flush flows? we need to remove this part now i think 09:27:19 the apps needs to flush flows only when OVSDB monitor send a port up event 09:27:20 and when ovsdb_monitor is introduced, all south event will be send from ovsdb monitor 09:27:24 and then flush according to this port/tenant 09:27:26 i agree 09:27:32 triggered by packetin msg? 09:27:39 no 09:27:48 let the reliability process clean the old flows 09:27:57 ovsdb monitor is triggered by openflow feature reply 09:28:00 we currently trigger by port up in the applications, but this is something that now needs to be removed 09:28:26 because we need to make sure the datapath is ready, when reliability do the sync work 09:28:56 we need to agree that we will use only one mechanizem OVSDB and remove the openflow port up flow 09:29:07 I agree 09:29:09 hshan, DuanKebo: i agree i think the problem right now is because we didnt yet remove the part in the applications that listen to openflow port up events and call all the apps 09:29:18 but with the OVSDB monitor we dont need this part 09:29:36 yes 09:29:57 but when the controller is up, the port events from OVSDB are waiting in the queue, so you can do everything before the controller start reading the events 09:30:08 so you can change the cookie first 09:30:21 right? 09:30:24 but the packetin msg may also matter 09:30:30 a little different 09:30:54 DuanKebo: yes i agree 09:31:10 DuanKebo: but currently only for DHCP 09:31:21 i think we should ignore packet in until we have the port info 09:31:21 gampel: DHCP doesnt install flows reactively right? 09:31:22 yes 09:31:27 we plan not to send the port event to the vent queue until reliability is done 09:31:36 No only on port and subnet update 09:31:43 so long as we don't use reactive L3 09:31:45 okie, so we currently have no problem 09:32:04 DuanKebo: yep, but if changing the order is simple then maybe you should do it so we dont block ourself 09:32:05 but i am now adding DOS attack prevention that will install block 09:32:22 is there a problem you see with changing the order? 09:32:28 of startup 09:32:43 maybe the easier thing would be to not read the last cookie 09:32:45 I do not see a problem but we should test 09:32:45 from OVS 09:32:52 maybe we can save it in the chassis table instead 09:33:01 and read it on startup and then we have less problems 09:33:16 Or save it locally in the node, e.g. a file 09:33:18 and no need for "CANARY" flow to get it 09:33:48 i think so 09:33:51 but either way i dont see a problem, we can just run CLI and read it either way (i prefer we dont but..) 09:33:57 gsagie: that will introduce another problem, we have to keep the consistency between DB and OVS 09:34:01 heshan can think about gal's advice. 09:34:29 okie 09:34:32 yes, I had thought about that 09:34:35 gampel: is there any spec about the DOS attack prevetion? 09:34:43 not yet 09:34:44 *prevention 09:35:41 hshan: lets keep talking after meeting if you need, i dont see a consistent problem with saving it in the DB, only the local controller write/read this data but maybe i miss something, and local file is also a possbility but lets continue after this meeting 09:35:50 can we save it in OVSDB ? 09:35:57 and i think we can come up with a good sequence 09:36:22 gampel: thats possible 09:36:36 gampel: hope we dont have to change schema for that, but i think there are optional params 09:36:43 that we can use in the bridge table 09:36:44 for example 09:36:47 for br-int 09:36:48 one more thing 09:37:01 then we do not have the consistency problem 09:37:23 gampel: there is no consistency problem either way (i dont see it) 09:37:38 its a cookie per controller 09:37:44 and only the controller read/write it 09:37:59 DuanKebo: yes? 09:38:04 on receiving feature reply message, apps will install some default flow entries. So we need to generate the new cookie before app installing these flow entries. 09:38:23 DuanKebo: good point 09:38:38 otherwise it will be cleared. 09:38:46 DuanKebo: but we can install these flows on demand instead 09:38:54 move them from the place they are now 09:39:00 Duankebo: those cookies should be after reliability sync 09:39:03 and just call this function explictly 09:39:10 yes, then we need to modify apps 09:39:22 DuanKebo: yep agreed on this part, good point 09:40:07 okie, lets continue after the meeting but i think we have solutions 09:40:12 those framework flow cookies should be trigger by ovsdb_monitor too 09:40:13 lets think about this a bit more 09:40:19 I think that we should take this feature to another specific meeting about this 09:40:28 ok 09:40:31 okay 09:40:36 hshan: yes, we can trigger it by controller 09:40:44 #topic selective proactive and ovsdb monitor 09:40:54 ok, not alot to talk about as this is merged already to master 09:40:59 gampel: any points on that? 09:41:12 still need to work on some exceptions and test code 09:41:13 DuanKebo: i still see some exceptions 09:41:31 I'll add more ut cases and fullstack cases for this feature. 09:41:38 and we need to test it some more all the possible flows 09:41:57 Will you please send them to me gampel. 09:41:58 shlomo: did you open a bug about the exceptions ? 09:42:09 Ore tell me where you see them. 09:42:16 yeah, we need to add bugs and triage them to DuanKebo 09:42:18 Ok will do 09:42:19 if we see on this 09:42:46 Everyone can join Dragonflow launchpad and join the bugs team to triage and edit bugs 09:42:52 Shlomo_N will be back in a minute 09:43:00 #topic bugs 09:43:15 yuli_s, oanson: got something for us on this? :) 09:43:18 I am the bug master today 09:43:33 i'm here 09:43:39 sure 09:43:45 I have reviewed opened annasigned bugs 09:44:06 we need a somebody to take charge for the RethinkDB related bugs 09:44:13 we have the gate kernel version problem with the ovs compile 09:44:17 i will take it 09:44:32 i will take the RethinkDB 09:44:32 https://bugs.launchpad.net/dragonflow/+bug/1527217 09:44:34 Launchpad bug 1527217 in DragonFlow "RethinkDB installation only works in Ubuntu" [Medium,New] 09:44:34 https://bugs.launchpad.net/dragonflow/+bug/1527970 09:44:35 https://bugs.launchpad.net/dragonflow/+bug/1530877 09:44:36 https://bugs.launchpad.net/dragonflow/+bug/1530288 09:44:36 Launchpad bug 1527970 in DragonFlow "rejoin-stack doesn't support RethinkDB" [Medium,In progress] 09:44:37 Launchpad bug 1530877 in DragonFlow "RethinkDB and RAMCloud installations as a service" [Medium,New] - Assigned to Eran Gampel (eran-gampel) 09:44:38 Launchpad bug 1530288 in DragonFlow "RethinkDB isn't deleting DB tables/entries after unstack" [Low,New] 09:44:55 we have few more interesting bugs 09:44:57 okie, RethinkDB or lower priority right now 09:45:04 stack.sh isn't working after using VxLAN https://bugs.launchpad.net/dragonflow/+bug/1555001 09:45:05 Launchpad bug 1555001 in DragonFlow "stack.sh isn't working after using VxLAN" [Medium,New] 09:45:12 It's not possible to assign a FIP for a VM on the default private network https://bugs.launchpad.net/dragonflow/+bug/1548725 09:45:13 Launchpad bug 1548725 in DragonFlow "It's not possible to assign a FIP for a VM on the default private network" [High,New] 09:45:24 VM connected to two different networks - https://bugs.launchpad.net/dragonflow/+bug/1557412 09:45:25 Launchpad bug 1557412 in DragonFlow "VM connected to two different networks" [High,New] 09:45:31 yes, i have to test everything locally. is there any temp solution for ovs compilation error? 09:45:47 omer was working on that 09:45:57 nick-ma: i think we can bring the code that install OVS 09:46:02 back to plugin.sh 09:46:12 it will be great if you can look at the bugs 09:46:14 instead of using Neutron compile_ovs code 09:46:31 gsagie, that won't help 09:46:37 oanson: why not? 09:46:38 omer any way is moving that code to create ovs services 09:46:44 The compilation is broken due to the kernel bump to 3.13.0-83 09:46:54 It worked on the previous version. 09:46:59 (3.13.0-79) 09:47:03 oanson: so you cant compile OVS on that kernel ? 09:47:05 yes, it's the image problem in infra. 09:47:06 can we install rpm that we make instead of compile 09:47:08 Yes 09:47:41 great 09:47:46 maybe we need to send this to OVS mailing list 09:47:46 gampel, we can try. But there's a danger that the API difference may break things 09:47:53 i think its really strange 09:47:59 gsagie, already done. 09:48:08 maybe we should take another version of OVS 09:48:08 i checked ovs. someone did that. still no reply from them. 09:48:08 http://openvswitch.org/pipermail/dev/2016-March/067987.html 09:48:15 maybe from another branch instead of using the master OVS 09:48:20 there is a OVS 2.5 branch 09:48:25 tag 09:49:18 we can try it. 09:49:32 we have another open issue with the IP spoofing 09:49:43 Gal has created a spec 09:49:50 and we have a bug for it 09:49:53 yuli_s: yes, dingboopt is working on it 09:49:57 https://bugs.launchpad.net/dragonflow/+bug/1536868 09:49:58 Launchpad bug 1536868 in DragonFlow "Dragonflow should prevent IP spoof" [Low,New] 09:50:12 ok 09:50:30 shlomo: did you open a BUG about the exceptions you got ? 09:50:53 yes, sure 09:51:11 please share 09:51:26 but I have closed it 09:51:47 Once I have ran git pull, it was disappeared 09:51:56 ok 09:52:25 ok, i think we are going to enter an intensive testing phase soon 09:52:29 after everything gets merged 09:52:35 #topic open discussion 09:52:41 @omer whats the status of moving the OVS daemons into services ? 09:52:46 nick-ma: anything to share? i saw the DB consistency got merged 09:53:07 yes, i'm already in the intensive testing phase, :-) 09:53:07 gampel, I am testing the modifications in the review 09:53:18 nick-ma: cool, good job :) 09:53:26 Once they pass, I will update the patch 09:53:27 and submit several reviews for some small bugs. 09:53:30 please review them. 09:53:31 :) i think we should all go into that mode 09:53:37 nick-ma, sometimes devstack can't startup 09:53:55 due to table dflockedobjects not found 09:54:18 weird. 09:54:27 i did not get it 09:54:47 I didn't get it as well 09:55:00 Neither did I. 09:55:41 i want to talk about degradation test & rally 09:55:47 sure 09:55:53 Duankebo: maybe you can describe it in detail via email or submit a bug for it, i will investigate it. 09:55:54 DuanKebo: maybe send logs to nick-ma 09:56:03 yes. 09:56:07 OK 09:56:11 the devstacklog 09:56:17 I have the stack trace 09:56:40 yuli_s: we have 1-2 minutes, anything new? 09:56:45 the degradation bugs was fixed in neutron: https://review.openstack.org/#/c/293976/ 09:56:52 hope it will be merged soon 09:57:04 good job yuli 09:57:11 I opened a bug / siggestion in rally 09:57:17 to add degradation tests 09:57:20 https://bugs.launchpad.net/rally/+bug/1558416 09:57:20 Launchpad bug 1558416 in Rally "Wishlist: performance degradation test" [Undecided,New] 09:57:43 hope they will pick it up and build an infrustructure 09:57:51 to run the degradation tests inr ally 09:58:11 yuli_s: about rally test, i replied the email to you. do you need me to work on the rally plugin? or do you have other plans? 09:58:49 nick-ma, lets wait few days 09:59:04 yuli_s, sure. 09:59:06 otherwise I will take it 09:59:22 and probably with your recommendation build something 09:59:42 thanks everyone 10:00:01 thanks everyone and see you all next week :) 10:00:09 thanks all, see you. 10:00:10 lets continue other discussions in #openstack-dragonflow 10:00:11 thanks ! 10:00:13 #endmeeting