14:04:01 #startmeeting kuryr 14:04:02 Meeting started Mon Nov 6 14:04:01 2017 UTC and is due to finish in 60 minutes. The chair is irenab. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:04:03 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:04:05 The meeting name has been set to 'kuryr' 14:04:14 o/ 14:04:22 o/ 14:04:36 hi guys 14:04:54 anyone else for the weekly? 14:04:59 o/ 14:05:52 o/ 14:06:07 I guess dmellado and apuimedo won't join since they are at OS summit 14:06:25 #topic kuryr 14:07:06 Anyone have updates related to general kuryr? 14:07:43 moving on then 14:07:54 #topic kuryr-libnetwork 14:08:22 anything to discuss related to kuryr-libnetwork? 14:08:56 moving on 14:09:06 #topic kuryr-kubernetes 14:09:43 Shall I start with CNI daemon status? 14:09:50 dulek, yes, please 14:10:14 https://review.openstack.org/#/c/515186/ - initial patch has +2 from apuimedo and I think irenab is testing it. 14:10:37 I'm working on support for running kuryr-daemon containerized. I'll be pushing a patch soon. 14:10:40 dulek, correct, will finalize it today 14:10:46 And I'll need to update and rebase the documentation patch. 14:11:33 And that will be it! I'll need to do a few lower priority follow up patches that fix up corner cases and bugs that are now being visible when using CNI daemon. 14:11:42 dulek, any plans about the gate for cni split? 14:12:08 irenab: That's a very good question. How about I'll try to fix our tempest gates first? 14:12:16 Currently does are constantly failing. 14:12:26 dulek, :-), totally agree on priorities 14:12:41 Once we have gates functional it'll be easier to add it. :) 14:13:01 but we need to have gate to make sure its stable to switch cni split to default in devstack 14:13:20 irenab: I totally agree, thank you for reminding that. 14:13:39 and then probably deprecate the original support 14:14:01 dulek, thank you for the update 14:14:14 anyone who can test the patch, please do so 14:14:24 #link https://review.openstack.org/#/c/515186/ 14:14:38 sure, I'm happy to test the folow up one (mix containerized and split) 14:14:48 I've already tested cni split and works fine! 14:14:56 ltomasbo, perfect 14:15:21 ltomasbo, any update on stuff you are working on? 14:15:29 this is ready for reviews: https://review.openstack.org/#/c/510157/ 14:16:01 and I'm working on an OOM problem at ODL when using it with kuryr 14:16:17 OOM? 14:16:23 out of memory 14:16:36 on ODL or kuryr side? 14:16:49 in ODL mostly 14:17:03 but increase the chances due to being deployed with devstack 14:17:18 as the java memory is limited to 512MB (instead of 2GB) 14:17:35 should be fine from the kuryr side 14:18:04 I'm digging also in some problems (probably on kuryr side, but most probably on docker/kubernetes) 14:18:13 regarding containers taking long time to boot up 14:18:32 #action irenab apuimedo review the patch https://review.openstack.org/#/c/510157/ 14:18:39 long time until the first one is up, in the nested case 14:18:52 ltomasbo, including the image load or when image is local? 14:18:59 but still didn't find the culprit (though I know I'm affected by a couple of bugs) 14:19:10 irenab, once the image is already there 14:19:18 as well as the ports are present in the pool 14:19:35 so, it should be faster 14:19:49 and it takes more than a minute for the first ocntainer when booting 100 at once 14:19:58 interesting , any idea where time is spent? 14:20:22 digging a bit it seems I hit this: https://bugzilla.redhat.com/show_bug.cgi?id=1425278 14:20:23 bugzilla.redhat.com bug 1425278 in docker ""SELinux: mount invalid. Same superblock, different security settings for (dev mqueue, type mqueue)" error message in logs" [Urgent,New] - Assigned to dwalsh 14:20:24 I wonder if the CNI split may improve or it is on the controller side 14:20:35 and this https://bugzilla.redhat.com/show_bug.cgi?id=1267291 14:20:35 bugzilla.redhat.com bug 1267291 in openvswitch "[Openvswitch] balance-tcp bond mode causing issues with OSP Deployments" [High,Closed: currentrelease] - Assigned to nyechiel 14:21:01 and I disabled the os_vif.plug to test if that was also adding some time, but it was not 14:21:53 ltomasbo: Ah, commenting out os_vif.plug created an issue for me in OVS on baremetal case. 14:22:08 ltomasbo, please report the issue as kuryr bug 14:22:22 dulek, not an issue for the nested case 14:22:30 as the plug basically does a 'pass' 14:22:40 it removed it just to about the privsep thing 14:22:47 but it is not helping, so I set it back 14:22:51 ltomasbo: Okay, I would need to dig more to understand that. :P 14:23:14 irenab, I'm not sure it is a kuryr bug, I need to dig a bit more to figure out what to report... 14:23:32 irenab, and the OOM came on my way while debugging... 14:23:55 as soon as I understand a bit more about the issue, I'll open a bug! 14:24:00 ltomasbo, I wonder if this happens only for the bulk or on the single or lets say 2 Pods spawing 14:24:22 irenab, it is somehow proportional to the amount of pods being created 14:24:32 I have 3 worker VMs 14:24:49 and if I create 3 containres (on on each VM) it takes around 5-8 seconds to start the first one 14:25:03 if I create 30, it takes around 20-30 seconds to start the first one 14:25:13 and if it is 100, it takes around 70 seconds 14:25:27 so, my bet is on something we do for each container 14:25:42 and you sure its on kuryr side? 14:25:54 but not sure if it is at the controller (gettting the subnet information) or at the cni side 14:26:06 irenab, I'm not sure about that 14:26:13 it may not even be on kuryr side 14:26:41 I'll dig more during this week and let you know if I find it 14:26:44 I wonder if there is some scale impact in case of native k8s 14:26:50 ltomasbo, thanks! 14:27:18 irenab, it could be on k8s, yes 14:27:38 but we haven't seen that on the scale testing we did a couple of months ago 14:27:52 #action ltomasbo to investigate case with large number of containers and update on findings 14:27:54 perhaps it is related to the OOM that I was hitting 14:28:24 scale was with ovs and you see the issue with ODL? 14:28:29 so, it may well be ODL 14:28:30 ODL 14:28:42 scale test was done with OVN 14:28:51 I will try to see if can run similar test with Dragonflow 14:28:52 and I'm doing it with ODL 14:29:04 irenab, it would be great to test that 14:29:32 #action irenab try to run scale test for kuryr+dragonflow, nested 14:29:44 I can help you recreating my env if you need help (it was a devstack base multinode deployment 14:29:58 with 4 VMs (1 master + 3 workers) 14:30:22 ltomasbo, would appreciate your help. I guess you have some heat stack for that, right? 14:30:41 yep, I'm using a kuryr_heat_pike to create the VMs 14:31:07 ltomasbo, I will sync with you offline to get the details 14:31:10 and then an ansible-based script to install openshift on top of the VMs 14:31:12 sure! 14:31:13 thanks! 14:31:28 ltomasbo, thank you for the update 14:31:40 that's all from my side 14:31:49 leyal, would you like to update regarding network policy progress? 14:32:02 yes 14:32:26 please go ahead 14:32:38 I created (with a lot help from irenab) a draft for detailed-design for supporting network-poilcy , will be happy for reviews on that .. 14:32:55 https://docs.google.com/document/d/1GShzI4DemoraZdjnpZe9ug1GI9xgl3JcIyjnllTtQN4/edit?usp=sharing 14:33:07 #link https://docs.google.com/document/d/1GShzI4DemoraZdjnpZe9ug1GI9xgl3JcIyjnllTtQN4/edit?usp=sharing 14:33:39 leyal, any specific issues/questions you would like to discuss now? 14:33:58 Hope to upload patch with spec soon. 14:34:49 Lets discuses in the draft/spec(when it's will be ready) .. 14:35:02 great! I'll read it and try to provide some feedback 14:35:28 gdoc has very detailed information regarding the Network Policy support, so anyone who has some spare cycles please take a look before leyal uploads the rst 14:35:51 ltomasbo, thanks! 14:35:53 ltomasbo , thanks 14:36:24 anyone else on kuryr-kubernetes topics? 14:36:59 I can update about my progress with openshift route 14:37:14 yboaron, go ahead 14:38:12 started to work on integrating openshift route support with KURYR-K8S , I will share a design doc for review in the next few days 14:38:42 yboaron, openshift route is like Ingress Controller or something else? 14:38:57 irenab, right 14:39:35 yboaron, is there any launchpad bp for this? 14:40:30 I'll open one , in a very high level KURYR should translate route objects into lbaas L7-policy/pool resources 14:41:10 yboaron, great, looking forward to see the details 14:41:29 that's it , I will open a bp , and will share a design doc soon 14:41:57 I plan to fix the https://bugs.launchpad.net/kuryr-kubernetes/+bug/1723938 14:41:58 Launchpad bug 1723938 in kuryr-kubernetes "Cannot access service of LoadBalancer type " [High,New] - Assigned to Irena Berezovsky (irenab) 14:42:25 hope to get it fixed by next week 14:42:25 irenab, is that just a security group configuration? 14:43:21 yes, but seems to be done upon service creation but not in advance as with other sec. groups configuration 14:44:19 the fix should be quite trivial. And the funny thing it works without the fix with reference neutron implementation 14:44:27 ohh, true 14:44:38 now I remember 14:44:59 did you find out why it works with default ml2/ovs? 14:45:03 it is a bug? 14:45:11 irenab, same solution for ha-proxy and octavia ? 14:45:52 octavia sets proper SGs, so the additional SG configuration will be required only for HA Proxy 14:46:37 ltomasbo, I think I checked, but do not remember ... 14:46:49 xD 14:47:00 same here... maybe even you already mentioned on kuryr channel... 14:47:01 the issue is only when FIP is assigned for a vIP 14:47:23 ltomasbo, I will check, maybe the details are saved :-) 14:47:56 anything else for k8s support? 14:48:38 #topic open discussion 14:49:29 Well, looks like all of us are pretty occupied with k8s support :-) 14:50:16 xD 14:50:23 if no one has topic to discuss, I think we can close a meeting 14:50:50 thanks everyone for joining 14:51:02 #endmeeting