14:07:39 <irenab> #startmeeting kuryr 14:07:40 <openstack> Meeting started Mon Nov 13 14:07:39 2017 UTC and is due to finish in 60 minutes. The chair is irenab. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:07:41 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:07:43 <openstack> The meeting name has been set to 'kuryr' 14:07:57 <irenab> hi, who is here for kuryr meeting? 14:08:19 <yboaron> Hi 14:09:31 <dulek> o/ 14:09:43 <irenab> yboaron, dulek hi 14:09:44 <leyal> o/ 14:09:53 <ltomasbo> o/ 14:10:28 <irenab> #info yboaron dulek leyal ltomasbo irenab in a meeting 14:10:36 <irenab> lets start 14:11:04 <irenab> apuimedo asked me to chair the meeting today, he is sick and still recovering from the trip to OS summit 14:11:39 <irenab> I suggest we skip kuryr and kuryr-libnetwork topics, unless you have something to update 14:12:16 <irenab> #topic kuryr-kubernetes 14:12:35 <irenab> dulek: do you want to update regarding CNI slip adventure? 14:12:57 <dulek> irenab: Sure! 14:13:15 <dulek> Sooo… ltomasbo found bugs that I was fixing most of the last week. 14:13:45 <dulek> Patches are ready again, bugs were related to incorrect assumption we had in Kuryr, that's why it took me whole week. 14:13:46 <ltomasbo> well, I would say I hit the bug, but dulek found it 14:14:28 <dulek> So this is the main patch: https://review.openstack.org/#/c/515186/, it's dependent on one more bugfix. 14:14:28 <irenab> dulek: the missing container_id check? 14:15:03 <dulek> irenab: Yes, but also some pyroute2 internal timeout issues. 14:15:23 <dulek> I hope apuimedo will be well tomorrow and we'll be able to start merging the patches. :) 14:15:26 <irenab> dulek: please share the link 14:15:46 <irenab> #link https://review.openstack.org/#/c/515186/ 14:16:06 <irenab> #action apuimendo irenab to review the https://review.openstack.org/#/c/515186/ 14:16:33 <dulek> irenab: Hm, I don't have a bug created for this pyroute2 issue as I think it only manifests in CNI daemon case - so fix for this one is included.. ;) 14:17:01 <irenab> dulek: so the same patch, right? 14:17:05 <dulek> Yes. 14:17:11 <irenab> great 14:17:37 <irenab> good that you and ltomasbo ran scale scenario 14:17:46 <dulek> irenab: https://review.openstack.org/#/c/515186/11/kuryr_kubernetes/cni/daemon/service.py@194 - if you're interested in how fix looks like. 14:18:09 <dulek> Yeah, thanks ltomasbo for running the tests! 14:18:24 <ltomasbo> np, glad to help! 14:18:36 <irenab> dulek: so the fix is to increase timeout? 14:18:50 <ltomasbo> and it was not that big scale, just a bunch of pods in a single server (around 20-30) 14:19:29 <dulek> irenab: More like to make it configurable. My kernel on a VM was just not fast enough to complete all the pyroute2 operations in default 5 seconds. 14:19:52 <irenab> I wonder if the timeout shouldn’t be dynamically adjustable 14:20:13 <dulek> irenab: In case a timeout will be hit - kubelet will retry, so with bug 1731485 fixed we'll be fine anyway. 14:20:13 <openstack> bug 1731485 in kuryr-kubernetes "Kuryr ignores CNI_CONTAINERID when serving requests" [High,In progress] https://launchpad.net/bugs/1731485 - Assigned to Michal Dulko (michal-dulko-f) 14:20:21 <irenab> dulek: proably need to add some advisory on how to setup the timeout 14:21:06 <dulek> irenab: To be honest - I don't really know how to do that myself. I was just trying a few values, 30 seconds were enough for ~50 pods being created. 14:21:29 <dulek> irenab: Plus this timeout doesn't mean we'll be always waiting that much. It's just the maximum wait time. 14:22:10 <irenab> dulek: I am just not sure we should expose some operator facing configs if we are not sure how to set them … 14:22:51 <dulek> irenab: I think I can add some documentation describing relationships between timeouts as we have multiple options now. 14:23:26 <irenab> good enough, lets keep reasonable default and document the relationship to enable tuning 14:23:37 <dulek> Will do! 14:23:46 <irenab> dulek: thanks a lot 14:24:30 <irenab> #action dulek to document the CNI daemon config timeout 14:25:01 <irenab> dulek: anything else you would like to raise? 14:25:11 <dulek> Nope, that's it, thanks! 14:25:40 <irenab> ltomasbo: anything from your side? 14:25:56 <ltomasbo> I have a patch about adding a readiness probe to the kuryr controller 14:26:03 <ltomasbo> when running in containerized mode 14:26:15 <ltomasbo> https://review.openstack.org/#/c/518502/ 14:26:30 <ltomasbo> dulek, already provided some reviews 14:26:43 <irenab> ltomasbo: this should prevent kuryr controller pod to be considered active till pools are populated? 14:26:53 <ltomasbo> it basically makes the controller not ready until the precreated ports have been loaded 14:27:09 <irenab> #link https://review.openstack.org/#/c/518502/ 14:27:28 <ltomasbo> irenab, yep, until the existing ports are loaded into their respective pools 14:27:41 <irenab> ltomasbo: any issue you want to discuss? 14:27:57 <irenab> #action everyone please review https://review.openstack.org/#/c/518502/ 14:28:06 <ltomasbo> I tested it and seems to work find 14:28:11 <ltomasbo> *fine 14:28:45 <dulek> ltomasbo: One question… Besides pod being not ready - does it affect if it accepts requests? 14:28:56 <irenab> ltomasbo: I wonder if there supposed to be single readiness probe per container or there can be few? 14:29:35 <ltomasbo> irenab, that I don't know, but probably there may be more than one, need to check that 14:29:49 <ltomasbo> irenab, are you thinking on another test to be added? 14:30:17 <irenab> ltomasbo: yes, maybe one will beneeded once Network Policies are supported 14:30:29 <ltomasbo> dulek, I assume it should not accept requests, but didn't check actually 14:30:50 <irenab> dulek: this is probably up to k8s to manage 14:31:17 <ltomasbo> irenab, worse case we can create a script that it to be executed and return a given value, and that could include several checkings 14:31:20 <dulek> ltomasbo: If it's before Watcher is started, then it won't. 14:31:30 <irenab> ltomasbo: +1 14:32:05 <ltomasbo> dulek, good question, I will double check that 14:32:07 <dulek> irenab: k8s can manage that if e.g. the Pod is added to a Service. But kuryr-controller have no API. 14:32:48 <irenab> there is the one for the tool to populate pools 14:32:50 <dulek> ltomasbo: It would be good to block annotating VIFs until we have all info recovered. But that's most likely already done. :) 14:33:24 <irenab> dulek: do you suggest internal controller state of being active? 14:33:46 <irenab> maybe a good idea 14:34:00 <ltomasbo> dulek, that was the intention with the readiness probe, but I just added the check, so probably not at the right place... :/ 14:34:12 <dulek> irenab: Yes, or more specifically - not to start any Watcher before all ports are recovered. 14:34:29 <yboaron> ltomasbo, is it probe per pod/resource or a global one ? 14:34:33 <dulek> ltomasbo: Yeah, so if that was the intention, then k8s will not manage that for you on it's own. 14:34:35 <yboaron> resource 14:34:56 <irenab> dulek: yes, moving to Active will ‘open’ the controller to external world 14:35:43 <irenab> yboaron: Pod (which is k8s Controller container) 14:35:51 <ltomasbo> irenab, dulek: I thought that was managed by kubernetes, if it was not ready, it will not receive requests 14:35:57 <dulek> irenab: Oh? So until it's ready pod will have no network connectivity? Then it will be unable to complete the recovery. 14:36:04 <ltomasbo> yboaron, yep, the kuryr-controller pod 14:36:19 <dulek> ltomasbo: Yes, but what requests? Controller watches on k8s API, doesn't do much more. 14:36:48 <ltomasbo> dulek, controller calls neutron to get/create the ports 14:37:02 <irenab> I think it can still work, but will allocate the ports via neutron 14:37:42 <dulek> ltomasbo: Sure thing, but how can k8s block that? Let me try to find this in k8s docs. 14:38:02 <irenab> ltomasbo: waht was your intention with the readiness probe? 14:38:16 <ltomasbo> dulek, irenab: "A pod with containers reporting that they are not ready does not receive traffic through Kubernetes Services." 14:38:48 <dulek> ltomasbo: Exactly. And we're not using Services for kuryr-controller, as it doesn't have an API. 14:39:12 <irenab> ltomasbo: so what it supposed to serve? 14:39:14 <dulek> ltomasbo: Where by Service I mean this: https://kubernetes.io/docs/concepts/services-networking/service/ - it's just a load balancer. 14:39:25 <ltomasbo> dulek, aren't we? 14:39:45 <ltomasbo> don't we have a service (lbaas) for the K8S API? 14:39:48 <irenab> ltomasbo: but k8s controller jst watches the events, it does not serve any API requests 14:40:06 <dulek> irenab: Right! :) 14:40:39 <irenab> ltomasbo: I wonder if it is realted to any further work of adding HA to the k8s controller 14:40:53 <ltomasbo> dulek, irenab: yep, but if kubernete sets the pod as not ready, it will not perform the API actions regarding the kuryr-controller (the needed annotations) 14:41:35 <dulek> ltomasbo: Can you check that? If it's true, then I'm totally wrong and should apologize. 14:41:59 <ltomasbo> no, most probably I'm wrong, I did not check that 14:42:09 <ltomasbo> I just assumed kubernetes was taking care of that 14:42:16 <ltomasbo> I will double check 14:42:25 <ltomasbo> thanks for pointing that out! 14:42:33 <irenab> ltomasbo: so you suggest that k8s will ignore the changes applied by k8s controller on k8s API ? 14:42:44 <irenab> till it is ready? 14:43:09 <ltomasbo> that is what I understood from this: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-readiness-probes 14:43:27 <irenab> #action ltomasbo check and update regarding readiness probe for k8s controller 14:44:07 <irenab> ltomasbo: thanks, lets follow up on this after the check 14:44:31 <irenab> leyal: do you want to update regarding Network Policy support? 14:44:42 <leyal> yes , 14:44:50 <leyal> i upload a spec - 14:45:00 <irenab> link? 14:45:02 <leyal> https://review.openstack.org/#/c/519319/ \ 14:45:18 <irenab> #link https://review.openstack.org/#/c/519319/ Network Policy Spec 14:46:00 <irenab> everyone please review the spec 14:46:22 <irenab> leyal: anything you want to raise now? 14:47:05 <leyal> lets to the discussion on the patch .. 14:47:22 <irenab> sure, thank you for the update 14:47:42 <leyal> it's contains a lot of details.. 14:48:00 <irenab> anything else on kuryr-kubernetes? 14:49:21 <irenab> #topic open discussion 14:49:47 <irenab> any other issue/topic to discuss? 14:50:21 <irenab> I hope next time we will have dmellado and apuimedo to share the inputs from the summit 14:51:10 <irenab> Well, I think we can close a meeting few munute earlier 14:51:28 <irenab> thanks everyone for joining 14:51:37 <irenab> #endmeeting