14:07:39 <irenab> #startmeeting kuryr
14:07:40 <openstack> Meeting started Mon Nov 13 14:07:39 2017 UTC and is due to finish in 60 minutes.  The chair is irenab. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:07:41 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:07:43 <openstack> The meeting name has been set to 'kuryr'
14:07:57 <irenab> hi, who is here for kuryr meeting?
14:08:19 <yboaron> Hi
14:09:31 <dulek> o/
14:09:43 <irenab> yboaron, dulek hi
14:09:44 <leyal> o/
14:09:53 <ltomasbo> o/
14:10:28 <irenab> #info yboaron dulek leyal ltomasbo irenab in a meeting
14:10:36 <irenab> lets start
14:11:04 <irenab> apuimedo asked me to chair the meeting today, he is sick and still recovering from the trip to OS summit
14:11:39 <irenab> I suggest we skip kuryr and kuryr-libnetwork topics, unless you have something to update
14:12:16 <irenab> #topic kuryr-kubernetes
14:12:35 <irenab> dulek: do you want to update regarding CNI slip adventure?
14:12:57 <dulek> irenab: Sure!
14:13:15 <dulek> Sooo… ltomasbo found bugs that I was fixing most of the last week.
14:13:45 <dulek> Patches are ready again, bugs were related to incorrect assumption we had in Kuryr, that's why it took me whole week.
14:13:46 <ltomasbo> well, I would say I hit the bug, but dulek found it
14:14:28 <dulek> So this is the main patch: https://review.openstack.org/#/c/515186/, it's dependent on one more bugfix.
14:14:28 <irenab> dulek: the missing container_id check?
14:15:03 <dulek> irenab: Yes, but also some pyroute2 internal timeout issues.
14:15:23 <dulek> I hope apuimedo will be well tomorrow and we'll be able to start merging the patches. :)
14:15:26 <irenab> dulek: please share the link
14:15:46 <irenab> #link https://review.openstack.org/#/c/515186/
14:16:06 <irenab> #action apuimendo irenab to review the https://review.openstack.org/#/c/515186/
14:16:33 <dulek> irenab: Hm, I don't have a bug created for this pyroute2 issue as I think it only manifests in CNI daemon case - so fix for this one is included.. ;)
14:17:01 <irenab> dulek: so the same patch, right?
14:17:05 <dulek> Yes.
14:17:11 <irenab> great
14:17:37 <irenab> good that you and ltomasbo ran scale scenario
14:17:46 <dulek> irenab: https://review.openstack.org/#/c/515186/11/kuryr_kubernetes/cni/daemon/service.py@194 - if you're interested in how fix looks like.
14:18:09 <dulek> Yeah, thanks ltomasbo for running the tests!
14:18:24 <ltomasbo> np, glad to help!
14:18:36 <irenab> dulek: so the fix is to increase timeout?
14:18:50 <ltomasbo> and it was not that big scale, just a bunch of pods in a single server (around 20-30)
14:19:29 <dulek> irenab: More like to make it configurable. My kernel on a VM was just not fast enough to complete all the pyroute2 operations in default 5 seconds.
14:19:52 <irenab> I wonder if the timeout shouldn’t be dynamically adjustable
14:20:13 <dulek> irenab: In case a timeout will be hit - kubelet will retry, so with bug 1731485 fixed we'll be fine anyway.
14:20:13 <openstack> bug 1731485 in kuryr-kubernetes "Kuryr ignores CNI_CONTAINERID when serving requests" [High,In progress] https://launchpad.net/bugs/1731485 - Assigned to Michal Dulko (michal-dulko-f)
14:20:21 <irenab> dulek: proably need to add some advisory on how to setup the timeout
14:21:06 <dulek> irenab: To be honest - I don't really know how to do that myself. I was just trying a few values, 30 seconds were enough for ~50 pods being created.
14:21:29 <dulek> irenab: Plus this timeout doesn't mean we'll be always waiting that much. It's just the maximum wait time.
14:22:10 <irenab> dulek: I am just not sure we should expose some operator facing configs if we are not sure how to set them …
14:22:51 <dulek> irenab: I think I can add some documentation describing relationships between timeouts as we have multiple options now.
14:23:26 <irenab> good enough, lets keep reasonable default and document the relationship to enable tuning
14:23:37 <dulek> Will do!
14:23:46 <irenab> dulek: thanks a lot
14:24:30 <irenab> #action dulek to document the CNI daemon config timeout
14:25:01 <irenab> dulek: anything else you would like to raise?
14:25:11 <dulek> Nope, that's it, thanks!
14:25:40 <irenab> ltomasbo: anything from your side?
14:25:56 <ltomasbo> I have a patch about adding a readiness probe to the kuryr controller
14:26:03 <ltomasbo> when running in containerized mode
14:26:15 <ltomasbo> https://review.openstack.org/#/c/518502/
14:26:30 <ltomasbo> dulek, already provided some reviews
14:26:43 <irenab> ltomasbo: this should prevent kuryr controller pod to be considered active till pools are populated?
14:26:53 <ltomasbo> it basically makes the controller not ready until the precreated ports have been loaded
14:27:09 <irenab> #link https://review.openstack.org/#/c/518502/
14:27:28 <ltomasbo> irenab, yep, until the existing ports are loaded into their respective pools
14:27:41 <irenab> ltomasbo: any issue you want to discuss?
14:27:57 <irenab> #action everyone please review https://review.openstack.org/#/c/518502/
14:28:06 <ltomasbo> I tested it and seems to work find
14:28:11 <ltomasbo> *fine
14:28:45 <dulek> ltomasbo: One question… Besides pod being not ready - does it affect if it accepts requests?
14:28:56 <irenab> ltomasbo: I wonder if there supposed to be single readiness probe per container or there can be few?
14:29:35 <ltomasbo> irenab, that I don't know, but probably there may be more than one, need to check that
14:29:49 <ltomasbo> irenab, are you thinking on another test to be added?
14:30:17 <irenab> ltomasbo: yes, maybe one will beneeded once Network Policies are supported
14:30:29 <ltomasbo> dulek, I assume it should not accept requests, but didn't check actually
14:30:50 <irenab> dulek: this is probably up to k8s to manage
14:31:17 <ltomasbo> irenab, worse case we can create a script that it to be executed and return a given value, and that could include several checkings
14:31:20 <dulek> ltomasbo: If it's before Watcher is started, then it won't.
14:31:30 <irenab> ltomasbo: +1
14:32:05 <ltomasbo> dulek, good question, I will double check that
14:32:07 <dulek> irenab: k8s can manage that if e.g. the Pod is added to a Service. But kuryr-controller have no API.
14:32:48 <irenab> there is the one for the tool to populate pools
14:32:50 <dulek> ltomasbo: It would be good to block annotating VIFs until we have all info recovered. But that's most likely already done. :)
14:33:24 <irenab> dulek: do you suggest internal controller state of being active?
14:33:46 <irenab> maybe a good idea
14:34:00 <ltomasbo> dulek, that was the intention with the readiness probe, but I just added the check, so probably not at the right place... :/
14:34:12 <dulek> irenab: Yes, or more specifically - not to start any Watcher before all ports are recovered.
14:34:29 <yboaron> ltomasbo, is it probe per pod/resource or a global one ?
14:34:33 <dulek> ltomasbo: Yeah, so if that was the intention, then k8s will not manage that for you on it's own.
14:34:35 <yboaron> resource
14:34:56 <irenab> dulek: yes, moving to Active will ‘open’ the controller to external world
14:35:43 <irenab> yboaron: Pod (which is k8s Controller container)
14:35:51 <ltomasbo> irenab, dulek: I thought that was managed by kubernetes, if it was not ready, it will not receive requests
14:35:57 <dulek> irenab: Oh? So until it's ready pod will have no network connectivity? Then it will be unable to complete the recovery.
14:36:04 <ltomasbo> yboaron, yep, the kuryr-controller pod
14:36:19 <dulek> ltomasbo: Yes, but what requests? Controller watches on k8s API, doesn't do much more.
14:36:48 <ltomasbo> dulek, controller calls neutron to get/create the ports
14:37:02 <irenab> I think it can still work, but will allocate the ports via neutron
14:37:42 <dulek> ltomasbo: Sure thing, but how can k8s block that? Let me try to find this in k8s docs.
14:38:02 <irenab> ltomasbo: waht was your intention with the readiness probe?
14:38:16 <ltomasbo> dulek, irenab: "A pod with containers reporting that they are not ready does not receive traffic through Kubernetes Services."
14:38:48 <dulek> ltomasbo: Exactly. And we're not using Services for kuryr-controller, as it doesn't have an API.
14:39:12 <irenab> ltomasbo: so what it supposed to serve?
14:39:14 <dulek> ltomasbo: Where by Service I mean this: https://kubernetes.io/docs/concepts/services-networking/service/ - it's just a load balancer.
14:39:25 <ltomasbo> dulek, aren't we?
14:39:45 <ltomasbo> don't we have a service (lbaas) for the K8S API?
14:39:48 <irenab> ltomasbo: but k8s controller jst watches the events, it does not serve any API requests
14:40:06 <dulek> irenab: Right! :)
14:40:39 <irenab> ltomasbo: I wonder if it is realted to any further work of adding HA to the k8s controller
14:40:53 <ltomasbo> dulek, irenab: yep, but if kubernete sets the pod as not ready, it will not perform the API actions regarding the kuryr-controller (the needed annotations)
14:41:35 <dulek> ltomasbo: Can you check that? If it's true, then I'm totally wrong and should apologize.
14:41:59 <ltomasbo> no, most probably I'm wrong, I did not check that
14:42:09 <ltomasbo> I just assumed kubernetes was taking care of that
14:42:16 <ltomasbo> I will double check
14:42:25 <ltomasbo> thanks for pointing that out!
14:42:33 <irenab> ltomasbo: so you suggest that k8s will ignore the changes applied by k8s controller on k8s API ?
14:42:44 <irenab> till it is ready?
14:43:09 <ltomasbo> that is what I understood from this: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/#define-readiness-probes
14:43:27 <irenab> #action ltomasbo check and update regarding readiness probe for k8s controller
14:44:07 <irenab> ltomasbo: thanks, lets follow up on this after the check
14:44:31 <irenab> leyal: do you want to update regarding Network Policy support?
14:44:42 <leyal> yes ,
14:44:50 <leyal> i upload a spec -
14:45:00 <irenab> link?
14:45:02 <leyal> https://review.openstack.org/#/c/519319/ \
14:45:18 <irenab> #link https://review.openstack.org/#/c/519319/ Network Policy Spec
14:46:00 <irenab> everyone  please review the spec
14:46:22 <irenab> leyal: anything you want to raise now?
14:47:05 <leyal> lets to the discussion on the patch ..
14:47:22 <irenab> sure, thank you for the update
14:47:42 <leyal> it's contains a lot of details..
14:48:00 <irenab> anything else on kuryr-kubernetes?
14:49:21 <irenab> #topic open discussion
14:49:47 <irenab> any other issue/topic to discuss?
14:50:21 <irenab> I hope next time we will have dmellado and apuimedo to share the inputs from the summit
14:51:10 <irenab> Well, I think we can close a meeting few munute earlier
14:51:28 <irenab> thanks everyone for joining
14:51:37 <irenab> #endmeeting