#openstack-kuryr log

14:02:40 <apuimedo> #startmeeting k8s-kuryr part II
14:02:41 <openstack> Meeting started Wed Mar 23 14:02:40 2016 UTC and is due to finish in 60 minutes.  The chair is apuimedo. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:02:42 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:02:44 <openstack> The meeting name has been set to 'k8s_kuryr_part_ii'
14:03:03 <apuimedo> Welcome to another kubernetes integration meeting
14:03:11 <apuimedo> who is here for it?
14:03:18 <baohua> :) morning & evening
14:03:19 <mspreitz> o/
14:03:30 <baohua> o/
14:04:27 <apuimedo> gsagie: fawadkhaliq salv-orl_: ping
14:04:58 <apuimedo> let's hope we get at least 5 people :-)
14:05:09 <baohua> sure, let wait for a while
14:05:18 <mspreitz> BTW, banix told me he will join but about 20 min late
14:06:04 <apuimedo> baohua: so maybe my mind synced with banix when I told you I thought the meeting was in 30mins :P
14:06:18 <baohua> haha, maybe :)
14:06:33 <apuimedo> while we wait
14:06:54 <apuimedo> tfukushima and I have been working on the prototype based on python 3.4 asyncio
14:07:01 <apuimedo> for the api watcher
14:07:05 <apuimedo> (we call it raven)
14:07:21 <apuimedo> it watches the pod and the service endpoints
14:07:59 <apuimedo> and already adds data for the direct cni plugin to retrieve and use for plugging the container into the neutron provider
14:08:31 <gsagie> here
14:08:35 <baohua> great, and sorry i may lost some context, is it watching the k8s resource changing and trigger our plugin to work?
14:21:50 <apuimedo> ovs
14:21:51 <gsagie> "hard pinging" heh
14:21:57 <baohua> it works!
14:21:59 <apuimedo> banix: lurking, eh?
14:22:01 <baohua> hi banix
14:22:02 <mspreitz> gsagie: my design goal is to keep the iptables-based kube-proxy working, as we use Neutron.
14:22:07 <banix> they throw a rock at me
14:22:17 <gsagie> mspreitz: but which Neutron.. OVN ?
14:22:17 <banix> hi, sorry for being late
14:22:20 <apuimedo> banix: do you have to carry it up a hill then?
14:22:21 <banix> just joined
14:22:25 <apuimedo> gsagie: ovs
14:22:30 <banix> :)
14:22:38 <gsagie> apuimedo: so the reference implementation? ok thans
14:22:39 <gsagie> thanks
14:22:43 <mspreitz> gsagie: I think my design does not depend on what Neutron plugin / mechanism-drivers are being used.
14:22:45 <apuimedo> with quite a bit of other vendors it would not work with kube-proxy
14:23:29 <mspreitz> apuimedo: *which* kube proxy?
14:23:31 <gsagie> apuimedo, mspreitz: you need to somehow count on some sord of service chaining then, and have the kube-proxy as a Neutron port
14:23:37 <apuimedo> or it would require a bit of hacking around to make the k8s pod network available to the hosts (which is not a default in some neutron vendors)
14:23:58 <gsagie> mspreitz: or you have another plan?
14:24:05 <apuimedo> gsagie: yes. that is an option I considered. Not for kube-proxy. Rather for dns
14:24:20 <apuimedo> to put a port for the skydns in the neutron network
14:24:23 <mspreitz> woa, slow down, too many parallel threads and side topics.
14:24:33 <apuimedo> mspreitz: you are right
14:24:43 <apuimedo> let's start properly now that we have enough people
14:24:47 <mspreitz> My aim is to support the iptables-based kube-proxy, not the userspace based one.
14:24:58 <apuimedo> which topic do you want to cover with?
14:25:11 <apuimedo> let's do updates first
14:25:15 <apuimedo> #topic updates
14:25:30 <apuimedo> mspreitz: can you give us some update on your work?
14:25:42 <gsagie> mspreitz: but this can also be done in a namespace
14:25:48 <mspreitz> I added the requested examples to my devref and slightly updated the design.  I also have some open questions posted in the new k8s place for that.
14:25:56 <apuimedo> gsagie: we'll get to this later
14:26:00 <gsagie> okie
14:26:13 <apuimedo> mspreitz: you mean the google docs?
14:26:25 <apuimedo> if so, please link it here
14:26:33 <mspreitz> There is a k8s issue opened for discussing the network policy design: https://github.com/kubernetes/kubernetes/issues/22469
14:26:55 <apuimedo> #link https://github.com/kubernetes/kubernetes/issues/22469
14:27:00 <apuimedo> thanks mspreitz
14:27:29 <apuimedo> #link https://review.openstack.org/#/c/290172/
14:27:50 <mspreitz> The main design work has deferred the questions of access from outside the cluster, but I could not avoid it in doing a plausible guestbook example.
14:28:44 <baohua> is the problem on the lb side?
14:28:49 <apuimedo> I see that you added the examples, thanks
14:29:10 <mspreitz> The first problem with access from the outside is that the design so far has no vocabulary for talking about external clients.
14:29:12 <apuimedo> for those wondering from the last meeting
14:29:16 <apuimedo> https://review.openstack.org/#/c/290172/5..10/doc/source/devref/k8s_t1.rst
14:29:23 <apuimedo> #link https://review.openstack.org/#/c/290172/5..10/doc/source/devref/k8s_t1.rst
14:29:32 <baohua> thanks for the links
14:29:37 <apuimedo> this is the diff against the version that we spoke about the last time
14:30:53 <apuimedo> mspreitz: well, I guess they do assume that if your service defines externalIP or uses loadbalancer, it is externally accessible
14:31:26 <mspreitz> apuimedo: actually, that is contrary to the design approach that has been taken so far...
14:31:46 <mspreitz> Note, for example, there there is explicitly no usage of existing "Service" instances.
14:31:50 <apuimedo> mspreitz: you mean that by default it is not, and you'd have to add it
14:32:02 <baohua> i think if can access through nodeip, then the external should work with lb support.
14:32:26 <mspreitz> I mean that the design approach has been one of orthogonality, the network policy has to say what is intended, no implicit anything except health checking.
14:32:58 <apuimedo> mspreitz: oh, sure. I was talking about what people may be doing provisionally, while we lack the vocabulary :P
14:33:13 <mspreitz> I have not been thinking much about node IP, since my context is a public shared service that will not be offering node IP as an option.
14:33:53 <apuimedo> only node port then?
14:34:01 <baohua> oh, sure, that's the case
14:34:30 <mspreitz> I am focused on the case of network policies allowing connections to a pod's 'cluster IP' address.
14:35:05 <baohua> for external clients?
14:35:15 <apuimedo> mspreitz: do we have any news on the policy language front?
14:35:27 <apuimedo> (from the k8s-sig-network side)
14:36:03 <mspreitz> I am including the problem of external clients.  Clearly they have to have an IP route to the cluster IP addresses.  As do the pod hosts (minions, nodes), for health checking.
14:36:28 <mspreitz> Configuring external clients has to be beyond the scope of this code, but it has to be something that can be done relatively easily.
14:37:10 <mspreitz> My thought is to put the k8s pods on Neutron tenant networks connected to a Neutron router connected to an "external network" (in Neutron terms).
14:37:26 <apuimedo> mspreitz: well. for what is worth
14:37:30 <apuimedo> what we do is
14:37:34 <mspreitz> That establishes a path, and naturally all the right routing table entries have to be in the right places.
14:37:45 <apuimedo> Pods -> tenant network -> router <- cluster ip network
14:38:06 <apuimedo> and the cluseter ip network is where we put LBs that go into the pods (we do not use kube-proxy)
14:38:15 <mspreitz> IIRC, "cluster IP" is the kind of address a pod gets, not a host.
14:38:41 <apuimedo> cluster IP is the IP that brings you to a replica of a pod
14:38:54 <apuimedo> in one host it takes you to one replica, in another, to another
14:39:09 <apuimedo> that's why we made it the VIP of the load balancer that we put in front of the pods
14:39:24 <mspreitz> If I understand the terminology correctly, an RC manages several "replicas", each of which is a "pod".
14:39:34 <apuimedo> then, for external access, the router is connected to a neutron external net
14:39:44 <apuimedo> and we can assign FIPs to the VIPs of the load balancers
14:39:51 <apuimedo> mspreitz: that is right
14:40:13 <apuimedo> so, cluster ip -> pod_x
14:40:32 <apuimedo> where pod_x may be any of the pods that are replicas
14:40:48 <apuimedo> and kube-proxy handles that with its iptables fiddling
14:40:58 <mspreitz> apuimedo: are you saying that a given cluster IP is had by several pods (one on each of several hosts)?
14:41:01 <apuimedo> (managing the cluster ip as a sort of a vip)
14:41:31 <apuimedo> mspreitz: that's what I saw looking at the iptables
14:41:35 <apuimedo> of a deployment
14:41:48 <apuimedo> generally, it was redirecting the cluster ip to a pod in the same host
14:41:53 <mspreitz> apuimedo: are you trying to report a fact about kubernetes?
14:41:59 <apuimedo> I don't know what they do if there is no replica in that specific host
14:42:09 <apuimedo> mspreitz: I'm just stating what I saw
14:42:23 <mspreitz> was this using the userspace kube-proxy or the iptables-based one?
14:42:29 <apuimedo> in hopes that it gives some context as to why we use the cluster ips as VIPs of neutron LBs
14:42:45 <baohua> clusterip is virtual, only meaningful with kube-proxy rules to do the translation to real address
14:43:01 <mspreitz> slow down, let apuimedo answer
14:43:02 <apuimedo> iptables based one IIRC. But I can't confirm it. tfukushima set it up, I only looked around
14:43:36 <mspreitz> apuimedo: I think you may have confused "cluster IP" and "service IP".  The service IPs are virtual, the cluster IPs are real; each cluster IP is had by just one pod.
14:43:46 <apuimedo> but what I saw was, cluster ip only defined in iptables redirects
14:44:14 <mspreitz> apuimedo: so the virtual IP addrs you saw were NOT the addrs that each pod sees itself as having, right?
14:44:18 <banix> i think that is the service ip
14:44:24 <apuimedo> mspreitz: right
14:44:44 <mspreitz> apuimedo: you are saying "cluster IP" where you mean what is actually called "service IP".
14:45:09 <apuimedo> mspreitz: maybe. I've been known to confuse names in the past. I'll try to check it
14:45:11 <baohua> sorry, mspreitz, i think we only have the clusterIP term.
14:45:24 <apuimedo> I only recall cluster ip
14:45:26 <baohua> service ip was the past
14:45:30 <mspreitz> "service IP" and "cluster IP" are distinct concepts
14:45:57 <mspreitz> baohua: if you list services, does each one have an IP address?
14:46:27 <baohua> yes, can u give some link to the concept doc? I've only seen the clusterIP
14:46:45 <apuimedo> mspreitz: https://coreos.com/kubernetes/docs/latest/getting-started.html
14:46:57 <apuimedo> look at the "service_ip_range"
14:47:10 <apuimedo> "Each service will be assigned a cluster IP out of this range"
14:47:25 <mspreitz> baohua: I can cite http://kubernetes.io/docs/user-guide/#concept-guide but it is not 1.2
14:47:30 <mspreitz> apuimedo: exactly...
14:47:32 <apuimedo> which is distinct from POD_NETWORK
14:47:39 <mspreitz> exactly.
14:47:44 <apuimedo> pods don't get cluster_ips
14:47:54 <apuimedo> cluster ips are for services
14:47:56 <mspreitz> that POD_NETWORK thing configures the range for cluster IP addresses.
14:48:12 <apuimedo> nope
14:48:19 <apuimedo> SERVICE_IP_RANGE=10.3.0.0/24
14:48:25 <mspreitz> o gosh, I see the verbiage there
14:48:25 <apuimedo> is for cluster ip addresses
14:48:39 <baohua> yes, that's what i think
14:48:42 <mspreitz> fine, so let's say "pod IP" for the kind of address that a pod gets.
14:48:51 <apuimedo> k8s naming conventions are bringing headache to everybody :-)
14:49:03 <apuimedo> right
14:49:09 <baohua> there're 3 types of ip: node, pod and clusterIP
14:49:21 <baohua> node is for the physical server, pod is for pod, and clusterIP for service
14:49:30 <apuimedo> right
14:49:34 <mspreitz> could we please say "service IP" for those virtual ones?
14:49:43 <baohua> oh, on,pls
14:49:53 <baohua> as this term was utilized in old release
14:49:57 <baohua> very confusing
14:49:58 <mspreitz> Anyway, back to the design.
14:50:06 <apuimedo> mspreitz: it will end up being more confusing, as people will check the current reference
14:50:07 <baohua> pls unify to clusterIP
14:50:16 <apuimedo> cluster ip is for the service
14:50:21 <mspreitz> My aim is to support the iptables-based kube-proxy
14:50:33 <apuimedo> and that's why we map it to a VIP of Neutron LBs when not using kube-proxy
14:50:50 <apuimedo> mspreitz: for supporting iptables-based-proxy
14:51:11 <apuimedo> can you refresh my memory on what it does with the VIP (since I'm not sure I looked at the userspace one or not)
14:51:16 <apuimedo> with the one I looked at
14:51:38 <apuimedo> It should be enough that the host can route into the neutron network that we use as POD_NETWORK
14:52:00 <apuimedo> (of course, depending on vendor, but probably with ovs it would work)
14:52:22 <mspreitz> apuimedo: we may have another terminology confusion.  When you say "kube proxy", do you mean specifically the userspace one and NOT the  iptables-based one?
14:53:06 <apuimedo> mspreitz: I mean the one I experienced (I only saw tons of iptables redirect rules, so I assume that it was the iptables-based one)
14:53:18 <apuimedo> that's why I asked what you see in your deployments
14:53:26 <apuimedo> how does it map the cluster ip to a pod
14:53:39 <mspreitz> apuimedo: the userspace based kube-proxy also uses iptables entries.
14:53:48 <apuimedo> to be able to know if my understanding is from one or the other :P
14:54:54 <mspreitz> In the userspace proxy, on each pod host, for each service, there is an iptables rule mapping dest=service & port=serviceport to dest=localhost & port=localaliasofthatservice
14:55:07 <mspreitz> There is a userspace process listening there and doing the loadbalancing.
14:55:55 <apuimedo> and in the new one?
14:56:09 <mspreitz> In the iptables-based kube-proxy, on each host, for each service, there is an iptables rule matching dest=service & port=serviceport and jumping to a set of rules that stochastically choose among the service's endpoints.
14:56:33 <apuimedo> mspreitz: and you want to support the latter, right?
14:56:39 <mspreitz> apuimedo: right
14:56:49 <mspreitz> Because it does not transform the client IP address.
14:57:05 <mspreitz> So the translation from network policy statement to security group rules is pretty direct.
14:57:21 <apuimedo> mspreitz: it would be very useful if you could post in the ML and/or devref examples of those chains that it uses for choosing
14:57:40 <mspreitz> OK, I'll find a way to do that.
14:57:42 <apuimedo> so that we can figure out how best to support this new kube-proxy backend
14:58:26 <apuimedo> #topic others
14:58:32 <apuimedo> any other topic in the last two minutes?
14:59:04 <apuimedo> it may feel like we didn't get a lot decided, but I am very happy to have converged in our understanding of concepts and vocabulary
14:59:23 <banix> good
14:59:34 <mspreitz> I am not sure we are accurately converged on "cluster IP"
14:59:55 <apuimedo> mspreitz: you'll get to love to the new name with time
15:00:25 <apuimedo> it's like a that shoe that is a bit rough but that with time gets familiar and you keep using it out of habit
15:00:43 <banix> looks like this is the new name for what used to be called service ip. if i understand it correctly.
15:00:43 <apuimedo> I'll never like the overloading of the name "namespaces" though
15:00:50 <apuimedo> banix: that's right
15:01:14 <apuimedo> I don't know what is it nowadays with people changing names and ips all the time
15:01:15 <baohua> +1
15:01:25 <apuimedo> anything else, then?
15:01:28 <gsagie> heh
15:01:31 <mspreitz> not from me
15:01:35 <gsagie> nope
15:01:38 <baohua> nope
15:01:42 <apuimedo> let's meet next week?
15:01:54 <gsagie> sure
15:01:57 <baohua> sure, see u then
15:01:59 <banix> 14:30 UTC?
15:02:10 <banix> if on Wednesday
15:02:11 <mspreitz> OK with e
15:02:11 <apuimedo> #info next week 14:30utc k8s-kuryr meeting
15:02:18 <apuimedo> #endmeeting