18:00:07 <daneyon> #startmeeting container-networking
18:00:09 <openstack> Meeting started Thu Sep 24 18:00:07 2015 UTC and is due to finish in 60 minutes.  The chair is daneyon. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:00:10 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
18:00:12 <openstack> The meeting name has been set to 'container_networking'
18:00:16 <daneyon> Agenda
18:00:20 <daneyon> #link https://wiki.openstack.org/wiki/Meetings/Containers#Agenda
18:00:39 <daneyon> I'll give everyone a minute to review the agenda.
18:00:57 <daneyon> #topic roll call
18:01:01 <daneyon> o/
18:01:08 <hongbin_> o/
18:01:12 <eghobo> o/
18:01:14 <s3wong> o/
18:01:16 <gangil1> o/
18:01:19 <Tango> o/
18:02:16 <daneyon> Thank you hongbin s3wong gangil1 Tango for attending.
18:02:25 <daneyon> #topic Discuss discovery changes required for implementing Flannel in Swarm
18:02:45 <daneyon> This topic was discussed over irc yesterday and over the ML last week.
18:03:09 <daneyon> I just want to make sure everyone understands the issue with discovery for swarm.
18:03:34 <daneyon> Would you like me to provide a quick overview of the issue or does everyone understand it?
18:04:31 <gangil1> daneyon: I haven't read about it, so would go through it first and then ping you if I have any doubts.
18:04:40 <adrian_otto> o/
18:04:55 <daneyon> adrian_otto thanks for joining.
18:05:23 <daneyon> adrian_otto we are at topic: Discuss discovery changes required for implementing Flannel in Swarm
18:05:59 <daneyon> I wanted to take this time to make sure everyone understands the discovery issue and agree on the solution.
18:06:53 <daneyon> as part of this patch, swarm public discovery is removed.
18:06:54 <daneyon> #link https://review.openstack.org/#/c/224367/
18:07:38 <daneyon> instead swarm will use etcd for bootstrapping a swarm cluster.
18:08:18 <Tango> If later we find that we need other method of discovery for something else, would we run into the same situation?
18:08:28 <daneyon> keep in mind that flannel required etcd. flannel uses etcd for shared config amount flannel daemon's that run across nodes
18:08:51 <daneyon> Tango that is a good point
18:09:08 <daneyon> It seems like consul and etcd are the discovery kings
18:09:21 <daneyon> I filed a bp to address discovery from a bigger picture
18:09:34 <daneyon> make discovery more pluggable, configurable, etc..
18:09:45 <Tango> that would be good.
18:10:08 <daneyon> that is outside of my current focus of implementing the container network model across all bay types
18:10:56 <Tango> I think it's reasonable for now
18:11:57 <adrian_otto> agreed
18:11:58 <daneyon> hongbin_ eghobo or adrian_otto do you have any questions or concerns regarding discovery?
18:12:06 <hongbin_> no
18:12:08 <daneyon> adrian_otto thx
18:12:30 <daneyon> i'll wait 1 minute before moving to our next topic.
18:12:44 <daneyon> hongbin_ thanks for the feedback.
18:12:58 <eghobo> I will comment at review, I am still thinking
18:13:13 <daneyon> eghobo that makes sense. thanks.
18:13:18 <daneyon> #topic Review Swarm patch
18:13:35 <daneyon> I'll take a few minutes to cover the main points of the patch.
18:14:26 <daneyon> 1. Implements flannel for swarm bay types. We can now have containers run across multiple nodes and they can communicate with one another using flannel's overlay (UDP or VXLAN) network.
18:14:53 <daneyon> I have tested this multiple times using native tools
18:15:35 <daneyon> Does anyone have time to test the patch?
18:15:51 <daneyon> Back to the patch review
18:16:10 <hongbin_> I will if I find some time
18:16:25 <eghobo> daneyon: but user still can run without flanel?
18:16:26 <daneyon> I removed swarm public discovery. Instead swarm uses etcd to bootstrap the swarm cluster.
18:16:34 <Tango> Can you post the link?
18:16:43 <daneyon> I implemented etcd for swarm.
18:17:21 <daneyon> eghobo no, flannel is the default network-driver if one is not specified at the baymodel creation.
18:17:43 <eghobo> hmm, not sure I am agree with it
18:18:01 <daneyon> w/o flannel docker does not have the ability to communicate across hosts until libnetwork is implemented and you use the native overlay driver or a libnetwork remote driver
18:18:13 <eghobo> many people run Swarm and Mesos without special network
18:18:25 <daneyon> We will eventually add libnetwork as a magnum network-driver, but libnetwork is only supported in docker experimental.
18:18:41 <eghobo> only Kub is very strict at network side
18:18:58 <daneyon> swarm patch
18:19:00 <daneyon> #link https://review.openstack.org/#/c/224367/
18:20:03 <Tango> Thanks
18:20:09 <eghobo> daneyon: if my nodes can communicate docker/swarm will communicate
18:20:30 <eghobo> of course without any isolation
18:21:09 <daneyon> eghobo before this patch, container within a swarm bay type could not communicate across nodes unless you expose the container port to the host. this is because the swarm bay type was using docker legacy networking (docker bridge).
18:22:16 <Tango> So we are really enhancing current docker networking?
18:22:25 <eghobo> aha, got what you mean now, thx
18:22:41 <daneyon> eghobo swarm containers can not directly communicate with one another unless you do either 1. expose the container port to the host or 2. Use libnetwork overlay or remote driver that supports multi-host.
18:23:17 <adrian_otto> daneyon, I had inline comments with questions. None were answered.
18:23:29 <adrian_otto> I was particularly interested ina ll the repeated code
18:24:37 <daneyon> supporting native (i.e. not exposing ports to hosts) container-to-container communication is part of the Magnum Container Network Model. This is also where Docker Swarm is heading.
18:25:10 <daneyon> adrian_otto I submitted my latest patch just before this meeting. I plan to go back and address everyone's review comments.
18:25:13 <adrian_otto> also why did you take out ExecStartPost?
18:25:17 <eghobo> daneyon: flannel is value add no question, my concerns that Docker folks show Swarm without networking everywhere and user will get different experience with Magnum
18:25:52 <Tango> +1
18:25:57 <hongbin_> We could set different default network_driver per bay type
18:26:13 <hongbin_> k8s default to flannel, swarm default to something else
18:26:43 <eghobo> hongbin_: =1
18:27:17 <Tango> If user develops something on Magnum Swarm cluster and uses the networking capability here, when they bring their containers elsewhere, they may not work
18:27:24 <daneyon> eghobo libnetwork will be the preferred networking method for Docker Swarm. When it gets out of experimental, we will go through the process of adding the driver. Swarm will be the first bay type. UNless their is issue with the community, libnetwork will be the default net driver for swarm bay types
18:27:33 <adrian_otto> you should be able to specify —network-driver=none
18:27:43 <daneyon> hongbin_ beat me to the punch ;-)
18:27:47 <adrian_otto> and get the current setup with no networking for swarm if that's what you want
18:28:39 <Tango> Fair enough. We should write this up in a user guide so it's clear.
18:29:03 <hongbin_> yes, --network-driver=none make sense I think
18:29:09 <eghobo> daneyon: I agree about libnetwork but it's experimental too long from my point of view ;)
18:29:31 <eghobo> we have user who want service now ;)
18:29:35 <daneyon> Tango after the cnm gets delivered to all 3 bay types, I will work on the docs
18:30:34 <daneyon> eghobo then it comes down to what do we do with magnum? DO we want magnum to be stable or on the bleeding edge? It's my understanding that production ready was a top goal set by adrian_otto
18:30:35 <adrian_otto> I repeated my questions on patch set 4 and voted on the patch again.
18:30:56 <adrian_otto> production ready is key
18:30:57 <daneyon> If that's not the case, then let's use docker experimental instead and we can add libnetwork.
18:31:16 <daneyon> adrian_otto thx. I'll def address your comments.
18:31:18 <adrian_otto> we can offer optional features that use newer things, but we need to have the basics covered first.
18:32:27 <daneyon> adrian_otto each bay type requires a network driver. each bay type has a default network driver of flannel. This can and will change over time.
18:33:20 <daneyon> I am +1 for using stable relases of docker and other tools instead of experimental.
18:33:26 <eghobo> daneyon: does it mean you against --network-driver=none idea?
18:33:35 <daneyon> users want to start deploying containerized apps in OS
18:33:58 <daneyon> if we provide a low quality service, then users will write off the magnum project.
18:34:21 <daneyon> eghobo what does network-driver none do?
18:34:44 <eghobo> nothing
18:35:09 <daneyon> if we provide network-driver none to a k8s bay type, the result is a broken bay.
18:35:13 <eghobo> and Mesos/Swarm users confortable with this model
18:35:46 <daneyon> k8s will not work using legacy docker bridging.
18:35:49 <eghobo> it's just for Mesos and Swarm, we must have driver for Kub
18:36:58 <daneyon> eghobo if we default to flannel as the net-driver, users can still expose host ports, etc..
18:37:54 <eghobo> correct and I think it's good option for advance user
18:38:04 <daneyon> swarm is headed in the direction of multi-host networking, so I think we are getting in front of this. I foresee all coe's using multi-host networking.
18:38:30 <adrian_otto> daneyon, why do you think that swarm bays are broken without flannel?
18:38:31 <daneyon> If the community prefers to have an option for none, then it can be implemented.
18:39:40 <hongbin_> adrian_otto: I guess daneyon means k8s bay are broken without flannel
18:40:15 <daneyon> adrian_otto swarm bays work w/o flannel. Container-to-container communication with our current swarm requires either 1. Expose the container port to the host. or 2. Support a seperate network provider such as flannel, libnetwork, etc..
18:41:10 <adrian_otto> right. you can use docker with -v to expose the container port to the host
18:41:38 <daneyon> hongbin_ yes. k8s requires flannel. Future does not currently require a multi-host network provider, but signs indicate that one will be required in the future
18:41:48 <adrian_otto> the DEFAULT network driver per bay should be one that does not violate the principle of least surprise
18:41:52 <daneyon> i can;t comment on mesos b/c i don;t know enough about that bay type yet.
18:42:15 <daneyon> adrian_otto correct
18:42:20 <adrian_otto> so for now the default for swarm could actually be "none", and the user could enable networking by changing it to "flannel"
18:42:35 <adrian_otto> the default on k8s bays, should be "flannel"
18:42:43 <daneyon> adrian_otto ok
18:42:54 <adrian_otto> and if you think you know what you are doing you could set it to "none" and do something more exotic perhaps
18:42:57 <hongbin_> I guess the default for mesos is none as well
18:43:05 <adrian_otto> it mightbe away to simplify using k8s built-in features
18:43:13 <adrian_otto> hongbin_: yes
18:43:56 <adrian_otto> my apologies, but I have to depart here in a minute so I can not stay for the end.
18:44:09 <daneyon> I'll refactor the patch so flannel is not the swarm default
18:44:14 <eghobo> adrian_otto: kub doesn't have built-in features, we need flannel
18:44:42 <adrian_otto> eghobo: that is true today, but that's likely to change
18:45:05 <adrian_otto> I'm not arguing for disabling flannel for k8s bays. that's not the important point
18:45:15 <eghobo> ok, it looks like you know than us ;)
18:45:21 <adrian_otto> I'm more interested in providing a close-to-native experience for each COE
18:45:36 <daneyon> adrian_otto that makes sense
18:45:58 <daneyon> I'll update the patch
18:46:06 <adrian_otto> and as each COE evolves, we can follow the prevailing direction each heads
18:46:20 <daneyon> adrian_otto agreed
18:46:54 * adrian_otto waves
18:46:57 <adrian_otto> catch you next time
18:47:47 <daneyon> so then do we even need to implement flannel for swarm and mesos. I thought the original charter was to provide a native multi-host container networking solution for all bay types.
18:48:18 <Tango> I think it's a good option to have
18:48:32 <hongbin_> I think it is good to have, just not set it as default
18:48:50 <eghobo> +1, it's value add
18:48:51 <Tango> especially if libnetwork is heading that way
18:49:04 <daneyon> #action danehans to look into changing the default network-driver for swarm to none.
18:49:57 <daneyon> ok, then I think we agree on supporting flannel in swarm, but not using it for the default net driver
18:50:10 <daneyon> unless their are questions, lets move on.
18:50:18 <daneyon> and thanks for the good discussion.
18:50:26 <daneyon> #topic Review Action Items
18:50:30 * daneyon everyone who votes on the kuryr design spec to continue tracking the spec to completion by voting.
18:50:40 <daneyon> I see everyone voted.
18:50:57 <daneyon> thanks again for taking the time to review the kuryr spec and cast your vote.
18:51:02 * daneyon danehans to continue coordinating with gsagie on a combined kuryr/magnum design summit session.
18:51:14 <daneyon> I have not discussed this with gsagie
18:51:22 <daneyon> I will move this one forward.
18:51:31 <daneyon> #action danehans to continue coordinating with gsagie on a combined kuryr/magnum design summit session.
18:51:39 <daneyon> #topic Open Discussion
18:51:59 <daneyon> anyone have a topic to discuss?
18:52:17 <daneyon> ok
18:52:36 <daneyon> then I will close out our meeting.
18:52:45 <daneyon> thanks again for the great discussion
18:52:56 <daneyon> #endmeeting