21:00:07 #startmeeting containers 21:00:08 Meeting started Tue Sep 25 21:00:07 2018 UTC and is due to finish in 60 minutes. The chair is strigazi. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:09 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:12 The meeting name has been set to 'containers' 21:00:19 #topic Roll Call 21:00:23 o/ 21:00:24 o/ 21:00:25 o/ 21:00:29 hello 21:00:54 jim is otw 21:01:45 it seems that flwang is not here 21:01:48 colin-: cool 21:02:02 agenda: 21:02:04 #link https://wiki.openstack.org/wiki/Meetings/Containers#Agenda_for_2018-09-25_2100_UTC 21:02:21 o/ 21:02:34 #topic Stories/Tasks 21:02:37 imdigitaljim: hello 21:02:59 I have put 4 items in the agenda 21:03:26 The 1st one is merged in rocky Fix cluster update command https://storyboard.openstack.org/#!/story/1722573 Patch in review: https://review.openstack.org/#/c/600806/ \o/ 21:03:51 very nice 21:04:07 no more broken stacks cause one char changed in the templates 21:04:27 And actually I want to mention the 4th one: 21:04:42 scale cluster as admin or other user in the same project 21:04:49 #link https://storyboard.openstack.org/#!/story/2002648 21:05:01 We have discussed this before, 21:05:23 and I think our only option is pass the public key as a string. 21:05:38 plus the patch from imdigitaljim to not pass a keypair at all 21:05:57 yeah this story wont be an issue for us 21:05:59 imdigitaljim: cbrumm you are not using keypairs at all 21:06:00 ? 21:06:04 correct 21:06:11 only sssd? 21:06:19 yeah 21:06:24 keypair is less secure as well 21:06:31 since if anyone gets access to said key 21:06:32 does this make sense to go upstream? 21:06:43 it is a ds right? 21:06:59 its fine to support it but we should consider the option for without 21:07:05 we could have a recipe we some common bits 21:07:21 without sssd? 21:07:49 yeah that would be good, an option that works as you need it to and an option that will not worry about it at all for usages like sssd 21:07:53 *we could have a recipe with some common bits 21:07:59 yup 21:08:15 ive noticed this issue occur in other cases too btw 21:08:22 like? 21:08:49 not with keys but just policy control flow 21:09:04 oh, right 21:09:38 we have a current issue where admin/owner A creates cluster in tenant A for user B, the user B cannot create a config file (using CLI/API) for that cluster because they are neither admin/owner 21:09:59 and user B belongs to tenant A as well 21:10:20 that is fixable in the policy file 21:10:20 we would like any users of tenant A be able to generate a config for clusters of tenant A 21:10:25 not in its current state 21:10:46 its an API enforcement issue where our issue sits 21:10:46 we have it, wihtout any other change 21:10:53 one sec 21:11:01 maybe share the policy, perhaps we're missing something :D 21:12:03 "certificate:create": "rule:admin_or_owner or rule:cluster_user", 21:12:06 "certificate:get": "rule:admin_or_owner or rule:cluster_user", 21:12:31 wwhat is your cluster_user rule 21:12:32 "admin_or_user": "is_admin:True or user_id:%(user_id)s", 21:12:32 "cluster_user": "user_id:%(trustee_user_id)s", 21:12:37 thats what we have 21:13:19 also: "admin_or_owner": "is_admin:True or project_id:%(project_id)s", 21:13:38 o/ 21:13:49 hey canori02 21:14:12 yeah thats what we have, i think theres a condition that doesnt get met somewhere and it fails the policy 21:14:21 ill have to find it, sorry it was a couple weeks ago 21:14:51 imdigitaljim: that is our policy, works for brtknr too 21:15:17 yeah, id like for it to work too :) 21:15:51 imdigitaljim: I'll double check in devstack too 21:16:14 ok, I have two more 21:16:41 This patch requires a first pass [k8s] Add vulnerability scanner https://review.openstack.org/#/c/598142 21:16:59 it was done by an intern, in the past months at CERN 21:17:20 it is a scanner to scan all images in a runnning cluster 21:17:35 combined with a clair serve 21:17:37 combined with a clair server 21:17:52 You can have a look and give some input 21:17:58 oh excellent 21:18:51 The first iteration works only for public images, in subsequent steps we can enhance it to work for private registies too 21:19:04 great! 21:19:16 yeah that could be really useful 21:19:24 looks good on everything but ill have some comments for the shell file 21:20:03 nice :) The last item, from me and ttsiouts is about nodegroups Nodegroups patches: https://review.openstack.org/#/q/status:open+project:openstack/magnum+branch:master+topic:magnum_nodegroups 21:20:26 yeah can we discuss that 21:20:32 im not sure what thats about/whats its purpose 21:20:33 We need to dig the spec and bring it up to date, but these patches are a kickstart 21:20:34 imdigitaljim: I'm drafting a spec for this 21:20:35 i couldnt follow 21:20:39 oh ok great thanks 21:21:16 cool 21:21:33 I'll try to have it upstream asap 21:21:34 atm the clusters are homogeneous, one AZ one flavor 21:21:54 oh is it for cluster node groups 21:21:57 i understand 21:22:00 strigazi: is this to provide the option to support different types of minions in the cluster? 21:22:03 distinctly 21:22:06 yes 21:22:08 i think i have some other thoughts too for the WIP 21:22:08 neat 21:22:41 From our side, 21:23:02 is to have minimum two groups of nodes 21:23:11 one for master one for minion 21:23:25 and then add as you go, like in GKE 21:23:35 in gke they call them nodepools 21:24:27 we don't have a strong opinon on the master nodegroups, but I think it is the most straight forward option atm 21:24:38 imdigitaljim: do you have some quick input 21:24:51 we can take the details in the spec 21:25:02 yeah a couple questions 21:25:24 so, is this intended to be runtime nodegroups or determined at creation time? 21:26:06 the first two nodegroups will be created at creation time and then the user will add more 21:26:22 like now 21:26:28 when you create a cluster 21:26:50 the heat stack has two resource groups, one for master one for minions 21:27:02 this can be the minimum 21:27:20 could you add a nodegroup to a cluster that was created without it? 21:27:23 at a later time? 21:27:36 the you call POST cluster/UUID/nodegroups and you add more 21:27:45 interesting 21:28:17 for this design i was thinking something more clever with leveraging heat more 21:28:18 https://docs.openstack.org/heat/latest/template_guide/hot_spec.html 21:28:23 colin-: it could be possible, but I'm not sure what is the benefit. IMO for this use case 21:28:30 if we update the minimum heat we could have a repeat for the # of pools 21:28:42 imdigitaljim: this is what we want to do ^^ 21:28:58 so like 1-N pools, and provide the data through template (for now) 21:28:59 imdigitaljim: not many stacks 21:29:14 pools/resourcegroups 21:29:30 a shallow nested stack 21:29:52 yeah 21:29:58 so where do all these controllers come into play 21:30:09 i dont see why these would be necessary to accomplish node pools 21:30:19 colin-: for this use case we could have the concept of extrnal groups or smth 21:30:41 ok 21:31:17 imdigitaljim: in the end it would be one stack. But end user, that don't know about heat need a way to express this 21:31:32 imdigitaljim: we need a route in the api 21:32:26 imdigitaljim: otherwise we need to do CRUD operations in a field or many fields in the cluster 21:33:00 have a nodegroup field that describes those pools/groups 21:33:34 oh is this part for the feedback for api/cli/ on what exists? 21:34:00 feedback/usage via cli/api? 21:34:39 I think I got the question and I'll say yes 21:34:43 :) 21:34:53 let me sit on it a little longer 21:35:01 and maybe if you can answer those questions from ricardo 21:35:10 ok 21:35:11 but if its that then i can better judge the PR :) 21:35:34 but i do think i understand what these PR's are now 21:35:46 :) 21:36:31 yeah 21:36:33 now i see 21:36:35 cool beans 21:36:37 looks about right 21:36:43 ill keep following it 21:36:47 thanks for clarifying! 21:36:53 :) 21:37:01 ttsiouts++ 21:37:11 :) 21:37:37 oh, I would like to add two more things 21:37:46 one is, for imdigitaljim 21:38:09 Do you have experience on rebooting cluster nodes? 21:38:32 yeah 21:38:35 somewhat 21:38:42 our experience is pretty unplesant with flannel 21:38:58 we've played a lot with killing and creating minions, rebooting is generally fine too 21:39:24 ^ 21:39:38 and also killing LB's and recoverying 21:39:39 with the current model of flannel, 30% of the nodes lose network 21:40:03 recoverying/recovering* 21:40:09 I hope that the slef-hosted flannel works better 21:40:18 yeah i feel like it would 21:40:44 i think you guys are doing the right thing switching to a self-hosted flannel imho 21:40:47 cbrumm: imdigitaljim your experience is with calico hosted on k8s, right? 21:40:49 or join us with calico 21:40:54 yeah 21:40:55 yeah 21:41:05 did you guys already consider that strigazi ? 21:41:08 we're using latest calico 3.3.9 21:41:09 must have at some point 21:41:30 calico has "just worked" for us 21:41:33 we sticked with what we know, no other reason so far 21:41:45 understood 21:41:46 cbrumm+1 21:42:00 but we must give it a go 21:42:09 it's nice not to deal with any layer 2 matters i have to say 21:42:21 we also have tungsten waiting in the corner and we kind of wait for it 21:42:22 been a relief for me personally from an operator perspective to use calico only 21:42:45 colin-: you use calico for vms too? 21:43:27 colin is with us 21:43:28 as much imdigitaljim and cbrumm do :) 21:43:57 oh, right :) 21:44:17 it is close to midnight here, sorry :) 21:45:26 the last thing is for people interested in Fedora CoreOS 21:46:06 I promised the FCOS team to try systemd-portable services for kubelet and dockerd/containerd 21:46:38 But I didn't have time so far, if anyone wants to help, is more than welcome 21:46:50 I'm fetching the pointer 21:47:30 not sure we'll have time to try it out 21:47:35 not sure we can aid with that yet but keep a finger on them for a minimal image ;) 21:47:36 might, but our timeline is tight 21:47:41 #link https://github.com/systemd/systemd/blob/master/docs/PORTABLE_SERVICES.md 21:48:01 strigazi: ill catch up on the literature 21:48:33 The goal is to run the kubelet a portable systemd service 21:49:08 oh i see 21:49:13 I just wanted to share with it with you 21:49:16 its super similar to the atomic install model already 21:49:18 eyah 21:49:21 ill read up some more 21:49:24 maybe canori01 is interested too 21:50:00 imdigitaljim: and should work in many distros (? or !) 21:50:13 yeah 21:50:35 its same pattern/benefits of containers 21:50:49 just rebranded/ slightly different 21:51:29 plus maintained by the systemd team 21:51:47 I think this is the right thing to look into 21:51:48 i can see kubelet being done fairly easily 21:51:57 but dockerd/containerd would be much more complicated 21:52:06 We'll all want to make sure it works well, but its the correct starting place 21:52:41 imdigitaljim: would it though? we managed to run dockerd in a syscontainer already 21:52:55 let's see 21:53:07 perhaps 21:53:20 maybe im thinking of something more complicated 21:53:23 and not this context 21:53:28 but ill check it out 21:53:33 do you have the dockerd in a syscontainer? 21:53:40 does it look like the dind project? 21:53:58 yes, for swarm, but we look to use it for k8s too 21:54:25 imdigitaljim: no, not like dind 21:54:45 imdigitaljim: https://gitlab.cern.ch/cloud/docker-ce-centos/ 21:55:35 oh ok 21:55:37 cool 21:55:40 and this works for you alreayd? 21:56:02 yes 21:56:28 for swarm for a year or so 21:56:46 i just personally dont have intimate knowledge of the dockerd requirements but if you've got it already it should be cake! 21:56:54 for k8s we didn't put a lot of effort, but for some tests it was fine 21:57:27 imdigitaljim: the only corner case can be mounting weird dirs on the host 21:57:30 yeah 21:57:40 thats where my complexities were concerned 21:57:44 imdigitaljim: our mount points are pretty much standard 21:57:48 weird dirs/weird mounts 21:58:20 ./weird permissions 21:58:21 interesting idea, would be curious to see how it's implemented for k8s and how kubelet reacts 21:58:30 imdigitaljim: we have tested mounting cinder volumes too 21:59:00 anyways yeah we'll keep an eye on it and catch up 21:59:40 imdigitaljim: colin- if dockerd and kubelet share the proper bind mounts it "Just Works" 22:00:02 nice 22:00:24 good to remember that does still happen in real life :) 22:00:29 (sometimes) 22:00:34 :) 22:00:34 'proper' :P 22:00:42 is the complexity 22:00:43 but yeah 22:01:00 need to go, bye everyone 22:01:07 we are and hour in 22:01:12 cbrumm: thanks 22:01:48 let's wrap then 22:02:04 Thanks for joining the meeting everyone 22:02:10 ttyl! 22:02:24 bye! 22:02:36 #endmeeting