20:59:26 #startmeeting containers 20:59:27 Meeting started Tue Feb 26 20:59:26 2019 UTC and is due to finish in 60 minutes. The chair is strigazi. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:59:28 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:59:30 The meeting name has been set to 'containers' 20:59:34 #topic Roll Call 20:59:36 o/ 20:59:40 o/ 20:59:43 o/ 20:59:43 o/ 21:00:00 o/ 21:00:43 #topic Stories/Tasks 21:00:55 1. openstack autoscaler 21:01:16 #link https://github.com/kubernetes/autoscaler/pull/1690 21:01:28 by far the highest number of comments in the repo 21:01:55 was glad to get a chance to review, any thoughts on how far off a merge is? 21:02:14 schaney: are you Scott? 21:02:18 will we want to wait on the magnum resize API? 21:02:24 flwang: yes thats me 21:02:28 I think it is close if we show aggrement to them 21:02:31 schaney: welcome to join us 21:02:39 from the openstack and magnum side 21:02:41 :) 21:02:58 Form openstack PoV should be ok, 21:03:04 schaney: we also need work in gophercloud for the new api, so i'm not sure if we can wait 21:03:13 strigazi: ^ 21:03:20 since they (CA maitainers) are ok to have two 21:03:52 What difference does it make to us? 21:03:53 strigazi: yep, we can get current one in, and propose the new magnum_manager 21:03:56 any thoughts on moving away from the API polling once the magnum implementation is complete? 21:04:03 no difference for Magnum 21:04:08 if we agree on the design, implementation and direction 21:04:25 schaney: we can do that too 21:04:46 awesome 21:05:06 we can leave that as a third step? 21:05:07 strigazi: can we just rename the current pr to openstack_magnum_manager.go 21:05:14 First merge like it is now 21:05:20 and refactor it as long as we have the new api 21:05:23 2nd add resixe api 21:05:33 and then remove polling 21:06:02 schaney: makes sense? 21:06:22 all this would happen in this cycle 21:06:22 sounds good, it will be easier to start tackling specific areas once it's out there 21:06:30 FWIW, i don't mind get current PR in with current status, and as long as the new /resize api ready, we can decide how to do in CA 21:07:14 if there are no more objections with the current implementation, we can push the CA team to merge 21:08:07 its not so much an objection but how will the cluster API stuff affect this? 21:08:12 Is the ip vs id vs uuid thing clear? 21:08:22 brtknr: not at all 21:08:38 the cluster api will be very different 21:08:48 I am not actually clear on how your templates create the IP mapping 21:08:54 like google has two implementations, one for gce and one for gke 21:09:10 schaney: are you talking about this https://review.openstack.org/639053 ? 21:09:37 schaney: flwang bare with me for the explanation, also this change needs more comments in the commit message ^^ 21:10:02 in heat a resource group creates a stack with depth two 21:10:33 the first nested stack kube_minions has a ref_map output 21:10:39 which goes like this: 21:10:45 0: 21:10:51 1: 21:10:52 and so on 21:11:29 These indices are the minion-INDEX numbers 21:11:48 and the indices in the ResourceGroup 21:12:10 A RG supports removal_poliies 21:12:12 A RG supports removal_policies 21:12:49 which means you can pass a list of indices as a param, and heat will remove these resources from the RG 21:13:12 I am not clear on what is using the change made in https://review.openstack.org/639053 atm 21:13:26 additionally, heat will track which indices have removed and won't create them again 21:13:31 brtknr: bare with me 21:13:40 so, 21:14:00 in the first imlementation of removal policies in the k8s templates 21:14:27 the IP was used as an id in this list: 21:14:36 0: private-ip-1 21:14:42 1: private-ip-2 21:14:55 (or zero based :)) 21:15:25 then it was changes with this commit: 21:15:54 https://github.com/openstack/magnum/commit/3ca2eb30369a00240a92c254c95bea6c7a60fee1 21:16:07 and the ref_map became like this: 21:16:26 0: stack_id_of_nested_stack_0_depth_2 21:16:32 1: stack_id_of_nested_stack_1_depth_2 21:16:59 and the above patch broke the removal policy of the resouce group 21:17:23 meaning, if you passed a list of ips to the removal policy after the above patch 21:17:50 heat wouldn't understand to whoch index in the RG that ip belonged too 21:18:01 that is why for flwang and schaney didn't work 21:18:14 gotcha 21:18:30 flwang: now proposes a change 21:18:37 to make the ref_map: 21:18:51 0: nova_server_uuid_0 21:18:55 1: nova_server_uuid_1 21:19:11 you can inspect this map in current cluster like this: 21:19:43 sorry i'm late 21:19:48 openstack stack list --nested | grep | grep kube_minions 21:20:03 and then show the nested stack of depth 1 21:20:07 you will see the list 21:20:14 you will see the ref_map 21:20:15 eg: 21:21:41 `openstack stack list --nested` is a nice trick! 21:21:43 til 21:22:04 http://paste.openstack.org/show/746304/ 21:22:22 this is with the IP ^^ 21:22:23 ive always done `openstack stack resource list k8s-stack --nested-depth=4 21:23:17 o/ 21:23:23 http://paste.openstack.org/show/746305/ 21:23:40 this is with the stack_id 21:23:49 \o 21:23:51 check uuid b4e8a1ec-0b76-48cb-b486-2a5459ea45d4 21:24:01 in the ref_map and in the list of stacks 21:24:24 i like the new change to uuid =) 21:24:54 imdigitaljim: yep, uuid is more reliable than ip for some cases 21:24:55 after said change, we will see the nova uuid there 21:25:24 so, in heat we can pass either the server uuid or the index 21:25:36 then heat will store the removed ids here: 21:26:24 http://paste.openstack.org/show/746306/ 21:26:33 makes sense? 21:27:36 sounds good to me 21:27:52 yep! the confusion on my end was the "output" relationship to removal policy member 21:28:30 and the nested stack vs the resource representing the stack 21:28:37 makes sense now though 21:29:03 I spent a full moring with thomas on this 21:29:51 do you need https://review.openstack.org/639053 for resize to work? 21:29:59 brtknr: yes 21:30:12 brtknr: to work by giving nova uuids 21:30:26 as it doesnt seem linked on gerrit as a dependency 21:30:29 in https://github.com/openstack/magnum/commit/3ca2eb30369a00240a92c254c95bea6c7a60fee1 the name for key is OS::stack_id does that need to change or will that be confusing if we use it for soething else 21:31:01 jakeyip: i don't think we have an option there 21:31:13 it needs to be explained well 21:31:15 yes, probably better to call it nova_uuid? 21:31:26 brtknr: because i assume https://review.openstack.org/639053 will be merged very soon 21:31:27 or OS::nova_uuid? 21:31:38 but the resize patch may take a bit long, sorry for the cofustion 21:31:46 brtknr: doesn't work nova_uuid or other name 21:31:55 not sure if OS::nova_uuid makes sense 21:32:04 not sure if OS::nova_uuid makes sense to heat 21:32:09 (to me it does) 21:32:39 oh okay! i didnt realise it was a component of heat 21:33:04 brtknr: needs to be double checked 21:33:29 the important part is that the ref_map I mentioned before has the correct content 21:33:43 https://docs.openstack.org/heat/rocky/template_guide/composition.html 21:33:53 sounds like we're stuck with OS::stack_id 21:34:29 yeap 21:34:36 we should move and discuss details on the patch 21:34:46 with comments in code, should be ok 21:35:18 strigazi: i will update the patch based on above discussion 21:35:24 schaney and colleagues, flwang brtknr we have agreement right? 21:35:43 +1 21:35:53 imdigitaljim: colin- eandersson_ ^^ 21:36:18 on the UID portion? 21:36:23 yeah UUID will work well for us 21:36:24 yes 21:36:26 lgtm 21:36:29 thanks for the clarity, uuid looks good and works for us 21:36:51 \o/ 21:37:15 I don't have objections but I would like to read the patch first, I am a bit confused whether stack_id is same as nova_uuid or can you get one from the other 21:37:33 jakeyip: they are different 21:37:53 but that stack_id logically corresponds the a nova_uuid 21:38:04 but that stack_id logically corresponds to that nova_uuid 21:38:19 if we can dervie nova_uuid from stack_id should we do that instead? 21:38:25 sounds good 21:39:06 jakeyip: well it is the other way round 21:39:20 jakeyip: derive the stack_id from the nova_uuid 21:39:36 well the stack contains the nova server 21:39:48 so it makes sense to use stack anywasy 21:39:52 jakeyip: the CA or the user will know which server wants to remove 21:40:00 I thought the stack will have a nova_server with the uuid 21:40:25 imdigitaljim: that is correct but the user or the CA will know the uuid 21:40:50 you mean the nova uuid? 21:40:54 strigazi:^ 21:40:56 yes 21:41:30 eg when you do kubectl descibe node you see the nova_uuid of the server 21:42:03 yeah but its in autoscaler, im missing why the user's knowledge even matters 21:42:14 jakeyip: also to be clear, the nova_uuid won't replace the stack uuid, the stack will still have its uuid 21:42:47 either way, im happy with the approach 21:42:50 yes. but can whichever code just looks for stackid and the OS::Nova::Server uuid of that stack ? 21:42:51 good choices 21:43:05 imdigitaljim: for example for the resize API, there are cases that a user won't to get rid of a node 21:43:23 oh yeah you're right, under `ProviderID: openstack:///231ba791-91ec-4540-a580-3ef493e36055` 21:43:23 ah fair point 21:43:25 good call 21:44:37 jakeyip: can you image the user frustration? additionally, in the autoscaler 21:44:55 the CA wants to remove nova_server with uuid A. 21:45:45 then the CA needs to call heat and show all nested stacks to find which stack this server belongs too 21:46:05 it saves a gnarly reverse lookup =) 21:46:15 certificate authority? 21:46:24 that's just code? I am more worried about hijacking a variable that used to mean one thing and making it mean another 21:46:25 sorry not following the thread of convo 21:46:28 colin-: cluster autoscaler? 21:46:29 maybe can be done with a single list? or with an extra map we maintain as stack output 21:46:32 oh 21:46:42 that might get tricky :) 21:46:49 CAS maybe haha 21:46:52 colin-: I got used to it xD 21:47:23 should we move to next topoic? 21:47:51 strigazi: i'd like to know the rolling upgrade work and the design of resize/upgrade api 21:48:08 could you include both server_id and stack_id in the output and use as a reference point? 21:48:09 is that a thing? 21:48:20 imdigitaljim: I don't think so 21:48:30 agree with flwang maybe we discuss this straight after meeting 21:48:37 i'm thinking if we should use POST instead of PATCH for actions and if we should follow the actions api design like nova/cinder/etc 21:48:39 would be interesting to test 21:48:47 if the heat update can target either or 21:48:58 just like if ip AND stackid are present 21:49:14 because then it would be trivial problem 21:49:32 it is a map, so I don't think so 21:50:11 flwang: what do you mean? 21:50:41 flwang: is there a review with this topic ? 21:50:53 strigazi: is the key it looks for OS::stack_id? 21:51:00 resize != upgrade 21:51:01 (im new to some of this convo) 21:51:06 https://storyboard.openstack.org/#!/story/2005054 21:51:42 imdigitaljim: the key is stack_id 21:51:51 kk 21:51:53 thanks 21:52:00 flwang: can you explain more the PATCH vs POST 21:52:09 strigazi: nova/cinder is using api like /action and including action name in the post body 21:52:32 flwang: we can do that 21:52:49 so two points, 1. should we use POST instead of PATCH 2. should we follow the post body like nova/cinder, etc 21:52:55 flwang: in practice, we won't see any difference 21:53:14 pointer on 2.? 21:53:29 strigazi: i know, but i think we should make openstack like a building, instead of building blocks with different design 21:53:33 well 21:53:38 following a more restful paradigm 21:53:45 patch is practically the only appropriate option 21:53:45 https://fullstack-developer.academy/restful-api-design-post-vs-put-vs-patch/ 21:53:47 strigazi: https://developer.openstack.org/api-ref/compute/?expanded=rebuild-server-rebuild-action-detail,resume-suspended-server-resume-action-detail 21:53:47 somethign liket his 21:54:11 imdigitaljim: openstack do have some guidelines about api design 21:54:33 but the thing i'm discusing is a bit different from the method 21:55:09 flwang: pardon my ignorance what is the difference between this and PATCH at https://developer.openstack.org/api-ref/container-infrastructure-management/?expanded=update-information-of-cluster-detail#update-information-of-cluster 21:55:57 jakeyip: we're going to add new api /actions for upgrade and resize 21:56:00 imdigitaljim: flwang I agree with flwang , we can follow a similar pattern than other projects. 21:56:20 i also agree following similar patterns as other projects 21:56:26 just making sure we understand them =) 21:56:53 imdigitaljim: thanks, and yes, we're aware of the http method differences 21:57:04 flwang: for resize it will be something in addition to original PATCH function ? 21:57:21 and here, upgrade/resize are really not normal update for the resource(cluster here) 21:57:23 personally I prefer patch, but for the data model we have, there is no real difference, at least IMO 21:57:25 flwang: although nova seems to use PUT for update rather than PATCH or POST 21:57:47 both resize and upgrade cases, we're doing node replacement, delete, add new, etc 21:58:03 brtknr: yep, but that's history issue i think 21:58:24 brtknr: also put is user for properties/metadata only 21:58:31 brtknr: also put is used for properties/metadata only 21:58:50 when we say PATCH, it's most like a normal partial update for the resource 21:59:02 but those actions are really beyond that 21:59:41 I might add that they are "to infinity and beyond" 21:59:56 strigazi: haha, buzz lightyear fans here 22:00:57 hmm i'd vote for PATCH but there is not much precedence in other openstack projects... i wonder why 22:00:58 I feel POST is good. PUT/PATCH is more restrictive. It's much easier to refactor POST into PATCH/PUT later if it makes sense later, but not the other way round 22:01:28 since we don't have a concrete idea of how it is going to look like POST let us get on with it for now 22:01:39 yep, we can discuss on patch 22:01:53 we're running out time 22:01:57 strigazi: ^ 22:02:04 yes, 22:02:29 just vert quickly, brtknr can you mention the kubelet/node-ip thing? 22:02:48 post makes sense for these scaling operations 22:02:55 but maybe patch if versions are updated or anything? 22:03:22 strigazi: yes, its been bugging me for weeks, my minion InternalIP keeps flipping between the ip addresses it has been assigned on 3 different interfaces... 22:03:46 I think we can drop the node-ip, since the certs has only one ip 22:03:56 I have a special setup where each node has 3 interfaces, 1 for provider network and 1 high throughput and 1 high latency 22:03:57 I think we can drop the node-ip, since the certificate has only one ip 22:04:19 however assigning node-ip is not working 22:04:25 whose poor app gets the high latency card XD? 22:04:41 colin-: low latency :P 22:04:43 sorry- understand we're short on time 22:05:12 i have applied --node-ip arg to kubelet and the ip doesnt stick, the ips still keep changing 22:05:47 the consequence of this is that pods running on those minions become unavailable for the duration that the ip is on a different interface 22:06:22 my temporary workaround is that the order that kube-apiserver resolves host is Hostname,InternalIP,ExternalIP 22:06:28 brtknr: I thought it might be simpler :) we can discuss it tmr or in storyboard/mailing list? 22:06:35 random question 22:06:36 It was InternalIP,Hostname,ExternalIP 22:06:39 https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/ 22:06:48 --address 0.0.0.0 22:06:53 do you bind a specific address? 22:07:06 imdigitaljim: yes, i bound it to node IP 22:07:10 gotcha 22:07:15 it was already bound to node ip 22:07:19 by default 22:07:21 just curious of how that is all done with multi-interface 22:07:50 personally curious how the kube-proxy or similar would handle such a setup and rule/translation enforcement etc 22:07:54 is there any reason why we cant do Hostname,InternalIP,ExternalIP ordering by default 22:07:58 https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/ 22:08:01 same with kube-proxy? 22:08:08 do you do stuff here different? 22:09:06 I havent touched kube-proxy settings because I couldnt find it 22:09:30 Feilong Wang proposed openstack/magnum master: [WIP] Support /actions/resize API https://review.openstack.org/638572 22:09:30 --bind-address 0.0.0.0 Default: 0.0.0.0 22:09:32 maaybe? 22:09:32 brtknr: /etc/kubernetes/proxy 22:09:36 check this out? 22:09:54 in magnum it has the default 22:10:01 which is all interfaces? 22:10:04 yes 22:10:07 shouldnt it be node only here/ 22:10:08 ? 22:10:15 for proxy? 22:10:30 i guess what hes doing with his interfaces 22:10:41 oh okay, i'll try adding --bind-address=NODE_IP 22:10:55 to /etc/kubernetes/proxy 22:11:12 im just curious i dont have a solution 22:11:17 failing that i'd try imdigitaljim suggestion of wildcarding it 22:11:20 but maybe worth a shot 22:11:20 just for troubleshooting 22:11:33 wildcarding? 22:11:34 oh, that may be the default my mistake 22:11:50 0.0.0.0/0 22:12:35 colin-: how would that help? 22:12:46 according to the docs, 0.0.0.0 is already the default 22:12:48 https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/ 22:12:58 for --bind-address 22:12:59 brtknr: colin- imdigitaljim let's end the meeting and just continue? 22:13:15 sure, this is not going to be resolved very easily :) 22:13:30 thanks 22:13:43 yeah im just throwing out ideas 22:13:50 maybe a few things to think/try 22:14:16 maybe to get brtknr unstuck 22:14:42 flwang: brtknr jakeyip schaney colin- eandersson imdigitaljim thanks for joining and for the discussion on autoscaler. 22:14:54 yeah thanks for clearing up some stuff =) 22:14:58 looking forward to the merge 22:15:11 :) 22:15:19 #endmeeting