16:01:02 <adrian_otto> #startmeeting containers
16:01:03 <openstack> Meeting started Tue Oct 18 16:01:02 2016 UTC and is due to finish in 60 minutes.  The chair is adrian_otto. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:01:04 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:01:07 <openstack> The meeting name has been set to 'containers'
16:01:27 <adrian_otto> #link https://wiki.openstack.org/wiki/Meetings/Containers#Agenda_for_2016-10-18_1600_UTC Our Agenda
16:01:30 <adrian_otto> #topic Roll Call
16:01:31 <strigazi> Spyros Trigazis o/
16:01:32 <Drago> o/
16:01:32 <adrian_otto> Adrian Otto
16:01:36 <randallburt> o/
16:01:36 <muralia> o/
16:01:37 <jvgrant> Jaycen Grant
16:01:50 <tonanhngo> Ton Ngo
16:01:57 <Fengshengqin1> Fengshengqin
16:02:23 <adrian_otto> hello strigazi Drago randallburt muralia jvgrant tonanhngo Fengshengqin1
16:02:40 <muralia> Happy Tuesday everyone
16:03:28 <hongbin> o/
16:03:35 <vijendar1> o/
16:03:47 <adrian_otto> hello hongbin and vijendar1
16:04:08 <adrian_otto> let's begin
16:04:14 <adrian_otto> #topic Announcements
16:04:24 <adrian_otto> 1) Reminder: There will be no team meeting next week on 2016-10-25 because that is the week of the OpenStack Summit in Barcelona.
16:04:33 <swatson_> forgot to check in o/
16:04:40 <adrian_otto> Our Next meeting will be on 2015-11-01 at 1600 UTC.
16:04:44 <adrian_otto> welcome swatson_
16:05:01 <adrian_otto> 2) Reminder: Please review our NodeGroup spec draft to prepare for our Summit discussion on this topic:
16:05:07 <adrian_otto> #link https://review.openstack.org/352734 [WIP] Add NodeGroup specification
16:05:35 <adrian_otto> this is a key topic of collaboration for our upcoming summit, and we are planning to share our plans with the community, so we'd like to be well prepared.
16:05:49 <adrian_otto> any other announcements from team members?
16:06:31 <swatson_> ^ the client-side stuff for the ID blueprint is almost ready
16:06:39 <swatson_> https://review.openstack.org/#/c/383930/
16:07:03 <adrian_otto> oh, great, swatson_!
16:07:21 <swatson_> adrian_otto: Thanks! More eyes on it would be good :)
16:07:35 <adrian_otto> #topic Review Action Items
16:07:37 <adrian_otto> 1) adrian_otto follow up with Kuryr PTL to arrange a joint session
16:07:43 <adrian_otto> Status: Follow-up complete, coordination is in progress
16:07:56 <adrian_otto> I'll follow up with an ML message with further details
16:08:27 <tonanhngo> They have allocated their first session on Thursday for Magnum
16:08:31 <adrian_otto> #action adrian_otto to email Magnum/Kuryr teams with details about join meetup plans prior to summit.
16:08:44 <adrian_otto> then my action will be really easy!
16:08:50 <adrian_otto> thanks tonanhngo
16:08:56 <adrian_otto> 2) adrian_otto to remove BM Blueprint from Essential BP Review on team agenda
16:09:00 <adrian_otto> Status: COMPLETE
16:09:10 <adrian_otto> 3) adrian_otto to remove Docs Blueprint from Essential BP Review on team agenda
16:09:11 <adrian_otto> Status: COMPLETE
16:09:18 <adrian_otto> 4) strigazi to create a magnum-specs repo
16:09:44 <strigazi> I was sick from Thursday, I'll do it this week
16:09:58 <adrian_otto> oh, are you feeling better yet?
16:10:03 <strigazi> yeap
16:10:09 <adrian_otto> oh, good.
16:10:25 <adrian_otto> I'll carry that one forward
16:10:30 <strigazi> thanks
16:10:34 <adrian_otto> #action strigazi to create a magnum-specs repo
16:10:42 <adrian_otto> #topic Blueprints/Bugs/Reviews/Ideas
16:10:53 <adrian_otto> Do any team members have remarks for this section?
16:10:57 <Drago> adrian_otto: Are we going to discuss the NodeGroup spec during this meeting?
16:11:11 <adrian_otto> Drago: yes, let's cover that now
16:11:20 <Drago> Thanks
16:11:44 <adrian_otto> #link https://review.openstack.org/352734 [WIP] Add NodeGroup specification
16:12:06 <strigazi> Add information about how Heat stacks are handled for Clusters and NodeGroups
16:12:23 <strigazi> I think we can discuss this one.
16:12:34 <jvgrant> +1, this will be a significant change
16:13:02 <Drago> +1, my current thought is to have a stack for the Cluster, and 1 stack per NodeGroup
16:13:06 <adrian_otto> Our plan is to use a series of nested stacks, correct?
16:13:24 <strigazi> I had a private discussion with Drago on that, we were on a good path. Drago do you want to explain?
16:13:26 <randallburt> Drago:  same question ^ as nested stacks or completely separate?
16:13:36 <Drago> *Nested* stacks are going to be difficult to manage, because nested means they're inside another template
16:13:43 <strigazi> Also
16:13:59 <strigazi> nested stack resources are our main heat bottlenck
16:14:32 <randallburt> iirc, the heat team is working on that. triple0 has similar issues
16:14:36 <strigazi> This part is not parallel and scales linearly
16:14:48 <adrian_otto> that's a bottleneck when creating a cluster (or creating a node group in the future), correct strigazi ?
16:15:18 <Drago> If we have a Cluster stack and a stack per nodegroup, it would not exacerbate the problem
16:15:38 <strigazi> yes, when heat validates all nested stack *resources* after creating the masters and beforae creating the workers
16:15:40 <Drago> I am not sure if the stacks being completely independent would help or not
16:16:18 <tonanhngo> Drago: How would resources be referenced across the separate stacks?
16:16:23 <randallburt> if you're running heat with a single engine and worker, you won't see much difference either way
16:16:56 <strigazi> randallburt, ^^ what do you mean?
16:17:03 <adrian_otto> randallburt: what if the heat service has been scaled across numerous hosts?
16:17:29 <randallburt> then there should be no major difference between nested or multiple adrian_otto
16:17:44 <adrian_otto> the question I think we are after is if magnum can arrange heat's work so that it scales better than linear in therms of work complete by number of heat engines.
16:17:56 <randallburt> the validation itself would take longer with nested though
16:18:11 <Drago> tonanhngo: The properties set for the current nested stacks (in the top level template e.g. kubecluster.yaml) would be converted to outputs. Then magnum could pick them up and pass them as parameters to the nodegroup's stacks
16:18:11 <strigazi> exactly
16:18:12 <adrian_otto> and if there is no meaningful difference might the separate stacks be harder to manage than nested ones?
16:18:15 <hongbin> randallburt: how about stack-update (updating a big stack with nest stacks VS update specific stack)
16:18:39 <randallburt> adrian_otto:  the logic would certainly be more complex I would imagine
16:19:05 <hongbin> i think the pros of multiple stacks is that we can update specifc stack instead of always updating the big stack
16:19:22 <hongbin> this is common for life-cycle operations i guess
16:19:23 <Drago> adrian_otto: If the stacks are nested, you have to mess with the top-level template body directly
16:19:31 <randallburt> hongbin:  again, the bottleneck will be in the initial "what needs doing" step and if you can do that on a separate stack that would be faster from a magnum standpoint
16:19:39 <adrian_otto> Drago: ok, got it. I see the desire to federate that clearly now.
16:19:51 <randallburt> the actual orchestration after that isn't much different
16:20:57 <randallburt> so yeah, you get an easier implementation with using nested stacks but you could get faster and more fine-grained control using separate, but you'd have to do the synchronization of those stacks on your end
16:21:09 <adrian_otto> so our nodegroup resources will all be identified with the related cluster resource, so it will be trivial to act on a full set of them.
16:21:13 <randallburt> potentially, anyway
16:22:06 <Drago> I think it makes sense to have separate stacks because it means that magnum resources to stacks is 1:1
16:22:09 <adrian_otto> in the current design, I think a NodeGroup can only belong to one cluster at a time, and we don't have plans to make it so you can orphan and adopt them into other clusters, correct?
16:22:10 <strigazi> In the case of AZs I think it makes sense to have different stacks
16:22:24 <Drago> adrian_otto: Correct
16:22:26 <jvgrant> adrain_otto: correct
16:22:30 <strigazi> correct
16:22:35 <adrian_otto> in that case, let's cover what happens when you delete a cluster
16:22:37 <randallburt> strigazi:  you can put nested stacks in different AZ's though
16:22:44 <randallburt> there's a resource for that
16:22:48 <adrian_otto> one option is to automatically delete all associated node groups
16:23:12 <adrian_otto> another option is to raise an exception unless there are no associated nodegroups.
16:23:31 <adrian_otto> or some combination thereof, like a flag to delete that indicates you want a recursive deletion of all nodegroups.
16:24:08 <Drago> The current behavior is (obviously) the former, so we could stick with that
16:24:20 <strigazi> in the spec when creating a NG you must specify the cluster. also the cluster will hold the resources for networks
16:24:28 <randallburt> I would assume that if you are deleting at the cluster level, you want the whole cluster gone. If you want to reduce availability of a particular group then scale it to 0
16:24:39 <Drago> And if it turns out we want the latter, it can be implemented later
16:24:55 <adrian_otto> agreed. I'd like us to record that intent in the spec.
16:25:07 <jvgrant> currently the idea is a nodegroup can't exist without a cluster it is connected to
16:25:09 <Drago> adrian_otto: Which intent?
16:25:19 <adrian_otto> recursive delete
16:25:59 <strigazi> you mean rolling?
16:26:18 <Drago> NodeGroups will also be manageable, so you can add or delete nodegroups from a cluster
16:26:35 <adrian_otto> strigazi: no. yes. I was imagining a future state where a nodegroup might be transitioned from one cluster to another.
16:26:46 <adrian_otto> but I think that's too complex for a first iteration
16:27:15 <tonanhngo> How about cluster-update?  Would the change be applied recursively from the cluster, or separately per node group?
16:27:33 <adrian_otto> having separate heat stacks per node group opens the door to that possibility though.
16:27:43 <hongbin> tonanhngo: i think it depends on which attribute you want to update
16:28:06 <jvgrant> i would think any operation on cluster would be applied recursively as it makes since for the operation
16:28:18 <jvgrant> sense
16:28:57 <randallburt> you could always make it explicit in the api cluster-wide and group specific operations, then punt to the driver to sort it
16:29:24 <Drago> Is there a use-case for recursively updating something in all the nodegroups? There is also nodegroup-update, so we could have cluster-update only apply to cluster attributes and then enable specifying nodegroup attributes to do recursive updates if we need it
16:29:41 <randallburt> ^^ +1
16:29:58 <Drago> The only thing updatable right now is node-count, and I don't know why you'd want that recursive
16:30:21 <jvgrant> true, currently there is not something that needs this
16:30:27 <randallburt> and it would be easy enough to automate if you did want to do it to all groups in the cluster
16:30:29 <tonanhngo> it could get complicated for user to manage, figuring out which attribute to update at the cluster level and which for nodegroup level.
16:30:31 <jvgrant> though that might change with the lifecycle operations
16:30:38 <adrian_otto> that one should not be handles recursively.
16:31:18 <randallburt> jvgrant:  you'd just have cluster-wide ones and group-specific ones in the api I would think
16:32:00 <Drago> I could see an argument for the convenience of doing a cluster update and being able to specify —master-node-count or —node-count and having it apply correctly IFF there's 1 master NG and 1 minion NG
16:32:01 <strigazi> At this point our use cases are for cluster wide operations
16:32:22 <jvgrant> randallburt: correct, i don't think we have the use case for needing the recursive yet. so we should not worry about it for now
16:33:05 <jvgrant> Drago: that could be handled in the CLI to just find and send the command to the correct nodegroup
16:33:14 <Drago> jvgrant: Sure sure :)
16:33:38 <adrian_otto> another way to handle that is to do node-group-update and have an optional cluster_id to use instead of nodegroup name. In that case it coudl do what you asked for for all the nodegroups in the cluster.
16:34:03 <adrian_otto> or allow a list of nodegroups to act on
16:34:11 <adrian_otto> both are optiimzations we can defer.
16:34:44 <Drago> adrian_otto: Note that cluster_id is optional in the spec currently, because you can specify a NG_id that uniquely identifies the NG
16:35:02 <Drago> In nodegroup-update and many other nodegroup commands
16:35:17 <strigazi> -1 To orphan NGs
16:35:28 <strigazi> an NG must have a cluster IMO
16:35:49 <Drago> strigazi: Yes, the optional cluster ID is just because you already have a NG ID, not because it's orphanable
16:36:08 <randallburt> strigazi:  +1
16:36:12 <jvgrant> strigazi: +1
16:36:19 <swatson_> strigazi +1
16:36:34 <adrian_otto> +1 agreed.
16:36:50 <strigazi> I think we can agree on the one stack per NG and iterate on that
16:37:07 <adrian_otto> sounds good to me.
16:37:13 <Drago> I saw that tonanhngo had some concerns about ClusterTemplates, around how attributes work with them. I know that there was some confusion around this for others as well
16:37:38 <hongbin> agree, one stack per NG is more elegant approach
16:38:31 <jvgrant> Templates are true templates now. They have the same attributes as what they are a template of. Not separate objects with additional attributes
16:39:14 <Drago> Currently, the intent of v2 ClusterTemplates is to act as a prototype, so you can create a cluster out of them. I don't think it's necessary to set required attributes on it if you don't want to, because it only needs to be enforced upon cluster create. Keypair is a great example of this
16:40:28 <tonanhngo> Drago:  then when you use a template to create a cluster, magnum would validate and complain if a required attribute is missing?
16:40:37 <Drago> tonanhngo: Correct
16:40:48 <strigazi> +1
16:40:59 <tonanhngo> ok, I guess that's different from the current model
16:41:07 <Drago> tonanhngo: It definitely is
16:41:25 <tonanhngo> but it's reasonable as long as we document it so users are not surprised
16:42:09 <tonanhngo> You can also just override the attribute set in the template at cluster creation time, right?
16:42:10 <Drago> And I think that's where part of the confusion is from. Attributes from the CT are copied over and the created Cluster has *no* link to the CT it was created from. I wanted to make that clear too
16:42:19 <Drago> tonanhngo: Yes
16:42:58 <strigazi> A use case I want to have is: create  a cluster with 4 NGs with one commands from a public CT or/and NG
16:43:13 <Drago> tonanhngo: *Yes, unless we have the lockout feature we discussed at the midcycle
16:43:28 <strigazi> 2 master NGs and 2 worker NGs
16:44:05 <strigazi> I'll comment for it in the spec
16:44:15 <jvgrant> if the CT and NGT are already there then yes you can do that in one command
16:44:25 <Drago> strigazi: According to the current spec, you can do that. You can add NodeGroupTemplates to a ClusterTemplate, then when you create a cluster from that CT, it creates the NodeGroups as well
16:44:40 <tonanhngo> This is a distinct use case, we should describe it in the use case section:  flexibility in using the template
16:44:56 <jvgrant> and you can override specific values on each if you need to(example: give each a different keypair other than the one in the template)
16:44:58 <strigazi> I mean for public CT and NGT
16:45:13 <Drago> strigazi: It should not make any difference that it's public
16:45:21 <strigazi> ok
16:45:24 <strigazi> good
16:45:33 <jvgrant> i can add that example to the spec
16:45:46 <strigazi> I want to bring up one more thing before the summit
16:45:51 <tonanhngo> +1, that would clarify the intention
16:45:57 <adrian_otto> have we considered a use-case where the user wants the master and worker nodes collapsed onto a single node in a single nogegroup?
16:46:30 <strigazi> heat-agent VS magnum-daemon on cluster nodes
16:46:44 <Drago> adrian_otto: You could use labels when creating the master nodegroup and the driver could select the template that has the combined architecture
16:47:09 <adrian_otto> I like that
16:47:41 <hongbin> As the topology become complex, i think we could consider to support a declarative file to create a cluster, like k8s
16:47:44 <Drago> adrian_otto: A Cluster with only "master" nodes is valid
16:47:53 <Drago> hongbin: Or heat? :) randallburt
16:48:14 <randallburt> hongbin:  at what point then do you just say "use heat if the user is that opinionated"?
16:48:24 <randallburt> lol Drago :D
16:48:32 <adrian_otto> ok, given our limited time today, are there any major concerns with the nogegroup spec that we should plan to address prior to the summit?
16:48:43 <Drago> randallburt: I think hongbin means a file that would have a similar form to what you'd specify on the command line
16:48:52 <randallburt> Drago:  ah, ok
16:48:55 <hongbin> yes
16:48:56 <adrian_otto> or are we happy sharing the current direction with the community, in accordance with our discussion today?
16:49:31 <jvgrant> it seems like we are in agreement on most of the large points of this change
16:49:37 <tonanhngo> +1, maybe with some revision based on the discussion today
16:50:05 <adrian_otto> okay, so let's wrap this up for now, and if something surfaces, do our best to work it out on the ML
16:50:37 <adrian_otto> I had an item on the agenda to discuss the auth plugin
16:50:47 <Drago> I think that if we were to come up with our own magnum agent, it would turn out to be just like the heat agent
16:51:02 <Drago> strigazi, hongbin
16:51:05 <muralia> +1
16:51:23 <hongbin> well, the heat agent is polling based, and magnum agents could be push based
16:51:54 <Drago> That requires your nodes to be on the same network as magnum. I don't think that's good from a security perspective, and part of the reason the heat agent is polling
16:52:01 <strigazi> I don't know what I prefer yet, I just want to clarify
16:52:21 <adrian_otto> hongbin. there are options for the heat agent configuration that would allow for a reversal of that
16:52:27 <strigazi> Drago, nodes talk to magnum-api already
16:52:35 <hongbin> i don't like the performance of polling, it just takes a while for a lifecycle operation to get executed
16:52:49 <Drago> strigazi: Yes, but nodes can be behind NAT, with no way for magnum to contact them first
16:52:50 <randallburt> yeah, assuming Magnum will have direct network access to the cluster nodes isn't always the case
16:52:57 <adrian_otto> in that case using the zaqar driver could help
16:53:22 <randallburt> and IIRC, os-*-config are generic enough that I think an operator could write a push-based plugin for them if they wanted
16:53:39 <Drago> Another nice thing about using the heat agent is that it allows the operator to use whatever signal transport they want because it's a heat configuration, not magnum
16:53:59 <adrian_otto> #topic Open Discussion
16:54:01 <strigazi> randallburt, not sure how can this be done
16:54:15 <randallburt> right, so in all cases, I think the os-*-config agents work and for reference we can use the zaqar plugin that heat already supports
16:54:43 <Drago> hongbin: I know you had a concern about the whole stack being updated. We can architect around that. Currently, the softwaredeployment resources can be pulled to the top level. When we move to 1 stack per nodegroup, it won't be a problem anymore regardless
16:55:05 <randallburt> strigazi:  you could plug into os-refresh-config to be listening rather than polling, I think. Its been a while since I've looked tbh
16:55:14 <hongbin> then, how about extensibililty, for example, allows operators to provide a plugin that takes actions before/after a lifecycle operation
16:55:19 <strigazi> randallburt ok
16:55:48 <randallburt> hongbin:  Operators can customize their templates to do so
16:55:50 <Drago> adrian_otto, hongbin, strigazi, randallburt: Continue this after the meeting so that we can get to adrian_otto's item?
16:55:55 <randallburt> hongbin:  the mechanism is the same
16:55:56 <hongbin> Drago: ok, then the stack-update seems fine
16:56:01 <adrian_otto> ok Drago
16:56:25 <randallburt> sure
16:56:36 <hongbin> randallburt: yes, i would argue a better option is to override a interface instead of customize the template
16:56:52 <randallburt> hongbin:  then you gut Magnum in a way
16:57:19 <hongbin> randallburt: :)
16:57:29 <hongbin> that is my point, i am not strong at it
16:57:36 <randallburt> hongbin:  on the other hand, I was going to propose moving all lifecycle and montioring things into the driver interface so that would work too and you could do it either way depending on operator expertise
16:58:30 <Drago> adrian_otto?
16:58:33 <adrian_otto> we can cover the auth plugin discussion after we adjourn as well. What the team needs to know is that we will continue to refine that work, and seek agrement on the best place for the various components to live.
16:58:33 <hongbin> true, for example, in trove, there are hooks for operators to do pre-* and post-* operators
16:59:07 <randallburt> hongbin:  yep, and you can do this with resources and things in a Heat template, but I see your point about not *having* to do so
16:59:14 <adrian_otto> Thanks everyone for attending today. Our next meeting will be at 2016-11-01 at 1600 UTC.
16:59:31 <adrian_otto> #endmeeting