09:00:35 <jakeyip> #startmeeting magnum
09:00:35 <opendevmeet> Meeting started Wed Feb 28 09:00:35 2024 UTC and is due to finish in 60 minutes.  The chair is jakeyip. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:00:35 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:00:35 <opendevmeet> The meeting name has been set to 'magnum'
09:00:45 <jakeyip> #link https://etherpad.opendev.org/p/magnum-weekly-meeting
09:00:49 <jakeyip> #topic Roll Call
09:00:52 <jakeyip> o/
09:00:53 <mnasiadka> o/
09:00:55 <dalees> o/
09:00:59 <opendevreview> Merged openstack/magnum-tempest-plugin master: CI: Wait for pods to exit ContainerCreating state  https://review.opendev.org/c/openstack/magnum-tempest-plugin/+/908310
09:01:01 <opendevreview> Merged openstack/magnum-tempest-plugin master: Add pods description in logs  https://review.opendev.org/c/openstack/magnum-tempest-plugin/+/909444
09:01:14 <jakeyip> opendevreview came for meeting :)
09:01:50 <jakeyip> alright let's crack on
09:02:08 <jakeyip> #topic Feature Freeze
09:02:40 <jakeyip> Feature Freeze this week, any patches that needs to go in and I haven't put in the agenda? please add and ping
09:03:12 <jakeyip> anything else to discuss for feature freeze?
09:03:23 <dalees> I still hope to revisit control plane resizing this week. Just finished other up with other work. https://review.opendev.org/c/openstack/magnum/+/906086
09:03:50 <dalees> but maybe it will miss the freeze
09:04:13 <mnasiadka> I think would be nice to get Calico in
09:04:20 <mnasiadka> but that's a chain of patches
09:05:05 <jakeyip> dalees: ah that one. yeah you had a question about validation, I found a good place. take a look at the comments and see if it works?
09:05:39 <mnasiadka> #link https://review.opendev.org/q/topic:%22calico-helm%22
09:05:43 <dalees> on the topic of calico, sorry i've not uploaded the static manifest patchset for that yet. I prefer it over the helm chart and operator as it's easier to mirror.
09:05:53 <jakeyip> mnasiadka: calico is on the agenda :)
09:06:11 <dalees> jakeyip: yes, thank you! that's what i need to revisit
09:06:32 <jakeyip> let's skip to calico since we are on it already :)
09:06:40 <mnasiadka> dalees: there's one more image to mirror, I can post some docs to it - or ideally we could have a tool that lists images you need to mirror
09:07:24 <opendevreview> Michal Nasiadka proposed openstack/magnum master: Removing Tiller support  https://review.opendev.org/c/openstack/magnum/+/908414
09:08:08 <jakeyip> ok seems like we don't have enough info to make decision one way or another
09:08:36 <jakeyip> we don't use calico so I need to eval dalees to decide.
09:09:24 <jakeyip> dalees: on that topic - so you will still be using your patches if we get mnasiadka's helm version in?
09:09:26 <dalees> mnasiadka: not sure that's true; it's pulling helm charts from the internet and `curl` from internet urls. That adds dependencies to cluster spin up.
09:10:44 <dalees> jakeyip: yes, so while i'm wary of the extra internet dependencies, we'll just continue to use manifests with our carried for the remaining lifetime of our Heat clusters.
09:11:16 <mnasiadka> dalees: interesting, I can have a look - but Helm chart gives us support for multiple versions without need to update manifests
09:11:21 <dalees> so i'm not sure i need to add requirements to these calico that don't need to affect me.
09:11:45 <jakeyip> dalees: I am curious if we merge this, how much will it affect you carrying those patches?
09:12:00 <jakeyip> I'm concerned if there are deployments carrying their own calico like you, and this is a breaking change for them
09:13:57 <dalees> i think we'd need to disable this new script and instead include the static calico manifest that matches `calico_tag`, yeah.
09:14:13 <mnasiadka> I'm fine with carrying that one downstream with us, as long as we'll get rid of Tiller ;-)
09:14:55 <jakeyip> mnasiadka: I think we may have to do that for now, until we have more time to evaluate Calico better
09:15:00 <dalees> no issues removing tiller, and installing helm in the proposed way :)
09:15:29 <mnasiadka> goodie
09:15:32 <jakeyip> I am not in a good position to do it as we don't carry patches to handle calico, so I'm blind
09:15:56 <jakeyip> probably dalees and mnasiadka collaborate more on that next cycle? I will help.
09:16:13 <mnasiadka> next cycle we should rather deprecate Heat driver ;-)
09:16:48 <jakeyip> well if we can get magnum-capi-helm in, then heat still needs to stay at least 1 cycle :P
09:17:18 <jakeyip> I like your ambition mnasiadka :P
09:18:22 <jakeyip> so Tiller will go, that's decided.
09:18:42 <jakeyip> Helm move? https://review.opendev.org/c/openstack/magnum/+/908423
09:18:59 <mnasiadka> and hopefully https://review.opendev.org/c/openstack/magnum/+/908423 as well
09:19:14 <opendevreview> Michal Nasiadka proposed openstack/magnum master: Move Helm client install to separate script  https://review.opendev.org/c/openstack/magnum/+/908423
09:19:15 <opendevreview> Michal Nasiadka proposed openstack/magnum master: Calico deployment with Tigera Operator  https://review.opendev.org/c/openstack/magnum/+/908501
09:19:26 <jakeyip> #topic Calico
09:19:42 <jakeyip> #agreed Remove Tiller this cycle
09:19:55 <mnasiadka> dalees: will you submit the manifest update so users get something working out of the box?
09:21:04 <dalees> mnasiadka: yeah, cool; If my comments block the helm installed calico then I owe that this week
09:21:48 <dalees> it'd be nice to have either and default to manifest, but not sure how right now.
09:22:13 <mnasiadka> we could have a label calico_use_helm
09:22:15 <mnasiadka> or something like that
09:22:23 <mnasiadka> and default to false
09:22:44 <mnasiadka> if you update the manifest in a separate change - I can rebase the Helm one and add a label
09:23:15 <jakeyip> there's no way to make the current https://review.opendev.org/c/openstack/magnum/+/908501/14/magnum/drivers/common/templates/kubernetes/fragments/calico-service.sh work at all?
09:24:08 <mnasiadka> there is, we need to update the manifest to newer version
09:24:23 <jakeyip> e.g. certain calico_tag, etc?
09:24:35 <dalees> jakeyip: I think that one is old; so I'll propose a newer manifest to replace and lock the manifest to a certain `calico_tag`.
09:24:38 <mnasiadka> problem is, it won't support the current default calico_tag I think
09:25:18 <mnasiadka> and we have the Heat data limit - so we can't really have two big manifests
09:25:36 <jakeyip> dalees: cool
09:26:07 <dalees> yeah, we can only embed ~2 calico manifests. the Helm install really helps there
09:27:36 <mnasiadka> I think the easiest way forward is to bump the calico_tag default to something that works with 1.28 or whatever we claim is supported in docs
09:27:39 <mnasiadka> or tested
09:28:08 <mnasiadka> second thing is, I think we need a note in the docs that we don't test/support EOL versions of Kubernetes, so if it doesn't work - then it doesn't work
09:29:15 <mnasiadka> but I'll leave the manifest update to dalees ;-)
09:30:08 <jakeyip> mnasiadka: yeah I put something like that in the reno for occm change
09:31:11 <jakeyip> anyway the change with reno has been +w :)
09:31:36 <jakeyip> so calico last bumped to 3.21.2 in yoga. https://review.opendev.org/c/openstack/magnum/+/779378
09:31:47 <jakeyip> at that time we were testin with k8s v1.23
09:31:53 <jakeyip> so I guess it must be newer k8s broke it
09:32:30 <dalees> oh that's not too old. I'll have a try with it and add a new version at least, anyway.
09:33:03 <dalees> maybe it is too old, but it's not as old as i thought ;)
09:33:06 <jakeyip> it may have been one of those breaking changes 1.21 -> 1.23 -> 1.25
09:33:33 <mnasiadka> #link https://docs.tigera.io/calico/latest/getting-started/kubernetes/requirements#kubernetes-requirements
09:33:50 <mnasiadka> here they claim they only really test latest versions
09:34:49 <jakeyip> yeah it's prob not worth debugging and if we can replace the manifest with something that works with k8s v1.27 it will be the easiest
09:35:01 <mnasiadka> +1
09:35:07 <dalees> yep +1
09:36:13 <jakeyip> #agreed Push Calico Helm to next cycle
09:36:51 <jakeyip> #action dalees to send up newer calico manifest
09:37:00 <jakeyip> possibly will miss feature freeze ^
09:37:24 <jakeyip> let's get on to next, can't worry about that too much for now
09:37:50 <jakeyip> #topic Beta Drivers
09:37:52 <jakeyip> Beta drivers https://review.opendev.org/q/topic:%22beta_driver%22
09:38:56 <jakeyip> it's clean to go in now. not 100% sure of usefulness given the drama and uncertainty. last week we agreed no harm putting it in for this cycle
09:39:00 <jakeyip> still agreeable?
09:39:57 <jakeyip> it may be a useful tool next cycle merging magnum-capi-helm
09:40:10 <mnasiadka> if we want to merge it
09:40:39 <mnasiadka> I don't like the drama, and I mentioned I also like the idea of driver being independent (so master branch and tags only) - as long as it's properly tested in CI
09:40:42 <opendevreview> Merged openstack/magnum master: Update cloud-provider-openstack registry  https://review.opendev.org/c/openstack/magnum/+/909344
09:40:56 <mnasiadka> We're not really good in backports ;-)
09:40:57 <jakeyip> I think still useful even if out of tree in separate repo
09:41:17 <mnasiadka> And we shouldn't backport features, so always there will be some issues
09:41:35 <jakeyip> because a packager may package the driver, then it will conflict
09:41:43 <jakeyip> mnasiadka: not sure what you mean for backports?
09:41:56 <mnasiadka> well, let's assume the driver is in tree
09:42:03 <mnasiadka> we merge a feature in master
09:42:16 <mnasiadka> but we can't really backport that today in stable/2023.2 or whatever stable branch
09:42:34 <mnasiadka> so people backport that downstream anyway
09:42:51 <dalees> I don't mind the beta feature going in, it looks ready. not sure if it'll be useful if drivers are out of tree, but it can be removed if it's not used.
09:42:58 <mnasiadka> with independent release cycle of a driver in a separate repo - you can get the latest and greatest driver code (as long as it passes stable Magnum CI tests)
09:43:31 <jakeyip> yeah ok
09:44:14 <jakeyip> it may still be useful for out of tree drivers by letting them be hidden for 1 cycle by default
09:44:25 <mnasiadka> and Kubernetes changes so often, that I'm pretty sure every new Kuberenetes version will need something in the driver
09:45:06 <jakeyip> just because https://github.com/stackhpc/magnum-capi-helm/blob/main/magnum_capi_helm/driver.py#L60-L64 still conflicts with https://github.com/vexxhost/magnum-cluster-api/blob/main/magnum_cluster_api/driver.py#L389
09:46:35 <mnasiadka> I'll have a look in beta drivers later, added to my review queue :)
09:47:50 <jakeyip> thanks
09:49:17 <jakeyip> #topic magnum-capi-helm
09:50:40 <mnasiadka> I guess I should start
09:50:59 <jakeyip> yeah go for it
09:51:25 <mnasiadka> So basically StackHPC wants to move the driver from https://github.com/stackhpc/magnum-capi-helm to OpenDev - to be an out of tree driver still, but detangle that from SHPC ownership and move under Magnum governance
09:51:43 <mnasiadka> We will work on getting cross-repository CI working for every magnum/ and magnum-capi-helm/ change
09:52:00 <jakeyip> "OpenStack" governance :)
09:52:08 <mnasiadka> Yes, OpenStack governance
09:52:25 <mnasiadka> The driver repo would have independent release cycle - so we would just publish new versions as tags
09:52:41 <mnasiadka> And strive to make it working with all non-EOL/unmaintained Magnum versions
09:53:13 <mnasiadka> And of course add relevant documentation in both driver repo and magnum main repo
09:53:30 <jakeyip> sounds like a good halfway point to me
09:54:05 <jakeyip> who will make up the magnum-capi-helm core?
09:54:45 <mnasiadka> The regular magnum-core + current driver developers (John Garbutt and Matt Pryor) most probably
09:54:56 <mnasiadka> But regular magnum-core only is also fine with us, since I'm a core
09:55:19 <mkjpryor> I am happy to join as a core reviewer for magnum-capi-helm
09:55:39 <mkjpryor> dalees: we would also like a core reviewer from Catalyst, if you or Travis are happy to do it
09:55:54 <mkjpryor> To avoid it being a single-company thing
09:55:57 <jakeyip> I think it'll be good to have John and Matt as core. I will be happy to join, but they can drive the direction
09:56:09 <dalees> I'd not want to lose mkjpryor's reviews, though I guess he can continue reviewing in either case.
09:57:12 <dalees> mkjpryor: I'll discuss with travisholton, but I'm happy to be involved
09:57:39 <mkjpryor> :thumbsup:
09:58:02 <jakeyip> awesome
09:58:24 <mkjpryor> So there is also the question of the governance of the Helm charts
09:58:37 <jakeyip> #agree magnum-capi-helm as an out-of-tree driver
09:58:49 <mkjpryor> Currently, although they are an open-source project, they sit in the stackhpc namespace on GitHub
09:58:50 <jakeyip> #agreed magnum-capi-helm as an out-of-tree driver
09:59:46 <jakeyip> dalees made a repo for it already
10:00:16 <mkjpryor> The problem with moving them is we rely very heavily on GitHub Actions for CI, and we don't have the capacity to port to Zuul right now
10:01:38 <mnasiadka> I think we could discuss the Helm charts on the PTG again
10:01:41 <dalees> the helm charts are also used in a wider scope than just the magnum-capi-helm driver, so the reviewers may need to be aware of that.
10:02:10 <mkjpryor> What we are proposing in the interim is to create a new "azimuth-cloud" organisation on GitHub to host all the Azimuth repositories, including the capi-helm-charts, the cluster-api-addon-provider and cluster-api-janitor-openstack
10:02:45 <mkjpryor> dalees: we would like you or someone else from Catalyst to join this org as a maintainer or owner
10:03:01 <mkjpryor> Not for all of Azimuth, obviously
10:03:09 <mkjpryor> Unless you want to :-)
10:03:32 <mkjpryor> But we want to make it clear that it is not a StackHPC "product" but rather an open-source project
10:03:42 <mkjpryor> Which we provide support for
10:04:24 <dalees> mkjpryor: ok, I'll discuss with others at Catalyst and get in touch.
10:04:25 <mkjpryor> Looking forward, I think we are looking to contribute Azimuth and all its components to a foundation (TBC)
10:04:35 <jakeyip> I am not sure if that will appease the objections. I don't really have an objection
10:05:10 <mkjpryor> My understanding is that the main objection now is not to point to a "single vendor" repository.
10:05:44 <mkjpryor> If the repository was not single vendor, and especially if it belonged to a foundation like CNCF or LF, that would be less of an issue I think?
10:05:55 <mkjpryor> Even if it is not OpenInfra
10:06:30 <jakeyip> I won't know, it may be good to check in with Julia first before attempting the work
10:07:06 <jakeyip> as I said, I don't have an objection. my point is that there are so many images magnum uses which are 'single vendor'. this isn't any difference really.
10:07:43 <mkjpryor> I guess it is actually no different to, for example, Calico in that sense
10:07:46 <jakeyip> maybe they just want it mirrored or something, so in case stackhpc delete the original repo, the mirror still exists
10:07:58 <mnasiadka> I think the difference is those images/Helm charts/manifests are for 'addons', which some Helm charts are basically driver mechanics (like the one used to create a Kubernetes cluster)
10:08:12 <mkjpryor> True
10:08:53 <mnasiadka> That's why for me it's a PTG item - maybe we could split out the ,,core'' Helm charts required for the driver somewhere
10:09:02 <mkjpryor> The alternative would be to move the capi-helm-charts, cluster-api-addon-provider and cluster-api-janitor-openstack under OpenStack governance as "part of" Magnum
10:09:27 <mnasiadka> magnum-capi-helm-charts repo is there already, but I think we'll tackle that after handling the driver code move
10:09:40 <mkjpryor> It would be possible to have a version of the Helm charts that doesn't use the addon provider, if we think that is best
10:10:01 <mkjpryor> The addon handling would be worse, but I guess it becomes an optional thing
10:10:19 <mnasiadka> Might be, I think we're already over time
10:10:30 <jakeyip> I am concerned with the github actions. will they eventually be movable?
10:10:58 <mnasiadka> They will be complicated, especially if the repo uses some bots to update dependency versions
10:10:58 <mkjpryor> It will be work to move
10:11:23 <jakeyip> I see
10:11:38 <mnasiadka> So I'd like to focus on moving the driver code for now and discuss if we could make the scope of magnum-capi-helm-charts repo smaller to things that do not change that often
10:11:45 <mkjpryor> I agree
10:11:49 <jakeyip> ok I think I'd like try to wind up in the interest of time
10:12:15 <mkjpryor> I think we are still going to move the Azimuth repos out of the stackhpc namespace on GitHub as well, for several reasons
10:12:30 <mkjpryor> But we should discuss at the PTG
10:12:37 <jakeyip> yeap
10:12:43 <jakeyip> alright any last items?
10:12:54 <mnasiadka> jakeyip: can you comment and +1 the project-config/governance changes?
10:12:59 <jakeyip> #agreed discuss the helm charts at the PTG
10:13:19 <jakeyip> mnasiadka: yes I will. after I end this I will add the irc link ;)
10:13:28 <mnasiadka> Fanstatic, thanks
10:13:50 <jakeyip> cool if there's nothing else, let's end this meeting
10:13:56 <jakeyip> #endmeeting