09:00:33 <jakeyip> #startmeeting magnum
09:00:34 <opendevmeet> Meeting started Wed Jul 19 09:00:33 2023 UTC and is due to finish in 60 minutes.  The chair is jakeyip. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:00:34 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:00:34 <opendevmeet> The meeting name has been set to 'magnum'
09:00:41 <jakeyip> #topic Roll Call
09:00:43 <jakeyip> o/
09:00:55 <dalees> o/
09:00:56 <travisholton> o/
09:01:14 <diablo_rojo> o/
09:02:15 <jakeyip> thanks all for joining the meeting
09:02:34 <jakeyip> Agenda:
09:02:35 <jakeyip> #link https://etherpad.opendev.org/p/magnum-weekly-meeting
09:02:37 <jakeyip> ny
09:03:03 <jakeyip> #topic k8s conformance testing
09:03:48 <jakeyip> let's start with this. can someone take over?
09:04:08 <diablo_rojo> Sure lol.
09:04:39 <diablo_rojo> So, basically for the last dozen or so k8s releases, up till 1.24 openstack magnum has been certified as a provider for k8s
09:05:34 <diablo_rojo> There's a set of conformance tests that were run by lxkong for a while and then guilhermesp_____ for a while
09:05:35 <diablo_rojo> (thank you both for keeping up on that for so long)
09:06:09 <diablo_rojo> So, as of early May we fell out of conformance - what with k8s releasing every 3 months it doesn't take long for things to expire
09:06:52 <diablo_rojo> k8s is getting ready to release 1.28 (in August I think) so it would be good to target that or at least 1.27 to get back on their list of certified providers.
09:06:59 <diablo_rojo> That's step 1.
09:07:47 <diablo_rojo> Step 2 would be to get a periodic job setup to run the conformance tests so that we A. keep track of when they merge things that we should be aware of and B. don't have to manually run the tests anymore and can just pull logs from that to submit when the time comes.
09:08:14 <diablo_rojo> #link https://github.com/cncf/k8s-conformance/tree/master Conformance Info
09:08:36 <diablo_rojo> #link https://github.com/cncf/k8s-conformance/tree/master/v1.24/openstack-magnum Our last passing conformance application thingy
09:09:15 <diablo_rojo> Now, when last guilhermesp_____ had tried to run the tests with the latest magnum ( I think it was Antelope) it didn't pass with k8s 1.25
09:09:23 <dalees> So passing 1.25 and 1.26 should be fairly straightfoward; I submitted Catalyst Cloud's Magnum 1.25 a while back: https://github.com/cncf/k8s-conformance/pull/2414
09:09:29 <diablo_rojo> Unfortunately, I don't have his logs on me to tell you what the issue was.
09:09:33 <diablo_rojo> Oh sweet
09:09:43 <diablo_rojo> That is promising
09:09:48 <dalees> there were only minor changes required, most of these changes are merged now.
09:09:53 <dalees> (if not all)
09:10:04 <diablo_rojo> Oh even better then
09:10:32 <diablo_rojo> So I guess my ask is if catalyst cloud runs vanilla openstack magnum or do you have extra stuff added into it?
09:11:05 <dalees> This relates to Magnum Heat driver of course - we are migrating to Magnum CAPI driver and will want to remain passing conformance but there's some version after which we won't be looking for conformance for Magnum Heat.
09:11:45 <diablo_rojo> Yeah this rings a bell. I had talked to Matt Pryor about this at the summit a little I think.
09:11:50 <dalees> we run Magnum Wallaby, with several extra patches. I've gone through them recently and only a couple need to go upstream that relate to conformance.
09:12:18 <diablo_rojo> Sounded like vexxhost and stackhpc had made different drivers so neither of them were running pure antelope magnum
09:12:33 <diablo_rojo> dalees: oh okay that doesn't sound so bad
09:12:35 <jakeyip> for me, I have been testing with devstack and v1.25 in Antelope and v1.27 in Bobcat (IIRC)
09:13:04 <diablo_rojo> jakeyip: would you be able to run the conformance tests with that environment?
09:13:17 <jakeyip> it's not conformance though, just a basic set of tests as the environment I have will probably not be big enough
09:13:42 <jakeyip> dalees: what's the environment that you use to run?
09:14:08 <diablo_rojo> Ahhh got it - yeah that was my issue actually - hence my looking for an openstack provider that is running pure magnum and my desire to get it running as a periodic non voting gate job
09:14:15 <diablo_rojo> So it would be true magnum
09:14:18 <jakeyip> hm ping mnasiadka (sorry I forgot)
09:14:22 <diablo_rojo> and the latest magnum to boot
09:14:32 <dalees> I use our preproduction environment for conformance submissions, so real metal.
09:14:56 <jakeyip> disk/ram?
09:18:10 <dalees> I think the tests just required a couple of control plane and couple of workers. I created the control plane with c2r4 and c4r8 workers
09:18:28 <jakeyip> since guilhermesp_____ was the last to do it, is it possible that we contact them to find out (1) what error they were having and maybe solve that?
09:18:28 <dalees> (err, 3x control plane, not 2)
09:18:34 <diablo_rojo> dalees: yeah I think so
09:19:06 <diablo_rojo> jakeyip: yeah I asked for his logs in that thread.. I will go back and see if he had them.
09:19:21 <dalees> jakeyip: guilhermesp_____ emailed on 6th May with "I am trying to run conformance against 1.25 and 1.26 now, but it looks like we are still with this ongoing? https://review.opendev.org/c/openstack/magnum/+/874092   Im still facing issues to create the cluster due to "PodSecurityPolicy\" is unknown."
09:19:36 <jakeyip> oh
09:19:39 <diablo_rojo> I don't think we need to test for everything between our last cert and the current cert, for the record.
09:19:42 <diablo_rojo> Oh yeah!
09:19:46 <diablo_rojo> thanks dalees :)
09:19:58 <dalees> so that is merged
09:21:31 <jakeyip> yeah of cos it's PodSecurityPolicy
09:21:38 <diablo_rojo> Lol
09:21:58 <diablo_rojo> Was that merged in antelope?
09:22:12 <diablo_rojo> Or its in bobcat/master?
09:22:14 <jakeyip> no, Bobcat
09:22:27 <diablo_rojo> Okay that makes sense.
09:22:57 <jakeyip> so the issue is that at that patch breaks compatibility between k8s < 1.25 and >=1.25
09:22:58 <diablo_rojo> I will ask guilhermesp_____ to get master and run the tests again and hopefully we will be good to go. That solves part 1 I think :)
09:23:23 <diablo_rojo> Oh
09:23:47 <diablo_rojo> so vexxhost would need to be running sometyhing greater than 1.25
09:23:58 <diablo_rojo> for it to pass/work with master magnum?
09:24:01 <dalees> jakeyip: does it? we run 1.23, 1.24 and 1.25 currently, maybe we have some other patches (or our templates explicitly list everything needed for older ones to function still)
09:24:49 <jakeyip> at Antelope cycle we didn't want to break compatibility, partly because we respect OpenStack deprecation cycle, etc, so we needed to do some comms first etc
09:25:17 <jakeyip> yeah it should be working with master / bobcat
09:26:05 <dalees> I wonder if we should publish a set of magnum templates with labels set to working versions for that k8s release and magnum release.
09:26:33 <diablo_rojo> That sounds like a good idea
09:26:52 <jakeyip> dalees: so if a cloud has public templates created for user and upgraded Magnum to a version past this patch, new clusters with existing templates <1.25 will not have PodSecurityPolicy
09:28:35 <jakeyip> dalees: I was working on that. it's a big-ish job because I needed to reformat the docs... and face it no one likes docs. no one likes reviewing docs too :P
09:29:04 <dalees> jakeyip: ah indeed, unless they define `admission_control_list` in older templates
09:29:18 <diablo_rojo> jakeyip: I volunteer to review your docs
09:29:22 <diablo_rojo> Just add me :)
09:29:26 <jakeyip> thanks diablo_rojo :P
09:29:44 <diablo_rojo> Lol of course :)
09:30:07 <jakeyip> so the idea is instead of updating default labels for each version, which comes with it's own problems, we will publish the working labels for each version
09:30:26 <jakeyip> currently the docs format is there is a 'default' for each version
09:30:50 <diablo_rojo> Makes sense
09:31:03 <jakeyip> this is the review I was working on https://review.opendev.org/c/openstack/magnum/+/881802
09:32:09 <diablo_rojo> Tab is open :)
09:32:12 <jakeyip> (and sorry I was wrong, it was v1.23 for Antelope)
09:32:41 <diablo_rojo> Ahh got it.
09:34:05 <jakeyip> since v1.23 is so old, in Bobcat PTG we decided support v1.25+ and remove PodSecurityPolicy (without a toggle / detection code to add PSP if k8s <v1.25).
09:34:43 <diablo_rojo> Makes sense.
09:34:44 <jakeyip> I attempted to write some logic / flags, but the fact that I am working to support an EOL k8s made that work discouraging very quickly
09:35:04 <jakeyip> hopefully operators can understand why Bobcat is such a big change
09:35:11 <diablo_rojo> Yeah. Their release cadence is insane.
09:37:05 <jakeyip> so I guess my question is, how do we handle conformance given this info?
09:37:28 <jakeyip> Bobcat release is 2023-10-04
09:37:33 <diablo_rojo> Like long term?
09:37:44 <diablo_rojo> Or right now?
09:38:19 <jakeyip> right now, there is no Magnum version that supports a non-EOL version of K8S
09:39:02 <dalees> heh, k8s 1.25 End of Life is 2023-10-28. So perhaps Bobcat should try to also ensure support for 1.26, else it'll only be relevant for a few weeks?
09:39:14 <diablo_rojo> Hmmmm yeah okay I see your point.
09:39:54 <jakeyip> (1) we backport the 'breaking' patch to Antelop. That will give us the ability to run conformance on Antelope on v1.25
09:40:07 <jakeyip> (2) we wait until Oct and support v1.25 to v1.27
09:41:23 <diablo_rojo> I would prefer option 1, but I understand that may not be the 'best' option
09:41:34 <jakeyip> I think (2) is possible right now. (1) needs a bunch more work.
09:41:40 <diablo_rojo> Right
09:42:01 <diablo_rojo> Where 'best' is for magnum devs?
09:43:27 <diablo_rojo> Hmmmm
09:44:11 <jakeyip> Personally, I think our efforts will be best placed towards (2), so we can concentrate on the ClusterAPI efforts
09:44:35 <diablo_rojo> That seems reasonable to me.
09:45:11 <jakeyip> Personally, I hate it that Antelope is going to be a 'sad' release (only support v1.23), but we should just focus on CAPI to reduce the sadness period
09:45:41 <diablo_rojo> ..the other issue is the whole Skip Level Upgrade Release process.
09:45:58 <diablo_rojo> So well maybe not an issue but something to think about.
09:46:52 <jakeyip> yeah... on the surface I find that will be difficult to support given k8s release cadence
09:47:01 <diablo_rojo> sorry -thinking outloud kinda.
09:47:11 <diablo_rojo> Yeah definitely not a straightforward solution
09:47:46 <jakeyip> SLURP will mean a yearly cycle? that'll be 4 K8S releases?
09:48:11 <diablo_rojo> Basically yeah. SLURP means people will go from Antelope to C
09:48:11 <jakeyip> hopefully with CAPI we don't need to worry about that :fingers-crossed:
09:48:15 <diablo_rojo> skipping Bobcat.
09:48:24 <diablo_rojo> Yeah that would be nice lol
09:48:34 <jakeyip> yeah more work to be done there
09:48:59 <jakeyip> we need help on tests. then we can tackle SLURP testing.
09:49:29 <jakeyip> running out of time, so summarise?
09:49:30 <diablo_rojo> I guess we advise folks using magnum to not do slurp with k8s for the short term
09:49:41 <diablo_rojo> Sorry to hog the whole meeting.
09:49:48 <diablo_rojo> I appreciate all of everyone's input!
09:50:26 <jakeyip> so to summarise, can you see if you can run conformance with master, and let us know?
09:50:29 <diablo_rojo> So. Basically we are currently stuck between a rock and a hardplace but for the meantime we will focus on CAPI and then when bobcat releases, recert magnum.
09:50:37 <diablo_rojo> Oh yeah I can look into that.
09:51:11 <jakeyip> that will allow us to identify any issues, so when Bobcat gets cut we can recert it straightaway
09:51:19 <diablo_rojo> +2
09:51:22 <diablo_rojo> sounds like a good plan to me
09:51:27 <dalees> sounds good to me
09:51:54 <diablo_rojo> Thank you everyone!
09:51:55 <jakeyip> #action diablo_rojo to do conformance testing on master with k8s v1.25+
09:52:16 <jakeyip> diablo_rojo: thanks for helping out, we need all the help we can get :)
09:52:47 <jakeyip> #topic ClusterAPI
09:53:44 <jakeyip> I have been reviewing some of the patches and testing CAPI out with varying levels of success
09:54:32 <jakeyip> I think I may need some help to point me to the right direction to test - e.g. which patchset that will work with devstack
09:55:20 <dalees> we continue to also, and travisholton has flatcar functional which we're keen to contribute.
09:56:40 <jakeyip> dalees: have you been able to test it on devstack?
09:56:40 <dalees> jakeyip: I think the top of the patch chain is https://review.opendev.org/c/openstack/magnum/+/884891/2 (aside from flatcar addition)
09:56:45 <travisholton> jakeyip: the latest active patch is #880805
09:57:24 <dalees> oh, travisholton will know better than I!
09:57:36 <jakeyip> travisholton: will 880805 work in devstack?
09:57:43 <travisholton> 884891 works as well..just in WIP still
09:58:21 <dalees> jakeyip: yes I have had it running in devstack. I did all the CAPI management setup manually though, which I believe is in the devstack scripts now.
09:58:23 <travisholton> jakeyip: yes I've been using patches in that set for a few weeks now
09:58:53 <jakeyip> cool I will try to jump straight to 880805
09:58:53 <travisholton> the work that I've done to set up flatcar is based on those and that has been working as well
09:59:32 <jakeyip> #action jakeyip to test CAPI with #880805
09:59:51 <jakeyip> any other things related to CAPI?
10:00:03 <dalees> hoping to see johnthetubaguy or Matt Pryor around soon, we have much to discuss on those patchsets and things to contribute
10:00:13 <travisholton> +1
10:01:37 <jakeyip> +1, hopefully next week, if someone can ping them and let them know
10:02:18 <dalees> I want to discuss a few more things, but better to have StackHPC here too. Such as: 1) What we agree to merge to start with, vs add features later. 2) helm chart upstream location (magnum owned - do we need a new repo?). 3) oci support for helm charts, and a few other things ;) I will add to agenda for next time
10:03:40 <jakeyip> yes those are all important. I've created the placeholder for next week's agenda, please populate before we forget https://etherpad.opendev.org/p/magnum-weekly-meeting
10:05:03 <jakeyip> travisholton, dalees, diablo_rojo: we are overtime. anything else?
10:05:15 <travisholton> no not from me
10:05:21 <diablo_rojo> None from me
10:05:25 <dalees> all good for today, thanks jakeyip
10:05:26 <diablo_rojo> thanks jakeyip !
10:05:36 <jakeyip> thanks everyone for coming!
10:05:47 <jakeyip> #endmeeting