#opendev-meeting log

19:01:25 <clarkb> #startmeeting infra
19:01:25 <opendevmeet> Meeting started Tue Sep 19 19:01:25 2023 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:25 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:25 <opendevmeet> The meeting name has been set to 'infra'
19:01:34 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/ZBZMOM7RXD4AXORIXJO537T3YDOJPFTW/ Our Agenda
19:01:41 <clarkb> #topic Announcements
19:01:56 <clarkb> We are deep into the OpenStack release period and OpenStack elections end tomorrow
19:02:02 <clarkb> go vote if you haven't already
19:03:00 <clarkb> #topic Mailman 3
19:03:20 <clarkb> fungi: the plan is still for starlingx and openinfra to migrate on thursday at ~15:30 UTC?
19:03:41 <fungi> #link https://etherpad.opendev.org/p/mm3migration maintenance plan starts at line 198
19:03:44 <fungi> yes
19:04:04 <fungi> tomorrow i'll prime the rsyncs and remind the relevant community managers/liaisons
19:04:10 <fungi> but we're on track there
19:04:18 <clarkb> fungi: probably want to update dns records at the same time tomorrwo too?
19:04:21 <clarkb> (to reduce TTLs?)
19:04:33 <fungi> i did the dns ttl adjustments early because one of the domains is in cloudflare
19:04:41 <fungi> already crossed off the list
19:04:48 <fungi> wanted to make sure i could still get to it
19:05:32 <clarkb> aha
19:05:32 <fungi> this time we'll approve the change earlier ahead of the window so we don't end up starting late
19:05:55 <fungi> #link https://review.opendev.org/895205 Move OpenInfra and StarlingX lists to Mailman 3
19:05:59 <fungi> please review
19:06:17 <fungi> i plan to approve it by 13:30 utc on thursday
19:06:46 <clarkb> sounds like a plan. Any other prep work besides reviewing that change we can help with?
19:07:05 <fungi> it can technically be approved as far ahead as we like, but easier not to need to roll that back if we end up postponing for some reason
19:07:32 <clarkb> and avoids having the two lists of mailing lists diverge
19:07:42 <clarkb> though at this point I suspect we'd say please wait for the migration to complete before adding a new list
19:07:50 <fungi> no remaining prep work for this window, though assuming it goes well i'll start planning for the openstack lists (15:30-19:30 utc on thursday 2023-10-12, a week after their release)
19:08:54 <fungi> i've already jotted some preliminary notes for that one, and plan to split the server/config management deprovisioning to a separate change and put the old server in emergency disable to avoid any accidents
19:09:28 <fungi> you can find the section for the final maintenance at the very end of the previously mentioned pad
19:09:55 <clarkb> sounds good, thank you for pushing this along
19:10:19 <fungi> sure, looking forward to being done with it after what's been about a year of off-and-on effort from several of us
19:11:42 <clarkb> #topic Server Upgrades
19:11:44 <clarkb> nothing new here...
19:11:53 <clarkb> #topic Nodepool image upload situation
19:12:20 <fungi> the timeout increase merged yeah?
19:12:31 <clarkb> yup yesterday
19:12:47 <clarkb> our image builds look good too. The thing we lack in the dashboard is a listing of how old images are in each cloud provider
19:13:00 <clarkb> but I think in a week we can do a nodepool image-list and check for any that are more than 7 days olver
19:13:06 <clarkb> *more than 7 days old
19:13:30 <fungi> in semi-related news, cloudnull is back at rackspace and possibly has leverage/mandate to assign effort for fixing some of the problems we've observed
19:13:33 <frickler> there are 6 and 13d old ones
19:14:04 <clarkb> frickler: the 13d images are the fedora ones
19:14:10 <clarkb> we'll need to clean up that dashboard to remove them I think
19:14:11 <fungi> 6d and 13d sounds perfect. like we'll be getting 0d and 7d tomorrow
19:14:18 <clarkb> oh you mean in the cloud sorry
19:14:47 <clarkb> all that to say early signs are this is working well enough for us again
19:15:06 <clarkb> but lets check back in again in a week and ensure that the complete set of changes is looking good
19:15:14 <frickler> ack
19:15:43 <fungi> we should probably also do another leaked upload cleanup to see if we keep getting more
19:16:42 <clarkb> fungi: good idea
19:16:47 <frickler> yes, now would be a good time to see if the 6h timeout helps with that
19:17:09 <fungi> i can try to find time for that later this week
19:18:14 <clarkb> thanks. Anything else nodepool related?
19:19:04 <fungi> nothing i'm aware of
19:19:30 <clarkb> #topic Zuul PCRE deprecation
19:19:37 <clarkb> #link https://etherpad.opendev.org/p/3FbfuhNmIWT33fCFK1oK Draft announcement for OpenDev users (particularly OpenStack)
19:19:55 <clarkb> I think corvus was looking for feedback on this before sending it out. At this point I want to say we have chimed in so corvus  you are probably good to send that when ready?
19:20:37 <frickler> note I added some comments in the etherpad, so it shouldn't be sent as is
19:22:04 <clarkb> thank you everyone for reviewing that
19:22:06 <frickler> I also tasked kopecmartin to look at the qa projects and started doing patches for OSC myself, those are the largest batches of warnings I saw
19:22:21 <clarkb> tripleo-ci repo had a lot of them too last I looked
19:22:29 <frickler> or some of tha largest, yes
19:22:50 <frickler> but tripleo was to be retired somehow? need to check the timeline for that
19:23:05 <fungi> yeah, that's one to bring up with the tc
19:23:40 <fungi> i wouldn't sink lots of effort digging into tripleo, we can just ask the tc how they want it handled
19:24:10 <fungi> maybe it can be retired instead
19:24:27 <clarkb> I think that repo supports the stable branches they've kept open but in that case they should be able to fix it
19:24:32 <clarkb> either way we can come up with a solution
19:25:03 <fungi> right. it's a question of whether they're keeping those open in light of the new "unmaintained" and opt-in or automatic eol resolution
19:25:39 <clarkb> #topic Python image updates
19:25:43 <clarkb> #link https://review.opendev.org/q/(+topic:bookworm-python3.11+OR+hashtag:bookworm+)status:open
19:26:26 <clarkb> I think I've decided that we should avoid the Gerrit update until after the openstack release. The two drivers for this are that Irealized there are plenty of other less impactful services that we can work on in the meantime (changes for some pushed up under that link) and Gerrit made a 3.7.5 release that we should also update to at the same time
19:26:50 <clarkb> that means our gerrit update will be a 3.7.4->3.7.5 upgrade, java 11 -> java 17 upgrade, and bullseye -> bookworm upgrade
19:27:07 <clarkb> we can split those up or go at once but either way enough changing that I think we should avoid the openstack release
19:27:32 <frickler> sounds reasonable to me
19:27:33 <clarkb> In the meantime reviews on the other changes I've pushed are appreciated and I think I can approve and monitor those while we wait
19:28:00 <clarkb> note I've asked kopecmartin to weigh in on the refstack update too
19:28:43 <clarkb> I'm hopeful that we'll be able to drop bullseye and python3.9/3.10 image builds soon enoguh. Then we can look at adding 3.12 builds once that release occurs
19:28:46 <clarkb> but one step at a time :)
19:29:28 <clarkb> #topic Redeploying the InMotion/OpenMetal cloud
19:29:47 <clarkb> I sent email to yuriy to gather some initial information on what this involves. I cc'd fungi and frickler
19:30:25 <clarkb> Yuriy responded, but it wasn't imemdiately clear to me what a redeployment would be based on if we deployed today. Yuriy did talk about how in the new year they would be on 2023.1 or 2023.2 openstack.
19:30:54 <clarkb> Anyway I asked for further clarification and volunteered to set up time to meet and coordinate if necessary (I think that yuriy in particular finds the more synchronous interaction beneficial)
19:30:55 <frickler> so currently they would be able to deploy yoga iiuc
19:30:57 <fungi> in light of that, it might make sense to delay rebuilding for a few more months
19:31:22 <clarkb> ya I wanted more clarification on the base OS stuff before we decide
19:31:36 <clarkb> but if the base OS doesn't move forward much by redeploying today then a delay may be worthwhile
19:32:01 <fungi> right, that's more the deciding factor than openstack version, since we can in-place upgrade openstack
19:32:29 <clarkb> Other items of note: we should call the new cloud openmetal not inmotion. They seem interested in continuing to help us successfully consume this hardware and service as well. Thank you openmetal for the help
19:33:35 <frickler> but we wouldn't rename the cloud before the redeploy? other than possibly in grafana?
19:34:05 <clarkb> frickler: I think we can if it is important to them. It is a bit of a pain to do as we have to shutdown and startup a new provider in nodepool
19:34:10 <fungi> yeah, i would say anything that's a significant lift in the renaming department should get wrapped into the redeployment
19:34:13 <clarkb> doable but if we can tie it into a redployment that will be easiest
19:34:33 <fungi> we did something similar for internap->iweb though
19:35:05 <frickler> also they want to rework the networking
19:35:34 <frickler> seems currently we have 6 x /28, they'd provide a single /26 instead
19:35:38 <fungi> easy things to rename we can do straight away because why not? and right if they express urgency then we can accommodate that
19:35:51 <clarkb> frickler: yup that should hopefully simplify thigns for us
19:36:14 <fungi> yes, their platform previously could only allocate networks in /28 prefixes
19:36:27 <fungi> but they've apparently overcome that design limitation now
19:36:35 <fungi> (i wonder if they're any closer to ipv6)
19:36:40 <clarkb> It also sounded like we may need to do the self signed cert thing
19:36:55 <clarkb> kolla supports that so hopefuilly just a matter of setting the correct cars and having kolla rerun
19:37:17 <clarkb> overall though an encouraging first email trade. I'll try to keep the thread alive and drive it towards something we can make a plan off of
19:37:18 <frickler> in 2023.2 kolla might support LE
19:37:48 <fungi> thanks!
19:38:37 <clarkb> #topic Open Discussion
19:38:40 <clarkb> Anything else?
19:38:55 <fungi> if we served the dns for the api endpoints in our opendev.org zone, we could probably wrangle our own le with our usual ansible
19:39:06 <corvus> thanks for the suggestions on the email, i'll update the etherpad later
19:39:14 <fungi> just missing the bit to inject that into kolla
19:39:36 <corvus> just wanted to make sure we were all on board with the tone and the requested actions
19:39:47 <fungi> yep, still lgtm
19:39:54 <frickler> fungi: you'd just place the resulting cert file into the correct location in the kolla config
19:40:30 <fungi> frickler: right, we're missing however we'd tie that file deployment into our ansible
19:40:57 <fungi> i guess we'd put something in our inventory for wherever that is
19:41:34 <clarkb> ianw had started looking into that
19:41:49 <fungi> seems like the linaro cloud could benefit from the same
19:41:50 <clarkb> basically a lighter weight base role application for systems where we don't want to manage firewalls and email and so on
19:42:00 <clarkb> yup it was the linaro cloud where he was proving this out
19:42:21 <clarkb> basically add our users and potentailly other lightweight stuff. I could see working some of the LE provisioning into that
19:42:30 <clarkb> speaking of LE I wonder if we can unfork acme.sh yet
19:43:20 <clarkb> Last call for any other items otherwise we can probably end a bit early today
19:43:59 <fungi> thanks clarkb!
19:44:23 <frickler> ack, thx all
19:45:06 <clarkb> #endmeeting