19:01:01 <clarkb> #startmeeting infra
19:01:01 <opendevmeet> Meeting started Tue Oct  3 19:01:01 2023 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:01 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:01 <opendevmeet> The meeting name has been set to 'infra'
19:01:03 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/OTYE3H5MGJHG2LMHKB6DYOED4HVGO3JL/ Our Agenda
19:01:26 <clarkb> #topic Announcements
19:01:50 <clarkb> OpenStack is finalizing the bobcat release this week (tomorrow if all goes according to plan)
19:03:19 <clarkb> Then just under two weeks from now the PTG will be held
19:03:39 <clarkb> We aren't hosting any PTG meetings ourselves but we should expect people to use meetpad and keep on eye on that
19:03:41 <clarkb> as well as etherpad
19:04:28 <clarkb> Things to be aware of as we are making changes to the systems
19:04:34 <clarkb> #topic Mailman 3
19:04:48 <clarkb> We are down to our last mailman site migration for lists.openstack.org
19:05:01 <clarkb> The current plan is to perform that migration on October 12, 2023 around 1530 UTC iirc
19:05:10 <clarkb> fungi: ^ anything to add to that?
19:05:22 <fungi> nothing to add
19:05:39 <fungi> sorry, tc meeting was distracting and wrapped up late
19:05:41 <clarkb> did you want to bring up the mailserver configuration changes you have proposed?
19:05:58 <clarkb> In particular I think we can go ahead and add the nordix list and bitbucket the mailman address
19:06:06 <clarkb> I've +2'd both changes
19:06:17 <fungi> trying to organize my thoughts...
19:07:02 <fungi> so some of the proposed changes are driven by the recent exim and libspf2 vulnerabilities which were announced
19:08:08 <fungi> i guess it's worth noting that there's at least one unfixed (from our perspective) buffer underrun in libspf2 and our listservs are the only places that's potentially exploitable by remote connection
19:08:32 <fungi> #link https://review.opendev.org/897078 Temporarily disable SPF checking on ML servers
19:09:59 <fungi> up for discussion whether that's something we want to do, it seems like exploiting that would require a malicious recursive resolver, but also there's not even any clear agreement on whether the suspected underrun is the one zdi was claiming to be able to exploit (the author of the fix for the publicly known underrun didn't succeed in finding a way to exploit it)
19:10:41 <fungi> we've also got a configuration change for something that was probably just missed in the switch of our server configs to adapt for mm3:
19:10:50 <fungi> #link https://review.opendev.org/897086 Blackhole deliveries for Mailman v3 local user
19:11:22 <fungi> that should be safe, and would silence a lot of errors in the mta log on the new server as well as freeing up a lot of cruft in its deferral queue
19:11:43 <fungi> finally there's a proposal to add a new mailing list:
19:11:52 <fungi> #link https://review.opendev.org/897234 Add mailing list for Nordix environment
19:12:36 <fungi> notable as it would be the first addition of a mailing list to our production mm3 server through continuous deployment automation of our config not associated with a migration effort
19:12:56 <fungi> i expect it will "just work" but we should check on it after deployment
19:13:34 <fungi> as for the upcoming lists.openstack.org migration, i still need to flesh out the migration plan and write the config changes for that
19:13:41 <clarkb> ya our testing does create mailing lists so it should have coverage but good to confirm
19:13:48 <fungi> well, and we
19:14:07 <fungi> 've created "new" lists on the server before each new migration
19:14:32 <fungi> since they're created by ansible before we run the imports
19:15:19 <fungi> for the final maintenance, i'll also send a one-week reminder to the openstack-discuss ml on thursday
19:16:22 <clarkb> thanks. Anything else?
19:16:27 <fungi> nothing on my end, no
19:17:53 <clarkb> #topic InMotion/OpenMetal Cloud Replacement
19:18:22 <clarkb> I haven't followed up to yuriy again to restart the conversation. part of the reason for the delay is a realization that yuriy would probably just prefer to have a phone call (on meetpad)
19:18:58 <clarkb> I'm thinking I should propose something for later this week. Does Friday morning pacific time seem reasonable? I've also got some stuff at home so may have to jump around
19:19:17 <fungi> i can make that
19:20:52 <clarkb> cool I'll try to propose that later today
19:21:02 <clarkb> #topic Zuul PCRE Deprecation
19:21:28 <clarkb> I don't see tempest in the list of warnings zuul provides
19:21:44 <clarkb> still plenty of warnings, but I don't think there is anything special we need to do at this point? Any concerns with dropping this off of our agenda?
19:22:04 <fungi> i have no concerns with that
19:22:26 <tonyb> Should we (I) propose some patches to fix the errors?
19:22:40 <tonyb> or do we want the teams to do that?
19:22:58 <clarkb> tonyb: frickler has been driving it so far I would coordinate with him. In general I think we're hoping projects become a bit more responsive. It also isn't super urgent
19:23:02 <tonyb> Kind of a "here's an attempt, go nuts"
19:23:10 <tonyb> Okay
19:23:34 <clarkb> part of the struggle in the past has been we write a bunch of changes then no one reviews them so it feels like wasted effort
19:23:53 <clarkb> better if they write the patches knowing they can review them and we just trim projects out of zuul when their configs are invalid. But there is a balancing act
19:24:10 <tonyb> Okay
19:24:21 <fungi> i have changes along those lines that are still open since 2015
19:24:51 <fungi> i take that back, i have a change that's still open from 2014
19:24:54 <tonyb> hehe Okay
19:25:46 <clarkb> in general though the zuul features to workaround the loss of regex functionality seem to be sufficient for us so far so we don't need to feedback to zuul on the change
19:25:54 <clarkb> just a matter of getting projects to update their configs
19:26:00 <clarkb> #topic Python Container Updates
19:26:06 <clarkb> #link https://review.opendev.org/q/(+topic:bookworm-python3.11+OR+hashtag:bookworm+)status:open
19:26:30 <clarkb> As mentioned last week zuul-registry relies on openssl 1.x and breaks under bookworm. There are however a few other things we can update under that topic if you have time for review
19:26:50 <clarkb> Early next week I would like to land the Gerrit java 17 + bookworm update as well
19:26:57 <clarkb> as that is the last major one still pending
19:28:02 <clarkb> #topic Etherpad 1.9.3 Upgrade
19:28:14 <clarkb> I haven't heard from frickelr on whether or not there was time to test this
19:28:31 <clarkb> tl;dr is I noticed some weirdness that appaered to be due to browser caching because switching to incognito mode and 1.9.3 was fine
19:28:45 <clarkb> we were hopign to get more data points from people testing against the held 1.9.3 node before we upgraded
19:29:12 <clarkb> tonyb: ^ if you'd like to do that we can dig up the IP address. YOu ahve to edit /etc/hosts to point etherpad.opendev.org at it due to redirects but then you'll talk to the test server with its bad cert
19:29:13 <fungi> if we're going to update it, soon would be preferable as we're approaching another ptg and we want to stabilize the deployment
19:29:22 <tonyb> Yes please
19:29:26 <fungi> ip address is in my comment on the upgrade change
19:29:28 <clarkb> fungi: yup, I'm hoping we can do ti this week
19:29:51 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/896454 Change to upgrade has test node details in comments
19:31:08 <clarkb> let us know if you test it and what you find. Then we can decide on upgrading or not
19:31:27 <tonyb> Will do
19:31:29 <clarkb> #topic Gitea 1.21
19:31:48 <clarkb> There was timeline discussion in the gitea discord room the other day and they indicated it was ~3 weeks away
19:32:14 <clarkb> We also got a 1.20.5 update recently. We should upgrade to that before we continue 1.21 planning/testing
19:32:20 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/897244 update to 1.20.5 first
19:32:44 <fungi> has a changelog for 1.21 appeared yet?
19:32:54 <clarkb> The 1.20.5 changelog looks very straightforward. We can probably merge that right now if we want, but can also delay until after the openstack reelase to be careful
19:33:00 <clarkb> fungi: I have not seen a 1.21 changelog yet
19:34:10 <clarkb> #topic Gerrit replication leaked task files
19:34:14 <clarkb> #link https://gerrit-review.googlesource.com/c/plugins/replication/+/387314 Clarkb wrote and pushed a fix upstream
19:35:01 <clarkb> The bulk of my time last week was spent around this. I was asked to write unittests but I couldn't even get gerrit and its plugins to build successfully locally which meant I couldn't run tests (also there are no docs on how to run plugin tests)
19:35:40 <clarkb> I pivoted to learning how to make that work first. Turns out every version of Gerrit needs a different version of Bazel to build. If you don't use the correct version of Bazel then the linters will fail (and possibly other things).
19:35:56 <clarkb> To address this apparently everyone using bazel uses Bazelisk to run bazel
19:36:36 <clarkb> once I figured this out and installed bazelisk I was able to build Gerrit and run tests. I pushed a docs update upstream to have bazelisk info and examples which is in review. Then iterated on tests until I had some that fail on the old code and pass in my update
19:37:00 <clarkb> If anyone else wants to dive into Gerrit things I'm happy to help more directly as well. Hopefully the docs updates land at some point though
19:37:24 <clarkb> All that to say I'm hopeful we can fix this issue before we upgrade to 3.8 or maybe as part of the 3.8 upgrade depending on how things mereg and get applied to stable branches
19:38:34 <clarkb> once the bookworm upgrade is done and the plugin is fixed that will be the next gerrit task on my todo list
19:38:53 <clarkb> we already have some testing of the 3.7 to 3.8 upgrade buit we'll need to read release notes and probably hold nodes for testing particularly of the rollback
19:39:00 <clarkb> #topic Open Discussion
19:39:05 <clarkb> That was all I had written in the agenda
19:39:16 <clarkb> One thing I noticed today is that our arm64 image builds all seem to be failing
19:39:54 <clarkb> I was going to say I bet the builder filled its disk. I decied to just check really quickly and that is indeed the issue
19:40:29 <clarkb> so we'll want to stop services, clean out the dib tmp dir and remove any leaked image build files, reboot for good measure (clears mount tables) and restart things
19:41:31 <clarkb> oh! the other gerrit thing I wanted to mention is their community meeting is happening at 8am Pacific Time Thursday October 5 on Discord
19:41:32 <tonyb> When I was doing DIB work I had a tool that'd clear out the mount tables without a reboot, I can try and dig that up for next time
19:42:15 <clarkb> I'm going to try and be at the gerrit meeting. I joined their discord. The only gotcha is I have to drop the kids off at school that day. I can probably get them there early and make it back in time to make the start of the meeting
19:42:40 <clarkb> previously there were issues with that meeting beause no one from google would join to let us into the google meet room
19:42:46 <clarkb> they solved that by hosting it in discord I guess
19:43:17 <clarkb> I'm going to try and be involved upstream a bit more since I've gone through the process of learning some of their tooling I may as well fix the occaisional bug
19:43:23 <tonyb> You could also do discord on your phone ;P
19:43:46 <clarkb> that is a good idea actually then I can walk home at a normal pace :)
19:44:11 <clarkb> Last call anything else to discuss?
19:44:27 <tonyb> nope.
19:47:06 <clarkb> thank you everyone for you time today and in general helping to run opendev
19:47:12 <clarkb> we'll be back here same time and location next week
19:47:14 <clarkb> #endmeeting