19:00:05 <clarkb> #startmeeting infra
19:00:05 <opendevmeet> Meeting started Tue Apr 30 19:00:05 2024 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:06 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:00:06 <opendevmeet> The meeting name has been set to 'infra'
19:00:14 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/M4JKPPJYDIJT5EQTKPCIANUZ6WNFOO5T/ Our Agenda
19:00:29 <clarkb> #topic Announcements
19:01:06 <clarkb> I didn't have anything to announce. More just trying to get back into the normal swing of things after the PTG and all that
19:01:18 <clarkb> Did anyone have anything to announce?
19:01:43 <frickler> just that I'm off for the remainder of this week
19:02:09 <clarkb> I'll be around. Though I dod have parent teacher conference stuff thursday morning. But otherwise I'm generally around
19:02:51 <clarkb> #topic Upgrading Old Servers
19:03:30 <clarkb> tonyb: with our timezones a bit better aligned for a bit I'm happy to help dive into this again if your time constraints allow it
19:03:31 <tonyb> No progress but I'm in the US and it's next on my todo list to clear out my older reviews
19:03:43 <clarkb> cool feel free to ping me if I can help in any way
19:03:49 <tonyb> Will do
19:04:03 <tonyb> I'll also start an etherpad for the focal servers
19:04:29 <fungi> i'll be disappearing on thursday and gone for 11 days
19:04:31 <clarkb> ++ getting a sense of the scale of the next round of things would be good. I'm hoping that generally as we're more and more in containers this becomes easier
19:05:01 <fungi> (sorry, i missed the #topic change)
19:05:06 <clarkb> no problem
19:05:15 <clarkb> #topic MariaDB Upgrades
19:05:26 <clarkb> We've done all of the services except for Gerrit at this point
19:05:42 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/916848
19:06:20 <clarkb> This is a change to do that, but it won't be automated after landing (all of the other upgrades were). I'm 99% positive that this will simply update the docker-compose file instead and we'll need to pull and down and up -d by hand to get that through
19:06:35 <clarkb> that means there will be a short gerrit outage so any thoughts on when we should do that?
19:08:30 <tonyb> Not really.
19:08:49 <clarkb> ok I think this one is also relatively low priority since gerrit barely depends on that db at this point
19:09:06 <clarkb> reviews would be great then if an opportunity presents itself we can merge it quickly and restart services?
19:09:15 <tonyb> Sounds good
19:09:53 <clarkb> #topic AFS Mirror Cleanups
19:10:02 <clarkb> topic:drop-ubuntu-xenial has changes up for review.
19:10:20 <clarkb> That said this is going to be a slow process of chipping away at things.
19:10:48 <clarkb> One question I had for the group is whether or not you think I should push up changes to remove projects from zuul's tenant config rather than try and fix up their zuul configs properly
19:11:19 <clarkb> in particular there are a number of x/* projects with python35 jobs. I think Xenial sort of coincided with the era of everyone making a project for everything and as a result ended up with test configs that want to run on xenial
19:11:38 <clarkb> and for those it may just be easiest to remove them from zuul's tenant config entirely rather than try and coerve their configs into the future
19:12:40 <tonyb> I feel like dropping projects in that state is reasonable.  Assuming it's announced and we restore any somewhat quickly?
19:13:00 <clarkb> ya restoring them shouldn't be an issue. They will have to make their first change a zuul config cleanup but otherwise should be straightforward
19:13:07 <clarkb> (and that fixup might be to reset to noop jobs)
19:13:26 <frickler> +1
19:14:51 <clarkb> ok I'll continue down that path then and hopefully we eventually reach a point where its like 80% done and we can announce a removal date and let the fallout fall from there
19:14:54 <fungi> i'm in favor
19:15:03 <frickler> I also just noticed an issue with the deb-ceph-reef mirror I created
19:16:00 <clarkb> frickler: looks like we need the symlink into the apache web space since it is a separate mirror entirely
19:16:03 <frickler> seems the volume is named mirror.ceph-deb-reef, while the correct name would be just mirror.deb-reef
19:16:38 <clarkb> hrm we try to keep those short due to afs limits, not sure what the limit is. Maybe that name is ok as is?
19:16:49 <clarkb> and then we just need to realign things? Or maybe we can simply rename the volume?
19:17:19 <frickler> I'm not sure, I just notice that the vos release in the reprepro log is failing with "vldb not found"
19:17:24 <frickler> anyway, if it is not considered urgent, I can look at that next week
19:17:39 <clarkb> ya I don't think it is urgent since it is new rather than affecting existing jobs
19:17:42 <frickler> but if one you wants to fix it, go ahead
19:17:45 <clarkb> ack
19:18:23 <clarkb> #topic Adding Ubuntu Noble Test Nodes
19:18:56 <clarkb> The changes we needed in glean and opendev nodepool elements all landed and dib is building noble nodes in its testing
19:19:20 <clarkb> I think the next step is to add noble to our ubuntu mirror then we can add noble image builds to nodepool
19:19:49 <frickler> I also just mentioned in the TC meeting that help on this would likely be welcome
19:20:14 <clarkb> need to check that we have room in the ubuntu mirror volume first (and bump quotas if necessary. Hopefully we don't have to split the volume due to the 2TB limit but I don't think we are there yet)
19:20:25 <frickler> (even more so for the devstack etc. part that would still need to follow)
19:20:33 <clarkb> but then it should be some pretty straightforward copy pasta from the existing mirror stuff for ubuntu
19:21:25 <frickler> mirror.ubuntu is at 850G in total, so not close to 2T at least
19:21:28 <clarkb> ubuntu is 6GB short of the 850GB quota limit. And I think openafs is limited to 2TB
19:21:44 <clarkb> so ya we probably need to bump the quota to something like 1200GB and then we should be good to land the change
19:22:14 <clarkb> and once that is done similar copy paste with the nodepool configs to build new images there. And then its the long process of getting stuff onto the new node type
19:22:25 <frickler> oh, note to self: adding the reef volume to grafana is also missing
19:23:32 <clarkb> assuming I'm able to come up for air on the thigns I've got in flight I can probably look at noble stuff but it might be a good thing for someone else eto push along. ON the config side its largely copy paste and then checking results. Only really need infra root for the quota bump and possibly to hold locks for a manual seed of the mirror
19:23:54 <clarkb> I guess if anyone sees this and is interesting in helping out let us know and we'll point you in the right direction
19:24:04 <clarkb> #topic Etherpad 2.0.3 Upgrade
19:24:10 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/914119 Upgrade etherpad to 2.0.3
19:24:22 <fungi> i think that's ready to go now
19:24:36 <clarkb> we are currently running 1.9.7 or something like that. At first the 2.0 update wasn't a big deal as it mostly chagned how you install etherpad so we update the docker file and are good to go
19:25:07 <clarkb> then 2.0.2 broke api auth and we needed 2.0.3 to add a viable alternative. That release happened yesterday and a node is up and held for testing which fungi and I used for testing nad all seems well
19:25:31 <clarkb> so ya if the rest of the change (docker file updates, docs updates, test updates) look good I think we can proceed with upgrading this service
19:25:47 <clarkb> We do need to add a new private var to bridge before landing the change
19:26:16 <clarkb> fungi: maybe we give others a chance to review between now and tomorrow morning then send it tomorrow morning if no one objects by then?
19:26:23 <fungi> wfm
19:27:41 <clarkb> #topic Gerrit 3.9 Upgrade Planning
19:27:49 <clarkb> #link https://etherpad.opendev.org/p/gerrit-upgrade-3.9 Upgrade prep and process notes
19:28:18 <clarkb> I've started on this (need to actuall perform the upgrade process and downgrade process on a test node in order to take notes) and I think overall this is a straightforward upgrade
19:28:54 <clarkb> You can go over my notes and skim the release notes yourselves to see if I've made any errors in judgement or overlooked important changes so far. Feel free to add them to the etherpad if so
19:29:26 <clarkb> There are a few things worth mentioning. First is that we have the option of making diff3 the diff method for merge changes. This adds a third piece of info which is the base file state in addition to theirs and ours
19:29:42 <clarkb> There is a new default limit of 5000 changes per topic
19:30:05 <clarkb> we can incraese that value if we think it is too low. I suspect that our largest topics are going to be things for like openstack releases which maybe have a few hundred?
19:30:46 <clarkb> And finally we there is a new option to build the gerrit docs without external resources, but that option isn't part of the release war build so I've been asking upstream (with no luck yet) in how to combine this option with building a release war
19:31:40 <fungi> any idea how the changes per topic limit can be checked ahead of time, and what the outcome is if it's exceeded during upgrade?
19:32:07 <clarkb> fungi: no, but those are good questions. I'll try to followup with them upstream. Worst case thursday is the monthly community meeting and I should be able to get more info there
19:32:14 <fungi> also what happens when the 5001st change with the same topic is pushed (rejection with some specific error message i guess)
19:32:35 <clarkb> we should be able to test some of that on a held node easily. Just set the limit to 1 and then try and add a second chaneg to a topic
19:32:57 <fungi> good point
19:33:17 <clarkb> fungi: maybe scribble those notes under the item on the etherpad and I'll followup with more info
19:33:53 <clarkb> as far as upgrade planning goes I suspect we can upgrade before the end of May. Maybe on the last day of May given various holidays and vacation and all that
19:34:10 <clarkb> I'll propose something more concrete next week after a bit more testing then we can announcei t
19:35:04 <tonyb> If I'm doing it right new-release topic would be waaaay over 5k
19:35:04 <clarkb> #topic Wiki Cert Renewal
19:35:12 <clarkb> #undo
19:35:12 <opendevmeet> Removing item from minutes: #topic Wiki Cert Renewal
19:35:26 <clarkb> tonyb: you mean that openstack releases produce a topic with over 5000 changes?
19:36:04 <frickler> does that limit include merged changes or only open ones?
19:36:05 <tonyb> https://review.opendev.org/q/topic:new-release,5000
19:36:25 <clarkb> frickler: that is one of the open questions I made upstream that hasn't had a response yet
19:36:33 <tonyb> If it's only open then we'd be fine
19:36:41 <tonyb> Okay.
19:36:50 <clarkb> tonyb: oh I see releases use the same topic each time so they've build up over time.
19:37:27 <fungi> i tried to cover all those items in the pad
19:37:43 <clarkb> interseting that we're right around the limit too. Seems like somewhere between 5100 and 5200 change attached to that topic
19:37:47 <frickler> yes, I was checking "formal-vote" which came to my mind first, but that's only at 750
19:38:20 <tonyb> clarkb: Ok so waaaay over was an overstatement
19:38:33 <clarkb> tonyb: but if there are problems 1 over is probably sufficient to find them :)
19:38:44 <clarkb> I'll continue to followup and try to attend the community meeting to ask directly there as well
19:38:52 <tonyb> clarkb: Thanks
19:39:02 <clarkb> #topic Wiki Cert Renewal
19:39:19 <clarkb> This is just a note to make sure people know I've said I'll deal with this ~1 week before expiry
19:39:33 <clarkb> Don't really have anything new to say. But didnt' remove it from the agenda since the cert hasn't been renewed yet
19:39:53 <fungi> i'll be back from vacation a few days before it expires and can do the file installation part then
19:40:02 <clarkb> ack tahnks
19:40:07 <clarkb> #topic Open Discussion
19:40:09 <clarkb> Anything else?
19:43:45 <clarkb> sounds like that is probably it
19:43:47 <clarkb> Thank you everyone
19:44:16 <clarkb> we'll be back here next week at the same time and location. I suspect there will be fewer of us but enough to have the meeting and sync up on what is going on
19:44:20 <clarkb> #endmeeting