19:00:21 <clarkb> #startmeeting infra
19:00:21 <opendevmeet> Meeting started Tue Jan 23 19:00:21 2024 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:21 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:00:21 <opendevmeet> The meeting name has been set to 'infra'
19:01:20 <clarkb> link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/HPFGK4QDZU24FUZFA6BHEAYLQIG224WD/ Our Agenda
19:01:22 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/HPFGK4QDZU24FUZFA6BHEAYLQIG224WD/ Our Agenda
19:01:28 <clarkb> #topic Announcements
19:01:38 <clarkb> Service coordinator nominations open February 6, 2024 - February 20, 2024
19:01:46 <clarkb> I made that official in an email to the service-discuss list
19:01:50 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/TB2OFBIGWZEYC7L4MCYA46EXIX5T47TY/
19:01:57 <clarkb> Happy to answer any questions people have about that
19:03:24 <clarkb> #topic Server Upgrades
19:03:32 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/905510 Upgrading meetpad service to jammy
19:03:52 <clarkb> I think we're largely waiting on a second reviewer for this stack? tonyb testing was happy after we fixed the websockets issue?
19:05:02 <tonyb> Yup testing was definitely happier
19:05:52 <clarkb> cool
19:05:55 <tonyb> There are still certifuicate issues
19:05:57 <clarkb> #link https://etherpad.opendev.org/p/opendev-bionic-server-upgrades
19:06:04 <clarkb> ya the hsts stuff is annoying
19:06:09 <tonyb> but I think having real LE certs will help that
19:06:18 <clarkb> turns out there are ways to work around it like using incognito mode
19:06:40 <clarkb> however in this case I'm not sure if it will help since jitsi willsend the headers for strict verification
19:07:33 <tonyb> I was using ... https://paste.opendev.org/show/blYpdCY39nZbSijUoV2l/ ... which seemed to "do better"
19:08:52 <clarkb> thats good to know
19:09:07 <clarkb> in the etherpad I linked above there are notes for wiki replacement
19:09:19 <clarkb> tonyb: did you have a chance to see the notes funig and I added? any concerns with what we wrote?
19:11:05 <tonyb> I read them. I don't have any concerns
19:11:19 <clarkb> anything else on this topic?
19:11:37 <fungi> quick summary is that we want to make sure the openid and patrolling extensions work, the others are likely less important for now
19:12:08 <clarkb> for the wiki if there is any confusiion with the ongoing gerrit + openid problems
19:12:11 <fungi> overall the plan lftm
19:12:13 <fungi> er, lgtm
19:12:21 <tonyb> Yup.  I'm fairly confident that as the first step will be to deploy the *same* git versions but containerised we'll be okay
19:12:32 <tonyb> ... upgrades will be more work
19:12:51 <tonyb> but really I want to decouple the OS and application
19:12:53 <fungi> thanks!
19:13:32 <tonyb> yw
19:13:44 <tonyb> I think we're ready for #next_topic
19:13:53 <clarkb> #topic Python Container Updates
19:14:01 <clarkb> Nothing new here that I am aware of
19:14:27 <clarkb> I think this has largely been on the far end of the priority list due to all the other stuff going on. And thats ok. We've minmimzed the the amount of images we have to build and support etc
19:14:45 <clarkb> #topic Upgrading Zuul's DB Server
19:14:56 <clarkb> I kept this on the agenda despite the general agree with a rough plan last week
19:15:13 <clarkb> mostly because I seem to recall saying I would let people object for the next week
19:15:35 <clarkb> any objections to the plan of trying to run a mysql/mariadb for zuul that we can eventually cluster later if we get to it?
19:15:37 <fungi> also worth noting, there were several outages of that trove instance, though zuul seems to have weathered them okay
19:16:16 <clarkb> if there are no objections now then I think we can move on and remove this topic from the agenda for next week
19:16:36 <corvus> no objections here
19:16:37 <fungi> the silent ayes have it
19:16:51 <clarkb> #topic AFS Quota Issues
19:17:07 <clarkb> I haven't made any progress on ubuntu ports trimming
19:19:45 <clarkb> #topic Broken Wheel Builds on CentOS
19:20:15 <clarkb> I think we have openafs packages that should work now
19:20:26 <fungi> that's what it sounded like yesterday
19:20:35 <clarkb> In theory I would expect that means we've got more jobs passing for this now. However, I think the final publication stuff may still have the wrong volume names?
19:22:19 <clarkb> probably worth doing another pass on checking the job statuses
19:22:25 <fungi> as in they're publishing with the wrong platform stub in the filenames?
19:22:27 <clarkb> as I suspect we may have pushed the failure forward to the next broken thing
19:22:48 <clarkb> fungi: ya the centos8 amd64 stuff afiles because it was trying to publish to a volume openafs claimed didn't exist
19:22:50 <fungi> that was something that changed in a recent ansible, if memory serves
19:23:01 <clarkb> I suspect we may have made a stream volume but we're still pushing to the non stream location which was cleaned up?
19:23:14 <fungi> i think it broke for some rh-like platforms in our last ansible update
19:23:44 <fungi> where the release var started including the minor number rather than just the major
19:23:49 <tonyb> There was a similar change for Debian a while back
19:24:00 <clarkb> ah ya that could be part of the problem
19:24:40 <fungi> we switched which ansible var we use in some places, but probably missed some too
19:24:42 <clarkb> in any case we should find time to do another pass on job statuses and failure and take it from there
19:25:52 <clarkb> #topic OpenDev Pre PTG
19:26:00 <clarkb> Looking at try to do two days Wednesday and Thursday sometime in February: February 7+8 or February 14+15 or February 21+22
19:26:06 <clarkb> Have two blocks of time each day one that works better for EU and another for APAC. Probably 14:00-16:00UTC and 22:00-00:00 UTC.
19:26:23 <clarkb> I haven't heard any objections to any of these days or times
19:26:37 <clarkb> I'm kinda leaning towards the 14th and 15th as that gives time to prepare but isn't so far out in the future
19:27:10 <clarkb> I'm happy to hear feedback though. THis is me mostly trying to accomodate what I perceive to be the issues with various timezones as well as my own meeting schedule (tuesdays are really busy)
19:27:18 <fungi> i have schedule availability for all of them, but yeah maybe sooner is better than immediately before ptg week
19:27:39 <corvus> 14/15 slightly better for me
19:27:42 <frickler> ptg is a full month later?
19:27:53 <clarkb> frickler: ptg is first week of april I think /me double checks
19:28:00 <tonyb> 14+15 works best for me
19:28:04 <clarkb> April 8 - 12 is the PTG
19:28:29 <fungi> oh, yeah, i guess any of them is more than a month before the ptg
19:28:38 <frickler> anyway I'm fine with any of these dates
19:28:51 <clarkb> fungi: ya I mostly want to get a head start on some of this stuff
19:28:59 <fungi> absolutely
19:29:17 <clarkb> as far as topics go I'd like to do a group brainstorm/planning/prioritization sort of thing for the various debt we've got hanging around
19:29:37 <tonyb> I could do the "EU" timeslot but I do worry I wouldn't be a great asset
19:30:10 <clarkb> think podman / modern docker compose, mariadb upgrades, keycloak id stuff, openmetal/inmotion cloud redeployment, prometheus, deprecated zuul configs (think stdout/stderr split in command tasks) and so on
19:30:42 <clarkb> I've got a set of notes in my notetaking file that I need to transplant into an etherpad and then others can also add ideas as well as indicate interest on topics so that we can do our best to accomodate split scheduling
19:30:58 <fungi> also i'd generally throw a "sustainability" discussion item in there
19:31:03 <clarkb> ++
19:31:29 <clarkb> lets also say we'll do it February 14 and 15
19:32:05 <fungi> wouldn't hurt to do a "big picture" overview of everything we're still managing and ask ourselves what else might be on the losing side of cost vs benefit
19:32:07 <clarkb> and then we can use the meetpad corresponding with the planning etherpad as the location
19:32:30 <clarkb> fungi: thats a great idea. I think that sort of big picture will help with the prioritzation aspect of figuring out where to apply ourselves
19:33:13 <tonyb> ++
19:33:23 <clarkb> once other things settle down I'll send emails making this all official
19:33:30 <clarkb> and work on getting that agenda etherpad populated
19:33:39 <fungi> thanks!
19:33:48 <clarkb> #topic Open Discussion
19:34:00 <clarkb> speaking of other things settling down the gerrit openid stuff has been fun
19:34:10 <fungi> "fun"
19:34:27 <clarkb> maybe we should call the meeting early and get back to that? Is there anything else to bring up now?
19:34:38 <frickler> inmotion failures?
19:35:00 <clarkb> oh ya I haven't had time to look at that
19:35:17 <clarkb> tonyb: maybe after today's school run we can dig into that together?
19:35:22 <frickler> the mirror host seems to be offline since sunday
19:35:36 <clarkb> tonyb: would be a good introduction to how things are set up there because it like the linaro cloud are "different"
19:35:40 <frickler> but nodepool seems to have been unhappy for much longer
19:36:03 <tonyb> clarkb: Sounds good to me
19:36:26 <clarkb> frickler: in the past we've definitely seem things leak in ways that create nodepool failures that slowly get worse over time
19:36:42 <clarkb> wouldn't surprise me if it finally got bad enough that the mirror had nowhere to run but we'll haev to check logs
19:36:54 <fungi> i did add the inmotion mirror to the emergency disable list, in order to get the base deploy job working again. it's been failing since sunday, which coincides with the errors ironic reported
19:37:01 <tonyb> Oh, I have a meeting 2100-2200UTC but apart from that ...
19:37:07 <frickler> ack
19:37:14 <clarkb> tonyb: ya I think school pickup is 2200-2300 ish
19:37:36 <tonyb> Okay perfect
19:37:43 <clarkb> I have to walk because I'll eb without the car
19:40:14 <clarkb> sounds like that may be it and we need to get back to debugging gerrit things
19:40:17 <clarkb> thank you everyone!
19:40:20 <clarkb> #endmeeting