19:00:21 #startmeeting infra 19:00:21 Meeting started Tue Jan 23 19:00:21 2024 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:21 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:21 The meeting name has been set to 'infra' 19:01:20 link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/HPFGK4QDZU24FUZFA6BHEAYLQIG224WD/ Our Agenda 19:01:22 #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/HPFGK4QDZU24FUZFA6BHEAYLQIG224WD/ Our Agenda 19:01:28 #topic Announcements 19:01:38 Service coordinator nominations open February 6, 2024 - February 20, 2024 19:01:46 I made that official in an email to the service-discuss list 19:01:50 #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/TB2OFBIGWZEYC7L4MCYA46EXIX5T47TY/ 19:01:57 Happy to answer any questions people have about that 19:03:24 #topic Server Upgrades 19:03:32 #link https://review.opendev.org/c/opendev/system-config/+/905510 Upgrading meetpad service to jammy 19:03:52 I think we're largely waiting on a second reviewer for this stack? tonyb testing was happy after we fixed the websockets issue? 19:05:02 Yup testing was definitely happier 19:05:52 cool 19:05:55 There are still certifuicate issues 19:05:57 #link https://etherpad.opendev.org/p/opendev-bionic-server-upgrades 19:06:04 ya the hsts stuff is annoying 19:06:09 but I think having real LE certs will help that 19:06:18 turns out there are ways to work around it like using incognito mode 19:06:40 however in this case I'm not sure if it will help since jitsi willsend the headers for strict verification 19:07:33 I was using ... https://paste.opendev.org/show/blYpdCY39nZbSijUoV2l/ ... which seemed to "do better" 19:08:52 thats good to know 19:09:07 in the etherpad I linked above there are notes for wiki replacement 19:09:19 tonyb: did you have a chance to see the notes funig and I added? any concerns with what we wrote? 19:11:05 I read them. I don't have any concerns 19:11:19 anything else on this topic? 19:11:37 quick summary is that we want to make sure the openid and patrolling extensions work, the others are likely less important for now 19:12:08 for the wiki if there is any confusiion with the ongoing gerrit + openid problems 19:12:11 overall the plan lftm 19:12:13 er, lgtm 19:12:21 Yup. I'm fairly confident that as the first step will be to deploy the *same* git versions but containerised we'll be okay 19:12:32 ... upgrades will be more work 19:12:51 but really I want to decouple the OS and application 19:12:53 thanks! 19:13:32 yw 19:13:44 I think we're ready for #next_topic 19:13:53 #topic Python Container Updates 19:14:01 Nothing new here that I am aware of 19:14:27 I think this has largely been on the far end of the priority list due to all the other stuff going on. And thats ok. We've minmimzed the the amount of images we have to build and support etc 19:14:45 #topic Upgrading Zuul's DB Server 19:14:56 I kept this on the agenda despite the general agree with a rough plan last week 19:15:13 mostly because I seem to recall saying I would let people object for the next week 19:15:35 any objections to the plan of trying to run a mysql/mariadb for zuul that we can eventually cluster later if we get to it? 19:15:37 also worth noting, there were several outages of that trove instance, though zuul seems to have weathered them okay 19:16:16 if there are no objections now then I think we can move on and remove this topic from the agenda for next week 19:16:36 no objections here 19:16:37 the silent ayes have it 19:16:51 #topic AFS Quota Issues 19:17:07 I haven't made any progress on ubuntu ports trimming 19:19:45 #topic Broken Wheel Builds on CentOS 19:20:15 I think we have openafs packages that should work now 19:20:26 that's what it sounded like yesterday 19:20:35 In theory I would expect that means we've got more jobs passing for this now. However, I think the final publication stuff may still have the wrong volume names? 19:22:19 probably worth doing another pass on checking the job statuses 19:22:25 as in they're publishing with the wrong platform stub in the filenames? 19:22:27 as I suspect we may have pushed the failure forward to the next broken thing 19:22:48 fungi: ya the centos8 amd64 stuff afiles because it was trying to publish to a volume openafs claimed didn't exist 19:22:50 that was something that changed in a recent ansible, if memory serves 19:23:01 I suspect we may have made a stream volume but we're still pushing to the non stream location which was cleaned up? 19:23:14 i think it broke for some rh-like platforms in our last ansible update 19:23:44 where the release var started including the minor number rather than just the major 19:23:49 There was a similar change for Debian a while back 19:24:00 ah ya that could be part of the problem 19:24:40 we switched which ansible var we use in some places, but probably missed some too 19:24:42 in any case we should find time to do another pass on job statuses and failure and take it from there 19:25:52 #topic OpenDev Pre PTG 19:26:00 Looking at try to do two days Wednesday and Thursday sometime in February: February 7+8 or February 14+15 or February 21+22 19:26:06 Have two blocks of time each day one that works better for EU and another for APAC. Probably 14:00-16:00UTC and 22:00-00:00 UTC. 19:26:23 I haven't heard any objections to any of these days or times 19:26:37 I'm kinda leaning towards the 14th and 15th as that gives time to prepare but isn't so far out in the future 19:27:10 I'm happy to hear feedback though. THis is me mostly trying to accomodate what I perceive to be the issues with various timezones as well as my own meeting schedule (tuesdays are really busy) 19:27:18 i have schedule availability for all of them, but yeah maybe sooner is better than immediately before ptg week 19:27:39 14/15 slightly better for me 19:27:42 ptg is a full month later? 19:27:53 frickler: ptg is first week of april I think /me double checks 19:28:00 14+15 works best for me 19:28:04 April 8 - 12 is the PTG 19:28:29 oh, yeah, i guess any of them is more than a month before the ptg 19:28:38 anyway I'm fine with any of these dates 19:28:51 fungi: ya I mostly want to get a head start on some of this stuff 19:28:59 absolutely 19:29:17 as far as topics go I'd like to do a group brainstorm/planning/prioritization sort of thing for the various debt we've got hanging around 19:29:37 I could do the "EU" timeslot but I do worry I wouldn't be a great asset 19:30:10 think podman / modern docker compose, mariadb upgrades, keycloak id stuff, openmetal/inmotion cloud redeployment, prometheus, deprecated zuul configs (think stdout/stderr split in command tasks) and so on 19:30:42 I've got a set of notes in my notetaking file that I need to transplant into an etherpad and then others can also add ideas as well as indicate interest on topics so that we can do our best to accomodate split scheduling 19:30:58 also i'd generally throw a "sustainability" discussion item in there 19:31:03 ++ 19:31:29 lets also say we'll do it February 14 and 15 19:32:05 wouldn't hurt to do a "big picture" overview of everything we're still managing and ask ourselves what else might be on the losing side of cost vs benefit 19:32:07 and then we can use the meetpad corresponding with the planning etherpad as the location 19:32:30 fungi: thats a great idea. I think that sort of big picture will help with the prioritzation aspect of figuring out where to apply ourselves 19:33:13 ++ 19:33:23 once other things settle down I'll send emails making this all official 19:33:30 and work on getting that agenda etherpad populated 19:33:39 thanks! 19:33:48 #topic Open Discussion 19:34:00 speaking of other things settling down the gerrit openid stuff has been fun 19:34:10 "fun" 19:34:27 maybe we should call the meeting early and get back to that? Is there anything else to bring up now? 19:34:38 inmotion failures? 19:35:00 oh ya I haven't had time to look at that 19:35:17 tonyb: maybe after today's school run we can dig into that together? 19:35:22 the mirror host seems to be offline since sunday 19:35:36 tonyb: would be a good introduction to how things are set up there because it like the linaro cloud are "different" 19:35:40 but nodepool seems to have been unhappy for much longer 19:36:03 clarkb: Sounds good to me 19:36:26 frickler: in the past we've definitely seem things leak in ways that create nodepool failures that slowly get worse over time 19:36:42 wouldn't surprise me if it finally got bad enough that the mirror had nowhere to run but we'll haev to check logs 19:36:54 i did add the inmotion mirror to the emergency disable list, in order to get the base deploy job working again. it's been failing since sunday, which coincides with the errors ironic reported 19:37:01 Oh, I have a meeting 2100-2200UTC but apart from that ... 19:37:07 ack 19:37:14 tonyb: ya I think school pickup is 2200-2300 ish 19:37:36 Okay perfect 19:37:43 I have to walk because I'll eb without the car 19:40:14 sounds like that may be it and we need to get back to debugging gerrit things 19:40:17 thank you everyone! 19:40:20 #endmeeting