19:00:12 #startmeeting infra 19:00:12 Meeting started Tue Apr 16 19:00:12 2024 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:12 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:12 The meeting name has been set to 'infra' 19:00:17 #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/X5LH5DM5F4KOX5X2D2IGRGVJ5USFL3SZ/ Our Agenda 19:00:28 #topic Announcements 19:00:42 I wanted to call out this email fungi sent to the service-announce list 19:01:00 #link https://lists.opendev.org/archives/list/service-announce@lists.opendev.org/thread/4HUXROUE4ZRWLC6JFT5YF3GO6G3UULWL/ 19:01:12 i had fun pulling those numbers 19:01:15 Basically if you use ecdsa keys and putty you'll want to check that out as your private key material is potentially determinable 19:01:47 anything else to announce? 19:02:38 not just any ecdsa, one specific set of nist parameters 19:02:44 but yes 19:03:22 * fungi has nothing else to announce 19:03:45 #topic Upgrading Servers 19:04:04 no n progress. 19:04:06 I don't think there is anything new to report here, but it is worth noting that OpenStack and StarlingX have made major releases and the PTG has concluded 19:04:19 we should be good to start making changes with minimized impact to development cycles 19:04:37 I'll get back into the swing of things this week 19:04:38 #topic MariaDB Upgrades 19:04:41 ack 19:04:48 Etherpad, Gitea, Gerrit, and Mailman could use upgrades. 19:04:56 #link https://review.opendev.org/c/opendev/system-config/+/915183 Upgrade mailman3 mariadb to 10.11 19:05:01 #link https://review.opendev.org/c/opendev/system-config/+/911000 Upgrade etherpad mariadb to 10.11 19:05:21 zuul runs on mariadb now and does not need upgrading :) 19:05:43 I believe that ansible and docker compose will automatically upgrade these db servers for us when we land these changes 19:05:47 i think the mailman one could go in without much concern. i'm happy to monitor it as it deploys 19:06:02 if we don't want the upgrades to be automated we can put nodes in the emergency file and do it manually 19:06:24 fungi: ya I agree. Maybe we should just go ahead with that one and do similar with etherpad once we've got a better idea of what the 2.0.x upgrade path looks like? 19:06:49 sure. we've already got one +2 on it, didn't know if anyone else wanted to look it over first 19:08:02 I guess any other reviewers can chime in during the meeting otherwise I think you're clear to proceed when ready 19:09:06 thanks! 19:09:10 #topic AFS Mirror Cleanups 19:09:35 Now that the PTG is over I need to start pulling up existing xenial configs in zuul to figure out a course of action for cleaning up xenial 19:09:55 I suspect with this one we're going to have to set a hard date and then just accept zuul errors because there are a lot of tendrils in everything 19:11:01 once I've got a general sense of what needs to be cleaned up I'll try to write appropriate mailing list emails for those affected and then we can set a date and aim for that 19:11:11 may need to merge some things bypassing errors too 19:11:48 #topic Building Ubuntu Noble Nodes 19:11:58 The other related item to cleaning old ubuntu up is adding new ubuntu 19:12:16 frickler has started some testing of this locally and discovered two problems. The first is glean needs some small updatse to be python3.12 ready 19:12:25 #link https://review.opendev.org/c/opendev/glean/+/915907 Glean updates for python3.12 support 19:13:01 The other is that debootstrap in our debian bookworm based nodepool-builder images is not new enough to build noble. I think this is "normal" and we have had to pull newer debootstrap from testing/unstable to accomodate prior ubuntu releases 19:13:13 frickler is there a change to do that bump yet? 19:13:30 I haven't seen one if so. Adding one would probably be a good idea 19:13:40 no, I just created the missing symlink locally for me 19:14:07 then there's also some infra elements that need updating, I hope I can do a patch for that tomorrow 19:14:28 sounds good, thank you for looking into that 19:16:03 #topic review gerrit service troubles 19:16:48 I haven't seen any word from mnaser or guillhermesp on why the server shutdown under us. At this point maybe we file this topic away and address it if we do get that info 19:17:17 However, on Sunday the service stopped responding again. This time the server itself was up and running but spinning its cpus and not getting any useful work done 19:17:46 I believe I tracked this down to a bad client and blocked them then restarted services and things have been happy since 19:18:08 We also upgraded Gerrit yesterday which brought in a bug fix for a potential dos vector 19:18:38 so that was just a single IP address? 19:18:42 frickler: yes 19:19:28 as a side note when we rebuilt the 3.8 image we also rebuilt the 3.9 image which brought in fixes for the issues I was concerned about upon upgrading. We can probably start upgrade planning and testing now 19:21:20 I don't think there is anything we need to do related to gerrit at thsi moment. I just wanted to get everyone up to date on the issues we have had and point out we're in a position to begin upgrade testing and planning 19:21:29 #topic Project Renaming 19:21:53 Before we upgrade Gerrit we have a project rename request. We pencilled in April 19th as renaming day as it happens after the PTG and that happens to be this Friday. 19:22:30 Do we want to proceed with an April 19 renaming? If so we need to land https://review.opendev.org/c/opendev/system-config/+/911622 (or something like it ) and prep the record keeping changes 19:22:42 Oh and we need to decide on a time to do that so we can send an announcement 19:23:35 I'm happy to shepherd that stuff along but don't want to be the only one around on Friday if we proceed 19:24:04 sounds good to me 19:24:19 i can be available whenever you are 19:24:45 fungi: ok in that case 9am pacific is good for me. I think that is 1600 UTC. Lets announce 1600-1700 UTC as the window? 19:25:21 and I'll dedicate a chunk of tomorrow to getting everything prepared well in advance 19:25:57 wfm 19:26:18 great I'll send an announcement later today 19:26:39 #topic Etherpad 2.0.x Upgrade 19:26:45 #link https://review.opendev.org/c/opendev/system-config/+/914119 WIP change for Etherpad 2.0.3 19:27:00 This change passes testing now which is a nice improvement. 19:27:25 The background on this is taht Etherpad made a 2.0.0 release that largely didn't chagne anything user facing and had everything to do with how you install and deploy etherpad using pnpm now 19:28:01 This resulted in dockerfile updates but was reasonably straightforward. Then before the PTG could end 2.0.2 was released and they removed support for APIKEY.txt based authentication and moved everything to oauth2.0 19:28:26 so much for semver ;) 19:28:29 I filed a bug asking them to document how you can use the api like before and the result of that was new functioanlity in the oauth2 server to support client_credentials grants 19:28:55 The reason why that change above is a WIP is that this update (whcih does work for our purposes) is not in a release yet. I suspect that release will be 2.0.3 or 2.1.0 19:29:15 I also updated testing to cover the documented api tasks that we perform to ensure we can perform them via the new auth process 19:29:19 and I updated the docs 19:29:38 I do think this change is ready for review. I hope that when the release happens I can update teh git checkouts as the only updates to the change and we can upgrade 19:29:54 thanks for solving that! 19:30:26 Given we don't know when that release will happen I think we can probably try to do the mariadb upgrade before we upgrade etherpad 19:30:46 I'll try to find time to watch that upgrade if no one has objections 19:30:49 wfm 19:31:59 it was a fun one. I had to rtfs to find the api endpoints because even after the docs updates details like that were not mentioned. Then spent time reading the oauth2.0 rfc to figure out the client_credentials request flow 19:32:14 its actually fairly straightforward once you have that info, the hard part was discovering all the breadcrumbs myself 19:32:15 use the source, luke 19:32:49 I'll get a held node up soon that we can use to test normal functionality hasn't regressed as well 19:33:46 #topic Gitea 1.21.11 Upgrade 19:33:53 #link https://review.opendev.org/c/opendev/system-config/+/916004 Upgrade Gitea to 1.21.11 19:34:02 Gitea made a release overnight 19:34:11 there are bug fixes in there that we should probably consider upgrading for 19:34:35 The templates we override did not change so we don't have any template updates either 19:35:20 anything that might be related to the missing tags? 19:35:34 frickler: unfortunately I didn't see anything that looks related to that 19:36:41 #topic Open Discussion 19:36:43 Anything else? 19:37:27 as alluded to earlier, zuul-db01 is running mariadb now and zuul is using that as its db 19:37:43 corvus: and the host is out of the emergencyfile? 19:37:46 i've removed it from emergency, so we should consider it back in normal service 19:37:52 thanks! 19:38:05 the web ui works for me so this seems to be happy 19:38:14 corvus: we should probably plan cleanup of the trove db at some point? 19:38:18 i did leave the mysql 8 files on disk; maybe we'll delete them next weekend/week? 19:38:31 sounds good 19:38:35 yes, maybe do both mysql 8 and trove cleanups at the same time? 19:38:41 wfm 19:39:19 how about we action me on that, and i'll try to do it > fri and < tues? 19:39:28 i think i still need to clean up the old keycloak server too 19:40:14 #action corvus cleanup the zuul trove and mysql dbs 19:41:38 fungi: did you want an action on that too? 19:41:49 sure 19:41:55 /var/mariadb looks pretty full 19:42:05 #action fungi cleanup old keycloak server 19:42:42 speaking of cleanup, openinfra foundation staff are looking at moving the openstack.org dns hosting from rackspace's dns service to cloudflare where they have the openinfra.dev domain hosted. we don't rely on it for much any more, but probably worth talking through. i was at least going to see about deleting any old records of ours we no longer need so we have a better idea of what's 19:42:43 just looking at cacti 19:42:44 still in there 19:43:04 frickler: agreed. The volume is mounted via lvm though so in theory we can add another and grow it. Or add a biger one, grow, then remove the old smaller one 19:43:24 fungi: ++ deleting old records is a great idea 19:43:27 well, removing the mysql 8 data will give half the used space back 19:43:32 if we're talking about zuul-db01 19:43:33 corvus: oh I see ++ 19:43:59 we can also pvmove extents from a smaller pv to a larger one in the same vg, if it becomes necessary 19:44:07 fungi: I also mentioned to them that ianw wrote a tool to dump the rax zones in zonefile format which I offered to provide them to simplify the move 19:45:18 Probably also worth noting that meetpad seems to have done well during the PTG. There were some reports of mic problems that apparently don't happen in other tools and some indicated they couldn't connect. But on the whole it worked well and performance was reasonable from what I could see 19:46:29 some users also found the built-in noise cancellation option helpful to turn on, if their own mics didn't do a good enough job of it 19:46:37 yes, not sure still what happened in the openeuler session 19:46:55 last call on any other items. Otherwise I'll end the meeting about 12 minutes early. 19:47:06 frickler: diablo_rojo was going to check in with them to get details 19:47:54 the edge wg session(s) used the recording feature, which seems to have worked out 19:48:19 not sure if any other tracks recorded anything 19:48:29 I haven't seen any if so 19:49:25 sounds like that may be all. Thank you everyone! We'll be back next week at the same time and location. 19:49:29 #endmeeting