Tuesday, 2023-11-21

clarkbjust about meeting time18:59
clarkb#startmeeting infra19:00
opendevmeetMeeting started Tue Nov 21 19:00:52 2023 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:00
opendevmeetThe meeting name has been set to 'infra'19:00
fungiindeed19:00
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/WBYBD2663WL2IJD7NLDHBQ5ANRNRSMX3/ Our Agenda19:00
clarkb#topic Announcements19:01
clarkbIt is Thanksgiving week in the US. I saw the TC meeting was cancelled today as a result. I will be less and less around as the week progresses. Have to start on food prep tomorrow19:01
clarkbbasically heads up that it may get quiet but I'll probably check my matrix connection at times19:01
clarkb#topic Server Upgrades19:02
clarkbtonyb has made progress on this and replaced the ord mirror. The new jammy mirror is in use19:02
tonyb\o/19:02
clarkb#link https://review.opendev.org/c/opendev/system-config/+/901504 Helper tool for mirror node volume management19:03
fungiawesome job19:03
tonybI created mirror02.bhs1 today, and tested ^^^19:03
clarkbone thing that came out of that is the mirror nodes have volumes that are set up differently than all our other hosts so the existing tools can't be used19:03
clarkbto avoid manual effort which results in errors and deltas tonyb volunteered to write a tool to simplify things.19:03
clarkbI need to rereview it19:03
clarkbtonyb: other than reviewing changes and answering questions you have is there anything the rest of us can be doing to help?19:04
tonybNope I'm working through things.19:04
tonybif anything comes up I'll yell19:04
clarkbsounds good and thank you for the help!19:05
clarkb#topic Python Container Updates19:05
clarkbNo update on getting zuul-operator off of old debian. But uwsgi builds against python3.12 now so we can add python3.12 images if we want19:05
clarkb#link https://review.opendev.org/c/opendev/system-config/+/898756 And parent add python3.12 images19:06
clarkbI don't expect we'll be making use of those quickly, but I do like getting the base images ready so that we aren't preventing anyone from testing with them19:06
tonyb++19:06
clarkbThey should be straightforward reviews. THe parent is a bookkeeping noop and the child only adds new images that you have to explicitly opt into using19:06
clarkb#topic Gitea 1.12.019:08
clarkbI worked through the changelog and have the gitea test job running with screenshots that look correct now19:09
clarkbHowever, it seems there is rough consensus that we'd like to rotate our ssh keys out in gitea before we upgrade to avoid needing to disable ssh key length checking19:09
clarkb#link https://review.opendev.org/c/opendev/system-config/+/901082 Support gitea key rotation19:09
clarkbThis change should allow us to do that entirely through configuration management. (the existing config management doesn't quite do what we need for rotating keys)19:09
clarkbAs written it should noop. Then we can create a new key, add it to gitea, then also update gerrit config management to deploy the key there and select it19:10
clarkbthe gerrit side is not yet implemented as I was hoping for feedback on 901082 first19:10
clarkbOh and I think we should use an ed25519 key because they have a single length which hopefully avoids gitea changing minimum lengths in the future on us19:11
tonybSounds good to me.19:11
fungii'm fine with it19:12
clarkbIf you are interested in seeing what changes with gitea other than the ssh key stuff the change is ready for review19:12
clarkb#link https://review.opendev.org/c/opendev/system-config/+/897679 Upgrade to 1.21.019:12
clarkbThere are other things that change but none of them in a very impactful way19:12
clarkb#topic Gerrit 3.8 Upgrade19:13
clarkbThis is done!19:13
tonyb\o/19:14
clarkbIt went really well as far as I can tell19:14
clarkbThe one issue we've seen is that html/js resources seem to be cached on the old version affecting the web ui file editor19:14
clarkbIf you hard refresh or delete your caches this seems to fix it19:14
clarkbI've gone ahead and started on the gerrit container image cleanup for 3.7 and updates for 3.919:14
clarkb#link https://review.opendev.org/c/opendev/system-config/+/901469 Updates our gerrit image builds post upgrade19:14
clarkbI figure we can merge those first thing next week if we still don't have a reason to rollback to 3.719:15
tonybIs it worth sending an email to service-announce (and pointing other projects at it) explaining the html/js issue19:15
clarkbtonyb: ya I can do that as a response to the upgrade announcement19:15
tonybOkay, I wasn't volunteering you ;P19:16
clarkbtonyb: no its fine, then I don't have to moderate it througgh the list :)19:16
tonyb:)19:16
fungii get the impression not many people use the built-in change editor, and some of them will end up never seeing the problem because of their browser pulling the new version before they do19:16
clarkbI also sent email to upstrem about it and at least one person indicated they had seen the issue before as well but weren't sure of any changes that avoid it19:16
clarkbIn related good news the gerrit 3.9 upgrade looks similar to the 3.8 upgrade. Minimal downtime to run init and then online reindexing19:18
clarkbI haven't gone through the change list though so there may be annoying things we have to deal with pre upgrade19:18
clarkbAnyway if you agree about removing 3.7 early next week maybe review teh chagne and indicate that in review or something19:19
clarkb#topic Upgrading Zuul's MySQL DB Server19:19
clarkbIn addition to upgrading gerrit last friday we also did a big zuul db migration to accomodate buildsets with multiple refs19:19
clarkbin that migration we discovered that the older mysql tehre didn't support modern sql syntax for renaming foreign key constraints19:20
clarkbThis has since been fixed in the zuul migration, but to avoid similar problems in the future it is probably a good idea for us to look into running a more modern mysql/maria db for zuul19:20
clarkbI don't think we're going to create a plan for that here in this meeting but wanted to bring it up so that we can call out concerns or items to think about. I have 2. The first is where do we run it? Should it be on a dedicated host or just on say zuul01? I think disk size and memory needs will determine that. And are we currently backing up the db? If not should we before we19:21
clarkbmove it?19:21
clarkbI suspect that the size of the database may make it somewhat impactful to run it alongside of the existing schedulers and we'll need a new host dedicated to the databse instead. Thats fine but a small departure from how we run mariabd next to our other services19:22
fungii don't see a mysqldump in root's crontab on either of the schedulers, for reference19:22
tonybIt'd be a departure from how we typically run the DB, but consistent with how we're runnign it for zuul today right?19:24
clarkbtonyb: correct.19:24
clarkbtonyb: basically all of the self hosted non trove dbs currently are run out of the same docker compose for $service on the same host19:24
clarkbbut that is because all of those dbs are small enough or servers are large enough that the impact is minimal19:25
clarkbI suspect that won't be the case here19:25
tonybYup that makes sense19:25
fungiwell, first off, we're running zuul with no spof other than haproxy and that trove instance at the moment. would we want a db cluster?19:25
clarkbmaybe the thign to do is collect info in an etherpad (current db version, current db size needs for disk and memory, backups and backup sizes if any) and then use that to build a plan off of19:25
clarkbso I'm not sure how zuul would handle that19:26
clarkbfor example is it galera safe?19:26
fungiall questions we ought to ask19:26
clarkbunlike zookeeper which automatically fails over and handles clustering out of the box with db servers its a lot more hands on and has impacts on the sorts of queries and inserts you can do for example19:27
fungiin the short term though, should we schedule some downtime to reboot the current trove instance onto a newer version (if available)?19:27
clarkbI think it depends on how much newer we can get? If it is still fairly ancient then probably not worthwhile but if it is modern then it may be worth doing19:28
clarkbbut ya this is the sort of info gathering we need before we can make any reasonable decisions19:28
tonybYup.19:29
clarkbhttps://etherpad.opendev.org/p/opendev-zuul-mysql-upgrade <- 19:29
clarkblets collect questions and answers there19:29
fungithe "upgrade instance" option is greyed out in the rackspace webui for that db, just checked. not sure if that means 5.7 is the latest they have, or what19:29
tonybWell that's a start.19:30
fungiif i create a new instance they have mysql 8.0 or percona 8.0 or mariadb 10.4 as options19:30
fungiso anyway, in-place upgrading seems to be unavailable for it19:31
fungino idea if those versions are also ~ancient19:31
tonybSo we could stick with trove and dump|restore19:32
clarkb10.4 is like old old stable but still supported for a bit iirc19:32
clarkbits what a lot of our stuff runs on and I haven't prioritized upgrades yet because it isn't EOL for another year or two iirc19:32
clarkbI've got a list of questions in that etherpad now19:32
tonyb10.4.32 was releases last week19:32
clarkbI think collect what we can on that etherpad then loop corvus in and make an informed decision19:34
corvusoh hi, today has been busy for me, sorry just catching up19:35
clarkbcorvus: I don't think it is urgent. Just trying to get a handle on what an updated non trove zuul db looks like19:35
corvusi think i'd add that we have generally been okay with losing the entire build db, thus the current decisions around deployment19:35
corvusand lack of backups etc19:35
corvuswe could decide to change that, but that's definitely a first-order input into requirements :)19:36
corvusif we wanted to remove the spof, we could do what the zuul operator does and run percona xtradb19:36
corvusbut none of us knows how to run it other than just using the pxc operator, so that's a k8s.19:36
corvusif we run our own mysql spof, then i think it should be on a separate host since we now treat the schedulers as disposable19:37
fungithose all sound like reasonable constraints19:39
corvusmaybe worth doing a survey of db clustering solutions that are reasonably low effort19:41
clarkb++19:41
corvusi feel like this is not important enough for us to sink huge amounts of ops time into running a zero-downtime cluster and risk more downtime by not doing it well enough.19:42
fungiand that aren't kv stores, presumably. we need an actual rdbms right?19:42
corvusso if it's hard/risky, i would lean toward just run a tidy mariadb on a dedicated node.19:42
clarkbcorvus: ++19:42
corvusbut if it's reasonably easy (like it is in k8s with pxc-operator), then maybe worth it19:42
clarkbI think it may be a really interesting learning experience if peopel are into that but also based on people's struggles with openstack stuff it seems running db clusters isn't always straightforward19:43
corvusfungi: yes, mysql/mariadb or postgres specifically.  no others.19:43
fungi#nopostgres19:43
corvuswe should probably not exclude pgsql from our consideration, even though we're generally mysql biased so far.19:43
fungia, okay19:44
corvusfungi: was that a veto of postgres, or tongue in cheek?19:44
fungiit was an interpretation of your followup to my question about rdbms19:44
fungibut then you clarified19:44
clarkbok I've tried to collect what we've said so far in that etherpad19:45
corvusoh that was a deep cut.  i get it.  :)19:45
* fungi has no preference, just remember the postgres wars in openstack19:45
clarkbI think the next step(s) is/are to fill in the rest of the anwers to those questions and get some basic info on clustering options19:45
corvusanyway, the ship has sailed on zuul supporting both.  both are first-class citizens and will continue to be afaict, even though supporting two is O(n^2) effort.19:45
clarkbI'm definitely not going to get to that this week :) I can already start to feel the pull of cooking happening19:45
corvusso either is fine, if, say, postgres clustering is easy.19:46
fungiwfm19:46
corvusyep. these are good notes to take and will help me remember this after recovering from pumpkin pie.19:46
clarkbmaybe resync at next week's meeting and find specific volunteers for remaining info gathering post holiday19:46
corvus++19:47
fungisounds good19:47
clarkb#topic Open Discussion19:47
clarkbI think if I could get one more thing done this week it would be to land the python3.12 image updates since that is low impact. But otherwise I'm happy to wait on the gitea ssh stuff and gerrit image cleanup/additions19:48
clarkbI'm definitely going to start being around less regularly. Apparently we haev to roast turkey stuff tomorrow because we're not cooking it whole and need gravy makings19:48
clarkbbut also before that happens the turkey needs to be "deconstructed"19:49
clarkbsuch a kind way of saying "butchered"19:49
fungiopeninfra foundation board individual member representative nominations are open until december 1519:49
fungi#link https://lists.openinfra.dev/archives/list/foundation@lists.openinfra.dev/thread/YJIQL444JMKFRSHUBYDWUQHBF7P7UDJF/ 2024 Open Infrastructure Foundation Individual Director nominations are open19:50
clarkbOne thing on my todo list for after thanksgiving is to start on Foundation Annual Report content for OpenDev (and Zuul)19:51
clarkbI plan to stick that into etherpads like I've done before so that others can provide feedback easily19:51
clarkbIf there is somethign specific you're proud of or really want to see covered feel free to let me know19:51
clarkbLast call for anything else? otehrwise we can go eat $meal a bit early19:52
tonybnothing from me19:53
clarkbSounds like that is everything. Thank you everyone for your time and I hope you get to enjoy Thanksgiving if you are celebrating19:53
clarkb#endmeeting19:53
opendevmeetMeeting ended Tue Nov 21 19:53:33 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:53
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2023/infra.2023-11-21-19.00.html19:53
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2023/infra.2023-11-21-19.00.txt19:53
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2023/infra.2023-11-21-19.00.log.html19:53
fungithanks clarkb!19:53

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!