Tuesday, 2024-03-12

clarkbJust about meeting time18:58
clarkb#startmeeting infra19:00
opendevmeetMeeting started Tue Mar 12 19:00:17 2024 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:00
opendevmeetThe meeting name has been set to 'infra'19:00
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/IIJAA3YB34I5JUJLM7SXXRGGQYL2JXGI/ Our Agenda19:00
clarkb#topic Announcements19:00
clarkbnorth america did its DST switch. I think europe and australia have changes coming up in the near future too. Just keep that in mind as you manage your calendars :)19:01
* fungi shakes his fist at the hour the universe has borrowed from him19:01
* corvus puts food in face19:01
clarkbAlso I'm going to be fishing thursday and I think my back is demanding I finally go find a new office chair sooner than later so that will probably happen tomrrow. Tl;dr I'll be in and out this week19:02
fungii'll be starting a little later than usual on thursday (routine medical checkup scheduled)19:02
fungibut expect to be around the rest of the day19:03
clarkbThe OpenStack TC election is happening right now. Go vote if you can19:03
frickleralso openstack rc1 releases are due this week19:04
clarkb#topic Server Upgrades19:05
clarkbHaven't seen any movement on this. We'll discuss more later but the rackspace clouds.yaml updates should in place now. Please say something if you run into trouble related to that booting new servers19:05
clarkbthe dns updates may not work which for 99% of things thats probably fine19:06
clarkbits not too bad to add reverse records by hand when we need them and forward records are mostly in opendev.org now19:06
fungii didn't see any issue with dns when i tested the launch-node script with an api key only cloud definition19:06
fungieasy enough to rerun that test and check for it explicitly though19:07
clarkbfungi: dns uses ist own auth credentials which I think are the old ones19:08
clarkbso they will stop working once MFA is enabled19:08
fungioh, not clouds.yaml then19:08
clarkbcorrect19:08
clarkbthats what the files we source are for when we do the dns stuff19:08
clarkbone sources creds the others the virtualenv for the tool19:08
clarkbanyway we'll talk more about rax mfa in a bit19:09
clarkb#topic MariaDB Upgrades19:09
clarkbI've got two changes up to do some more upgrades. One will do refstack and the other etherpad. They both need reviews. I'm happy to approve and babysit though19:10
clarkb#link https://review.opendev.org/c/opendev/system-config/+/910999 Upgrade refstack mariadb to 10.1119:10
clarkb#link https://review.opendev.org/c/opendev/system-config/+/911000 Upgrade etherpad mariadb to 10.1119:10
clarkbbased on paste's upgrade these should be straightforward but always worth monitoring database changes19:10
clarkblet me know if you have any questions or concerns19:11
clarkb#topic AFS Mirror cleanups19:11
clarkbWe're in the "idle" period between active mirror/nodepool cleanup where we try and reduce the impact across our repos19:11
clarkbOn the infrastructure side of things I think we are now ready to remove the centos-7 base-jobs nodeset and the images from nodepool but I announced that would happen on friday19:12
clarkbI don't have changes up for that yet but will get them up before Friday in order to make it easy to alnd them19:12
clarkbin the meantime topic:drop-centos-7 is worth keeping an eye on in case there are further cleanups19:13
fricklerubuntu-ports mirror was broken likely as a result of running into quota limits19:13
fricklershould be fixed now but implies we'd better try to avoid this repeating if possible19:13
clarkb++ I'm somewhat surprised that reprepro can't resolve that on its own though19:13
fricklermaybe I just didn't do the right things19:14
clarkband if you didn't I don't blame you. reprepro docs are a bit incomprehensible :)19:14
clarkbI also started pushing some changes for xenial removal under topic:drop-ubuntu-xenial19:15
clarkbI don't think we are in a hurry there as there will be plenty to untangle for xenial. But reviews are always welcome19:15
clarkbslow but steady progress including in the projects dropping old configs. THat is nice to see19:16
clarkb#topic Rebuilding Gerrit Container Images19:17
fricklerthe centos7 removal from devstack is still blocked19:17
clarkb#undo19:17
opendevmeetRemoving item from minutes: #topic Rebuilding Gerrit Container Images19:17
clarkbfrickler: that is due to jobs in other projects like keystone?19:17
frickleryes, see https://review.opendev.org/c/openstack/devstack/+/91098619:17
clarkbthanks. I don't see a change or changes yet to remove it from keystone19:18
clarkbfungi: was that something you were planning on pushing?19:18
fungiyeah, i need to pick that back up now19:18
fricklerkeystone wasn't mentioned in the list of affected projects so far, too19:19
fricklerbut since this is also on master, it might be a good idea to move the removal until after the openstack release if this can't get fixed in time19:19
fricklersays /me with release hat on19:19
clarkbfrickler: wouldn't it be just as easy to disable that job/remove that job as landing any fixes to keystone that might be necessary for the release?19:20
clarkbI guess I don't see this as a hard blocker19:20
clarkbbecause either way you're talking about $something that needs updating in keystone and if we can do that we can drop the job19:20
clarkbbut we can check in on friday and make a call if it seems particularly painful19:21
fricklerwe don't know how many other projects might also be affected that we don't know yet about19:21
fricklerand we will discover only by iterating over merging fixes19:21
fungiwe can add removal changes for stable/2024.1 before or after the release19:21
clarkbya I think my main push back is that the only way that should affect the release is if you need to update the job config for one of those projects anyway19:21
clarkbin which case having a job that needs to be deleted is equivalent to whatever else is blocking you19:22
fungiwell, or if zuul configuration is broken and the release jobs for the project's tag don't run19:22
fungithough i think those jobs are all defined in other repos19:22
clarkbI thought zuul will continue to run the jobs in that case yes19:23
clarkbI don't know that for certain though19:23
fungiproject-config and openstack-zuul-jobs mainly19:23
clarkbpart of my concern is that we can't avoid merging these until every project is fixed because we know not all will be fixed ahead of time19:23
frickleryes, that should be fine, I was more thinking about possible last minute fixes being needed19:23
fungii thought zuul wouldn't run jobs for a project if it couldn't load its configuration, but specifically for things defined via in-project configuration which hopefully the release jobs aren't19:23
clarkbfungi: and you can have errors in your config that are only a problem when you try to modify the config19:24
clarkbotherwise it will use cached configs19:24
clarkb(which isn't something to rely on, but config errors aren't usually a hard stop)_19:25
clarkbanyway we can take stock in a few days and make a call then19:25
clarkb#topic Rebuilding Gerrit Container Images19:25
clarkbGerrit finally released a new version of 3.9 to update mina ssh for that mitm thing19:26
clarkb#link https://review.opendev.org/c/opendev/system-config/+/912470 Update our 3.9 image to 3.9.219:26
clarkbI try to keep out images up to date so that we're testing what we'll actually be upgrading with. However, merging this change will produce new 3.8 images too so we should try and restart gerrit even though it is running the older image19:26
clarkbhistorically we've upgraded gerrit around april/may and november/december19:27
clarkbwould be great to get this up to date then try and work towards a gerrit upgrade in the next month or two19:27
fricklershould we try to combine with project renames?19:27
clarkbmy preference is that we don19:27
clarkb*don't19:27
fricklerok19:28
fungiyes, it can result in a bigger mess or additional delays if something needs to be rolled back19:28
clarkbI think project renames are hacky enough that combining it with na upgrade is more risk than necessary. Also both should be relatively quick so we won't have massive downtimes19:28
fungii agree, two brief outages a week or two apart is preferable to one slightly longer outage which has an increased risk of something going wrong19:29
clarkb#topic Project Renames19:29
clarkbThats a good jump to another one of today's topics. A reminder we've pencilled in April 19. If you know of people who may want to rename projects remind them to get that info pushed up19:29
clarkb#link https://review.opendev.org/c/opendev/system-config/+/911622 Move gerrit replication queue aside during project renames.19:30
clarkbI also wrote this chagne to add this workaround for thousnads of errors on startup that we've been using when manually restarting gerrit to the playbook that autoamtes it19:30
clarkbDon't think there is much else to say and I expect that is the primary prep we need before we get there (we actually test renames in our system-config-run-review jobs so should be good)19:31
clarkb#topic Rackspace MFA Requirement19:31
clarkbAs noted earlier all of our clouds.yaml files should be updated now to use the api key auth method and the rackspaceauth plugin19:32
clarkbthis should make launch node, openstackclient, and nodepool happy19:32
clarkbone major exception is the dns client and dns updates for dns hosted by rax19:32
clarkband then there is some question about swift hosted job logs but we're like 95% certain that is dedicated swift accounts which shouldn't be affected19:32
clarkbThe enforced deadline for the change is March 2619:33
clarkbwhich means we can either wait until then or since we think we're prepared opt into MFA now and see what breaks19:33
fungisee also https://review.opendev.org/912632 per our earlier discussion, dns should be fine but that change ought to help avoid future divergence19:33
clarkboh interesting it was already using the api key19:34
fungibasically, the dns api module was already set up using api keys, so it transitioned to the new approach long ago19:34
clarkbthere are three accounts that we will need to manage MFA for. I think we'll just do totp like we've done for other accounts19:35
clarkbfungi: actually I don't think that change (912632) is safe ? dns is managed via the third account19:36
clarkbwhich is different than the control plane account19:36
fungidns is, rdns is not19:36
clarkboh this is rdns specific got it19:36
fungii did not touch dns just rdns19:36
clarkbgot it19:36
fungias mentioned in the commit message19:36
fricklermaybe try to switch to mfa early next week? then we'd have some days left in case it doesn't work as expected19:37
clarkbI suspect that updating the nodepool account is going to have the biggest impact if something goes wrong19:37
clarkbfrickler: ya so maybe we start with our control plane account, test launch node stuff again, then if that works do the other two accounts19:37
clarkband next week works for me after the way this week filled up19:37
clarkbthough I won't stop anyone from getting to it sooner19:38
fungiagreed19:38
clarkbcool /me scribbles a note to try and find time for that next week19:38
clarkblet me know if you want to help I'll happily defer :)19:38
fungia lot of my week is consumed by openstack release<->foundation marketing coordination19:38
funginext week should be better though19:38
clarkbsounds like a plan19:39
clarkb#topic PTG Planning19:39
clarkb#link https://ptg.opendev.org/ptg.html19:39
clarkbwe are now on the schedule there19:39
clarkband I think we'll use the default etherpad that was "created" for us19:39
clarkb#link https://etherpad.opendev.org/p/apr2024-ptg-opendev Feel free to add agenda content here.19:40
fungiwednesday/thursday19:40
clarkbyup I used the times we talked about in here last week.19:40
clarkbI need to start adding some stuff to the agenda but everyone should feel welcome to. My intent is to make this more of an external reaching set of time but we can use it for our own stuff as well19:41
clarkbfor example ubuntu noble nodepool/dib/mirror stuff would be godo to discuss19:41
clarkb#topic Open Discussion19:42
clarkbfungi: I meant to followup on git-review things but then we got distracted19:42
clarkbfungi: have we approved the changes we're already happy with?19:42
fungifrickler still wanted to take a look, it sounded like19:43
fungimainly i was looking for feedback on what already merged changes warranted adding release notes19:43
clarkbgot it19:43
frickleroh, I missed that, sorry19:43
clarkbhttps://review.opendev.org/q/project:opendev/git-review+status:open is the list of open changes19:43
clarkbmy suggestion would be that we go ahead and make a release (with release notes if necessary) for the stuff that has already landed19:43
clarkbthen we can followup and do a second release for those changes if we want19:44
fungii'll put together an omnubus reno change with the additions requested for the already merged changes since the last tag19:44
fungier, omnibus19:44
clarkbbut I think trying to do everything will just lead to more delays19:44
corvusfungi: sounds ominous19:44
fungiomnomnominous19:44
clarkbAlso I mentioned this in #opendev yesterday but got nerd sniped by eBPF and bcc as profiling tools that may be useful particularly in ci jobs19:45
fungibut yeah, the rackspace mfa notification sort of derailed my finishing up the git-review release prep19:45
clarkbI think the tools are neat and the way they work is particularly interesting to me beacuse I don't have to care too much about specific test job workloads to profile them reasonably well. You can just do it through the lens of the kernel19:46
clarkbthat said its not all perfect and they seem to be somewhat neglected on debuntu compared to rpm distributions19:46
clarkbthe runqslower command doesn't work on ubuntu for example and the python ustat command crashes19:47
clarkbmostly mentioning them because you may find them useful as debugging aids19:47
clarkbAnything else?19:48
clarkbI'll take that as a no. Thank you for your time and help everyone. See you around and we'll be back here same time and place next week.19:51
clarkb#endmeeting19:51
opendevmeetMeeting ended Tue Mar 12 19:51:06 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:51
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2024/infra.2024-03-12-19.00.html19:51
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2024/infra.2024-03-12-19.00.txt19:51
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2024/infra.2024-03-12-19.00.log.html19:51
fungithanks clarkb!19:52

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!