Tuesday, 2024-03-05

clarkbJust about meeting time18:59
clarkbI'm going to try and manage this while eating a bowl of rice18:59
clarkb#startmeeting infra19:00
opendevmeetMeeting started Tue Mar  5 19:00:20 2024 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:00
opendevmeetThe meeting name has been set to 'infra'19:00
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/UG2JFEL6XFFLDT5UYDHCBYNAJF72XXHZ/ Our Agenda19:00
clarkb#topic Announcements19:00
fungihold bowl with one hand, chopsticks with second hand, type with third hand19:00
clarkbsmall note that I'll be AFK through a good chunk of tomorrow. Taking advantage of a morning matinee and kids being in school to see Dune19:01
clarkb#topic Server Upgrades19:02
clarkbI haven't seen any new movement on this19:02
tonybnope.  I'll address the review feedback and boot the new servers today19:02
clarkbWorth calling out that the announced rackspace mfa switch may impact our ability to run launch node. I've got notes to discuss that further at the tail end of the meeting19:03
clarkbtonyb: ah if you boot today you should be fine19:03
clarkb#topic MariaDB Upgrades19:03
clarkbThe paste db upgrade went as expected. It seems to have only touched system tables, and did a backup of those tables first the size of which is less than 1MB and reasonable to continue to have the process do that backup19:03
clarkb#link https://review.opendev.org/c/opendev/system-config/+/910999 Upgrade refstack mariadb to 10.1119:04
clarkb#link https://review.opendev.org/c/opendev/system-config/+/911000 Upgrade etherpad mariadb to 10.1119:04
clarkbI went ahead and pushed these two changes to upgrade refstack and etherpad's backing databases. I did have to make a small change to etherpad's test cases because the log output from 10.11 was updated to say mariadb is read instead of myslq is ready19:05
clarkbreviews welcome as well as any feedback on whether we're comfortable with docker-compose kicking the upgrade off automatically or if we'd prefer manual intervention for up to the minute backups19:06
clarkbafter these two gerrit, gitea, and mailman 3 are the remaining dbs that need upgrades. I'll try to continue to step through them19:07
clarkb#topic AFS Mirror cleanups19:07
clarkbOpenSUSE Leap and Debian Buster have been removed from afs mirroring as well as nodepool19:08
clarkbNext up is CentOS 7 which we've got some stuff in progress for under topic:drop-centos-719:08
clarkbI did realize that CentOS 7 had/has far more reach than the other two so decided to announce a removal date for March 15 in order to minimize impact to the openstack release process19:09
clarkbthe impact should still be minimal but there were enough places thatcentos 7 was still showing up that I didn't want to just blaze ahead like I did with the others19:09
clarkbwe're currently cleaning up project configs then late this week early next week I'll drop zuul-jobs testing of centos 7 and remove wheel caching for centos 719:10
fungithe custom nodeset definition in devstack is nearly done merging backports across 8 active branches19:10
clarkbthen we can do the actual nodeset and nodepool removal on the 15th and once that is done clean up afs19:10
clarkb#link https://review.opendev.org/c/opendev/system-config/+/906013 Improve DKMS for CentOS OpenAFS testing/packaging19:10
fungithough i expect whichever tries merging last to fail with errors we can then use to see what old branches of other projects are using the devstack nodeset19:10
clarkbthsi change isn't directly related to the cleanup but involves centos and afs and I think will make it easier to understand failures with dkms on the platform19:11
clarkbfungi: ya19:11
clarkbfungi: keystone for example19:11
clarkbslow but steady progress. And we've already freed up like 400GB of openafs consumption19:12
fungithis cleanup effort is likely to be pretty vast, since copies of bits like custom nodesets and jobs are declared across many, many branches and only the last removal will actually tell you what was using it19:12
clarkbI will note that last friday when I tried to clean up the buster mirror content afs01.dfw.openstack.org lost a "disk" and everything went sideways19:12
clarkbit isn't clear to me if this was due to deleting a few hundred gigabytes of data or just coincidence19:12
clarkbsomething we should be aware of when making other large changes to openafs. Rax addressed it quickly at least19:13
fungiyeah, and other than retrying some vos releases there wasn't any lasting impact19:13
clarkbfungi: yes, I mentioned it elsewhere but we really need openstack to clean up old stuff early in the branching process instead of at eol time19:13
clarkbbecause we're ending up with ancient configs that make no sense in modern oepnstack that continue to be carried forward release after release increasing the cleanup time/cost19:14
fungii think people define shared resources in branched repos without considering how zuul uses them19:14
fungiand not realizing that even if you delete something out of your master branch, other projects will just keep using it from a branch 5 releases ago19:15
clarkbAnother issue that I ran into was that openafs doesn't load on debian bookworm.19:15
clarkb#link https://gerrit.openafs.org/#/c/15668/ Fix for openafs on arm with newer gcc19:15
clarkb* doesn't load on debian bookworm arm6419:15
clarkbOnce upstream merges this fix for it I'll submit a bug to debian to see if we can get that fixed (it doesn't work at all so should be a good candidate for a fixup)19:15
clarkbOnce we've chipped enough of this old stuff out we can add in new things :)19:16
clarkbif anyone wants to get a headstart on that a new dib job to start building 24.04 might be helpful19:16
clarkb*start building Ubuntu 24.04 to avoid any confusion on what I was referring to19:17
frickleris it coincidence that openafs is using gerrit and we are using openafs? /me just notices this19:17
clarkbfrickler: yes I think it is19:17
clarkb#topic OpenDev Email Hosting19:19
clarkbDon't think we have anything new to mention on this. But kept it on the agenda in case we had any stronger opinions19:19
* clarkb will give everyone a couple minutes to chime in if so. Otherwise we can continue on19:19
fricklerI'd be fine with dropping it from the agenda and reviving once we consider it to be more urgent again19:20
clarkbwfm I can do that19:20
clarkb#topic Project Renames19:21
clarkbThis is mostly a reminder that we're planning to do renames after the openstack release on April 1919:21
clarkbwe can adjust this timing as necessary19:21
clarkbso please say somethign if that timing is especially bad for some reason19:22
fricklerthe release should happen earlier, the date is after the PTG19:22
clarkbcorrect. Its basically release then ptg then the 19th19:23
clarkbwe didn't want to conflict with the ptg or the release so we're doing it late19:23
clarkbwhich is a good lead into our next tope19:23
clarkb*topic19:23
clarkb#topic PTG Planning19:24
clarkb#link https://ptg.opendev.org/ptg.html19:24
clarkbI was hoping this schedule would be a bit more fileld in before picking times but it is very empty19:24
clarkbrather than wait for others to fill in I think we can go ahead and grab some time.19:24
clarkbSomething like Wednesday 0400-0600 and Thursday 1400-1600UTC. Gives enough time between blocks to catch up on sleep.19:24
clarkbmonday and tuesday tend to be busy so I'm trying to accomodate that19:25
frickler+119:26
tonybWorks for me.  I admit I'll only be attending APAC friendly meetings 19:26
fungiever since the ptg organizers stopped trying to pre-schedule times for all registered teams,many teams tend to wait until the last week to book any slots19:26
tonybI was thinking of dropping into the openeuler session19:26
clarkbtonyb: that doesn't conflict with the times I proposed does it?19:27
clarkbno it is on friday so we're good there19:27
fricklerI wasn't even aware that the scheduling is already happening. seems it is only announced to PTLs/session leaders?19:27
tonybI don't think so.  the one I saw was Friday 19:28
clarkbfrickler: yes emails did go out to the session leaders. Not sure if emails went out more broadly.19:28
clarkbI can make a note that we may need to communicate this more widely19:28
tonybI think it only goes to session leaders19:28
clarkbanyway I'll get us signed up for those two blcoks later today19:31
clarkb#topic Rax MFA Requirement19:31
fungisounds good19:31
clarkbfungi received email today announcing that rax will require MFA for authenticating starting march 26, 202419:31
fungithey've also added a similar notice on the login page for their portal19:32
clarkbenabling MFA breaks normal openstack api auth. We have to either use a rax api key or bearer token19:32
clarkbthis means all of our automation is impacted.19:32
clarkbSince bearer tokens expire (relatively quickly too) we've decided to investigate using the api_key method. To do this we need ot install rackspaceauth as a keystoneauth1 plugin to all the places we use the api19:33
clarkbthe nwe need to use the api key value instead of regular user auth19:33
clarkbthe rough plan here is to test this with nodepool using a single region to start that way we can check that launcher and builder operati ons work (or don't)19:33
fricklerdo we know the lifetime for those api keys?19:34
clarkbthen when that works we can switch all rax nodepool providers over to the new system and update our control plane management to use the same api-key stuff. Then we can opt in to MFA when ready19:34
clarkbfungi: ^ do you know the answer to frickler's question? You were testing this with your personal account any indication of a lifetime?19:34
fungifrickler: i generated one for my personal rackspace account years ago and it's never changed19:34
fungifrom what i can tell it only changes if you click the "reset" button next to it in the account settings19:35
clarkbIf you'd like to help with reviews or pitch in pushing changes we're using topic:rackspace-mfa19:35
tonybI'm not really seeing how that helps with security at all?19:35
fungitonyb: it helps with security theater19:35
fungiif you force people to make changes then you can't say you didn't do anything19:36
clarkbfungi: for the system-config chagne we need to put new secrets in private vars. Is that done yet?19:36
clarkbthinking about our next steps and I think it is roughly add the new private vars, land the system-config change, then update nodepool config19:36
fungiyes, i left a comment on the change saying i did it too19:36
clarkbthen we can either land the nodepool change or try it out of the intermediate registry. Pull from the intermediate registry will only work for the launcher image I think since the builer is multiarch and docker isn't able to negotiate multiarch images out of the intermediate registry currently :/19:37
clarkbfungi: thanks!19:37
clarkbfungi: we should be able to push up a project-config update with a depends on system-config too if we haven't yet19:38
clarkbbut I think that is where we're at until a couple of things merge19:38
corvusi think if it works for launcher that's good enough to land the nodepool change19:38
clarkbas an alternative we can manually install the lib itno the image if the launcher is multiarch too and can't be fetched out of testing19:38
fungiclarkb: what needs changing in project-config? i can do that19:38
corvus(i don't think we need to prove it works to land the nodepool change; it's pretty simple.  but still, it'd be nice to avoid churn or errors there since there's no real way to test it other than in prod)19:38
clarkbfungi: we have to update the nodepool/nl01.opendev.org and nodepool/nodepool.yaml files to force one of the three rax providers to use your newly defined clouds.yaml entries19:39
fungioh, right that19:39
clarkbcorvus: ++19:39
fungiyeah i'll get that proposed19:39
fungithough probably not until after 21z19:39
clarkbfungi: I would pick the rax region with the lowset capacity to reduce impact if it doesn't work19:39
fungigood idea19:39
fungiwe have three weeks to get this working, which seems like plenty, but if we run into problems that time can disappear on us very quickly19:40
clarkbagreed best to get as much info as we can as early as possible then adjust our plan as necessary19:41
fricklerwhat about log uploads, are these also affected or not? the earlier discussion in #opendev didn't seem conclusive to me19:41
fungiwe use swift account credentials for that, not keystone19:42
fungias i understand it19:42
fungithose are separate accounts defined in swift itself and scoped to specific swift acls19:42
clarkbya so I don't think they will be affected but we should double check on that (check that we are using special creds and check that they aren't affected though i'm not sure how we do this second thing)19:42
clarkbcorvus: you may recall the details as I think youset that UP/19:43
clarkb(and I can't type)19:43
fungiwe can also, worst case, fall back to only uploading to ovh in the interim while we work it out19:43
clarkbnot ideal but ya that would work19:44
clarkbas far as actual MFA implementatino goes their docs refer to phone authenticator apps. Typically this means they are doing totp so we should be able to do that here as well19:44
clarkbsimilar to how some of our other accounts have done totp19:44
clarkbStill a lot of unknowns for now but we've got a plan to learn more. Next week we can catch up and make sure there aren't any glaring issues we need to address19:46
clarkb#topic Open Discussion19:46
clarkbAnything else before we end the meeting?19:46
corvusclarkb: i don't recall the details....19:47
clarkbcorvus: ack we should be able to log in to the swift stuff and check and/or look at our secrets in zuul19:47
corvusyeah, probably worth looking into ahead of time19:47
corvusbecause i agree, something is different about it19:47
clarkbopenstack is starting to get into release mode. Keep that in mind when making changes19:48
clarkband thats about all I had19:48
clarkbsounds like that is everything for today. Thank you everyone for your time and effort operating and improving opendev19:51
clarkb#endmeeting19:51
opendevmeetMeeting ended Tue Mar  5 19:51:55 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:51
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2024/infra.2024-03-05-19.00.html19:51
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2024/infra.2024-03-05-19.00.txt19:51
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2024/infra.2024-03-05-19.00.log.html19:51
fricklerthx clarkb 19:52

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!