Tuesday, 2023-07-11

clarkbmeeting time19:00
* fungi waves19:00
clarkbI think I'm awake enough I can run through this agenda19:00
fungitc call is running a few minutes over but i can chair if you want just may need to delay a few minutes19:00
clarkbnah I can do it19:00
fungithanks!19:01
fungiand good morning19:01
clarkbI've got the cacophony of birds outside reminding me it is time to wake up anyway19:01
clarkb#startmeeting infra19:01
opendevmeetMeeting started Tue Jul 11 19:01:25 2023 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
opendevmeetThe meeting name has been set to 'infra'19:01
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/FV2S3YE62K34SWSZRQNISEERZU3IR5A7/ Our Agenda19:01
clarkb#topic Announcements19:02
clarkbI did make it to UTC+1119:02
clarkbI'm finding that the best time to sit at a computer is something like 01:00/02:00 UTC and later simply due to weather. But we'll see as I get more settled this is only day 5 or something there19:03
clarkb#topic Topics19:04
clarkb#topic Bastion Host Updates19:04
clarkb#link https://review.opendev.org/q/topic:bridge-backups19:04
clarkbLooks like this set of changes from ianw could still use some infra root review19:05
clarkbif we can get that review done we can plan the sharing of the individual key portions19:05
clarkb#topic Mailman 319:06
clarkbfungi: any updates on the vhosting? then we can talk about the http 429 error emails19:06
fungino new progress, though a couple of things to bring up yeah19:06
clarkbgo for it on the new things19:06
fungithe first you mentioned, i'm looking to see if there's a way to create fallback error page templates for django19:07
fungibut perhaps someone more familiar with django knows?19:07
fungii know we can create specific error page templates for each status19:07
fungiso we could create a 429 error page template, but what i'm unsure about is if there's a way to have an error page template that applies to any error response which doesn't have its own separate template19:08
fungii think i recall tonyb mentioning some familiarity with django so i might pick his brain later if so19:08
clarkbI'm unsure myself19:09
fungiassuming my web searches and documentation digging turn up little of value19:09
clarkba default would be nice if possible but I suspect adding a 429 file would be a big improement alone19:09
tonybI don't think it was me 19:09
fungioh too bad19:10
tonybsorry I'll do better :P19:10
fungithe other item is we've had a couple of (mild) spam incidents on the rust-vmm ml, similar to what hit zuul-discuss a few months back. for now it's just been one address i initially unsubscribed and then they resubscribed and sent more, so after the second time i switched the default moderation policy for their address to discard instead of unsubscribing them19:10
fungibut still might consider switching the default moderation policy for all users on that list to moderate and then individually updating them to accept after they send good messages19:11
fungithat is if the problem continues19:12
clarkbI'm good with that but ideally if we can find a moderator in that community to do the filtering19:12
clarkbI'm not sure we should be filtering for random lists like that.19:12
fungiwell, yes i stepped in as a moderator since i was already subscribed and the only current community moderator had gone on sabbatical, but we found another volunteer to take over now19:13
clarkbgreat19:14
fungimy concern is it seems like the killer feature of mm3, the ability for people to post via http, increases the spam risk as well19:14
fungiwhich is going to mean a potentially increased amount of work for list moderators19:14
clarkbThough two? incidents in ~6 months ins't too bad19:15
fungiyeah, basically19:16
fungibut these are also very low-volume and fairly low-profile lists19:16
fungiso i don't know how that may translate to some of the more established lists once they get migrated19:16
fungisomething to keep an eye out for19:16
clarkbthere is probably only one way to find out unfortunately19:16
fungiagreed19:17
fungianyway, that's all i had on this topic19:17
clarkb#topic Gerrit Updates19:17
clarkbWe are still building a Gerrit 3.8 RC image. This is only used for testing the 3.7 to 3.8 upgrade as well as genereal gerrit tests on the 3.8 version but it would be good to fix that19:17
clarkb#link https://review.opendev.org/c/opendev/system-config/+/885317?usp=dashboard Build final 3.8.0 release images19:17
clarkbAdditionally the Gerrit replication tasks stuff is still ongoing19:18
clarkbI think my recommendation at this point is that we revert the bind mount for the task data so that when we periodically update our gerrit image and replace the gerrit contianer those files get automatically cleaned up19:18
clarkb#link https://review.opendev.org/c/opendev/system-config/+/884779?usp=dashboard Stop bind mounting replication task file location19:18
clarkbIf we can get reviews on one or both of those then we can coordinate the moves on the server itself to ensure we're using the latest image and also cleaning up the leaked files etc19:19
fungiwhat's the impact to container restarts?19:19
fungiif we down/up the gerrit container, do we lose queued replication events?19:19
clarkbfungi: yes. This was the case until very recently when I swapped out the giteas though so we were living with that for a while already19:20
clarkbThe tradeoff here is that having many leaked files on disk is potentially problematic when that number gets large neough. Also these bad replication tasks produce errors on gerrit startup that flood the logs19:21
clarkbwe'd be trading better replication resiliency for better service resiliency I think19:21
clarkbfungi: that said having anothe rset of eyes look over the situation may produce additional ideas. The alternative I've got is the gerrit container startup script updates that try to clean up the leaked file sfor us. I don't think the script will clear all the file scurrently but having a smaller set to look at will help identify the additional ones19:22
clarkb#link https://review.opendev.org/c/opendev/system-config/+/880672 Clear leaked replication tasks at gerrit startup using a script19:23
clarkbI'm happy to continue down that path as well, its jus thte most risky and effort needed option19:24
fungithanks, makes sense19:24
clarkbrisky because we are automating file deletions19:24
clarkbfor a todo here maybe fungi can take a look this week and next week we can pick an option and proceed from there?19:25
clarkbThe other Gerrit item is disallowing implicit merges across branches in our All-Projects ACL19:25
clarkbI can't think of any reason to not do this and I don't recall any objections to this in prior meetings where this was discussed19:26
fungiyeah, i should be able to19:26
clarkbreceive.rejectImplicitMerges is the config option to reject those when set to true19:26
fungidid i propose a change for that? i can't even remember now19:26
clarkbfungi: I don't think so since it has to be done directly in All-Project sthen simply recorded in our docs?19:26
clarkbthere may be a chngae to do the recording bit /me looks19:26
clarkbhttps://review.opendev.org/c/opendev/system-config/+/88531819:27
clarkbso ya if you have time to push that All-Projects update I think you can +A the change to record it in our docs19:27
fungioh, cool19:28
fungii guess someone did propose that19:28
clarkbthat was all I had for gerrit. Anything else befor ewe move on?19:29
clarkb#topic Server Upgrades19:31
clarkbI'm not aware of any changes here since we last met19:31
clarkbtonyb helped push the insecure ci registry upgrade through. I may still need to delete the old server I can't recal lif I did that right now19:31
corvusze04-ze06 upgraded to jammy today19:32
clarkbI think tonyb is looking at other things now in order to diversify the bootstrapping process as an OpenDev contributor so I'll try to look at some of the remaining stragglers myself as I have time19:32
clarkbcorvus: excellent19:32
tonybthe cleanup is removing it (01) from the inventory and then infra-root deleting the 01 vm?19:32
clarkbtonyb: correct19:32
tonybOkay19:33
clarkb#topic Fedora Cleanup19:34
clarkbtonyb: I've lost track o fwhere we were in the mirror configuration stuff. Are there changes you need reivewing on or input on diretion?19:34
tonybI need to update the mirror setup witrh the new mirrorinfo variable19:35
clarkbtonyb: is that something where some dedicated time to work through it would be helpful? if so we can probably sort that out with newly overlapping timezones19:35
tonybYeah, that's a good idea.  I understand the concept of what needs to happen but I'm in danger or overthiking it19:36
clarkbok lets sync up when it isn't first thing in the morning for both of us and take it from there19:37
tonybgreat19:37
clarkb#topic Quo vadis Storyboard19:37
fungii think i switched a neutron deliverable repo over to inactive and updated its description to point to lp last week? openstack/networking-odl19:38
clarkbOne thing I noticed the other day is that some projects like starlingx are still createing subprojects in storyboard. We haven't told them to stop and I'm not sure we should, but they were confused that it seems to take some time to do that creation. I think we are only creating new storyboard projects once a day19:38
clarkbAt this point I'm not sure there is much benefit in having project-config updates trigger the storyboard job more quickly19:38
clarkbBut it was a thing people noticed so I'm mentioning it here19:39
fungithere was also some discussion about sb in the #openstack-sdks channel, in particular a user was surprised to discover that an unescaped <script> tag in a story description resulted in truncating the text once displayed. fairly easy to avoid, but turned into some conversation about why getting a fix for such things implemented would be tough with out current deployment19:39
tonybCan we run the create, I assume via cron, twice (#gasp) a day?19:39
clarkbtonyb: I believe the job that does it is infra-prod-remote-puppet-else19:40
clarkbwhich at this point should mostly just be storyboard?19:40
clarkbwe have removed the vast majority of any remaining puppet19:40
tonybAhhh okay19:40
clarkbso ya we basically run that job more often or when necessary to decrease the wait time19:41
clarkbanything else storyboard related?19:42
fungii got nothin'19:43
clarkb#topic Gitea Upgrades19:43
clarkbGitea 1.19.4 exists and fungi has pushed an update to upgrade us19:43
clarkb#link https://review.opendev.org/c/opendev/system-config/+/887734?usp=dashboard Upgrade Gitea to 1.19.419:43
fungiseems to be fairly minor for our purposes, fixes mostly to stuff we disable anyway19:44
clarkbThese bugfix point upgrades tend to be pretty safe and straightforward though this one has a small template update19:44
fungiso also probably not urgent19:44
clarkbThe other gitea upgrade to think about is the 1.20 update. They only have RC releases so far and no changelog so also not urgent19:44
clarkb#link https://review.opendev.org/c/opendev/system-config/+/886993?usp=dashboard Begin process to upgrade gitea to 1.2019:45
clarkbThe 1.20 upgrade should happen after we upgrade to 1.19.latest19:45
clarkbfungi: I think I may be able to be around with overlap in your timezone early tomorrow morning for me later afternoon for you if we want to land that 1.19.4 chnage an dmonitor19:46
clarkbI'll ping you tomorrow if I manage that and we can take it from there?19:46
fungiyeah, sure that works19:46
fungii should be at the keyboard by 1200z19:47
clarkbI don't expect trouble but good to have people around if necessary19:47
clarkb1900 is probbaly about as eraly as I can manage :)19:47
clarkbthough maybe my evening overlaps with 1200 I need to math that out19:47
fungioh, you said early in your timezone not the other way around19:47
clarkb#topic Etherpad Upgrade19:47
clarkbfungi: ya19:47
clarkbAfter a long release drought Etherpad made a 1.9.1 release19:48
fungii assume the commit we've been running on is included in that release19:48
clarkbAt first the tagged sha didn't actually build nad I was forced to use a commit that fixed the build issues after the reelase. But I think they updated/replaced the tag and now it seems to work19:48
fungiaha, cool19:48
clarkb#link https://review.opendev.org/c/opendev/system-config/+/887006?usp=dashboard Etherpad 1.9.119:48
clarkbI updated that chnage to use the tag again and it did not fial19:49
clarkbNow there is an issue where numbered lists don't properly increment the list number values so every entry is 1. basically making it a weird bulleted list19:49
tonybGoing back to gitea (sorry), is there any merrit to landing the bullseye -> bookworm update in the same window?19:49
clarkbtonyb: no I think we should decouple those if we can. Basically swap gitea to the new debian with a fixed gitea version19:50
tonybokay19:50
clarkbtonyb: the gitea upgrades are very low impact (we roll through them one by one and shouldn't lose any replication events and the haproxy should handle http requests too)19:50
tonybOkay19:50
clarkbif the gitea upgrades were a bit more impactful then we should cnsider combining but they ar esuper transparent to users 19:50
clarkbGoing back to etherpad I think that this is laos not very urgent given the known list bug19:51
clarkbI also haven't held a node yet to interact with it which is probbaly a good idea to double check that we don't hvae any plugin interactions that will create problems for us19:51
clarkbReviews welcome and I'll try to get a held node up soon19:52
clarkb#topic Open Discussion19:53
clarkbAnything else?19:53
tonybJust to note that we started the bullseye to bookworm updates19:53
fungiyay!19:53
tonybthe first set of services I tried failed due to, what I think is, missing requires to get the speculative images in the buildset registry19:54
clarkbya I looked at that briefly and wasn't able to understand what was missing. It looks like we have what we need19:54
tonybHopefully with some push and TZ overlap we can make solid progress19:55
clarkbI feel like this comes up semi regularly though and  Ineed to be better about writing down what the issue was/improving my understanding19:55
clarkbcorvus: any chance yo umight have a few minutes to look at that?19:55
clarkbhttps://zuul.opendev.org/t/openstack/build/ab79e98cdd0242649cbc50593e87dae1/log/job-output.txt#723 is the failure19:55
corvusyeah i'll take a look and followup in #opendev 19:56
clarkbthank you19:56
tonybThanks19:57
clarkbsounds like that is everything for now. Thank you everyone!19:58
clarkb#endmeeting19:58
opendevmeetMeeting ended Tue Jul 11 19:58:51 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:58
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2023/infra.2023-07-11-19.01.html19:58
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2023/infra.2023-07-11-19.01.txt19:58
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2023/infra.2023-07-11-19.01.log.html19:58

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!