19:01:46 <clarkb> #startmeeting infra
19:01:46 <opendevmeet> Meeting started Tue Jan 24 19:01:46 2023 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:46 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:46 <opendevmeet> The meeting name has been set to 'infra'
19:01:52 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/SAMQHW6WCCF4LKQ2IADJ4VJGZZENI72D/ Our Agenda
19:01:58 <clarkb> #topic Announcements
19:02:16 <clarkb> I sent email last week and made the Service Coordinator nomination period that begins on January 31, 2023 official
19:02:22 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/32BIEDDOWDUITX26NSNUSUB6GJYFHWWP/
19:03:13 <clarkb> I'll send a reminder email on the 31st that things have opened up too
19:03:55 <clarkb> #topic Bastion Host Updates
19:04:09 <clarkb> I suspect there hasn't been much change here after the week we had last week...
19:04:37 <clarkb> But on the todo list we need to shutdown the old bridge and clean it up when we are satisfied doing so is fine
19:04:42 <clarkb> #link https://review.opendev.org/q/topic:bridge-backups
19:05:07 <clarkb> We need to review ^ that stack of changes (I need to personally do it, but anything around backups and encryption demands time and attention and I haven't had a ton of that recently)
19:05:09 <ianw> oh right, yes everyone has signed off on that
19:05:34 <clarkb> and then finally once we've dealt with those items we can start looking at parallel infra-prod jobs again:
19:05:36 <clarkb> #link https://review.opendev.org/q/topic:prod-bastion-group Remaining changes are part of parallel ansible runs on bridge
19:06:25 <ianw> (i've shutdown the old bridge.  it's already in emergency.  i'll work on inventory changes, etc.)
19:06:42 <clarkb> thanks!
19:06:45 <clarkb> anything else to add to this topic?
19:06:56 <ianw> not today!
19:07:16 <clarkb> #topic Mailman 3
19:07:40 <clarkb> fungi: I know we've all been underwater with various security things, you more than others. Did you manage to make any progress on the outstanding mailman3 items?
19:07:51 <clarkb> For those following along these are the major todo items that I recall:
19:07:59 <clarkb> We need a service restart to set the site_owner config
19:08:03 <fungi> i've started catching back up on this, i added some initial notes to the bottom of https://etherpad.opendev.org/p/mm3migration in the todo sectiom
19:08:13 <fungi> oh, right, i should add the restart
19:08:25 <clarkb> We need to figure out domain vhosting and likely change domain configuration in the mm3 django install to do this
19:08:34 <clarkb> and we need to fix the root email alias on the server
19:09:44 <fungi> i think all that's captured in the pad now
19:09:44 <clarkb> #link https://etherpad.opendev.org/p/mm3migration live todo list for mailman3 work.
19:09:51 <clarkb> excellent. Anything else to mention on this topic?
19:10:25 <fungi> nope, my focus will be the restart (should be able to do that after the meeting) and troubleshooting the job failures on 867987
19:10:45 <clarkb> sounds good, thanks. Again let me know if I can help
19:10:51 <fungi> sure thing
19:10:51 <clarkb> #topic Gerrit Updates
19:11:09 <clarkb> This has sort of morphed out of the Gerrit 3.6 post upgrade task tracking into a bigger set of items
19:11:17 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/870114 Add Gerrit 3.6 -> 3.7 Upgrade test job
19:11:52 <clarkb> this change is a post upgrade item. ianw I responded to your comment there and basically indicated I feel like punting on that for now is ok/desireable since we aren't trying to fully automate the gerrit upgrade in production (yet)
19:12:11 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/870874 Convert Gerrit to run on our base python image
19:12:40 <clarkb> This one looks super straightforward, but is actually fairly involved. It turns out that the old opendjdk images on dockerhub aren't really something we should use going forward.
19:13:19 <clarkb> I've elected to address that by switching gerrit over to our base python image and installing java from debian repos. The reason for this is it will allow us to update python on those images to 3.10 or 3.11 for jeepyb in a straightforward manner
19:13:42 <clarkb> Debain bullseye includes java 11 (what we currently run on) and java 17 (what we'll eventually move to) which is nice too
19:13:48 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/870877 Run Gerrit under Java 17
19:14:28 <clarkb> this change is a followup to the previous change that switches us from java 11 to 17 as gerrit 3.6 release notes say 17 is fully supported as of 3.6. That said, I have had to add a workaround for a bug in running gerrit under java 17 (the bug is linked to in that change)
19:14:52 <clarkb> I should write the gerrit mailing list today asking them about that because "fully supported" and "use this workaround for the jvm" seem to be in conflict with one another
19:15:25 <clarkb> And finally ianw has a change to convert us away from deprecated copy conditions in 3.6 (this needs to be done before we upgrade to 3.7 along with other things like conversion to submit-requirements)
19:15:27 <clarkb> #link https://review.opendev.org/c/openstack/project-config/+/867931 Cleaning up deprecated copy conditions in project ACLs
19:16:29 <clarkb> I need to review ^ that one and will try to do that today. Then we can land that when we're happy with the state of acls generally (which I think we may already be)
19:16:38 <clarkb> Any other Gerrit updates?
19:16:50 <ianw> yeah i think we're fine on that -- to just log what happened i re-loaded the acl's per
19:16:59 <ianw> #link https://etherpad.opendev.org/p/760YNeM5OEFS1hlr7bE5
19:17:27 <ianw> someone else should probably double check the logs but the only "errors" were for projects that were retired in and in R/O mode
19:18:01 <ianw> i saw some discussion on that in #opendev and wondering if we should still make the change to jeepyb to stop on acl failures
19:18:16 <ianw> i think we can, because we won't try to load ACL's for retired projects (normally?)
19:18:33 <clarkb> ianw: we will if we haven't already cached that we've updated them
19:18:38 <ianw> although i guess it does mean we can't run the mass reload?
19:18:42 <clarkb> which is my concern since the cache may not persist forever
19:18:45 <clarkb> ya exactly
19:18:49 <ianw> yeah
19:19:13 <clarkb> I think we need to handle errors for updating RO projects or remove RO projects from projects.yaml or something if we want to make errors more forceful
19:19:29 <clarkb> I'm open to ideas if people want to leave them in that jeepyb review
19:19:38 <clarkb> but its nothing something we can land as is so I WIP'd it
19:20:40 <clarkb> I can use your captured error logging for hints too
19:20:47 <clarkb> I should go look at that for multiple reasons :)
19:21:20 <ianw> ok, yeah something to think about.  they're all using the retired acl file
19:21:26 <fungi> i suppose our projects.yaml cleanup job could also propose removals of read-only projects $somehow
19:21:38 <ianw> i guess we need to probe though if it's been applied ...
19:22:13 <clarkb> well we know the retired acl does apply cleanly which would imply anything using that acl that fails is very likely to have failed due to being ro
19:22:30 <clarkb> I want ot say gerrit says something like "you can't modify this RO project" which we could use as an indication to ignore too
19:23:10 <ianw> yeah it says
19:23:11 <ianw> openstack-attic_compute-api.txt- ! [remote rejected] HEAD -> refs/meta/config (prohibited by Gerrit: project state does not permit write)
19:23:35 <clarkb> cool, I can work on a new patchset that checks for that error specifically and only ignore it if that is returned else error
19:23:44 <ianw> ++
19:24:24 <clarkb> anything else related to Gerrit before we go to the next thing?
19:25:36 <clarkb> #topic Gitea 1.18 upgrade
19:25:50 <clarkb> yesterday we did a minor upgrade from 1.17.3 to 1.17.4
19:26:08 <clarkb> That was in preparation for an upgrade to 1.18.x which has now made it to 1.18.3
19:26:10 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/870851 Upgrade to 1.18.3
19:26:27 <clarkb> There is a held node against the child change of ^ that can be used to preview things. It seems like it is working happily.
19:27:04 <clarkb> reviews and double checking the changelog and the held node are much appreciated. I'm happy to watch that go in
19:27:21 <clarkb> (this is something that has been on my todo list since like December so will be glad to have it done :) )
19:27:45 <ianw> ++ will do
19:28:13 <clarkb> #topic Pruning backups on the rax server
19:28:36 <fungi> yeah, i keep meaning to get to that
19:28:38 <clarkb> The rax backup server is warning us that we're at 92% of capacity and we should run the pruning tool
19:28:42 <fungi> doesn't seem dire yet
19:28:52 <clarkb> fungi: oh thanks.
19:29:03 <clarkb> I wanted to bring it up hee just to make sure it didn't get completely forgotten under the fun of last week :)
19:29:24 <ianw> yeah i can clear that out
19:29:53 <clarkb> sounds like we've got a couple volunteers so should get done soon enough. Thanks again
19:30:00 <clarkb> #topic Linaro Cloud Updates
19:30:06 <clarkb> #link https://review.opendev.org/c/openstack/project-config/+/871196 Remove old linaro cloud from Nodepool
19:30:32 <clarkb> I've reviewed that stack and I think it can go in whenever we are ready. Also note that the ssl cert for that cloud expires in 2 days so sooner than later is a good idea
19:31:03 <clarkb> I know ianw is actively debugging use of the new cloud, but any chance you can give us a quick update on the modifications you made to that cloud?
19:31:56 <ianw> yep, basically we have 2tb to play with on the cloud, but it was all assigned to a cinder pool
19:32:10 <frickler> do we need a delay between the above cleanup patches?
19:32:38 <clarkb> frickler: yes, ideally we land the first one then wait until nodepool is done cleaning everything up before landing the second
19:33:07 <frickler> o.k., I just approved the first and will only review the next one, then
19:33:45 <clarkb> thanks!
19:34:13 <ianw> anyway, i deleted that cinder pool, and made it only 150gb which is enough room for the cache volume we attach to the mirror node
19:34:50 <ianw> the rest of the storage i just attached as a regular file system, and moved the glance image storage/libvirt storage into volumes mounted from there
19:35:29 <frickler> the new mirror seems to need another deployment run to recreate some things on the new volume
19:35:32 <ianw> so, basically we have enough room now to store our uploaded images and run i think as many vm's as we have floating ip's for
19:36:18 <ianw> frickler: ahh, that may be, yes i did delete it's cache volume and recreate it.  i probably should have done a manual run of the mirror deployment against it.  will double check soon
19:36:35 <clarkb> sounds like good progress. Anything else to add on the arm cloud migration?
19:37:45 <clarkb> #topic Upgrading servers
19:37:58 <clarkb> this is reasonably high on my todo list but things like git security patching took precedence...
19:38:13 <clarkb> I'm hoping once I've got my backlog of gitea and gerrit things out of the way I'll be able to focus on this more
19:38:19 <clarkb> No real updates on this one.
19:38:28 <clarkb> #topic Quo vadis Storyboard
19:38:33 <clarkb> and same story on this topic :(
19:39:06 <clarkb> Which takes us to the end of our scheduled agenda
19:39:10 <clarkb> #topic Open Discussion
19:39:19 <clarkb> There are a couple things I wanted to mention here.
19:39:50 <clarkb> First is that we discovered gitea does cross repo searching (only on the primary branch) similar to hound. This caused us to wonder if we could drop hound as a result, but ianw pointed out that hound does regex serach and gitea does not
19:40:12 <clarkb> That said I've been using it for simple searches and it seems to work well.
19:40:48 <clarkb> And second I've pushed tox -> nox conversion changes for bindep, jeepyb, git-review, and system-config now. For at least some of these (git-review) I don't think tox is working at all.
19:40:50 <fungi> and it sounds like the underlying search library gitea uses supports regex, so it may just be simple glue/ui patching to gitea to add that?
19:40:50 <ianw> yeah, i use both all the time :)
19:41:22 <clarkb> fungi: yes there is opportunity to improve gitea to expose regex searching as both bleve (the default we use) and elasticsearch appear to support regex searches
19:41:43 <ianw> i think hound is pretty low maintenance, i personally would like to keep it.  it probably doesn't need it's own host as it does now, but not sure where else it would live
19:42:16 <clarkb> ianw: ya I think as long as it has more functionality keeping it makes sense
19:42:41 <fungi> if we get approximate feature parity in gitea, then dropping one more redundant ancillary service will still be good though
19:42:45 <clarkb> ++
19:43:23 <fungi> we can always host a static redirect from the codesearch name to gitea's explore search
19:43:30 <frickler> I like that I can just type "co" in the browser address bar and it will autocomplete and then focus in the search field
19:44:14 <frickler> two more things from me:
19:44:21 <clarkb> I've been trying to do the nox stuff when I've got a hole in my schedule that is too short to really dive into more involved tasks. I haven't seen any really strong reactions either way on nox. Please say something if you think I'm wasting my time and I'll try to fill those odd blocks of time with something else.
19:44:26 <fungi> https://opendev.org/explore/code also seems to focus on the search field, so a redirect would preserve that experience
19:44:58 <frickler> ah, o.k.
19:45:09 * frickler will be holidaying the upcoming two weeks and mostly be offline
19:45:22 <frickler> also I didn't get to add AFS to the agenda as promised. seems there was another cleanup in fedora, so we are good for now except maybe for some centos+ubuntu quota adjustments
19:46:00 <fungi> thanks for keeping an eye on that
19:46:00 <clarkb> ya I almost added it, but the dashboard looks pretty good right now so figured it could wait
19:46:44 <clarkb> frickler: I hope you are able to do something fun for your holidays
19:46:54 <fungi> i think i'm moments away from having updated screenshots to see if the latest revision of the donor logos addition works
19:47:56 <fungi> though looks like the gitea image build takes a while, so it will probably still spill over into my next meeting
19:48:29 <frickler> well as much fun as it gets with the weather and everything
19:50:30 <clarkb> Oh one last thought, should we offer to help debian with the git stuff since we already worked through much of it? fungi already gave them pointers, but I'm worried there hasn't been any movement there after a week. I just don't know what all might be involved particularly since that package isn't in salsa?
19:51:23 <fungi> i've been keeping an eye on https://repo.or.cz/w/git/debian.git/ and haven't seen any movement on the debian branches there either
19:51:43 <frickler> maybe ask some debian people like zigo or kevko for their judgement?
19:52:00 <fungi> that sounds reasonable
19:52:36 <fungi> i'm not a dd so couldn't nmu anything without a sponsor, but also git is central enough to so many things i'd be a little uneasy being the one to nmu that anyway
19:52:59 <clarkb> frickler: excellent idea
19:53:00 <ianw> i am a dd ... but would probably not do that for git! :)
19:53:10 <ianw> well not without consent, anyway
19:53:28 <ianw> but ... if we find the right people happy to help
19:54:11 <fungi> ianw: oh! you're a dd? i have something else i need official dds to weigh in on at some point, but it's not directly related to opendev
19:54:21 <fungi> i'll follow up with you later on it
19:55:05 <ianw> haha well yeah, i maintain a few things; i was much more active back in itanium days on ia64 things
19:55:33 <clarkb> ianw: when I worked at intel we actually had some of those racked up doing I forget what
19:55:35 <ianw> but ... well that's a slice of history now :)
19:55:51 <ianw> clarkb: probably making a lot of noise and heat
19:55:54 <clarkb> ha
19:56:11 <clarkb> sounds like that is everything for today. I've got another meeting in a few minutes so stopping here and getting time between would be great
19:56:14 <clarkb> thanks everyone!
19:56:25 <ianw> thanks clarkb!
19:56:27 <clarkb> both for your time today and all the hard work everyone does to make this machine roll forward
19:56:32 <clarkb> #endmeeting