19:01:10 <clarkb> #startmeeting infra
19:01:10 <opendevmeet> Meeting started Tue May 30 19:01:10 2023 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:10 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:10 <opendevmeet> The meeting name has been set to 'infra'
19:01:13 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/G2YQVAPBGOGDJKUKZKDUKAWMFMIWIRRD/ Our Agenda
19:01:22 <clarkb> #topic Announcements
19:01:52 <clarkb> I've gone ahead and written down that we should skip the meeting on June 13 as many of us will be in vancouver for the summit
19:02:06 <fungi> works for me
19:02:23 <clarkb> I will also be unable to attend the meeting on June 20th as I'll either be on a plane or in a tsa line or something
19:02:27 <fungi> i expect to be travelling and likely miss the meeting on the 27th as well
19:02:32 <clarkb> but I'm happy for the meeting to happen without me
19:02:44 <ianw> i will also be AFK then!
19:02:48 <fungi> i can likely chair the one on the 20th unless someone else wants to
19:02:54 <corvus> Lol see everyone in September!
19:03:01 <fungi> i like that plan even better
19:03:23 <corvus> Jk
19:03:52 <clarkb> I do plan to have a meeting here next week though before the hiatus
19:05:10 <clarkb> Also good to generally be aware that a summit + forum + ptg is happening that week (June 13-15)
19:05:15 <clarkb> #topic Topics
19:05:22 <clarkb> #topic Migrating to quay.io
19:05:36 <clarkb> I thought about pulling this off of the agenda but decided to keep it for today in order to do a recap
19:06:05 <corvus> Run as root is new info
19:06:15 <fungi> sudo all the things
19:06:31 <clarkb> the tl;dr is that after migrating about half of hte issues I discoverd that transparent mirroring of images outside of docker.io does not work when using docker
19:06:50 <clarkb> this eventually led me to revert all of the moves I had already done. This is largely done except for base image locations in the zuul/* repos
19:07:12 <clarkb> and ya rootless podman really really wants a systemd session liek you're logging in on a desktop
19:07:29 <clarkb> as far as I can tell our test nodes do create a session with systemd when sshing in (we have all that pam setup in place)
19:07:49 <corvus> User mapping is the bigger production issue
19:07:59 <clarkb> but that isn't sufficient to make it happy. This leads to cgroupfs override options. THen on top of that you cannot run podman su'd to another user because you lack even more sstemd session stuff in that case
19:08:01 <corvus> That requires root
19:09:40 <clarkb> When I did my test conversions of system-config stuff to podman it was all running as root because that is the simple 1:1 mapping away from docker
19:09:57 <clarkb> I don't think this is a big regression compared to our use of docker but does remove some of the functionality you would hope to get out of podman
19:11:11 <clarkb> The other thing I want to call out is that dib folks are asking for some resolution on speculative image testing with nodepool. https://review.opendev.org/c/zuul/nodepool/+/884632 has been proposed now which does the hack up image names and set them via vars option we considered in my brainstorming document
19:11:28 <clarkb> I personally think rolling forward with podman there is the best way forward so I -1'd it and pointed to the change that does that
19:11:45 <clarkb> but might be good to try and close that out soon one way or another
19:12:01 <corvus> oh yeah i'm going to -2 that
19:12:18 <corvus> we haven't been working on the actual fix for weeks just to give up now that it's actually working
19:12:50 <clarkb> I think some of this confusion occurred due to the holiday creating two disparate groups of people attacking hte same problem
19:13:01 <clarkb> but I'm with you I've put a lot of effort into this and would like to see us fix it more properly
19:13:18 <fungi> the fedora mirror change for dib has moments ago been revised to drop the dep on the other nodepool change anyway
19:13:50 <ianw> ++
19:14:29 <clarkb> I think that is about it on quay.io stuff. Basically reviews and progress on the zuul side of things is what remains
19:14:41 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/883311 A role to install podman. Clarkb needs to update this role
19:14:55 <clarkb> oh also this change is on my list of things to update so that we can start pushing on converting existing jammy nodes
19:15:51 <corvus> i'm not sure the thing i raised before was adequately articulated
19:16:40 <corvus> the thing that is new since the last meeting is that due to the way we bind mount in files that are owned by the in-container nodepool user, the only way we can find to make that work right now is to run podman as root so that bind-mount happens with the correct perms, then we can still run the nodepool container as the nodepool user.
19:17:13 <clarkb> corvus: right we execute `podman` as root but then the container workload can still run as a dedicated user
19:17:28 <corvus> so the implication is that we need to be okay with running the podman command as the root user (which is effectively the same as what is happening now with docker) in production, at least unless/until someone figures out a way of subuid mapping to allow that to happen with a host-level nodepool user.
19:17:33 <corvus> ya
19:18:23 <fungi> if it's basically already the case with how we run the docker client, i don't see the concern
19:18:44 <fungi> unless it's just that podman might have otherwise been an opportunity to avoid doing that
19:19:39 <corvus> fungi: yep, from my pov, it's mostly just a sad face
19:19:40 <clarkb> ya I think we had hoped we could run things more betterer
19:19:48 <clarkb> but this isn't any worse
19:20:25 <clarkb> alright anything else on this? We can pick up the zuul work in matrix since its largely zuul specific at this point
19:20:43 <tonyb> sounds good
19:21:16 <clarkb> #topic Bastion Host Change
19:21:22 <clarkb> #link https://review.opendev.org/q/topic:bridge-backups
19:21:29 <clarkb> I think this topic still needs reviews
19:21:52 <clarkb> I do like the functionality and would like to move forard with it but also think it is sensitive enough it should be carefully reviewed (eg not move forward with just my review)
19:22:18 <tonyb> I promise to review it tomorrow
19:22:32 <tonyb> just another set of eyes
19:22:56 <clarkb> thanks!
19:23:06 <fungi> oh, right, i keep meaning to look at that too
19:23:20 <clarkb> #topic Mailman 3
19:23:25 <fungi> no news yet. next i need to initiate some delivery tests so i can check what urls end up embedded in the list-oriented headers
19:23:47 <clarkb> fungi: is the held node the same one as last week?
19:23:47 <fungi> slightly worried they'll go to the default domain instead of the list-specific domains
19:23:54 <fungi> yeah, same held node still
19:24:14 <fungi> at least now that the default domain is completely separate from the list-specific domains, it'll be easier to check for
19:24:23 <clarkb> fungi: re email headers I think that may just work because django and the email bits are separted and it was only django that we had trouble with
19:24:41 <clarkb> I think when we create the list and set the domain that is with the email backend and it should be more happy on that side of the mm3 house
19:24:44 <clarkb> but ++ to testing it
19:24:53 <fungi> yes, hyperkitty and postorius specifically are the concern, so mailman-core should probably be unaffected
19:25:02 <fungi> but i want to make sure
19:26:48 <clarkb> Sounds good
19:26:51 <clarkb> anything else related to mm3?
19:27:21 <fungi> nada
19:27:28 <clarkb> #topic Gerrit Updates
19:28:03 <clarkb> With all the quay.io stuff I haven't had a chance to look at this. I still would really like to but realistically with summit and travel etc it is unlikely. For this reason I'll push up a revert for the bind mount which we can fallback to if necessary
19:28:12 <clarkb> (again I don't think this is urgent more just super annoying)
19:29:07 <fungi> i guess the main new bit of news is that there's actually a gerrit 3.8.0 release now?
19:29:19 <clarkb> yes since my bugs haven't gotten any traction
19:29:28 <clarkb> there is a community meeting on thursday morning which I'll attempt to attend
19:29:41 <clarkb> but they cancelled the last two because no one at google would start the google meeting instance
19:29:48 <clarkb> I'm not getting my hopes up
19:30:12 <fungi> oh, i guess technically 3.8.0 was released a few days before last week's meeting
19:30:29 <fungi> time has been an absolute blur lately
19:30:35 <clarkb> #topic Upgrading Servers
19:31:01 <clarkb> As with gerrit replication task file cleanup this has been on the back burner. Unlike Gerrit replication leaks I'm hoping I might do a server or two between now and the summit
19:31:10 <clarkb> fingers crossed! but other than that I don't have any real updates
19:31:18 <corvus> i upgraded zuul mergers to jammy
19:31:19 <clarkb> corvus: did replcae our zuul mergers with jammy nodes.
19:31:22 <clarkb> jinx!
19:31:27 <corvus> :)
19:31:28 <fungi> thanks!
19:32:45 <clarkb> This continues to be the perpetual example of slow and steady progress
19:32:59 <clarkb> never as fast as I would like but never completely stalling out. Hopefully I can continue the trend before the summit
19:33:06 <clarkb> #topic Fedora Cleanup
19:33:06 <fungi> corvus: i guess, judging from the inventory/dns changes, you were able to do the full set of mergers in one shot?
19:33:13 <clarkb> #undo
19:33:13 <opendevmeet> Removing item from minutes: #topic Fedora Cleanup
19:33:28 <fungi> i think i was nodding off that afternoon
19:33:56 <clarkb> fungi: that is my understanding. In part because the executors also run mergers so we didn't need all of the mergers running at all times
19:34:13 <fungi> cool
19:34:20 <fungi> makes sense to me, thanks
19:34:27 <clarkb> #topic Fedora Cleanup
19:34:57 <clarkb> tonyb: I went looking for any changes around the disabling of mirrors for fedora test nodes and didn't find one. But I may hvae looked poorly.
19:35:12 <clarkb> I think that is the next step here, I'm happy to help if you need direction or reviews etc
19:35:15 <tonyb> I didn't get my patch published but I did a bunch of local testing
19:35:30 <tonyb> I'll push it up today after I land
19:35:35 <clarkb> sounds good thanks
19:35:38 <ianw> i really should have thought about DIB first
19:35:50 <ianw> this + the quay changes have unfortunately caused quite some confusion
19:35:57 <ianw> #link https://review.opendev.org/c/openstack/diskimage-builder/+/883798
19:36:05 <clarkb> ya, but we have changes to fix things on both sides so we should be able to make progress shortly
19:36:12 <ianw> is I think ~ right
19:36:58 <ianw> however we saw one weird failure where we couldn't parse out the .qcow2 path from a curl to the mirror
19:37:44 <ianw> it's undetermined why, but i don't think as is the work-around in there is required per my comment
19:39:02 <clarkb> once the nodepool stuff is running again we caniterate on the dib side more easily too
19:39:06 <clarkb> to figure that curl thing out
19:40:21 <clarkb> I think that is it for fedora
19:40:22 <ianw> ++ ; i mean we could also just drop building fedora from .qcow2's
19:40:34 <ianw> i don't know if anyone actually uses it, other than the test
19:40:40 <clarkb> eh its a feature people like to have since it allows them to modify existing eimages pretty easily
19:40:56 <clarkb> or at least I'm always told this is why guestfs or whatever it is called is popular
19:41:01 <clarkb> and I have to remind people that dib does that too :)
19:41:37 <clarkb> #topic Storyboard
19:41:53 <clarkb> fungi: I saw some openstack-helm discussion today about this. Anything else to report?
19:42:29 <fungi> nah, i merged that change and am about to deactivate the projects it removes
19:42:39 <fungi> that's basically the extent of it
19:42:58 <clarkb> #topic Open Discussion
19:43:01 <clarkb> #link https://review.opendev.org/c/openstack/project-config/+/884563 Github merge method zuul configuration error fixes
19:43:02 <fungi> and it's just cleanup for some already retired repos, the team isn't moving their active repos off sb for now
19:43:09 <clarkb> ack
19:43:16 <clarkb> frickler: called out this change to fix some configuration errors in zuul
19:43:47 <fungi> frickler also re-lit a fire under the openstack tc to get back to cleaning up their errors
19:43:50 <clarkb> corvus: ^ Is the underlying issue there that the projects in github have chosen a merge method that zuul doesn't support so we have to override? I do think frickler is correct that the harm here is minimal since we aren't gating those projects and are instead doing third party ci
19:43:56 <corvus> i think that may be more complex than anticipated; i left a comment on the change
19:44:05 <clarkb> ah ok I should refresh and read it then
19:44:37 <clarkb> corvus: is the issue that zuul is detecting a mismatch between it and the github project configuration?
19:44:39 <corvus> well, even if the change were correct, i don't think we should merge comments that are incorrect
19:45:18 <corvus> clarkb: yes, zuul is saying that it's configured to use the "merge" merge method with a certain repo, and github says that's not an option
19:45:23 <clarkb> the ideal fix would be to set the merge mode to match the upstream project in that case. Assuming zuul supports that mode.
19:45:43 <clarkb> and then we can drop the comment entirely
19:46:00 <corvus> hrm?
19:46:23 <corvus> i mean, the comment says "this shouldn't be necessary since we're not gating" but it is necessary even if not gating
19:46:47 <corvus> so i don't want to propagate the incorrect idea that this is only important for gating, it's not.  it's always important for zuul to merge changes locally the same way they are merged remotely.
19:46:56 <clarkb> right, my point is if we set merge-mode to what github wants then the comment isn't relevant anymore and can be removed
19:47:05 <fungi> because zuul needs to be able to match its merge method in order to faithfully predict what a pr might look like once it merges
19:47:16 <corvus> i mean, we don't need a comment
19:47:40 <corvus> but if we do feel that, then let's say something like "we're setting this to match the upstream method"
19:47:43 <corvus> my objection to the comment is that it says something about zuul's behavior which could mislead people
19:48:01 <corvus> fungi: exactly
19:48:03 <clarkb> got it. I'm trying to calrify what the actual fix is here since you also sa that this change won't change any behavior
19:48:18 <corvus> yes, i think that's the more important thing
19:48:23 <corvus> and i don't have an answer to that
19:48:58 <corvus> you can see right now if you look at the error it says that the 'merge' merge-mode isn't supported
19:49:14 <corvus> it's probably just not reporting an error in this case because it's not tripping the "is this a new error?" check
19:49:35 <corvus> but i bet a nickel if you merge that change the error will still be present since the conditions are the same
19:50:10 <corvus> so, what is the correct merge mode?  and if it is 'merge', then why does zuul think it's not allowed?  are the $10k questions
19:50:22 <corvus> maybe $11k now with inflation
19:50:47 <clarkb> looking at some closed PRs there doesn't seem to be a clear indication of the method being used unless 'foo merged commit 1234567' means merged explicitly
19:50:52 <tonyb> is that USD?
19:51:05 <fungi> canadian
19:51:18 <fungi> i need to use up all my leftover canadian currency
19:51:19 <tonyb> still better than AUD
19:51:33 <clarkb> but ya we can run that down maybe by asking someone at ansible or querying the github api like zuul or something
19:51:50 <corvus> maybe there's a bug where no merge methods show up as permissible to the zuul user or something.  just brainstorm.
19:51:50 <clarkb> corvus: I wonder if zuul can list the acceptable merge methods when it logs the unacceptable ones (I don't know if it has that knowledge)
19:51:55 <corvus> yeah i'd start with the latter.
19:52:04 <corvus> clarkb: that would be ideal for debugging this
19:52:18 <fungi> does sound like a useful addition
19:53:29 <clarkb> anything else?
19:55:13 <fungi> nothing here
19:55:22 <fungi> at least not that i can remember after the weekend
19:55:24 <clarkb> sounds like that is everything. Thank you for your time. Reminder we'll be back next week then take at least a one week break
19:55:31 <clarkb> possibly a two week break.
19:55:46 <clarkb> And then this meeting will occur at 6am for me so I'll feel tonyb and ianw's pain
19:55:52 <clarkb> thanks again!
19:55:55 <clarkb> #endmeeting