19:01:16 <clarkb> #startmeeting infra
19:01:16 <opendevmeet> Meeting started Tue Aug 24 19:01:16 2021 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:16 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:16 <opendevmeet> The meeting name has been set to 'infra'
19:01:22 <clarkb> #link http://lists.opendev.org/pipermail/service-discuss/2021-August/000278.html Our Agenda
19:01:28 <clarkb> #topic Announcements
19:01:33 <ianw> o/
19:01:37 <clarkb> Zuul is now on our matrix homeserver at #zuul:opendev.org
19:01:55 <clarkb> this means you should wander over there if you have zuul issues/questions/concerns or just want to say hi :)
19:02:11 <clarkb> also the matrix hosting info is in the normal location for that stuff if hosting things come up
19:02:48 <clarkb> OpenStack's feature freeze begins next week
19:02:59 <yoctozepto> o/
19:03:06 <clarkb> I expect there will be a big rush of changes as a result next week which may make zuul restarts annoying.
19:03:17 <clarkb> But otherwise try to be aware of that as we make changes so we don't impact that too much
19:04:15 <clarkb> Also I'll be taking monday off to see if I can find any salmon
19:04:22 <fungi> also we have a new (minor) gerrit version since last meeting, yeah?
19:04:48 <clarkb> oh yup. Upgraded as gerrit made some minor updates for security concerns (neither seemed to directly affect us) and wanted to keep up to speed with that
19:05:18 <fungi> g'luck with the salmon hunt
19:06:01 <clarkb> #topic Actions from last meeting
19:06:06 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-08-17-19.01.txt minutes from last meeting
19:06:14 <clarkb> There weren't any actions recorded
19:06:17 <clarkb> #topic Specs
19:06:22 <clarkb> #link https://review.opendev.org/c/opendev/infra-specs/+/804122 Prometheus Cacti replacement
19:06:44 <clarkb> I'm still looking for feedback on this change if you have some time over tea :)
19:06:54 <fungi> and we can close out the matrix spec now, or do we still have remaining tasks there?
19:07:19 <clarkb> Did the spec say we should implement a meetbot or not? if not I think we can close it
19:07:32 <fungi> we have users relying on it at this point anyway, so it's in production (but not with complete bot feature parity)
19:07:55 <clarkb> yup, I think as long as we didn't promise that extra bot functioanlity as part of the initial spec then we are good to mark it completed
19:08:01 <fungi> was anyone able to confirm if my status notice went through to the zuul channel? i wasn't watching at the time
19:08:12 <clarkb> fungi: I don't think it did. I think there isn't a status bot either
19:08:40 <fungi> so anyway, yeah, i guess if those bits are called out in the spec then we need to keep it open for now
19:09:10 <clarkb> I'll check the spec after the meeting and lunch and can push a completion change if we didn't otherwise we can talk about that next week in our meeting I guess
19:09:23 <clarkb> in order to plan finishing that effort up
19:09:54 <corvus> Status bot is in scope
19:10:08 <corvus> Meetbot not in scope
19:10:15 <fungi> thanks for confirming
19:10:50 <clarkb> ok, lets move on and we can talk about statusbot once we get through everything else
19:11:01 <clarkb> #topic Mailman server upgrades
19:11:20 <corvus> So we're a little off script there but not too important.
19:11:29 <clarkb> fungi and I have scheduled a time for the lists.kc.io upgrade with the kata folks. This will be happening Thursday at 16:00UTC
19:11:41 <clarkb> based on testing we expect things to function post upgrade and the upgrade itself to take about 2 hours
19:11:45 <fungi> kudos to debian/ubuntu packaging here, the in-place upgrade from xenial to focal was quite smooth
19:12:10 <fungi> (with a stop off in bionic)
19:12:31 <clarkb> we'll turn off services on the instance, snapshot it, make sure it is up to date and rebooted. Then run through the in place upgrades to bionic and then to focal
19:13:19 <fungi> worth noting, exim service disablement gets undone when the package switches from sysv-compat to systemd service unit
19:13:20 <clarkb> I've also checked that our zuul jobs are happy running our ansible against focal and they are so once we've finished the upgrade the config management should continue happily
19:13:57 <fungi> we could probably preemptively add a disable symlink for the service unit now that we know that?
19:13:58 <clarkb> #topic Improving OpenDev's CD throughput
19:14:07 <clarkb> #undo
19:14:07 <opendevmeet> Removing item from minutes: #topic Improving OpenDev's CD throughput
19:14:18 <fungi> oh, sorry, didn't mean to derail the agenda
19:14:36 <clarkb> fungi: that is a neat idea, I'm happy not doing that given it worked out ok on the test system but if you want to try that I'm not opposed
19:14:57 <fungi> i'm up for trying it on lists.k.i since it's sort of a trial run for lists.o.o anyway
19:15:13 <fungi> the risk of it breaking anything seems negligible
19:15:15 <clarkb> yup
19:15:47 <clarkb> I'll try to remember to put that in my planning doc when I write that up tomorrow
19:15:55 <clarkb> #topic Improving OpenDev's CD throughput
19:16:16 <clarkb> Not much new to say here as I don't think anyone has started doing the mapping (currently I have that scribbled on my todo list for friday)
19:16:31 <clarkb> But there is one change on the simple things we might be able to do upfront list:
19:16:32 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/805487 Be more specific about when inventory changes force jobs to run.
19:17:54 <clarkb> I suspect that this is safe, but I'm worried we might have situations where we do want to run jobs if the inventory/service/ paths update. But so far the only examples of that I have been able to come up with are letsencrypt and I think that is unchanged by mordred's change
19:18:19 <clarkb> whcih means as long as we match on inventory/service/thisservice and inventory/base that change should be ok
19:18:43 <fungi> i've approved it. we can always add specific files to those
19:18:57 <fungi> as we spot holes
19:18:58 <clarkb> ya we can do that was well
19:19:03 <clarkb> *as well
19:19:37 <clarkb> And like I said I'm hoping to start mapping out the jobs on Friday based on my current todo list
19:19:38 <ianw> yeah that feels safe
19:20:01 <clarkb> Anything else to add on this subject?
19:20:02 <fungi> i intend to help with the mapping exercise
19:20:09 <fungi> remind me when you get going on it
19:20:22 <clarkb> will do and tahnks
19:21:11 <clarkb> #topic Gerrit Account Cleanups
19:21:32 <clarkb> It has been 3 weeks since I disabled the last batch of users. I intend on deleting the conflicting external ids from those retired user accounts tomorrow
19:22:16 <clarkb> That will leave us with ~30 accounts where reaching out to the users is a good idea. Hopefully that will make for a good activity during openstack feature freeze slush. My goal there is to push a single change to the external ids ref that addresses all of those and gets verified by our running gerrit server too
19:23:00 <clarkb> I'll probably create a working checkout of the ref on review02 and start committing to it. It isn't clear to me if I can make 30 separate commits or if I need to have one. I will probably try doing 30 separate ones first and then squash them all if pushing the stack is rejected by gerrit
19:23:35 <fungi> they'll probably need to be squashed, i expect gerrit to validate each change individually
19:23:52 <fungi> er, i guess they're just commits not changes, so maybe not
19:24:07 <clarkb> yup I'm not sure how it will handle that
19:24:08 <fungi> either way, i agree with the plan
19:24:52 <clarkb> #topic Gitea 1.15.0 upgrade
19:25:09 <clarkb> Gitea 1.15.0 exists now.
19:25:21 <clarkb> there are a number of bugs they are looking at addressing that people have reported against 1.15.0
19:25:25 <clarkb> #link https://github.com/go-gitea/gitea/milestone/97
19:25:44 <clarkb> #link https://github.com/go-gitea/gitea/issues/16802 Primary Keys for mysql tables
19:26:00 <clarkb> issue 16802 in particular seems important for us since we run with mariadb
19:26:25 <clarkb> Since we're entering openstack feature freeze anyway and don't have a current pressing need to upgrade I think it would be best to see if we can wait for a 1.15.1
19:26:35 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/803231 WIP'd until we are happy with the gitea situation.
19:27:16 <clarkb> That all said one of the changes we will have to deal with is the hosting path for the logo files changes. Our Gerrit theme consumes the opendev logo from gitea and that path needs to update. The change above does update that path, but we may need to restart gerrit after the gitea upgrade too
19:27:37 <clarkb> ianw: you had talked about looking into hosting those files in a better way. Have you had a chance to look at that yet?
19:27:44 <ianw> oh yeah, i was thinking about that
19:27:56 <ianw> the expedient approach seems to be to dump the files in an afs directory
19:28:00 <clarkb> If we don't address that before the gitea upgrade I think that is fine as a gerritrestart is quick, but do want to bring up the relationship between the tasks here
19:28:08 <clarkb> ianw: oh interseting and then serve them off of static?
19:28:15 <ianw> the "correct" approach seems to be to create a repo to hold logos/favicons/images and get that to publish to afs
19:28:43 <ianw> i think it's probably worth doing the latter, as logos tend to morph and it's probably worth version tracking them
19:28:52 <clarkb> Most of them are svgs anyway so sticking them in git shouldn't be too bad
19:29:05 <ianw> then we can create a static stie with a long cache, or just via static.opendev.org
19:29:36 <ianw> if we think the repo and publishing is the way to go, i can do that
19:30:00 <fungi> we could put those files in system-config though and just include them from there in the images, right? don't need a separate repo?
19:30:05 <clarkb> That plan seems reasonable to me. We'd still be hosting the files away from the services but via a webserver we have far more control over.
19:30:49 <fungi> i guess i'm lost on why it needs its own repo
19:30:56 <ianw> fungi: we could ... as in via https://opendev.org/opendev/system-config/...  ?
19:31:19 <clarkb> ya the files are already largely in system-config
19:31:20 <ianw> just to keep it separate really, and to make it simpler to reuse the existing afs publish jobs
19:31:26 <fungi> yeah, make a new top-level directory... "assets" or whatever
19:32:27 <fungi> i guess if the idea is to use it in publication jobs rather than putting it in the service images (so the files are hosted from the servers embedding them in their services) then having a repo using standard docs publishing jobs makes some sense
19:33:03 <fungi> my main concern with putting them on static.o.o is that if that site is down it could cause problems for the services pointing at it for inlined files
19:33:24 <fungi> (or if afs is offline, suddenly that's an impact to all our services)
19:33:40 <fungi> if the images get written into the images we build, the services are more self-contained
19:34:02 <clarkb> yup and for some services like gitea I think we'd continue to embed because they expect to be themed that way and not with external resources
19:34:19 <fungi> er, graphical images get written into the container images we build (too many meanings for image)
19:34:27 <clarkb> for that reason having the assets in system-config probably makes the most sense so we don't have multiple copies of files? though we could clone and copy in the image builds from the new repo
19:34:49 <ianw> i don't think that failing <img> tags will bring down services, so i don't think that's a blocker as such
19:35:05 <clarkb> ianw: probably not. might make things render funny for a bit but not fatal
19:35:23 <fungi> right, i don't mind having multiple places the same image is served, the desire was to have only one version-controlled source for the images themselves
19:35:57 <fungi> so we don't have to manually copy the same favicon to multiple directories/repositories or whatever
19:36:28 <fungi> and worry about those different copies getting out of sync when we update one repo but not another
19:36:44 <clarkb> ianw: do you think it is possible to keep them in system-config under an assets/ dir?
19:36:53 <ianw> right, so the canonical favicon could be logos.opendev.org/favicons.ico or https://opendev.org/opendev/system-config/master/assets/favicon.ico you mean?
19:37:10 <fungi> yeah
19:38:10 <corvus> note that the fallback location for favicon.ico is implicitly on the same server.  only affects older browsers.
19:38:24 <fungi> the other reason i'm waffling on the serving one copy and having services link browsers out to it is that it looks a little more like webbug/tracker behavior, may set off some privacy extensions, may need cors fiddling...
19:38:26 <ianw> do we want logos.opendev.org?  what needs it?
19:39:29 <clarkb> https://opendev.org/opendev/system-config/raw/branch/master/docker/gitea/custom/public/img/opendev-sm.png that actually does seem to work
19:39:46 <clarkb> in that case I think we don't need logos.opendev.org just a bit better organization? THis may be much simpler than I had anticipated
19:40:05 <fungi> the counterargument i hear from webdev people is that having one location a particular file is served is that it speeds up user interaction due to browser-side caching (this is the argument pushed for using google's font service, for example). i'm not really sure i buy it for this case though
19:40:16 <clarkb> that may load a bit more slowly than hitting https://opendev.org/img/opendev-sm.png when caches are cold though
19:40:48 <clarkb> when we address things in the repo gitea does repo lookups and that will be slower on a cold cache, but once gitea's cache warms it should be about equivalent
19:41:31 <clarkb> maybe start with that since it doesn't require as many changes?
19:41:45 <clarkb> and if we find things aren't served quickly enough we can make a logos.opendev.org with better lookup times?
19:42:05 <fungi> as for logos.opendev.org, maintaining yet another site does seem like unnecessary administrative overhead to me, if the real goal is just to deduplicate the graphics in our git repo(s)
19:42:21 <ianw> lodgeit is the only one i can think of https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/lodgeit/templates/docker-compose.yaml.j2#L36
19:42:51 <clarkb> gerrit does similar as well
19:43:00 <clarkb> but they could both use https://opendev.org/opendev/system-config/raw/branch/master/docker/gitea/custom/public/img/opendev-sm.png type urls?
19:44:01 <ianw> i guess one goes via gitea's static rendering (/img) and the other via actually hitting the git tree
19:44:44 <corvus> if the plan is to serve them out of the repo, it would be prudent to put them in a dedicated directory with a README telling folks of the dangers of moving any files
19:45:08 <clarkb> corvus: yes I don't think we should use the actual url above if we use that method. We should put them in an assets/ type dir
19:45:14 <ianw> but, if we simply ln -s ./assets/opendev-sm.png docker/gitea/custom/public/img/opendev-sm.png then the image builds and the png is statically served?
19:45:34 <ianw> oh ... that will not be in the build context and probably won't work
19:46:17 <clarkb> ya I think that might be the difficult thing to sort out is how to keep copying the files into the docker images for services that expect that method of theming
19:46:29 <clarkb> I'm sure there is some way to do it that makes docker happy but I don't know what it is
19:47:27 <ianw> i feel like bazel had this issue too with symlinks to ~/.cache or something
19:48:20 <fungi> oh, so the problem is that we don't have any pre-build step to copy other files into the tree
19:48:58 <clarkb> ya docker doesn't like thngs that escape the path level it is started at
19:48:59 <fungi> and for people trying to build those locally they'd need to perform an extra step to get the files which aren't present in that subdirectory where the rest of the files for the image reside?
19:49:15 <fungi> and docker builds can't fetch files from some network location
19:49:30 <clarkb> docker builds can fetch the files from a network location. I think that is the common workaround here
19:49:35 <fungi> or otherwise install things which involve getting them over a network?
19:49:42 <clarkb> but our speculative builds make that more complicated
19:49:48 <fungi> yeah, agreed
19:50:20 <fungi> the network location would be a file:/// path within the same repo or another repo checked out in the workspace as pushed by zuul, i guess
19:51:14 <clarkb> We definitely don't need to solve that right here in the meeting (which only has a few more minutes allocated) but I suspect solving the "how do the graphical images get into the docker image builds" question is the main one to solve for now?
19:51:28 <clarkb> there might be an escape hatch override in docker we can use too. I'm not familiar enough to know for sure
19:51:59 <ianw> maybe give me an action to come up with something
19:52:08 <clarkb> a separate repo could solve that by us basically punting on the idea that we care about speculative exuections of those
19:52:16 <clarkb> then image builds just clone the repo and use the current state
19:52:50 <clarkb> #action ianw Figure out how to centrally store graphical images for logos while still making them consumable by docker image builds for services that want a copy of the data at build time.
19:52:54 <clarkb> something like that?
19:53:07 <fungi> but if all our builds are driven from system-config then there's no need to put the files in another repo than system-config to get speculative support
19:53:19 <clarkb> fungi: except you have to get those files into the docker image
19:53:35 <fungi> and having them in another repo makes that easier than having them in the same repo?
19:53:38 <clarkb> file:/// doesn't work and neither does cp ../../assets
19:53:42 <fungi> kinda bizarre if so
19:53:50 <clarkb> fungi: ya because you can git clone that and the repo is as small as necessary to do that
19:54:03 <clarkb> fungi: rather than git cloning yourself fomr somewhere that isn't the actual state you expect
19:54:15 <fungi> or not using git at all, just a relative filesystem path
19:54:30 <clarkb> but you can't do that within an image build as it is a security concern (if my undersatnding is correct)
19:54:45 <clarkb> docker limits you to things at and below the Dockerfile in the fs tree
19:54:51 <fungi> oh, i see we could clone system-config and use files from it
19:54:55 <clarkb> yes
19:55:07 <fungi> so still don't need a separate repo
19:55:21 <clarkb> right, but you might end up confused in a speculative state situation
19:55:26 <fungi> just a second system-config clone from local path during the build
19:55:42 <fungi> cloning from the same repo we're already using
19:55:48 <clarkb> fungi: can you do that within a git repo?
19:56:02 <corvus> technically it's the build context, not the dockerfile, but yeah
19:56:04 <clarkb> (I don't actually know I've never tried cloning a git repo to a subtree of the git repo)
19:56:42 <clarkb> corvus: ah ok maybe we shift the context to the root of the repo then rather than the dockerfile location
19:56:56 <fungi> you can clone your current repo within a subdirectory of your repo, yep, just tested it
19:56:58 <clarkb> then we would have access to everything in the git repo and not need to sort out cloning a repo into itself
19:57:06 <clarkb> fungi: neat
19:57:10 <fungi> git clone ./ foo
19:57:41 <clarkb> Alright we are just about time. Lets open it up to anything else really quickly
19:57:45 <clarkb> #topic Open Discussion
19:57:56 <clarkb> Any last minute items to call out before we go find lunch/dinner/breakfast?
19:58:12 <ianw> #link https://review.opendev.org/c/opendev/system-config/+/804916
19:58:16 <ianw> was from last week
19:58:26 <ianw> to fix the time randomisation on backups
19:58:34 <clarkb> oh thank you for that. I missed the change
19:58:43 <clarkb> I guess it went up during the meeting last week :)
19:58:51 <ianw> i'll get back to debian-stable removal
19:59:07 <ianw> in trying to figure out nodes/jobs etc i got quite side-tracked into updating how zuul shows that info :)
19:59:42 <clarkb> those improvements do make the info zuul presents more readable. Thank you for that
19:59:58 <fungi> thanks clarkb!
20:00:04 <ianw> oh and i don't think we got a response on the openeuler mirror?
20:00:16 <clarkb> ianw: I haven't seen one. But not sure I was cc'd on the original send?
20:00:17 * fungi has to skeedaddle
20:00:26 <clarkb> aha infra-root was cc'd I see that now and no response
20:00:40 <clarkb> thanks everyone we are at time
20:00:43 <clarkb> #endmeeting