Tuesday, 2023-02-07

clarkbMeeting time in a couple minutes18:59
clarkb#startmeeting infra19:01
opendevmeetMeeting started Tue Feb  7 19:01:06 2023 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
opendevmeetThe meeting name has been set to 'infra'19:01
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/QJK7E7D7HG5ZNT4UE7T5QIQ5TARIAXP6/ Our Agenda19:01
clarkb#topic Announcements19:01
clarkbThe service coordinator nomination period is currently open. You have until February 14 to put your name into the hat. I'm happy to chat about it if there is interest too before any decisions are made19:02
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/32BIEDDOWDUITX26NSNUSUB6GJYFHWWP/19:02
clarkbAlso, I'm going to be out tomorrow (jsut a heads up)19:02
clarkb#topic Topics19:04
clarkb#topic Bastion Host Updates19:04
clarkb#link https://review.opendev.org/q/topic:bridge-backups19:04
clarkbI truly feel bad for not getting to this. I should schedule an hour on my calendar just for this already. But too many fires keep coming up19:04
clarkbianw: fungi: were there any other bastion host updates you wanted to call out?19:05
fungii don't think so19:05
ianwsorry, woke up to a dead vm, back now :)19:06
clarkbyou haven't missed much. Just wanted to make sure there wasn't anthing else bastion related before continuing on19:06
ianwno changes related to that this week19:06
clarkb#topic Mailman 319:06
clarkbThe restart of containers ot pick up the new site owner email landed and fungi corrected the root alias email situation19:07
fungicurrent state is that i need to work out how to create new sites in django using ansible so that the mailman domains can be associated with them19:07
clarkbFixing the vhosting is still a WIP though I think fungi roughly understands the set of steps tha tneed to be taken and now is just a matter of figuring out how to automate django things19:07
fungiand yeah, this is really designed to be done from the django webui. if i were a seasoned django app admin i'd have a better idea of what makemigrations could do to ease that from the command line19:08
clarkbI wonder if we've got any of those in the broader community? Might e worth reaching out to the openstack mailing list?19:09
fungibut it's basically all done behind the scenes by creating database migrations which prepopulate new tables for the site you're creating19:09
fungidatabases were never my strong suit to begin with, and db migratopns are very much a black box for me still. django seems to build on that as a fundamental part of its management workflow19:10
clarkbya I suspect what we might end up with is having a templated migration file in ansible that gets written out to $dir for mailman for each site and then ansible triggers the migrations19:10
clarkband future migrations should just ensure that steady state without changing much19:10
clarkbthe tricky bit will be figuring out what goes into the migration file definition19:10
fungiyeah, django already templates the migrations, as i loosely understand it, which is what manage.py makemigrations is for19:11
fungiit seems you're expected to tell django to build the migrations necessary for the new site, and then to apply those migrations it's made19:11
fungiwhich results in bringing the new site up19:12
ianwit sort of seemed like you needed a common settings.py, and then each site would have it's own settings.py but with a different SITE_ID?19:12
fungii think so, but then mailman when it runs needs SITE_ID=0 instead19:12
clarkbianw: I think thats for normal django multi sites. But mailman doesn't quite do it that way? YOu don't have a true extra site it just uses the site db info to vhost its single deployment19:12
fungiwhich is a magic value telling it to infer the site from the web requests19:13
clarkbya so ultimately we run a single site with ID=0 but the db has entries for a few sites19:13
fungithe other related tidbit is i need to update docker on lists01 and restart the containers19:14
fungiwhich i plan to do first on a held node i have that pre-dates the new docker release19:14
clarkbcool sounds like we know what needs to happen just a matter of sorting through it. Anything else?19:15
fungii don't have anything else, no19:16
clarkb#topic Git updates19:16
clarkb#link https://review.opendev.org/c/opendev/system-config/+/873012 Update our base images19:16
fungii restacked the mm3 version upgrades change behind the vhosting work19:16
clarkbThe base python images did end up updating. Then I realized we use the -slim images which don't include git so this isn't really useful other than as a semi periodic update to the other things we have installed19:17
clarkbI was looking at the non slim images to see if git had updated not realizing we only have git where we explicitly install it. All that to say next week we can drop this topic.19:17
clarkbAnd that change is not urgent, but probably also a reasonable thing to do19:18
clarkb#topic New Debuntu Releases Preventing sudo pip install19:18
clarkbfungi called out that debian bookworm and consequently ubuntu 23.04 and after will prevent `sudo pip install` from working on those systems19:19
clarkbFor OpenDev we've shifted a lot of things into docker images built on our base python images. These don't use debian packaging for python and I suspect will be fine. However if they are not we should be able to modify the intsallation system on the image to use a single venv that gets added to $PATH19:19
clarkbI think this means the risk to us is relatively low19:20
clarkbAditionally ansible is already in a venv on bridge and we use venvs on our test images19:20
ianwdocker-compose isn't though.  that's one i've been meaning to get to19:20
clarkbgood call19:20
clarkbdefinitely anything you can think of that is still running outside of a venv should be moved. We can do that ahead of the system server upgrades that will break us since old stuff can handle venvs19:21
ianw++ i'm sure we can work around it, but it's a good push to do things better19:21
clarkbElsewhere we should expect projects like openstack and probably starlingx to struggle with this change19:22
clarkbin particular tools like devstack are not venv ready19:22
fungiyeah, i posted to openstack-discuss about it as well, just to raise awareness19:22
ianwyeah there have been changes floating around for years, that we've never quite finished19:22
clarkband ya I think talking about it semi regularly is a good way to keep encouraging people tochip away at it19:23
clarkbfor a lot of stuff we should be able to make msall measureable progress with minimal impact over time19:23
clarkb#topic Gerrit Updates19:25
clarkbA number of Gerrit related changes have landed over the last week. In particular our use of submit requirements was cleaned up and we have a 3.7 upgrade job19:25
clarkbThat expanded testing was used to land the base image swap for gerrit19:25
clarkbthis base image swap missed (at least) one thing: openssh-client installation19:25
clarkbthis broke jeepyb as it uses ssh to talk to gerrit for new repo creation via the manage-projects tool19:26
clarkbApologies for that.19:26
clarkbfungi discovered that even after fixing openssh jeepyb's manage-projects wedges itself for projects if the initial creation fails. The reason for this is that no branch is created in gerrit if manage-projects fails on the first run. This causes subsequent runs to clone from gerrit and not be able to checkout master19:26
clarkbTo work around this fungi manually pushed a master branch to starlingx/public-keys19:27
fungiand discovered in the process that you need an account which has agreed to a cla in gerrit in order to do that to a cla-enforced repository19:27
fungimy fungi.admin account had not (as i suspect most/all of our admin accounts haven't)19:28
clarkbI've only had a bit of time today to think about that but part of thinks that this may be desireable as I'm not sure we can fully automate around all the gerrit repo creation failed causes?19:28
fungithe bootstrapping account is in the "System CLA" group, which seems to be how it gets around that19:28
clarkbin this specific case we chould just fallback to reiniting from scratch but I'm not sure that is appropriate for all cases19:28
clarkbfungi: ya I wonde rif we should just go ahead and add the admin group to system cla or something like that19:28
fungior add project bootstrappers to it19:29
clarkbah yup19:29
fungias an included group19:29
clarkbwith that all sorted I think ianw's change to modify acls is landable once communicated19:29
clarkb#link https://review.opendev.org/c/openstack/project-config/+/867931 Cleaning up deprecated copy conditions in project ACLs19:29
clarkbit would've had a bad time with no ssh :(19:30
fungiindeed19:30
fungithanks for fixing it!19:30
ianwyeah sorry, will send something up about that19:30
clarkbOther Gerrit items include a possible upgrade to java 1719:30
clarkb#link https://review.opendev.org/c/opendev/system-config/+/870877 Run Gerrit under Java 1719:30
clarkbI'd still like to hunt down someone who can explain the workaround that is necessary for that to me a bit better19:31
clarkbbut I'm finding that the new discord bridge isn't as heavily trafficed as the old slack system. I may have to break down and sign up for discord19:31
clarkbAnd yesterday we had a few users reporting issues with large repo fetches19:31
clarkbianw did some debugging on that and it resulted in this issue for MINA SSHD19:32
clarkb#link https://github.com/apache/mina-sshd/issues/319 Gerrit SSH issues with flaky networks.19:32
ianwoh, that just got a comment a few minutes ago :)19:32
ianw... sounds like whatever we try is going to involve a .java file :/19:34
clarkbya looks like tomas has a theory but we need to update gerrit to better instrument things in order to confirm it19:34
clarkbProgress at least19:34
clarkbAnything else gerrit related before we move on?19:35
ianwjayf was the first to mention it, but it is a pretty constant thing in the logs19:35
clarkbif it is a race the chagne in jdk could be exposing it more too19:36
clarkbsince that may affect underlying timing of actions19:36
fungiand others are still reporting connectivity issues to gerrit today (jrosser at least)19:36
clarkboh side note: users can use https if necessary. Its maybe a bit more clunky if using git-review but is a fallback19:37
ianwi think it would be easy-ish to add the close logging suggested there in the same file19:37
ianw(if it is) i could try sending that upstream, and if it's ok, we could build with a patch19:38
clarkbyup and we could even patch that into our image if upstream doesn't want the extra debugging (though ideally we'd be upstream first as I like not having a fork)19:38
ianwyeah.  although we haven't had a lot of response on upstream things lately :/  but that was mail, not patches19:38
clarkbianw: oh also March 2 at a terrible time of day for you (8am for me) they have their community meeting. Why don't I go ahead and throw this on the agenda and I'll do my best to attend19:39
clarkbI can ask about java 17 too19:39
clarkb(not that we have to wait that long just figure having a direct conversation might help move some of these things forward)19:40
ianw++19:40
clarkb#topic Python 2 removal from test images19:40
clarkb20 minutes left lets keep things moving19:41
clarkbsome projects have noticed the python2 removal. It turns out listing python2 as a dependency in bindep was not something everyone understood as necessary19:41
clarkbsome projects like nova and swift are fine. Others like glance and cinder and tripleo-heat-templates are not19:41
clarkbWhen this came up earlier today I had three ideas for addressing this. A) revert the python2 removal from test images B) update things to fix buggy bindep.txt C) have -py27 jobs explicitly install python219:42
clarkbI'm beginning to wonder if we should do A) then announce we'll remove it again after the antelope release so openstack should do either B or C in the meantime?19:42
fungiper a post to the openstack-discuss ml. tripleo seems to have gone ahead with option b19:42
ianwyeah i'm just pulling it up ...19:43
ianwi think maybe we have openstack-tox-py27 install it19:43
fungiapparently stable branch jobs supporting python 2.7 are very urgent to some of their constituency19:43
clarkbmy main concern here is that openstack isn't using bindep properly19:43
ianwi agree on that19:43
ianwif we put it back in the images, i feel like we just have to do a cleanup again at some point19:44
clarkbianw: yup I think we'd remova python2 again say Late april after the openstack release?19:44
ianwat least if it's in the job, when the job eventually is unreferenced, we don't have to think abou tit again19:44
fungiwhat is properly in this case? they failed to specify a python version their testing requires... i guess that means they should include python3 as well19:44
clarkbthats a good point19:44
clarkbfungi: yes python3 should be included too19:44
ianwyeah, i mean the transition point between 2->3 was/is a bit of a weird time19:45
ianwthey *should* probably specify python3, but practically that's on all images19:45
ianwat least until python419:45
clarkbI suspect that nova and swift have/had user using bindep outside of CI19:45
fungialso a chicken-and-egg challenge for our jobs running bindep to find out they already have the python3 requested19:46
clarkband that is why theirs are fine. But the others never used bindep except for in CI and once things went green they shipped it19:46
clarkbSo maybe the fix is update openstack -py27 jobs to install python2 and encourage openstack to update their bindep files to include runtime dependencies19:46
fungibasically we can't really have images without python3 on them, because ansible even beofre it runs bindep19:46
fungiso, yeah, i agree including python3 in bindep.txt is a good idea, it just can't be enforced by ci through exercising the file itself (a linting rule could catch it though)19:48
clarkbwe also don't need to solve that in the meeting (lack of time) but I wanted to make sure everyone was aware of the speed bump they hit19:48
ianw++ i'll have a suggested patch to openstack-zuul-jobs for that in a bit19:48
clarkbthanks19:48
clarkb#topic Docker 2319:48
clarkbDocker 23 released last week (skipping 21 and 22) and created some minor isues for us19:48
clarkbIn particular they have an unlisted hard dependency on apparmor which we've worked around in a couple of places by installing apparomor19:49
clarkbAlso things using buildx need to explicitly install buildx as it has a separate package now (docker 23 makes buildx the default builder for linux too, I'm not sure how that works if buildx isn't even installed by default though)19:49
fungihard dependency on apparmor for debuan-derivatives anyway19:49
clarkbright19:50
fungis/debuan/debian/19:50
clarkband maybe on opensuse but we don't opensuse much19:50
clarkbat this point I think the CI situation is largely sorted out and ianw has started a list for working through prod updates19:50
clarkbprod updates are done manually because upgrading docker implies container restarts19:50
clarkbMostly just a call out topic since these errors have been hitting things all across our world19:51
ianw#link https://etherpad.opendev.org/p/docker-23-prod19:51
clarkbthank you to everone who has helped sort it out19:51
ianwmost done, have to think about zuul19:51
clarkbya zuul might be easiest in small batches19:52
ianwi'm thinking maybe the regular restart playbook, but with a forced docker update19:52
ianwrolling restart playbook19:52
clarkbya that could work too. A one off playbook modification?19:52
ianwyeah, basically just run a custom playbook19:52
fungithe pad contains list.katacontainers.io (what are we using docker for there?) but not lists.openstack.org19:52
clarkbfungi: we're not I think the entire inventory went in there and has been edited to reflect reality?19:53
corvusthat seems like it should work19:53
fungioh, i see lists.openstack.org is in the not using list19:53
fungilist.katacontainers.io probably just hasn't been checked yet19:53
ianwyeah sorry, i didn't19:53
ianwwhat i would like to do after this is rework things so we have one docker group19:53
fungino worries, i'll take a look19:53
ianwso hosts that run install-docker now are all in that group.  will take a bit of playbook swizzling19:54
clarkbok running out of time and I want to get to ade_lee's topic19:54
clarkb#topic FIPS jobs19:54
ade_lee:)19:54
clarkbspeaking of swizzling 19:54
fungiat this point 866881 needs a second zuul/zuul-jobs19:55
fungireviewer19:55
fungithe rest of the changes are ready to merge once that does?19:55
clarkb#link https://review.opendev.org/c/zuul/zuul-jobs/+/86688119:55
clarkb#link https://review.opendev.org/c/zuul/zuul-jobs/+/86688119:55
clarkb#link https://review.opendev.org/c/openstack/project-config/+/87222219:55
ade_leeI think so yes19:55
clarkbhttps://review.opendev.org/c/openstack/openstack-zuul-jobs/+/87222319:55
fungiianw and i +2'd the later changes ready to approve once the zuul-jobs change is in19:56
clarkband the tldr here is the jobs are getting reorganized to handle pass to parent and early fips reboot needs. They should emulate how our jobs for docker images are set up19:56
clarkbright?19:56
ade_leeyup19:56
fungimore to handle the need for secret handling in the new role that handles ubuntu advantage subscriptions19:56
clarkbah right thats the bit that needs the secret and uses pass to parent19:57
fungiua just ends up being a prerequisite for fips on ubuntu19:57
fungisince it requires a license to get the packages19:57
fungi(which opendev has been granted by canonical in order to make this work)19:58
clarkbsounds like mostly just need reviews at this point. I'll tr to review today if I don't run out of time.19:58
clarkb#topic Open Discussion19:58
clarkbAny last minute concerns or topics before we can all go find a meal?19:58
ade_leeclarkb, that would be great - thanks!19:58
fungiwe're running into dockerhub tag pruning issues which are blocking deployment from image updates19:59
clarkbianw has a change to aid in debugging that19:59
fungijust a heads up to people who haven't seen the discussion around that yet19:59
clarkb#link https://review.opendev.org/c/zuul/zuul-jobs/+/87284219:59
fungias soon as that's worked out we'll have donor logos on the main opendev.org page20:00
ianwalso speaking of distro deprecated things20:00
ianw#link https://review.opendev.org/c/opendev/system-config/+/87280820:00
ianwwas one to stop using apt-key for the docker install ... it warns on jammy now20:00
fungithanks for fixing that20:00
clarkband reminder I'll be afk tomorrow20:00
clarkbthats our hour. Thanks everyone20:01
clarkb#endmeeting20:01
opendevmeetMeeting ended Tue Feb  7 20:01:18 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)20:01
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2023/infra.2023-02-07-19.01.html20:01
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2023/infra.2023-02-07-19.01.txt20:01
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2023/infra.2023-02-07-19.01.log.html20:01
clarkbWe didn't get to a couple of topics but the sqlalchemy one isn't urgent and the others dind't really have updates20:01
clarkbbut feel free to bring them up in #opendev or on the mailing list if I'm istaken20:01

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!