Monday, 2023-03-13

*** jpena|off is now known as jpena08:01
*** ykarel is now known as ykarel|lunch09:01
*** ykarel|lunch is now known as ykarel10:27
*** d34dh0r5- is now known as d34dh0r5313:45
opendevreviewJulia Kreger proposed openstack/diskimage-builder master: Correct boot path to cover FIPS usage cases  https://review.opendev.org/c/openstack/diskimage-builder/+/87619214:42
clarkbI think the auto update of our haproxy deployment may have reenabled the gitea01-04 backends (though cacti show them as fairly idle the container restarted). I'll push up a change to get those pulled out of config for haproxy as soon as local updates and reboots complete15:20
fungigreat catch, i hadn't thought to check back behind that15:28
opendevreviewClark Boylan proposed opendev/system-config master: Remove gitea01-04 from our haproxy config  https://review.opendev.org/c/opendev/system-config/+/87729615:33
opendevreviewClark Boylan proposed opendev/system-config master: Remove gitea01-04 from Gerrit replication  https://review.opendev.org/c/opendev/system-config/+/87729815:35
opendevreviewClark Boylan proposed opendev/system-config master: Remove gitea01-04 from configuration management  https://review.opendev.org/c/opendev/system-config/+/87730115:38
clarkbNote the depends on in ^ is something we can probably land today?15:38
fungiyep, approved it15:41
clarkbthanks.15:42
clarkbhttps://www.phoronix.com/news/ipmitool-GitHub-Suspended cc JayF 16:43
fungioof, that's no good16:51
opendevreviewMerged opendev/system-config master: Switch borg backup from gitea01 to gitea09  https://review.opendev.org/c/opendev/system-config/+/87647116:52
fungisystem-config-run-base failing again. looking into it16:55
fungioh, failed in check, rogue vm in rax-iad from the looks of it16:58
clarkboh is jayf out this week? I should probably share that in the ironic channel17:05
fungiright, he did post to the ml saying he'd be gone17:12
clarkbthat tox + pytest situation is really interesting. More reason to use nox I guess <_<17:17
opendevreviewClark Boylan proposed opendev/git-review master: Switch from tox to nox  https://review.opendev.org/c/opendev/git-review/+/87165217:19
clarkbthats a rebase to address a conflict introduced by the gerrit 3.4.4 swap17:19
*** jpena is now known as jpena|off17:20
opendevreviewClark Boylan proposed opendev/git-review master: Switch to Gerrit 3.7 in testing  https://review.opendev.org/c/opendev/git-review/+/87731317:21
clarkband ^ issomething I meant to do previously just to see if it owrks at all17:21
fungiwill be interesting to see the results of that17:21
clarkbfungi: the git-review + gerrit 3.7 change passes testing18:01
clarkbpersonally I think I lean towards ensuring git-review works with newer gerrit since we're not making many changes to git-review at this point but gerrit does make changes. It also helps ensure we don't fall behind again. But there is some value in ensuring you're compatible with old gerrit too particularly when you do make functional chagnes to git-review18:02
fungiwe could also parameterize it i suppose, and test both?18:03
fungimaybe drop the intermediate python versions, which would keep the job count the same18:03
fungiso we test oldest/newest python with oldest/newest gerrit18:04
fungiprobably worth bumping the upper python version too, at this point18:04
clarkbyup that is an option18:05
fungishould probably decide on the switch to nox first though, in order to not create more churn for tox config18:05
clarkbsomething like python3.6 + Gerrit 3.4 and python3.6 + Gerrit 3.7 and python3.11 + Gerrit 3.4 and python3.11 + Gerrit 3.718:06
fungiyes, exactly18:06
clarkbyup. I'm personally comfortable running nox locally. I also like that nox seems to be simpler but with more expressive tooling when necessary18:06
fungiprobably no need to override the golden site number, just make it so we can pass in a warfile url override?18:06
clarkbya I think you define a default and only bump the golden site number when that default changes.18:07
clarkbbut allow overrides18:07
fungiright, just have to decide whether we want that default to be old gerrit or new gerrit18:07
fungii guess it's a question of which would we prefer people test with locally18:07
clarkbno real preference there if CI checks the bounds18:09
fungii guess keeping the older one as default helps catch situations where people are introducing features which aren't supported until later18:11
fungiwhich was what prompted all of this in the first place18:11
clarkb++18:11
fungithat's as close to a "good reason" as i can come up with, and it's kinda flimsy18:11
fungibut better than nothing i guess18:12
opendevreviewMerged opendev/system-config master: Remove gitea01-04 from our haproxy config  https://review.opendev.org/c/opendev/system-config/+/87729618:19
clarkbI was thinking about a bit about when to remove gitea01-04. Usually we've got a big rush around the openstack release as people pull the release. Maybe clean up the old servers at the end of next week after that rush?18:20
clarkbNot sure how conservative we should be as its still a bit unknown to me if the cpu steal problem and/or iowait is an actual problem18:20
corvusyeah, i think there may be a backlog in the bridge -- i'm seeing oftc -> matrix messages a bit later18:22
opendevreviewClark Boylan proposed opendev/git-review master: Test old and new Gerrit  https://review.opendev.org/c/opendev/git-review/+/87731320:13
opendevreviewClark Boylan proposed opendev/git-review master: Test Python bounds only  https://review.opendev.org/c/opendev/git-review/+/87732120:13
clarkbsomething like that maybe for git-review20:13
opendevreviewClark Boylan proposed opendev/git-review master: Test old and new Gerrit  https://review.opendev.org/c/opendev/git-review/+/87731320:15
opendevreviewClark Boylan proposed opendev/git-review master: Test old and new Gerrit  https://review.opendev.org/c/opendev/git-review/+/87731320:17
ianwclarkb: if you have some time to loop over https://review.opendev.org/q/topic:jammy-dns that would be good20:22
ianwi haven't fully finished https://etherpad.opendev.org/p/2023-opendev-dns because some of it, like variable names etc. depend on the stack ^ for an actual checklist20:23
ianwbut i could probably work on starting replacement servers asynchronously anyway to get them ready20:23
ianwi was thinking it would be good to have one in rax and one in vexxhost.  only need to be very small instances20:24
opendevreviewClark Boylan proposed opendev/git-review master: Test old and new Gerrit  https://review.opendev.org/c/opendev/git-review/+/87731320:24
clarkbianw: I think we already do the split clouds for them so ++ to continuing that20:24
ianwyeah, it's split like that now20:25
clarkbthe swap NS records step happens via our registrar to update opendev.org at the .org level right?20:26
fungiyeah, spreading our authoritative resolvers between multiple providers makes sense for resilience, which we already do (ns1 in rax, ns2 in vex) 20:26
fungiseems you agree with the approach we originally used ;)20:27
ianwhaha yes20:29
ianwclarkb: yeah, but they should match what's in the opendev.org zone file too20:30
clarkbianw: right I'm just trying to map the sequence of events in my head. Do we add the new NS servers to the zone first then update registrar for top level zone then remove our NS servers?20:32
clarkbor does it not matter so much and we can just be out of sync for a bit as long as both sets of servers have accurate enough data and listen on port 53?20:33
ianwi think we can change the registrar, and then change the zone file, as long as we don't turn off ns1,ns2 in the mean time20:36
clarkbmakes sense20:36
fungiyeah, either order works20:36
fungiwhat we supply to the registrar ends up injected as glue records into the .org tld zone20:36
clarkbianw: for https://review.opendev.org/c/opendev/system-config/+/876936/ I think we should see if corvus has a minute to look at it. corvus wrote the initial implementation and may be able to fill in why those names were chosen20:36
clarkbfungi: right and those are ultimately what and NS query would lookup right?20:37
clarkbbeacuse of the whole chicken and egg problem20:37
fungithat and the addition of ns records to our zones can happen in parallel, but the registrar may balk if we aren't already publishing ns records so i'd update the zonefiles first20:37
ianw++ that is an optional change, it just made a bit more sense to me layed out like that20:37
fungii'll note that i haven't looked at any dns server related changes yet but would like to20:38
ianwfungi: please do :)  https://review.opendev.org/q/topic:jammy-dns is the stack that is all just mechanical stuff.  the checklist @ https://etherpad.opendev.org/p/2023-opendev-dns isn't complete but i will update that and ask for review "soon"20:39
clarkbfungi: I think jammy ssh stuff is broken against gerrit 3.4.4 :) (this is the ssh-rsa problem so we can fix it with a different key type)20:40
fungibetter than the bionic ssh problem with gerrit 3.4.5 and later, which doesn't have a great workaround20:41
fungithough interesting the py36 job would have passed in that case20:41
fungimaybe i haven't fully grokked the problem there20:42
opendevreviewClark Boylan proposed opendev/git-review master: Test old and new Gerrit  https://review.opendev.org/c/opendev/git-review/+/87731320:43
clarkbfungi: MINA 2.7 or 2.8 fixed the issue you have with 3.4.5. But as far as  Ican tell the MINA version that fix thing swas never backported to 3.4 newer than 3.4.520:44
ianw(we could also upgrade these in place and just move on ... but i think it has some value rotating these, and having a checklist for how we did it, so that if we ever actually *need* to replace these things we have a template that worked at least once)20:44
fungiclarkb: it was supposedly fixed in mina-sshd 2.8 yeah20:44
clarkbianw: I agree I think this is better hygiene particularly for DNS servers its good to do that20:45
clarkbfungi: 3.4 is EOL now so won't ever get those updates either20:46
clarkb3.5 should go eol in a couple of months (when 3.8 is released)20:47
fungiwish we had a good idea of which gerrit versions are in use in the wild20:47
ianwfungi: there was a message in the chat the other day something along the lines of "we are still on our 2.6 port" ... openafs is one i look at occasionally that's 2.1320:49
clarkbianw: I think 2.6 is broadcom? its one of the well involved companies so I worry less aout them. They know they are behind and are working to upgrade20:49
ianw"that's a version of the replication plugin that we backported to work with 2.7 Gerrit because we're not yet off our fork..." ... that was it20:50
fungii guess similar to the openstack testing and pyca/cryptography version discussion, we (git-review maintainers) can accept bug reports for regressions with old gerrit versions even if they're too old for us to easily test in an automated fashion20:50
clarkb++20:51
ianwthat sounds like a very reasonable position20:51
clarkbthey can also use old versions of git review which were long battle tested against old gerrit20:51
fungisure, but i'd like for them to still be able to take advantage of newer features/fixes too (at least those that don't depend on newer gerrit)20:59
fungiparticularly since we may need to make future adjustments to work with newer python interpreters or git on the client side, which is independent from what gerrit version the servers they interact with might run21:00
opendevreviewMerged openstack/project-config master: gerrit/acl : Convert Backport-Candidate to submit-requirements  https://review.opendev.org/c/openstack/project-config/+/87599321:32
clarkbianw: did you want me to followup to arm with an etherpad where we can start drafting something? iwas going to send an email last week then totally spaced on it and see the todo item on the list now22:01
ianwheh i was just thinking of that too22:01
ianwmaybe we can work @ https://etherpad.opendev.org/p/arm-marketing-statement22:02
corvusclarkb: ianw i think ianw's names are fine, and really the server names could be adjusted too.  i +1d the change based on that feedback, but i have not done detailed analysis of ansible, etc, so refrained from +222:06
clarkbcorvus: thanks! I think double checking the intent was mostly what we needed. We can go over the ansible22:08
ianwindeed, thanks22:19
clarkbianw: if you're starting to draft something maybe you want to followup to the email?22:22
ianwclarbk: i think https://etherpad.opendev.org/p/arm-marketing-statement is a first attempt?22:22
ianwyeah, i don't know if that should say openinfra somewhere too maybe?22:24
ianwthe link they have https://git.airshipit.org/cgit doesn't work, on https://openinfra.dev/projects22:25
clarkbianw: I tried to make an edit that incorporates something about the open infra foundation22:26
ianw++ i think that's probably pretty good now22:29
ianwi can reply to the mail22:30
clarkbianw: let me see if foundation is ok with their name in there22:31
clarkb(I'm sure its fine but I can ask really quickly)22:31
ianw++ good idea22:31
ianwor have someone with a marketing degree write something even better, haha22:31
ianwok, the Backport-Candidate labels are now all using s-r too; has deployed https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_53d/875993/6/deploy/infra-prod-manage-projects/53d1f0b/manage-projects.yaml.log22:41
clarkbianw: I've been told someone will take a look but it might not be until tomorrow morning relative to North american time zones22:43
ianwno worries, let's reply after that to avoid confusion over what's final or not22:44
clarkb++22:44
opendevreviewMerged openstack/project-config master: gerrit/acl : handle key / values with multiple =  https://review.opendev.org/c/openstack/project-config/+/87599422:48
clarkbI've just made updates to our meeting agenda. Please let me know if I've forgotten anything22:59
clarkbI'm following up on the gitea01 to gitea09 borg backup move and I think the reason we haven't updated gitea09 to do backups yet is we need the service-borg-backup playbook to run and my change ti update the inventory and host vars didn't trigger it. It should run at like 0400 UTC in about 5 hours I think?23:07
clarkbI'll look at things tomorrow in that case23:07
ianwhuh, that might be an oversight23:17
clarkbgitea load averages look much more stable now. gitea11 is busy too just without the load fun23:39
opendevreviewMerged opendev/system-config master: dns: remove old openstack.org nameservers from iptables list  https://review.opendev.org/c/opendev/system-config/+/87690823:44
opendevreviewMerged opendev/system-config master: Remove unused adns1/ns* host_vars files  https://review.opendev.org/c/opendev/system-config/+/87690923:44
opendevreviewMerged openstack/project-config master: gerrit/acl : Convert Review-Priority to submit-requirements  https://review.opendev.org/c/openstack/project-config/+/87599523:54
ianw^ since -2 blocks on that one it's a bit more "active" than the other s-r rules.  no booleans, but just keep any eye out if any complaints (hasn't deployed yet)23:55
Clark[m]It's too bad we didn't coordinate the AND thing with April 1st :)23:58

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!