Thursday, 2021-10-14

*** dviroel|rover is now known as dviroel|out00:19
ianwwell that's annoying.  bionic can't mount the xfs partition on the centos-9 stream ISO00:48
*** diablo_rojo is now known as Guest281100:53
fungihow is it an iso if it has a partition formatted xfs instead of isofs?00:54
opendevreviewIan Wienand proposed openstack/diskimage-builder master: Update centos element for 9-stream  https://review.opendev.org/c/openstack/diskimage-builder/+/80681900:54
opendevreviewIan Wienand proposed openstack/diskimage-builder master: [dnm] testing centos 8 image builds  https://review.opendev.org/c/openstack/diskimage-builder/+/81391200:54
opendevreviewIan Wienand proposed openstack/diskimage-builder master: epel: match replacement better  https://review.opendev.org/c/openstack/diskimage-builder/+/81392200:54
fungii guess they got tired of worrying about rockridge extensions and all that, and assume nobody's booting them with a physical cd-rom drive these days anyway01:00
fungibut still, you'd think a different term would be in order01:00
ianwfungi: yeah, even more confused when it does double-duty with usb keys etc.01:14
fungiopenbsd just puts a .fs extension on its bootable install images01:17
opendevreviewIan Wienand proposed openstack/diskimage-builder master: [dnm] testing centos 8 image builds  https://review.opendev.org/c/openstack/diskimage-builder/+/81391201:27
ianw"skopeo --insecure-policy copy --all docker://127.0.0.1:2021/10/14 01:17:23 socat[16] W ioctl(5, IOCTL_VM_SOCKETS_GET_LOCAL_CID, ...)" ... does this ring any bells01:29
ianwit looks like one of the arguments to skopeo has a bad substituion01:29
ianwhttps://zuul.opendev.org/t/openstack/build/b7efb8aea39345ff9bc09ea8f884ceee01:29
Clark[m]ianw we skopeo through a socat ipv4 to ipv4 proxy/tunnel because neither skopeo nor docker want to support ipv6 address literals01:37
Clark[m]I think we store the result of the socat listen address in an Ansible var but the socat failed and that got stored in the var instead?01:38
Clark[m]Maybe look earlier in the job where the socat is started and see what the result of that is?01:38
ianwhttps://zuul.opendev.org/t/openstack/build/b7efb8aea39345ff9bc09ea8f884ceee/console#4/0/2/localhost ... that appears happy01:41
ianwthough indeed the root error is in https://zuul.opendev.org/t/openstack/build/b7efb8aea39345ff9bc09ea8f884ceee/console#4/0/2/localhost01:41
ianwSet socat port01:41
Clark[m]Hrm we did just update the zuul images to bullseye from buster with the last restart. Maybe related?01:43
ianw$ socat -d -d TCP-LISTEN:0,fork TCP:198.72.124.102:500001:46
ianw2021/10/14 12:45:48 socat[488127] W ioctl(5, IOCTL_VM_SOCKETS_GET_LOCAL_CID, ...): Inappropriate ioctl for device01:46
ianw2021/10/14 12:45:48 socat[488127] N listening on AF=2 0.0.0.0:4722701:46
ianwthis could be it.  looks like it might have an extra warning line now01:46
Clark[m]"neat" I guess we need to filter that out?01:46
Clark[m]W for warning?01:47
ianwyep, looks like it01:48
opendevreviewIan Wienand proposed zuul/zuul-jobs master: intermediate-registry: handle socat warning out  https://review.opendev.org/c/zuul/zuul-jobs/+/81392402:02
ianwfocal can't mount the partition in the 9-stream ISO either02:05
ianwhttps://zuul.opendev.org/t/openstack/build/3571002a75b44d22b7f0969d96417113/log/logs/centos_9-stream-build-succeeds.FAIL.log#35402:05
Clark[m]Does mount need an fs type hint? Though I think it relies on whatever the equivalent of block device magic numbers are for that?02:09
ianwit's something deeper, and has do with checksums or something in the on-disk format aiui02:10
Clark[m]The release for 9-stream hasn't happened yet right? I wonder if this is the sort of thing that is open for change still02:11
ianwyeah, i mean the fs being mountable by anything other than the kernel on the iso is probably deep into "don't do that" territory ...02:12
Clark[m]But the reason we have standard filesystems is so that they are portable?02:14
Clark[m]Or maybe I misunderstood02:14
Clark[m]I mean you can even mount a windows fa on Linux now02:14
ianwi'm sure a more recent kernel can, but for now ... 02:17
ianwi guess this means you can't build centos element on ubuntu.  this is annoying, but perhaps inevitable 02:18
Clark[m]Presumably if Ubuntu updates their xfs drivers it would work?02:20
Clark[m]Or is it not even that sort of problem? Can centos 8 mount it?02:20
Clark[m]Its kernel is older than focals I think02:20
Clark[m]But perhaps with backported xfs magic02:21
ianwyeah, i imagine it does, it's all checksumy stuff and i imagine is rhel focused02:21
opendevreviewIan Wienand proposed openstack/diskimage-builder master: [dnm] testing centos 8 image builds  https://review.opendev.org/c/openstack/diskimage-builder/+/81391202:25
ianwhttps://zuul.opendev.org/t/openstack/build/c71ed47d25b34c8f95a016a1d10f47f9/log/logs/centos_9-stream-build-succeeds.FAIL.log#35402:34
ianwit looks like centos-8 can't mount it either02:34
fungikevinz: thanks for the cleanup help!02:40
fungiwow, moments ago i got notification that vorlon won't-fixed an ubuntu bug i opened more than 8 years ago, asking for an sru to get security fixes in the cacti packages for precise: https://launchpad.net/bugs/121082202:46
Unit193There was a mass-close since precise is ESM-EOL.03:14
ianwi've filed https://bugzilla.redhat.com/show_bug.cgi?id=2013894 on the weird-xfs-in-image problem03:18
*** bhagyashris is now known as bhagyashris|out03:30
ianwoh, I suppose that "socat 2>&1 | grep 'foo' > /file" means that grep isn't actually flushing04:20
fungiUnit193: aha, i didn't realize it was still under esm until now!04:22
Unit193I'm not sure it was, but I got a bug closed for pianobar too (one I didn't file.)04:22
fungiit was mostly a fun trip down memory lane04:24
fungibeen ages since we had any servers running precise (thankfully)04:25
Unit193Poked the maintainer about the pastebinit patches again btw, not having much luck on that.  I've seen him active on IRC elsewhere, so I may have to annoy him until he does something. :/04:26
opendevreviewIan Wienand proposed zuul/zuul-jobs master: intermediate-registry: handle socat warning out  https://review.opendev.org/c/zuul/zuul-jobs/+/81392404:27
*** ysandeep|out is now known as ysandeep04:53
opendevreviewIan Wienand proposed zuul/zuul-jobs master: intermediate-registry: handle socat warning out  https://review.opendev.org/c/zuul/zuul-jobs/+/81392405:04
*** ykarel|away is now known as ykarel05:19
ykarelianw, hi05:20
ykarelianw, while testing c9-stream image build for centos-minimal in dib-functests05:21
ykarelin https://review.opendev.org/c/openstack/diskimage-builder/+/81139205:21
ykarelwe are hitting issues05:21
ykarelhttps://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_d03/811392/6/check/dib-functests-bionic-python3/d034e0c/logs/centos-minimal_9-stream-build-succeeds.FAIL.log05:21
ykareli checked that for c9 rpms need rpm version to be > 4.1605:22
ykarelto avoid these errors05:22
ykarelbut on bionic/focal rpm version is too old05:22
ykarelcan you suggest on how to handle this situation or any other distro build hit similar situation yet05:22
ykareli can try to build on c8-stream node if that works05:23
ianwykarel: yeah, i imagined this would be the case ... it is similar with current fedora05:44
ianwabout the only option I think we have is to use the containerfile element like the current fedora does05:45
ianwthis takes an upstream container image and uses it as the initial chroot environment05:46
ianwnote the best way i'd sugest to actually do this is to use the nodepool-builder image, and just call dib out of that05:48
ianwbecause this needs podman setup, and still needs other build tools etc.  all that work is done in the nodepool-builder image, because it's the one we're using to create the gate images05:48
ykarelianw, ok i can explore how current fedora is being built in ci and apply similar for c9-stream.05:49
ykarelfor other options i am not sure where to look at05:49
ianwykarel: see also my comments on https://review.opendev.org/c/openstack/diskimage-builder/+/806819/05:50
ianwit's a similar problem, where no current distro can mount the fs in the centos 9-stream qcow2 image05:50
ykarelianw, yes i am aware about this issue05:51
ykarelwe hit that while virt-customizing c9-image on c7 host05:52
ykareland to get rid of the issue we had to use custom appliance05:52
ykarelhttps://ykarel.fedorapeople.org/appliance-1.45.6.tar.xz05:52
ykarelsimilar we had hit for c8 image virt-customize on c7 but later kernel features were backported to c7 and issue went away05:53
ykareland for c9 i have hear those new filesystem features will not be backported05:53
ykarelbut yes some one should reply officially over your bz05:54
ianwyeah, i think it's unlikely they will agree to turn "down" the features, but at least we'll have something to point to05:54
ianwwhile there are other ways to build images obviously, everyone has there silo.  i'm still unaware of anything that has as broad platform support as dib05:55
ianwfor all its faults05:55
ianw"their silo" even05:55
* ykarel_ got disconnected05:58
ykarel_ok i will explore on these options05:58
ianwykarel: see https://opendev.org/openstack/diskimage-builder/src/branch/master/diskimage_builder/elements/fedora-container for the container approach05:59
ianwan ingenious idea credited to corvus05:59
ykarel_ianw, THanks will check in some time, in a meeting currently06:01
ianwheh no rush.  but let me know; i certainly think the whole ecosystem will benefit by pushing in this direction06:02
*** ykarel_ is now known as ykarel06:13
gthiemongeHi Folks, are there any known issues with the gates and stable/train? I have an octavia-tempest-plugin change that runs on stable/train and all those jobs are failing https://zuul.opendev.org/t/openstack/status#69845006:53
gthiemongeThere are similar backtraces on stable/train in other projects: https://zuul.opendev.org/t/openstack/build/a403d641fc7e4d689bc34032c416252a https://zuul.opendev.org/t/openstack/build/b850ddc4e2f942288c610cd99e6738e006:53
fricklergthiemonge: that looks like fallout from PyYAML-6.0 which was recently released. we might need to cap that for stable branches07:13
fricklerbut also devstack-gate should be considered eol really07:15
frickleroh, that role is from devstack-gate natively, which is branchless even. so cap that, but another nail in the coffin I'd say07:25
fricklerkopecmartin: gmann: ^^07:25
*** jpena|off is now known as jpena07:32
opendevreviewyatin proposed openstack/diskimage-builder master: [WIP] Add support for CentOS Stream 9 in DIB  https://review.opendev.org/c/openstack/diskimage-builder/+/81139207:58
fricklerfor those following only here, I made a ds-gate patch and am testing it in https://review.opendev.org/c/openstack/octavia/+/813961 , from the console log it looks good so far08:09
*** ykarel is now known as ykarel|lunch08:14
*** ysandeep is now known as ysandeep|lunch08:35
opendevreviewIan Wienand proposed openstack/diskimage-builder master: [dnm] testing centos 8 image builds  https://review.opendev.org/c/openstack/diskimage-builder/+/81391209:07
*** ykarel|lunch is now known as ykarel|lunc09:27
*** ykarel|lunc is now known as ykarel09:27
fricklerinfra-root: somethings seems borked between zuul and nodepool. lots of jobs waiting, lots of ready nodes in nodepool. since ~2h09:38
*** ysandeep|lunch is now known as ysandeep09:39
fricklerthese seem to be about when the issue started, umpteen of them in zuul log https://paste.opendev.org/show/810000/09:43
fricklernot sure whether I should try to restart or give someone a chance to debug09:44
fricklerseems this is where things went south https://paste.opendev.org/show/810001/09:51
fricklerlooks like https://docs.opendev.org/opendev/system-config/latest/zuul.html#restarting-the-scheduler isn't up2date, but I'm trying now based on root history on zuul.o.o10:00
frickler#status notice zuul was stuck processing jobs and has been restarted. pending jobs will be re-enqueued10:01
opendevstatusfrickler: sending notice10:01
-opendevstatus- NOTICE: zuul was stuck processing jobs and has been restarted. pending jobs will be re-enqueued10:01
ykarelianw, i tried centos-minimal 9-stream on c8-stream node10:07
ykarelit worked fine https://ba4aa52692f4678a0a1b-3dd866e87c38d155f83264836a748517.ssl.cf5.rackcdn.com/811392/7/check/dib-functests-centos-8-stream-python3-image/bb7b147/logs/centos-minimal_9-stream-build-succeeds.PASS.log10:07
iceyout of curiosity, does #opendev have a script used to persist and then re-enqueue jobs when Zuul gets restarted?10:09
fricklericey: yes, it is referenced in the above restarting doc, let me check where it is located in git10:15
frickleroh, the link is right in there https://opendev.org/zuul/zuul/src/branch/master/tools/zuul-changes.py10:16
icey:-D10:16
fricklerso actually part of zuul proper, nothing specific to opendev10:17
fricklerenqueue still running. now that I look at it, I maybe should have dropped the periodic jobs ...10:56
*** dviroel|out is now known as dviroel|rover11:09
*** jpena is now known as jpena|lunch11:27
*** akahat is now known as akahat|afk11:54
*** ysandeep is now known as ysandeep|afk11:58
*** jpena|lunch is now known as jpena12:22
*** ysandeep|afk is now known as ysandeep12:55
*** ysandeep is now known as ysandeep|brb12:57
*** bhagyashris|out is now known as bhagyashris|mtg13:00
*** ysandeep|brb is now known as ysandeep13:03
*** akahat|afk is now known as akahat13:10
crohmannHey there. I just upgraded to OpenSSH 8.8 an noticed that I cannot connect to review.opendev.org:29418 anymore due to the rsa-sha deprecation on my end. After adding "PubkeyAcceptedAlgorithms +ssh-rsa" for this target it's back. This might be worth looking into as others might run into this sooner or later as well.13:27
Clark[m]crohmann: it is a known issue as gerrit's Mina sshd does not implement key exchange extensions used to negotiate sha2 hashes over the default sha1 fallback. However, the RFC does say clients may fallback to sha2 instead https://datatracker.ietf.org/doc/html/rfc8332#section-3.313:43
Clark[m]crohmann: if you can double check whether you are falling back to ssh-rsa or rsa-sha2-* that would be helpful in knowing whether or not Mina doesn't handle a sha2 fallback properly13:44
Clark[m]I had really hoped that the client fallbacks would be updated when sha1 was deprecated. But it seems that this is getting missed13:45
Clark[m]corvus: fungi: frickler: I think we should try to prioritize ianw's 813924 fix for socat stuff. Also, it seems the issue frickler noticed was related to running out of threads after a zk disconnection? Could that also be impacted by the new bullseye runtime? I think the kernel sets max thread limits and docker containers inherit ulimits from the docker daemon so that seems unlikely13:48
Clark[m]Also, we kept the same version of python in the debian switch. Unlikely to be a new python issue either.13:51
crohmannClark[m]: I played around with it a bit now ... according to ssh -G $hostname sha2 algos are in the default list (i.e. rsa-sha2-512,rsa-sha2-256), but only if I add sha-rsa it seems to find a common cipher.14:05
Clark[m]crohmann: ya so it must be defaulting yo ssh-rsa which that RFC indicates is probably a but if ssh-rsa is not useable14:06
corvusClark: there were a lot of tracebacks; perhaps there were a lot of threads too.  frickler, infra-root if it happens again, getting a thread dump by issuing sigusr2 to the main scheduler process would be helpful.14:06
fungicorvus: thanks for looking, will definitely try to get a pair of thread dumps even next time, so we also get the yappi stats14:07
crohmannClark[m]: I can see "remote software version GerritCodeReview_3.3.6-44-g48c065f8b3-dirty (APACHE-SSHD-2.4.0)" during the handshake. Is there anything keeping your from an upgrade to 2.7.0 ?14:09
crohmannseems like with 2.5.1 there was a fix for client side support of RFC 8332 --- see https://issues.apache.org/jira/browse/SSHD-110414:11
crohmannah sorry ... I meant with 2.6.0 ... 2.5.1 was the "affected version".14:11
fungicrohmann: it's embedded by gerrit. we might be able to fork their bazel build scripts to force a newer mina-sshd14:16
Clark[m]I'd rather we wait for Gerrit to update. But also we need server side key exchange extensions support. I meant to follow-up on that bug to indicate I don't think they fixed it14:16
Clark[m]They made their client negotiate but for Gerrit it is the server that lacks support. They fixed the wrong end for us (good to fix both ends though)14:17
Clark[m]Its minimal impact to us because users can use a different key type and the impact goes away.14:17
fungicrohmann: it's also worth noting, depending on your preference, another workaround is to switch to ecdsa14:18
Clark[m]Or ed2551914:18
fungiyeah14:18
fungisome supported ecc14:18
corvuslooking at the zuul logs, it appears that the scheduler was quite busy after the restart.  so i'm very curious what it was stuck on.  but that's hard to determine after the fact considering how busy it was.14:20
Clark[m]But users of these platforms that have killed sha1 RSA should be asking their platforms why the default fallback wasn't changed to sha214:21
corvussorry, not after the restart, i mean after the zk disconnection14:21
*** ykarel is now known as ykarel|away14:21
Clark[m]corvus: periodic jobs maybe? I think the timing isn't a perfect fit but possibly close enough?14:23
corvus2021-10-14 08:21:40,326 DEBUG zuul.Scheduler: Run handler sleeping14:24
corvusthat's after the reconnection, but it never woke up again14:24
corvusi wonder if the event watcher didn't get notified of new events14:25
fungiwe didn't have any tickets from rackspace about outages either14:26
corvusswest mentioned an issue where they think that zk watches didn't survive a disconnection.  that's not supposed to happen, and i haven't been able to reproduce it, but perhaps that happened here too.14:31
Clark[m]crohmann: https://issues.apache.org/jira/plugins/servlet/mobile#issue/SSHD-1141 is the issue that ianw filed for this. Unfortunately I think upstream misunderstood and thought the problem was using their client and not connecting to their server with openssh. We should bump that issue with questions about the server implementing the server sig algs kex.14:35
fungiand also mention openssh 8.8 had officially deprecated ssh-rsa now14:36
Clark[m]I have to do a school run but I can sort out an account if others don't have one and update that issue14:36
fungiyeah, i don't think i have an apache jira account, but also trying to free up to get back to adding tests to the gitea metadata update filter14:37
corvusi think i've gleaned all i can from the logs right now.  hopefully we can get more debug info when it happens agin.14:49
fungithanks, i'll try to keep a closer eye on it today14:50
fungiany early warning signs we should be looking for to catch it quickly?14:50
crohmannClark[m]: Thanks for picking up on the ssh issue. Really appreciate that. Should I have raised an issue / bug about this somehwere or was IRC the place to come to?14:51
crohmannfungi: I shall be switching to another key type yes. But still you don't want contributors (think of someone wanting to push just a small documentation fix) to have a hard time debugging their ssh client ...14:55
opendevreviewJeremy Stanley proposed opendev/system-config master: Allow gitea_create_repos always_update to be list  https://review.opendev.org/c/opendev/system-config/+/81388614:57
fungicrohmann: i completely agree. unfortunately solving the problem is not easy, and involves a lot of finger-pointing between the server and client implementations, with us stuck in the middle14:58
fungibasically the newest openssh client release is not compatible with the sshd that gerrit embeds, even though we've been warning them about it for many months14:59
fungiboth the developers of the sshd and the openssh client developers could have made choices which would have avoided this, but neither did15:00
crohmannyeah - I did not want to add to that finger pointing at all. Maybe the necessity to now look at the cipher handshake side of things in SSH again by the Apache Mina devs helps in the long run. This will not be the last cipher or algo that is being deprecated or added.15:01
fungiopenssh apparently decided that disabling ssh-rsa didn't imply switching to a fallback other than the one they deprecated, and the mina-sshd devs seemed to think that just supporting rsa2 was sufficient without adding negotiation15:02
*** bhagyashris|mtg is now known as bhagyashris|away15:03
fungiand the only real answer we've had from gerrit and mina-sshd folks is "well switch to ecc, duh, why do you still care about rsa now that quantum computers totally pwn it? oh also isn't it 2050 yet?"15:04
crohmannClark[m]: I now switched to my ed25519 keypair - just works ;-)15:04
crohmannfungi: I am all about deprecating stuff and getting things forward. But, just like you said, this was never the issue here - it's the lack of negotiation support to find the next best alternative. And that's something that usually is part of any protocol / crypto handshake. Finding a common ground.15:06
fungiyup15:07
crohmannoh well - thanks again for revisiting this issue.15:08
fungithanks for pointing out that people are actually already using openssh 8.8!15:13
fungiyou're the first to say so, and without actual users of an official version's default configuration impacted it's been easier for gerrit and mina-sshd folks to write it off as "a fedora problem" (since fedora disabled it in a similarly broken way in advance of 8.8)15:14
funginow at least we know for sure that openssh 8.8's default is exhibiting the same symptoms15:16
crohmannfull disclose - I am "just" using Arch Linux.15:19
clarkbya this came up when fedora made the switch with f33 I think. And since then its been a fun back and forth of no one with the ability to actually fix things being particularly interested in fixing things. This more so than quantum computing threats are why I'm starting to use ecc keys15:20
clarkbbasically I'm more worried the software won't get appropriate updates than actual exploitable flaws in the algorithm for the near future15:20
clarkbhttps://issues.apache.org/jira/browse/SSHD-1141 is the actual url that I meant to link before but was on the phoen and that mangled the url to a mobile path that doesn't load iwthout auth?15:24
clarkbanyway ^ you can read more there now15:24
clarkbok I posted a followup to that issue15:35
fungithanks!15:39
*** ysandeep is now known as ysandeep|dinner15:45
prometheanfireI'm trying to find the supported python versions for openstack yoga, but can't find docs, anyone have a pointer?15:46
clarkbprometheanfire: https://governance.openstack.org/tc/reference/runtimes/yoga.html15:48
clarkbprometheanfire: I think there may be some ongoing debate for that though15:48
fungiprometheanfire: yeah, that's a palceholder, and what should be added is a topic on the tc's ptg agenda15:48
fungifor now it's effectively identical to xena15:49
clarkbprometheanfire: completely unrelated is gentoo still using dib for image stuff? it came up in a discussion where tripleo suggested that we just use their testing of image builds and we had to point out lots of other platforms use the tool including gentoo :)15:49
fungii still need to work out how to get gentoo working again for nodepool's testing: https://review.opendev.org/771106 and its parent15:50
clarkbfungi: crohmann: I'll draft an email to service-discuss@lists.opendev.org as well to point out the rsa situation, the related issues, and perhaps push back on the client side too :)15:50
clarkbthen we can point people to that if it comes up more as things bump up to openssh 8.815:50
fungier, not nodepool testing, rather zuul-jobs testing15:51
opendevreviewMerged zuul/zuul-jobs master: intermediate-registry: handle socat warning out  https://review.opendev.org/c/zuul/zuul-jobs/+/81392415:52
fungigiven how long system-config-run-gitea has been going on my metadata update filter change, i think it's not actually filtering15:55
clarkbha15:56
fungimassive console log so far too, so yeah i think it's doing all of them15:59
clarkbyay for testing15:59
*** ysandeep|dinner is now known as ysandeep16:00
fungimore likely the cause is how i added the role and overrode the var for it16:01
fungibut it's a good exercise nonetheless, since we'll be driving it from ansible too16:02
clarkbyup16:02
clarkbit actually tests things from top to bottom really well16:02
fungilooks like the job's wrapping up, so i'll peruse the archived logs over lunch16:03
clarkbI've sent that email about ssh keys to service-discuss. here is hoping I used enough keywords to have search engines pop that up for people16:05
clarkbfungi: also if you find time going over https://etherpad.opendev.org/p/project-renames-2021-10-15 would be good16:06
clarkbI copied from the previous etherpad and edited to accomodate updates we've made16:06
fungiyep, thanks16:06
clarkbI should get breakfast myself16:07
*** marios is now known as marios|out16:13
fungiinteresting that the failure on https://zuul.opendev.org/t/openstack/build/093e4171a13a44b2ad791e5025c62e8d was related to cloning dib (fatal: could not read Username for 'https://localhost:3081': No such device or address)... i wonder if updating every repo crashed gitea16:18
Clark[m]Ya we had to back off doing the description updates all the time iirc16:19
Clark[m]We may need to do a throttle?16:20
Clark[m]The gitea logs should be in the job logs16:20
fungiwell, the goal is to only end up refreshing metadata for one repo, so if i get that right then it presumably won't be an issue16:21
Clark[m]++16:23
fungiit was more that i was amused at the resulting failure mode16:25
fungiClark[m]: if you get a moment, can you look at my change to test-gitea.yaml in 813886,3 and let me know if that's what you were suggesting?16:26
fungioh, is the play an associative array? such that specifying import_playbook more than once overwrites the previous one?16:28
fungimaybe i need to add a separate play for the sync16:30
clarkbfungi: yes you need to split them16:32
clarkbI don't know which one will win in that case16:32
fungithanks, that helps anyway16:32
clarkbfungi: you also need to update sync-gitea-repos because it hardcode true for that value16:33
clarkbI suspect it was updating everything and that is why (implying the sync entry in the dict won?)16:33
clarkband then the error could be due to the rename not happening at all16:34
opendevreviewJeremy Stanley proposed opendev/system-config master: Allow gitea_create_repos always_update to be list  https://review.opendev.org/c/opendev/system-config/+/81388616:34
fungiso importing a playbook and then overriding a var like that doesn't actually override it?16:36
fungiinstead i need to override it everywhere we call the playbook?16:36
*** dpawlik5 is now known as dpawlik16:37
clarkbI'm not expert, but my understanding of import_playbook is basically a copy paste16:37
clarkbthen think of the vars as setting the variable at the top of that copy paste, then inside the copy paste it gets set again to a different value16:38
clarkbroughly foo = 1; def bar(): foo = 2; print foo; inline bar();16:39
fungiokay, so where else do we call that playbook? i can't find anywhere it's actually referenced. were we only running it by hand?16:39
clarkbfungi: yes it is only run by hand (and hasn't been run in a long time, it was more useful when we were bootstrapping gitea and the tooling was improvign so we would rerun to update new fields)16:39
*** ysandeep is now known as ysandeep|out16:39
clarkbbasically that was our get out of jail free card for gitea updates. You ran it and it did a global update16:39
fungiokay, so it's as simple as just dropping the override and maybe adding a comment16:40
clarkbyes, "calling context should set this value" type deal maybe16:41
clarkbyou can do it with ansible-playbook -e foo=bar or the way you've done it16:41
clarkbyou can also do a jinja thing to have it default to another value or something. I personally find ansible vars to be complicated in how they are evaluated. having the calling context set it seems reasonable16:41
*** jpena is now known as jpena|off16:42
clarkbit defaults to true in the role too iirc which means if you don't supply it then you'll get the same behavior as the current playbook anyway16:42
fungiit defaults to false, actually16:43
fungibecause the role gets added elsewhere that we don't want the metadata sync happening16:44
clarkboh so it does, I wonder where I saw something making me think it defaulted to true16:44
clarkbalways_update=dict(type='bool', default=True) <- the python lib defaults to true16:45
clarkbbut the ansible role defaults to false16:45
clarkbthis means if you run the python directly you get a different default than executing the role in ansible .neat16:45
clarkbanyway thats fine, this role isn't used often and is manually invoked. We can make it take parameters16:45
opendevreviewJeremy Stanley proposed opendev/system-config master: Allow gitea_create_repos always_update to be list  https://review.opendev.org/c/opendev/system-config/+/81388616:46
fungiso as simple as that?16:46
fungifuture improvement would be to make it able to consume the same input file as the rename playbook16:47
fungibut being able to pass a literal list for now should at least get us what we need16:47
clarkbya  Ithink that should do it16:48
*** sboyron_ is now known as sboyron17:09
*** sboyron is now known as Guest289717:10
clarkbfungi: I think there is a problem with your change looking at it again. sync-gitea-repos clones openstack/project-config to get an up to date projects.yaml file. Problem is that opendev/disk-image-builder isn't a project in upstream project-config17:12
clarkbhrm actually maybe that is a non issue, we're just going to clone that repo an then ignore it if grepping for project_config_src shows that we're using the /home/zuul/src location everywhere17:14
clarkbIf that is thecase I guess a further cleanup would be to drop that clone entirely17:14
fungiahh, yeah i hadn't even spotted that17:15
opendevreviewJeremy Stanley proposed opendev/storyboard-webclient master: Update default contact in error message template  https://review.opendev.org/c/opendev/storyboard-webclient/+/81404117:21
*** sboyron__ is now known as sboyron_17:36
fungiclarkb: looking at https://9a2b40b9abe499183777-8788f9c43324a469e03ac2bb48dfb234.ssl.cf2.rackcdn.com/813886/5/check/system-config-run-gitea/f0cf15b/bridge.openstack.org/ara-report/results/493.html i think you were right about it using the wrong copy of projects.yaml18:02
fungii don't see opendev/diskimage-builder in the output, though it went fast enough i think the filtering was at least in effect18:02
clarkbfungi: I see "project": "openstack/diskimage-builder" not opendev/diskimage-builder18:03
fungier, should be opendev/disk-image-builder even (the rename also adds a -)18:04
clarkbya implying openstack/diskimage-builder came from the old file18:05
fungiright18:05
clarkboh wait no this is right18:05
clarkbbecause again the renames don't operate with projects.yaml18:05
clarkbits acompletely separate input file18:05
fungiyeah, but the sync seems to have used the projects.yaml which has the old repo name, pre-rename18:06
clarkbfungi: I think you want a new step between rename and force update that updates the projects.yaml in place to simulate us landing the thing18:06
clarkbfungi: right, because we aren't making an update to projects.yaml as part of this rename18:06
clarkbas that is a completely separate step that we land after the rename is done in production18:06
fungimakes sense18:06
clarkbI think if you edit /home/zuul/src/opendev.org/openstack/project-config/gerrit/projects.yaml to update openstack/diskimage-builder to opendev/disk-image-builder it may work18:07
clarkband that would simulate us merging the project-config changes after the renames are complete and then manually running the sync repos18:07
clarkbso ya I think in sync-gitea-repos drop the git clone entirely then we know it isnt interferring and in the test-gitea.yaml playbook add a step to do that update in the file in place?18:08
fungii guess i can call sed -i in a task, or does ansible have a sed-like builtin?18:08
clarkbansible does have lineinfile18:09
clarkbhttps://docs.ansible.com/ansible/latest/collections/ansible/builtin/lineinfile_module.html18:09
fungithanks, found some examples18:09
fungiplaybooks/test-update-zuul-description.yaml is a clear one18:10
clarkboh ya we already do something semi similar18:11
clarkbthat job is getting a lot of stuff shoved into it. Might be worthwhile to think about breaking it up into two or three jobs to test different aspects of interatcing with gitea. I'll have to ponder that18:12
clarkbjob one might be manage-projects twice to check description update and otherwise noop. job two mangae projects once then rename and do what you are working on18:13
fungishould the git clone in sync-gitea-projects just drop the force: yes so it will create it only if it doesn't already exist?18:16
clarkbfungi: it should be deleted entirely as the path is all wrong and doesn't really get in line with the new way of having zuul manage the repos for us in the jobs18:17
clarkbfungi: basically zuul is keeping project-config up todate in /home/zuul/src/opendev.org/openstack/project-config. We don't need an out of band update unless we're trying to override what is already merged18:17
clarkbbefore zuul synced that for us each job had to update the /opt repo. its a relic basically18:18
*** sboyron_ is now known as sboyron18:21
opendevreviewJeremy Stanley proposed opendev/system-config master: Allow gitea_create_repos always_update to be list  https://review.opendev.org/c/opendev/system-config/+/81388618:23
fungihopefully that ^ then18:23
clarkbfungi: close, your new lineinfile thing needs a - hosts: "bridge"\n  tasks: header thing. The import_playbooks literally copy paste the playbooks in which have that stuff in place18:25
fungioh, righty18:25
clarkbthe test zuul description update gives you all that if you want to copy it18:26
opendevreviewJeremy Stanley proposed opendev/system-config master: Allow gitea_create_repos always_update to be list  https://review.opendev.org/c/opendev/system-config/+/81388618:26
fungiindeed, i had actually deleted it after copying. put back now18:26
clarkbI think that should do it18:28
clarkbhttps://www.phoronix.com/scan.php?page=news_item&px=XFS-Linux-5.10 is the issue the centos 9 xfs fixes I guess. Basically XFS is not Y2038 happy until very new linux.18:29
clarkbI'm going to go out on a limb here and suggest that is a backport worthy feature to have in your filesystem drivers18:29
fungiespecially if you expect those systems to be in use 17 years from now18:30
clarkbif we completely ignore the boostrapping problem from centos 8 to 9 for building images it seems that having filesysetms not break in 2038 is a good thing (tm)18:30
clarkbfungi: ya and importantly an fs may long outlive its operating system18:30
clarkbthe earlier you get support into filesystems the better imo18:30
clarkb*may long outlive its original operating system18:30
fungiright, that's what i meant by "systems"18:31
clarkb++18:31
fungior at least a subset of what i meant by systems18:31
opendevreviewClark Boylan proposed opendev/system-config master: Build Gerrit 3.3.7 images  https://review.opendev.org/c/opendev/system-config/+/81404818:47
clarkbI don't think ^ is urgent, in fact I'd prefer we just do the renames first and land and restart on that next week. But they finally got the release out the door and now we have a change to build it18:47
fungiyeah, we've crept pretty close to the other maintenance by now18:51
fungiand a gerrit patch update should just be a quick restart now that we're on 3.318:52
opendevreviewJeremy Stanley proposed opendev/storyboard-webclient master: Bindep cleanup  https://review.opendev.org/c/opendev/storyboard-webclient/+/81405319:22
fungihttps://4d46fce2e9b1003e0d52-32bdd6c3d47a0aba5cb6656809ba5b6f.ssl.cf2.rackcdn.com/813886/7/check/system-config-run-gitea/660961b/bridge.openstack.org/ara-report/results/492.html definitely shows it using the new project name in the update now, so that much is working. i guess i need to add a test that the metadata was actually changed though19:39
Clark[m]fungi: A good check would be to have it change the storyboard url?19:44
fungiyeah, that's what i'm working on19:44
Clark[m]Then you can fetch the page and grep out that url in the html19:44
fungithough also we don't actually test that the rename worked, so adding that at the same time19:44
fungibasically copying the zuul description update test19:45
clarkb++19:50
opendevreviewJeremy Stanley proposed opendev/system-config master: Allow gitea_create_repos always_update to be list  https://review.opendev.org/c/opendev/system-config/+/81388619:53
clarkbLunch is consumed. The forecast I saw this morning said maybe there wouldn't be rain this afternoon. It was wrong. I'm not brave enough to get a bike ride in the rain yet19:53
fungii need to switch to dinner prep while that's running19:54
clarkbwhat you've got looks good19:55
clarkbalso when we run this for real what we can do is use a --limit to say gitea08 and make sure it is happy before going to the other 719:55
fungigreat idea, yeah19:55
clarkbfungi: oh one issue19:56
clarkbfungi: you register dib_content but then check DIB_content19:56
clarkbI think case matters in ansible19:56
fungid'oh, thanks!19:56
opendevreviewJeremy Stanley proposed opendev/system-config master: Allow gitea_create_repos always_update to be list  https://review.opendev.org/c/opendev/system-config/+/81388619:57
fungiand now it's really time to make phat khii mao19:58
fungia bunch of chilis on the deck have turned red19:59
clarkbI picked all my not yet ripened tomatos this week because we finally got proper cold nights20:00
clarkbI'm hoping the ripen on the counteri n paper bags otherwise we'll be eating a bunch of friend green tomatos20:00
ianwclarkb: i've got a few ideas for centos9 i mentioned across the two changes that i'll poke at today20:34
ianw(bib, have to run a errand dropping off car)20:34
clarkbianw: it is interesting that the issue is related to y2038. seems like making the painful jump is the right thing to do, its just a matter of figuring out if it can be made less painful20:35
fungii wonder if ubuntu will provide hwe kernels for focal20:44
fungiwe might should consider using them in our test nodes and on our servers if so20:45
clarkbfungi: we don't use xfs anywhere ourselves, but I guess if that works around the builders not being able to mount stuff we should do that on the builders?20:46
fungiright, that21:00
fungibut raises the question of whether we want to run different kernels on nb hosts than everywhere else21:01
clarkbya, we have done it for certain servers in the past (for afs I want to say?)21:01
clarkbfungi: the job failed to find the new storyboard issues url in the content. My browser is having a hard time with these logs though to understand what might have broken21:05
fungiyeah, i'll take a closer look once done eating21:06
clarkbfungi: <a class=" item" href="https://bugs.launchpad.net/disk-image-builder" target="_blank" rel="noopener noreferrer"> <- it seems to have updated but to the lp path (note the - in the name)21:08
fungihuh21:09
fungimaybe the lineinfile addition didn't work?21:09
clarkbmaybe. The default fallthrough is lp so ya it is like we aren't matching storyboard then hitting the default which updates the name21:10
fungithe role output will also include a json transformation of the projects.yaml it used21:12
clarkboh in ara. I'm looking on the zuul console21:15
fungiright21:15
fungisince it's nested, the zuul console view is a bit of a pain still21:15
clarkbya no use-storyboard in there21:15
fungimebbe i typoed21:16
clarkbThe good news is I think this is working. but I think it would be good to have it actually switch to storyboard from lp to ensure it isn't some easy mode update21:16
clarkbhttps://778d47dd8b4c675f4bbd-a97fe3629175ad6b40107df87a4a1857.ssl.cf1.rackcdn.com/813886/9/check/system-config-run-gitea/dcd9e59/bridge.openstack.org/ara-report/results/493.html is the ara url by the way21:18
fungii guess the question is why the second lineinfile task didn't seem to work21:19
fungimaybe the match was wrong?21:20
clarkbor this is some weird ansible behavior. usually when I debug these I set up a simple ansible install locally and run it against localhost and a local file21:20
clarkbthen I can fiddle with it and see what it says21:20
clarkbhttps://778d47dd8b4c675f4bbd-a97fe3629175ad6b40107df87a4a1857.ssl.cf1.rackcdn.com/813886/9/check/system-config-run-gitea/dcd9e59/bridge.openstack.org/ara-report/results/492.html says changed is false21:21
fungiso likely didn't match21:21
fungii assume the tasks run in sequence21:22
fungidoes it maybe not flush the file between those?21:22
clarkbyes ansible is sequenced (one of the reasons people prefer it over puppet)21:22
fungithe second lineinfile is matching on the line changed by the first21:23
fungican i embed a newline in the first lineinfile instead?21:23
clarkb"When modifying a line the regexp should typically match both the initial state of the line as well as its state after replacement by line to ensure idempotence." that implies to me that they would expect subsequent runs to want the new stuff21:24
fungithe "both" implies that it might not be though21:25
clarkbfungi: they say that because you want it to replace itself on the next run21:25
clarkbso you match the old and new state in the regex then when you run it the first time you go from old to new, then from there on its matching new to new21:25
clarkbI don't think that really matters here since its a one shot for each of them. But I thought maybe it indicates they do expect you to handle idempotence directly. Eg no intentional weird buffering21:26
clarkbI'm installing an ansible locally and will see if I can reproduce this21:27
fungican i embed a newline in the first lineinfile instead? then i don't have to worry about whether the second one sees the first edit21:27
fungijust basically replace the old project line with the new project line plus the storyboard line21:29
*** dviroel|rover is now known as dviroel|rover|afk21:30
clarkbI don't think is the issue. I can reproduce it locally across multiple ansible invocations21:31
fungiam i not creating the yaml result i think i am?21:32
clarkbyou are21:34
clarkbI think this is going to be "ansible is weird" when we figure it out21:34
clarkbfungi: dropping the second line in file and doing line: "- project: opendev/disk-image-builder\n  use-storyboard: true" on the first lineinfile seems to work21:35
clarkbI still have no idea why the second isn't working21:35
clarkbbut not sure how important it is to understand ansible's weirdness21:36
fungiyeah, i'll just do that21:36
fungithanks for confirming21:36
fungii'm about done with dinner now21:36
clarkbI have a hunch that use-storyboard: true shows up in the file later after that match and it decides that is good enough21:36
clarkbso insertafter is nooping as it isn't an insert immediately after maybe?21:36
fungioh, like insertafter means "anywhere after" and not necessarily "immediately after" because it assumes all lines will be individually unique21:37
clarkbya I'm trying to test that now21:39
clarkbya that seems to be it. If I remove all instances of use-storyboard: true from projects.yaml then run what you have it updates as expected21:40
clarkbin that case I think the newline single replacement is about as far as I want to debug :)21:41
ianwclarkb: thanks for chasing up on the mina sshd thing.  i ran out of steam on that21:43
clarkbianw: I'll be honest I'm running out of steam myself. I have a todo on my todo list that is low priority to switch over to ed25519 keys21:44
clarkbI did it locally at home and haven't had any problems. I was going batch that change up in opendev when we can do a infra-root-keys rotation if anyone else wants to get onboard with me21:45
clarkbat this point more because I don't trust people to maintain rsa properly than a true concern that rsa will be attacked successfully in the near future21:45
clarkbThat said MINA should totally support rsa :)21:46
opendevreviewJeremy Stanley proposed opendev/system-config master: Allow gitea_create_repos always_update to be list  https://review.opendev.org/c/opendev/system-config/+/81388621:47
funginow with less lineinfile21:47
fungi50% less21:47
clarkbThough I guess unmaintained code is more likely to be attacked21:49
ianwyes i think a 5.10 kernel is the minimum to mount that xfs volume.  but we wouldn't do that on nodepool-builders as i imagine we'd use the containerfile approach21:53
ianwbut i'm thinking upgrading gate testing from bionic to debian bullseye solves a lot of problems21:53
fungiwfm21:56
clarkbpumpkin carving has begun22:04
clarkbis gerrit showing you very reduced context for changes without a ton of review (like 813886) new in 3.3? At first I was annoyed it wasn't showing me things but now I think I prefer it22:05
fungii've not really used the gerrit webui much since gertty came to exist22:21
fungiso really wouldn't have noticed22:21
clarkbI have to say I think this is the best the UI has ever been for gerrit. WHich is saying something beacuse we went through some dark days with that one UI22:28
fungii do like the inverse color theme for the new gerrit, at least22:31
ianwclarkb: yep.  the only problem is that it works well on my phone, so i find myself looking at it at times when i really probably should be doing other things :)22:31
clarkbhttps://issues.apache.org/jira/browse/SSHD-1216 is the new MINA bug for server-sig-algs fwiw22:38
opendevreviewIan Wienand proposed openstack/diskimage-builder master: ubuntu-systemd-container: deprecate and remove jobs  https://review.opendev.org/c/openstack/diskimage-builder/+/81406822:38
clarkbsomeone else noticed they hadn't fixe dit and filed a new bug for it22:38
clarkbI'm trying to figure out how to subscribe to it in jiar22:39
clarkbWondering what vote for this issue means22:44
fungiyou can facebook-like it22:45
clarkbya I think that is what this is. Anyway I have done so22:45
fungilatest round of 813886 worked as hoped22:50
clarkbexcellent, I guess i should review it for "any reason we can't land this as is"22:51
clarkbI think this should be fine. the sync-gitea-repos playbook isn't run automatically. And the update made to the role library souldn't impact new project creation as it only affects updates out of band after the fact22:52
clarkbfungi: do you think we should edit the projects.yaml for a second change and then leave it out of the list and verify we don't update it?22:52
clarkbthat might be overkill22:52
fungiwe can, though i think the task runtime proves it's not updating them all22:53
clarkbtahts a good point22:53
fungiwe could stand to refine the logging from that script too22:54
fungilike only mention a repo if it's creating it or updating its metadata (and differentiate those cases)22:54
fungibut also not critical22:54
clarkbfungi: have you had a chance to look at the etherpad yet?22:55
clarkbI think if we land 813886 we can update it to have steps for running the sync playbook passing it the list22:56
fungii have not yet, and may have to get to it in the morning at this rate22:57
clarkbok that should be fine as we haev a lot of time before starting tomorrow22:57
clarkband the only thing I have tomorrow morning is taking the kids to school22:57
opendevreviewIan Wienand proposed openstack/diskimage-builder master: ubuntu: add Focal test  https://review.opendev.org/c/openstack/diskimage-builder/+/81407222:57
fungii'll be around more than expected tomorrow too, the appointment for my auto safety inspection fell through22:57
clarkbthinking out loud here. For Monday it would be nice to land the gerrit 3.3.7 upgrade and then restart on that. Then land the gerrit.config cleanup and the gerrit vs review ansible group cleanup changes and restart again to make sure we didn't accidentally remove something we need23:01
clarkbfungi: https://review.opendev.org/c/opendev/system-config/+/813716 is the gerrit.config cleanup which is fun because we have a bit of stuff from like gerrit 2.8 in there23:02
fungii should be able to help with those, though have at least a few ptg sessions i plan to be in as well23:05
clarkbya I'm good to drive most of it if I can get reviews to make sure ther aren't any silly screwups ahead of time23:06
clarkbalso I keep forgetting it is the PTG next week23:06
clarkbI should look at a rough schedule and make sure there isn't anything I absolutely need to sit in on23:06
clarkbfungi: maybe we should do the rename work on a meetpad call? double check there aren't any major issues there?23:07
opendevreviewIan Wienand proposed openstack/diskimage-builder master: functests: drop apt-sources  https://review.opendev.org/c/openstack/diskimage-builder/+/81407423:09
fungigood idea, yep!23:09
clarkblooks like I should probably try and be around for openstack TC discussions around centos8 EOL and python runtime selections for yoga since both of those impact us if they choose weirdly23:15
fungiand i have to miss the second half of the tc session on monday as it overlaps with the security sig session23:16
opendevreviewIan Wienand proposed openstack/diskimage-builder master: centos7 : drop functional testing  https://review.opendev.org/c/openstack/diskimage-builder/+/81407523:19
stewie925hi guys, I m browsing opendev repository and I selected a random project like openstack.keystone - I wanted to see how they code logging.  but when I go to the search box and type "logging" nothing came out23:20
clarkbstewie925: yes, code search for gitea is known to be broken. We have kept https://codesearch.opendev.org running as a result23:21
clarkbstewie925: they use some homegrown golang text indexer thing and it doesn't seem to do a great job. If we can ever get around to running a true gitea cluster again we can try the new elasticsearch/opensearch based search instead23:21
stewie925clarkb: thank you!23:21
clarkbstewie925: they probably use oslo.logging though23:23
clarkber oslo.log I guess23:24
clarkbhttps://docs.openstack.org/oslo.log/latest/23:24
opendevreviewIan Wienand proposed openstack/diskimage-builder master: functests: drop minimal tests in the gate  https://review.opendev.org/c/openstack/diskimage-builder/+/81407823:49
opendevreviewIan Wienand proposed openstack/diskimage-builder master: Remove extras job, put gentoo job in gate  https://review.opendev.org/c/openstack/diskimage-builder/+/81407923:49
opendevreviewIan Wienand proposed openstack/diskimage-builder master: Simplify functests job  https://review.opendev.org/c/openstack/diskimage-builder/+/81408023:49
opendevreviewIan Wienand proposed openstack/diskimage-builder master: Run functional tests on Debian Bullseye  https://review.opendev.org/c/openstack/diskimage-builder/+/81408123:49

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!