Thursday, 2022-03-17

*** ysandeep is now known as ysandeep|out00:59
opendevreviewIan Wienand proposed opendev/system-config master: gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408501:16
opendevreviewwangxiyuan proposed openstack/project-config master: Correct openEuler mirror URL  https://review.opendev.org/c/openstack/project-config/+/83408601:35
fungiclarkb: yeah, 779546 is no longer needed. i've abandoned it01:57
opendevreviewIan Wienand proposed opendev/system-config master: gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408502:22
fungii've got the backup pruning in progress in a root screen session02:49
opendevreviewJeremy Stanley proposed opendev/system-config master: Stop checking the OpenStackID HTTPS cert  https://review.opendev.org/c/opendev/system-config/+/83409402:53
ianwclarkb/fungi: https://review.opendev.org/q/topic:opendev-gerrit-retire was what i came up with to retire this repo.  unlike other repos where we're happy for the historical changesets to show up forever more, this one seems different so that's why i put the robots block in there02:55
ianws/this repo/the opendev/gerrit repo/02:55
fungiwe can also delete its branches prior to retirement if we prefer03:00
ianwyeah, i mean those branches though have some of our history?03:01
ianwmaybe nothing worth saving, i guess03:01
fungiany history which was done via changes is still reachable by named refs which won't be garbage collected03:02
opendevreviewMerged zuul/zuul-jobs master: encrypt-file: always import expiring keys  https://review.opendev.org/c/zuul/zuul-jobs/+/82985303:15
opendevreviewIan Wienand proposed opendev/system-config master: gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408503:30
mnaseri've got a nodeless job that is hitting RETRY_LIMIT -- https://paste.opendev.org/show/bz2Fb6ho3KnJn5bz9q4s/ i'm failing in a "clean" way here (the change in question -- https://review.opendev.org/c/vexxhost/ansible-collection-atmosphere/+/834092 )03:33
mnaserok, my zuul_return didn't have a |list after teh selectattr, maybe that's borking it03:35
mnasernope, not that..03:36
mnaseri'd appreciate if any admins have any insight in zuul logs on why it's doing that ;(03:39
ianwmnaser: one sec and i'll see if i can find something03:43
mnaserianw: thanks!03:45
ianwok, https://zuul.opendev.org/t/vexxhost/build/ebcb981c62aa4d8d9f78b5a1112c99a4 just failed like that03:45
mnaseryep, the logs i got in teh paste is because i had the console open03:45
mnaseri wonder if my zuul_return is messing it up (i havent tried to comment it out to test my theory)03:46
ianwtrying to figure out what executor it went to03:49
mnaserianw: the one i had logs for was `ze08.opendev.org` with event id `44d00029e9ea4e6c89a598b4c2dfd4b8` (thanks "Print job information")03:50
opendevreviewMerged openstack/diskimage-builder master: Use https for downloading ubuntu images  https://review.opendev.org/c/openstack/diskimage-builder/+/83399703:51
ianwthat one seems like it was canceled03:55
mnaserhmm, maybe since i pushed a new change03:57
mnaseri wonder if calling `fail` inside `run` playbook would cause it to mark the job to be retried?03:57
ianwmnaser: https://paste.opendev.org/show/b8zzsstaHo8qsAgJzUKX/04:00
ianwthat's what's failing04:00
mnaserso using the `fail` module inside a run playbook causes it to fail04:00
mnaserhrm04:00
mnaserso how can i fail a job 'on purpose' after i run teh zuul_return04:01
* mnaser thinks of tox role04:01
mnaserok so it is using block + always -- https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/tox/tasks/main.yaml04:02
mnaserOk, i'll try the same concept at `tox`04:03
ianwit will retry in pre-run, but not run?04:04
mnaserianw: but the jobs are failing in `run` and not `pre-run`, that's the weird part04:04
mnaseri wonder if failures on `localhost` get retried, this is a nodeless job04:05
ianw4a609a2e704b43a39127240d3f93c72a was one of hte retried one04:12
mnaseri wonder if `fail` or `asssert` fails in a different way with ansible04:13
ianwthat went to ze0604:15
ianwi think the non-prefixing of exception lines has got me again :/  i feel like we've fixed that 04:16
mnaserianw: i guess i could try removing the zuul_return part if you're not finding much more, but i'll wait before clobbering your logs more04:16
ianwmnaser: https://paste.opendev.org/show/be7DwbeG4vRIVlrbkyEb/ seems to be the problem04:16
mnaserinterestinggg04:16
ianwahh, and you're sticking things in data: zuul: warnings: so i think maybe a smoking gun here :)04:19
mnaseryes, i forgot map(attribute='msg')04:20
ianwi'm sure i did something recently that returned exceptions like this a little better, at least as output in the build04:23
ianwhttps://review.opendev.org/c/zuul/zuul/+/829617 is what i'm thinking of04:24
ianwmight need a similar catch around this bit04:25
mnaserok finally, was able to fix it04:25
mnasernow ill just wait this to fail to see the comments show up04:26
mnaserthanks for that little help ianw 04:26
opendevreviewIan Wienand proposed opendev/system-config master: gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408504:28
ianwnp :)04:28
*** ysandeep|out is now known as ysandeep05:05
opendevreviewIan Wienand proposed opendev/system-config master: gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408505:34
*** ysandeep is now known as ysandeep|brb05:54
*** ysandeep|brb is now known as ysandeep06:10
opendevreviewIan Wienand proposed opendev/system-config master: gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408506:38
opendevreviewwangxiyuan proposed openstack/diskimage-builder master: Enable Yum mirror for openEuler element  https://review.opendev.org/c/openstack/diskimage-builder/+/83396907:13
opendevreviewwangxiyuan proposed openstack/diskimage-builder master: Enable Yum mirror for openEuler element  https://review.opendev.org/c/openstack/diskimage-builder/+/83396907:13
opendevreviewMerged opendev/system-config master: Stop checking the OpenStackID HTTPS cert  https://review.opendev.org/c/opendev/system-config/+/83409407:31
opendevreviewIan Wienand proposed opendev/system-config master: gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408507:41
opendevreviewIan Wienand proposed opendev/system-config master: gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408507:45
*** arxcruz|off is now known as arxcruz07:50
opendevreviewIan Wienand proposed opendev/system-config master: gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408507:51
fricklerianw: could you have a look at https://review.opendev.org/c/zuul/nodepool/+/826541 please? I've fixed the issue with sdk 1.0, but maybe it should be split off from the revert into its own patch?07:54
ianwfrickler: yeah, i'd probably agree in splitting out the fix07:56
ianwis that fixing it, or skipping over the quota calculation if we don't get the details?07:58
fricklerianw: both. the fix is to not check for the id attribute but for vcpus, because with latest sdk we get a Flavor() object, which does have an id, but not the one we expect08:12
fricklerI then also added the skip so we don't fail if we still don't have useful flavor data08:12
ianwok, i'm not sure if we should skip -- if we were skipping we wouldn't have noticed something like this?08:14
ianwdefinitely think it should be it's own change with ^^ in the commit message :)08:15
frickleryeah, I'll do that and try to amend to comments so things get clearer08:16
*** ysandeep is now known as ysandeep|lunch08:33
*** jpena|off is now known as jpena08:36
elodillesclarkb fungi : no problem, i forgot to abandon it myself, sorry :S08:40
opendevreviewIan Wienand proposed opendev/system-config master: gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408509:14
*** ysandeep|lunch is now known as ysandeep09:27
*** pojadhav is now known as pojadhav|lunch09:36
*** pojadhav|lunch is now known as pojadhav|10:12
*** pojadhav| is now known as pojadhav10:13
*** lajoskatona_ is now known as lajoskatona10:31
priteauHello. Is opendev git having issues? I am getting very odd job failures10:32
priteau  error: Server does not allow request for unadvertised object ffe7c0baf1eb21f7b0c3eabe480348b376cf928610:32
priteau  warning: Clone succeeded, but checkout failed.10:32
priteauhttps://zuul.opendev.org/t/openstack/build/39cdb2cfad2848faa95de52afbd910c510:50
priteauAnd in another job:10:50
priteau  Running command git clone --filter=blob:none --quiet https://opendev.org/openstack/nova /home/zuul/src/opendev.org/openstack/blazar-nova/.tox/pep8/src/nova10:50
priteau  fatal: the remote end hung up unexpectedly10:50
priteau  fatal: protocol error: bad pack header10:50
priteau  warning: https unexpectedly said: '0000'10:50
priteau  warning: Clone succeeded, but checkout failed.10:50
priteauhttps://zuul.opendev.org/t/openstack/build/a15cabe7fdd5441185844f996120b9c410:50
*** ysandeep is now known as ysandeep|afk11:04
*** dviroel|ruck|afk is now known as dviroel|ruck11:26
fungipriteau: it's possible yesterday's gitea upgrade introduced a regression, or that we've got a corrupt object in the nova repo on one of the servers in the cluster, but also why are you cloning nova over the network in a job? that's highly inefficient, and why we cache those git repositories locally on all our job nodes11:27
fungii'm testing cloning nova, but it's taking a while... longer than i would expect. could be something on my end, could be something to do with the specific server my request got routed to, could be something related to the upgrade, could also just be my pre-caffeine imagination running wild... not sure yet11:43
*** iurygregory_ is now known as iurygregory11:45
fricklerfungi: priteau: this code still references zuul-cloner and hasn't been touched in 3 years, likely could use some update https://opendev.org/openstack/blazar-nova/src/branch/stable/yoga/tools/tox_install.sh11:54
fungii'm surprised we haven't ripped out zuul-cloner already, that's a leftover from zuul v211:56
fungianyway, i've finished looking through the resource graphs for the load balancer and all the git servers. other than a bit of cache memory pressure i don't see anything especially alarming11:57
fungimy clone of openstack/nova just now completed, so took roughly 15 minutes11:58
fungino errors though. i'll try cloning directly from each of the backends next11:59
fungithough i expect it to take a while at this speed11:59
fungianother thought, i wonder if something changed with the git protocol supported by the servers in the upgrade, which is impacting whatever older version of git is on those bionic nodes12:00
fungias an aside, why would an openstack project be running stable/yoga branch jobs on bionic? openstack switched to focal several releases ago12:00
priteauI didn't realise we were running on bionic12:02
priteauwe are using the standard job templates12:02
fricklerseems the lower-constraints template could be updated12:07
priteaufungi: the openstack-tox-py39 job runs on focal and has the same issue12:08
fungithanks, so it's probably nothing to do with the version of the git client12:08
fungii was able to clone directly from gitea01 successfully12:09
priteauyeah, lower-constraints and py36 jobs ran on ubuntu-bionic, while pep8 and py39 ran on ubuntu-focal12:09
fungipriteau: is the problem you're seeing consistent or intermittent for those jobs?12:09
fricklergmann: how about bumping to py36 and focal? https://opendev.org/openstack/openstack-zuul-jobs/src/branch/master/zuul.d/jobs.yaml#L877-L89212:09
fricklerehm, py38 ofc12:10
priteauAll jobs failed consistently this morning. Not seen this issue ever before12:11
priteauI'll check the tox_install.sh script. I don't remember much about it, from the comment at the top it works around issues seen when nova wasn't on PyPI, which isn't the case anymore.12:12
fungifrickler: that could be one reason bionic is still used. yoga's pti says openstack will continue to test that branch against python 3.6 in order to ensure compatibility with centos stream 8 (but really it's a stand-in for rhel 8 since there's no newer rhel yet): https://governance.openstack.org/tc/reference/runtimes/yoga.html12:13
fungii've finished tests cloning from 01-03 so far. time to complete ranges between 10-15 minutes here at my house, which is reasonably well-connected12:20
fungino errors from git yet12:20
priteaufrickler: According to tox_install we should be using zuul-cloner to use the local git cache, but looking at logs I am guessing this program is gone from the CI environment, so we fall back to cloning on the network like we would from a dev's machine12:23
*** ysandeep|afk is now known as ysandeep12:23
priteauI can run these jobs fine on my machine though, no cloning issue here (but I have the latest git)12:24
fungiyes, zuul-cloner was a tool used in the old zuul v2 pull model12:24
fungizuul v3+ pushes git refs to the nodes instead, so they'll already be there on disk12:25
priteauI will try adding a check for the existence of a local repo and install from there12:26
fricklerpip uses the "--filter=blob:none" which works fine on my local impish and makes the clone finish in less than a minute, but exposes the failure on focal12:29
frickler+ option for git clone12:30
fungioh, could this be impact from the pip 22.0.4 release i wonder?12:31
fungithat was something like 10 days ago, as was the virtualenv release which included it, but maybe something in the last day increased the virtualenv we were using to 20.13.3?12:32
fricklerno, I suspect a bug between old(ish) git and new gitea and that option12:33
fungiyeah, that would make more sense given the timing12:33
fricklercan we revert one gitea server easily?12:33
frickleror shall we create a revert of the update patch and hold a node with that?12:34
fungithe latter would be pretty easy. i'm not sure how downgradeable gitea is in production12:35
fungii have a feeling a production downgrade would involve wiping and repopulating everything on the server, so may just be cleaner to replace the servers if it comes down to that12:36
priteauI wonder if this issue might impact other jobs. Have you seen any rise in failed builds today?12:36
fricklerpriteau: no, I hope not many job try to clone repos that way, but it will affect downstream installers12:41
fungithough i also hope not that many programs are trying to pip install the clone url and are instead cloning from a local checkout for efficiency (or relying on packages)12:47
fungireading a bit more about the architecture, i wonder if this really boils down to we've upgraded the cgit version underneath gitea as part of yesterday's update. https://github.com/go-gitea/gitea/issues/11958 has some pointers on setting git options for similar situations12:53
fungimaybe newer git on the server side has stopped doing some things that older git clients are expecting with regard to what objects it thinks are or aren't advertised12:54
*** anbanerj is now known as frenzyfriday12:58
fricklerI'm first trying to repro now locally using the upstream gitea containers first13:04
fricklerhmm, I get lots of errors when I try to push the nova repo to gitea. but what I do see is that with gitea 1.15 I get "warning: filtering not recognized by server, ignoring" which can also be seen in the successful blazar jobs until yesterday13:16
fricklerwith 1.16.4 I no longer get that message. so the very likely is a regression in gitea, trying to support filtering but failing13:16
fricklerseems that https://github.com/go-gitea/gitea/pull/18195/files is the change in question13:19
*** arxcruz is now known as arxcruz|ruck13:25
*** arxcruz|ruck is now known as arxcruz13:27
fricklerah, maybe the errors were because I cloned with the noblob option. trying another full clone from upstream now13:33
fungilooks like there have been some adjustments as well in https://github.com/go-gitea/gitea/pull/1837313:36
fricklerwhich version of gitea were we running before yesterday? older than 1.15 iirc?13:38
fungiv1.15.1113:39
fungiaccording to https://review.opendev.org/82818413:40
fricklerhmm, cannot reproduce with my local clone copy, so I'm rsyncing the repo from gitea01 now to test13:49
fricklero.k., I should try with something smaller, seems all repos are affected, at least some other I tried. smallest for now is https://opendev.org/openstack/python-designateclient13:56
*** pojadhav is now known as pojadhav|out14:19
fricklero.k., the workaround documented in the above PR, setting git.DISABLE_PARTIAL_CLONE = true in gitea's app.ini works14:20
fricklerclarkb: fungi: I'm in a meeting now, maybe you can check how to do that for our deployment. or verify on one of our servers first14:21
fungithanks, great find! i'm in meetings for the next couple of hours as well but should have some bandwidth to multi-task so i'll start looking into it14:22
*** tosky_ is now known as tosky14:27
*** ysandeep is now known as ysandeep|afk14:29
opendevreviewJeremy Stanley proposed opendev/system-config master: Disable partial clone feature in Gitea  https://review.opendev.org/c/opendev/system-config/+/83417414:31
fungifrickler: priteau: ^14:31
priteauThanks. In the meantime I fixed it by cloning from the local git cache14:32
fungipriteau: to be fair, you should definitely avoid cloning from a network git remote in zuul jobs14:33
priteauIndeed, it was a good way to discover that we were relying on the retired zuul-cloner14:33
fungiif you add openstack/nova to the required projects list for the job definition, zuul will provide an up-to-date checkout which will even work with cross-repo deps (depends-on)14:34
fungiyou can look at some of the neutron driver or horizon plugin projects for examples14:34
fungithey similarly rely on installing neutron or horizon from a git checkout14:34
priteauI think we should be using this for stable branches, as currently we must be testing blazar-nova stable with nova master14:49
fungiyep14:49
fungiif you relied on setting it as a required project, zuul would check out the corresponding stable branch of nova for you automatically14:49
gmannfrickler: ah right, we have to do that as per zed testing bits. I will push patch. thanks 14:50
clarkbfrickler: fungi: is there a tldr? I don't think downgrading is straightforward. Would require rebuild of entire cluster using db from backups to preserve redirects15:15
clarkbis it just zuul-cloner that broke? if so I'm inclined to say sorry that was deprecated and should've been removed from your jobs years ago15:16
fungiclarkb: tl;dr is in 83417415:16
clarkbaha thanks looking15:16
clarkboh that is an interesting thing. Should we report it upstream to pip as well?15:17
fungiprobably, i also had meetings so haven't gotten farther with it yet15:17
funginot quite sure at this point what the suggestion for pip is without nailing down which git versions are involved15:18
clarkbI'm +2 on disabling partial clones. That seems like a safe enough change if that was already the behavior in 1.15.1115:18
clarkb4b3bfd7e89cd1527d500ac44c2564d398a6b681e is the gitea commit that added it and does seem to be 1.16 specific. Also it notes that partial clones are disabled in git by default. Why would they enable them by default. Seems like they should've inverted this toggle and let people opt into it...15:20
clarkbanyway ++ to deploying that15:20
clarkbfungi: frickler: do we think we need a hold for 834174 to verify before approval or just send it?15:21
fungiclarkb: we can hold a dnm child of that change for further testing, sure, though frickler already confirmed it fixed the problem on a local install15:23
clarkboh I see the client sends the filters it wants. So if I just do a git cloen I should get a full proper clone even under 1.16. But if the client is trying to be smart (like pip) then maybe things go wrong15:23
clarkbfungi: oh in that case I think we can land it15:23
fungii mean, we could also take gitea01 out of the pool and update its config by hand to test if that's desirable15:24
clarkbI'm not too worried about it considering frickler already tested locally15:25
clarkbas a followup we might consider updating our test job to do whatever it was that tripped over this15:25
clarkbwe have content in the system-config repo populated in those test jobs15:25
clarkbI guess just git clone --filter=blob:none localhost/opendev/system-config? I can work on a patch for that15:26
fungi`pip install git+https://opendev.org/openstack/designate` for example, or we could do `git clone ...` yeah15:26
*** ysandeep|afk is now known as ysandeep15:29
opendevreviewClark Boylan proposed opendev/system-config master: Test gitea 1.16 partial clones  https://review.opendev.org/c/opendev/system-config/+/83418715:32
clarkbthat isn't based on 834174 so we should see it fail. Then we can rebase it and see it pass15:32
funginice, thanks15:33
*** ykarel is now known as ykarel|away15:34
*** ysandeep is now known as ysandeep|out15:37
fricklerunrelated news (probably): github seem to be having some issues, too https://www.githubstatus.com/15:38
fungithanks for the heads up!15:39
clarkbmaybe they upgraded to gitea 1.16 too :P15:39
fungi#status log Pruned backups on backup01.ord.rax in order to free up some storage, info in /opt/backups/prune-2022-03-17-02-13-07.log15:40
opendevstatusfungi: finished logging15:40
clarkbfungi: I wonder if that is the first time we've pruned that host. We may actually want to look overthat log and check the backups that remain to ensure we didn't overprune. There were issues in the past with certain backup name prefix overlaps causing that to happen for some hosts that we thought we had sorted out and double checking is probably a good idea15:41
clarkbin particular if a host had multiple backup targets (typicalyl due to fs backup and db backup) the pruning of one or the other could delete all backups for the other one15:41
clarkbwe had to switch them to unique prefixes to avoid that iirc15:41
clarkbfungi: probably sufficient to do a backup listing and ensure that services with database backups continue to have the correct date spread for fs and db backups15:42
fungii do see the log say it's keeping both filesystem and database backups15:45
fungilike "Keeping archive: paste01-mariadb-2022-03-16T17:28:13  Wed, 2022-03-16 17:28:15 [31cdd5957c5829be0ac464c83424b1a055e3f0d30c67c450addc83e5b4aa799c]"15:45
clarkband similar for paste01-fs-* or whatever it is called? Do we see that for review02 and gitea01 and so on?15:46
fungiright, also paste01-filesystem-...15:46
clarkbcool. As long as we see that for more than just paste we should be good15:47
clarkbin particular I worry about review db backups causing filesystem backups to be removed since gerrit db backups aren't worth much these days15:47
fungiit seems to treat "/opt/backups/borg-paste01/backup archive paste01-mariadb" separately from "/opt/backups/borg-paste01/backup archive paste01-filesystem"15:47
clarkbyes they are distinct backup archives15:48
fungireview01-filesystem and review01-mysql both kept, yes15:48
clarkbthe problem before was the prune command takes a target archive prefix (due to the common practice of appending dates to the backups) and then prunes everything it can match. So if you just pruned paste01 it would get confused15:49
clarkbfungi: review01 doesn't exist anymore :) what about review02?15:49
fungioh, right ;)15:49
clarkbbut ya I think we addressed it all when we noticed the problem and now its just verification we didn't miss anything. Sounds like it is probably working as expected15:49
fungiKeeping archive: review02-filesystem-2022-03-16T17:46:02 Wed, 2022-03-16 17:46:04 [7c55371b17f71d6c5ff23171d1bfb08b327eaa320f2c0718fe23fbb2a540f136]15:50
fungiKeeping archive: review02-mariadb-2022-03-16T17:50:32 Wed, 2022-03-16 17:50:33 [20a894f13f396ecb69f98dbc9d07fbb9c8bf460ebd4f6428d4b52fe7ff2edeea]15:50
clarkbperfect!15:51
fungisystem-config-run-gitea failed on 834174, looking into it15:52
clarkbfungi: looks like failure to encrypt the logs15:53
fungiencrypt-file: Validate input file15:53
clarkbya one of those files (of course ansible doesn't tell you which) does not exist15:53
clarkbhttps://zuul.opendev.org/t/openstack/build/66afc5f33d674bab8c29c5aea9240faa/console#3/1/15/bridge.openstack.org I'm not sure how to read that. It shows 7 inputs but one output15:55
clarkbdoes that imply the first input is the one that failed? I think that file not existing would make sense since it is just a test playbook not a normal prod run playbook with the logging setup15:55
clarkbhttps://zuul.opendev.org/t/openstack/build/66afc5f33d674bab8c29c5aea9240faa/console#3/1/12/bridge.openstack.org hrm but that shows the files existing15:56
*** dviroel|ruck is now known as dviroel|ruck|lunch15:56
fungiprobably the first time we've run this job since 829853 merged15:56
fungiit added the "file does not exist" error for when: not _stat_result.stat.exists15:57
fungipreviously we only checked for encrypt_file is undefined15:57
clarkbah. In that case maybe the issue is it is checking that entire list as if it were a single file15:57
clarkbwhich is sort of how I'm reading that 7 inputs one output now15:58
clarkbits a single input with 7 list entries15:58
fungisame15:58
fungididn't explode it for some reason15:58
fungiseems to have interpreted it as a string15:58
clarkbya15:59
clarkbfungi: I think we should do a partial revert of that change to revert the stat stuff?16:00
clarkbthe commit says this was done to avoid a weird error from gpg later. I think we can live with those weird errors for now16:00
fungiif it's not obvious how to pass a list of files to a path parameter in ansible, then yes16:02
clarkbwell I think we have to rewrite the validation to do a loop and check each one separately if it is a list. If it isn't a list then check the single input var16:03
clarkbI'm happy for someone to work on that too if they like16:03
clarkboh actually the rest of the code int here already does that so we can copy that probably16:04
fungiyeah, i'm fine undoing the change to roles/encrypt-file/tasks/main.yaml from 834174, i'll push that up along with an unrevert and try to make it exercise the failing job16:04
fungioh, or that16:05
clarkbeither way :)16:05
clarkbmaybe just push the partial revert for now16:05
clarkband we can work on fixing it properly as followup16:05
fungion it16:05
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Fix encrypt files stat validation  https://review.opendev.org/c/zuul/zuul-jobs/+/83419416:12
clarkbfungi: ^ something like that for fixing it but I'm not super confident in that so I think we land the partial revert and then rebase this on that and work through it more carefully16:12
opendevreviewJeremy Stanley proposed zuul/zuul-jobs master: encrypt-file: roll back extended file stat  https://review.opendev.org/c/zuul/zuul-jobs/+/83419616:17
clarkb+2 thanks16:18
opendevreviewJeremy Stanley proposed opendev/system-config master: Disable partial clone feature in Gitea  https://review.opendev.org/c/opendev/system-config/+/83417416:18
opendevreviewJeremy Stanley proposed zuul/zuul-jobs master: Revert "encrypt-file: roll back extended file stat"  https://review.opendev.org/c/zuul/zuul-jobs/+/83419716:22
fungiclarkb: ianw: wip for the moment ^16:22
fungioh, you already pushed 83419416:23
clarkbfungi: ya do you want to combine it with 834197?16:23
fungii'll abandon 83419716:23
clarkbsounds good I can rebase on 83419616:23
clarkbthen we can sort out how to test it better maybe16:23
fungiperfect16:24
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Fix encrypt files stat validation  https://review.opendev.org/c/zuul/zuul-jobs/+/83419416:24
clarkbthats the rebase. Happy for others to update it if they see a good way to test it. I need some breakfast but can try and take a look after that16:25
clarkbanother thing we may want to od is make that role non fatal16:25
clarkbits a really nice to have but the logs are always available otherwise16:25
fungiit's been a busy morning and i'm overdue for a shower, so going to disappear for a few while we wait for test results16:27
opendevreviewMarios Andreou proposed openstack/project-config master: Update channel ops for oooq (tripleo ci) channel  https://review.opendev.org/c/openstack/project-config/+/83419916:41
*** marios is now known as marios|out16:47
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Fix encrypt files stat validation  https://review.opendev.org/c/zuul/zuul-jobs/+/83419416:56
clarkbthe testing we have was sufficient to catch ^16:56
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Fix encrypt files stat validation  https://review.opendev.org/c/zuul/zuul-jobs/+/83419416:59
clarkband I think that should cover this case16:59
opendevreviewClark Boylan proposed opendev/system-config master: Test gitea 1.16 partial clones  https://review.opendev.org/c/opendev/system-config/+/83418717:02
*** dviroel|ruck|lunch is now known as dviroel|ruck17:05
fungiawesome17:06
lajoskatonadiablo_rojo_phone: Hi, regarding the Yoga marketing screenshots for Local IP and Off-path SmartNIC17:12
lajoskatonadiablo_rojo_phone: today I found only the official Neutron doc: 17:13
lajoskatonadiablo_rojo_phone: https://docs.openstack.org/neutron/latest/admin/ovn/smartnic_dpu.html#launch-an-instance-with-remote-managed-port17:13
lajoskatonadiablo_rojo_phone:  https://docs.openstack.org/neutron/latest/contributor/internals/local_ips.html#usage17:14
*** jpena is now known as jpena|off17:27
diablo_rojo_phonelajoskatona: yeah I think I had those links already.  17:34
diablo_rojo_phoneWhat  question did you have?17:34
lajoskatonadiablo_rojo_phone: It's not a question, just a "feedback" that I have only those from the developers of the 2 features :-)17:35
diablo_rojo_phoneAhh got it :) Noted!17:36
lajoskatonadiablo_rojo_phone: +117:36
lajoskatonadiablo_rojo_phone: is there anything I can help for the marketing material?17:36
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Fix encrypt files stat validation  https://review.opendev.org/c/zuul/zuul-jobs/+/83419417:56
clarkbwe'll get there eventually :)17:56
fungi834174 is passing now when stacked on top of the zuul-jobs partial revert18:03
fungiwhich was approved moments ago18:03
fungiso hopefully we're close to being able to land the gitea fix18:04
clarkbwoot18:04
clarkbhttps://zuul.opendev.org/t/openstack/build/e514209277c546288d95695c53777a41/log/job-output.txt#38524 my test shows the error. Now I'll rebase that on 834174 to see that it goes away, then we can land the test too to prevent future regressions18:06
opendevreviewMerged zuul/zuul-jobs master: encrypt-file: roll back extended file stat  https://review.opendev.org/c/zuul/zuul-jobs/+/83419618:06
opendevreviewClark Boylan proposed opendev/system-config master: Test gitea 1.16 partial clones  https://review.opendev.org/c/opendev/system-config/+/83418718:08
clarkbthere is the rebase to hopefully show things working and happy18:08
fungiexcellent18:09
clarkbfungi: maybe we can +A 834174 carring over frickler's previous +2?18:09
clarkbor if you want to wait for 834187 to pass that seems fine too. I trust fricklers local testing though18:09
fungiyeah, i'm good with approving it18:10
clarkbI'm going to see about filing a bug against gitea and/or pip if I can make sense of what this clone flag does18:10
fungiall i did was add a depends-on after his +218:10
clarkbmsotly nto sure if pip's flag is bad or if gitea handles it wrong18:10
clarkbfungi: ++18:10
clarkbfungi: do you want to +A or should I?18:11
fungiclarkb: one of the complexities of this situation is that newer git clients don't break18:11
clarkboh interesting18:11
fungiso apparently that clone option is valid on versions of git which can't use it with gitea18:11
clarkbfungi: the bionic version breaks. Do we know if focal's works?18:11
fungifocal's breaks too18:11
clarkboh so it is pretty new git that would be needed18:12
fungibut frickler said the version on his impish system was working18:12
fungiso yes, relatively bleeding-edge18:12
clarkbin that case I'm inclined to think it is likely a gitea bug18:12
clarkbwhere their server side implementation isn't implemeting something compatible with older git18:13
fungilike they only tested this implementation with very new git and missed some backward-compatible bits18:13
clarkbthey may need to ignore the filter specification based on the version of the client18:13
clarkbya18:13
fungianyway, having a consistent reproducer with git kind of rules out pip as being at fault18:13
clarkbya my tumbleweed git can clone with that filter against opendev.org18:14
clarkbso I've confirmed that behavior that frickler saw18:14
clarkbimpish is 2.32.0 I'm on 2.35.118:15
clarkbfocal is 2.25.118:15
fungii can approve 834174 once 834196 merges if nobody beats me to it. there's no real speedup for approving it sooner than that anyway, so i'm happy to give a few more minutes for (re-)reviews18:15
clarkbI'm going to double check focal locally and then work on a bug upstream with gitea18:15
clarkbfungi: 834196 has merged18:16
clarkbhirsute with 2.30.2 works18:18
fungioh, right i missed that in scrollback18:19
clarkbhttps://github.com/go-gitea/gitea/issues/19118 filed against gitea18:34
clarkbhttps://review.opendev.org/c/zuul/zuul-jobs/+/834194 passes testing now. Probably a good one to have ianw review before approving just to make sure there aren't other instances of this sort of problem in the new role18:46
clarkbricolin: fungi: I see there is another change to remove the -tw list https://review.opendev.org/c/opendev/system-config/+/584035 I guess whcih one is preferable? taking ownership of it or removing it?18:53
clarkbhttps://review.opendev.org/c/opendev/system-config/+/792020 cc infra-root anyone remember what that was trying to fix? I am not aware of any existing issues related to python with gerrit and I think we can abanond that19:01
fungiclarkb: yeah, i don't feel like i have enough context on what 834194 was trying to fix (error from passing a path for a nonexistent file to gpg?) and would definitely appreciate ianw's input there19:07
fungiit seems like the original implementation was just an attempt to catch an error condition sooner in the job and return a more helpful error19:08
opendevreviewMerged opendev/system-config master: Disable partial clone feature in Gitea  https://review.opendev.org/c/opendev/system-config/+/83417419:16
fungiclarkb: frickler: the gitea config update has deployed, are you still able to recreate the failures?19:59
fungiit doesn't look like updating the config restarts the container20:00
fungibut maybe the config is reread dynamically?20:01
clarkbfungi: just sitting back down after lunch I'll check20:07
clarkbfungi: no still seems to fail. I think we need to do rolling restarts of gitea. Those are a bit tricky because ordering matters to prevent gerrit replication issues.20:09
clarkbfungi: I'm happy to do them or walk someone else through them if interested. Basically you disable the node in haproxy, docker-compose down, then docker compose up -d only the mariadb and gitea-web processes. Then once web is responding you up -d gitea-ssh too. Then reenable in haproxy20:10
clarkbthis ensures that gerrit can't push to gitea when gitea web is unable to trigger the appropriate hooks to record the updates20:10
clarkboh this is interesting though https://docs.gitea.io/en-us/config-cheat-sheet/#server-server indicates a sighup might also work20:11
clarkbI'll go ahead and start the typical process with gitea01 momentarily .Then we can double check behavior against it at least20:12
clarkb'warning: filtering not recognized by server, ignoring' but my clone worked20:15
clarkbthat was against 01 after doing a restart of the service. I'll go through the other 7 in sequence20:15
clarkband done20:27
clarkbI did my clone test against all 8 backends and they all emit that warning but succeed otherwise20:27
clarkbI think this is likely happy now. priteau you can rerun those jobs I guess20:28
priteauThank you clarkb, I will give them a try20:28
priteauSorry for generating all that work, I didn't realise my message this morning would send you down this rabbit hole20:28
fungiawesome thanks!20:29
fungipriteau: no need to apologize, we appreciate the early notice of a regression20:29
clarkbit also produced a bug report upstream so hopefully this helps them improve their software too20:31
opendevreviewSteve Baker proposed openstack/diskimage-builder master: Move grub-install to the end, and skip for partition images  https://review.opendev.org/c/openstack/diskimage-builder/+/82697620:32
ianwargghh sorry about the encrypt-file breakage20:39
clarkbianw: no worries. I think I got a fix with test coverage pushed but we wanted to make sure it made sense to you before landing it. In the meantime we just landed a partial revert of the change that broke things20:44
fungiyeah, it was not hard to find or undo, don't sweat it20:45
clarkbside ntoe on the gitea thing. It occured to me that the bug may be in the server side git install depending on how much gitea passes through to git. The alpine 3.13 git version matches debian bullseye's git which means we install the same git version as the upstream docker images. I am able to reproduce the issue against try.gitea.io as well.20:50
ianwthanks, that fix looks good.  it was just to give a nice error early rather than pass missing files to gpg20:51
priteauclarkb, fungi: Build succeeded!20:52
clarkbwoot20:52
fungithanks for confirming, priteau!20:53
ianwclarkb/fungi: if you have a sec, should we confirm a plan for opendev/gerrit retirement?  i'm happy to execute20:55
clarkbianw: catching up on that was on my list of todos but then I got distracted20:55
*** dviroel|ruck is now known as dviroel|afk20:55
clarkbianw: you're suggesting that instead of deleting all the repo content on every branch as we typically do we retire it normally otherwise and then update robots.txt to prevent it from being indexed?20:56
clarkbThe downside to that is we'd have to update robots on gerrit and gitea ya?20:56
clarkboh wait you're retiring the master branch on the gerrit repo as normal20:57
ianwright, i guess i can do that for the other20:57
ianw$ git branch -a | grep gerrit/openstack/ | wc -l20:57
ianw2020:57
ianw-ish branches ...20:57
funginormally we only delete and replace content from the master branch, i think20:57
ianwyeah20:58
clarkbin this case the master branch was never the interesting one20:58
fungias i mentioned yesterday, we could consider simply deleting the other branches as well20:58
ianwbut that still leaves the old changes indexable20:58
clarkbbut it is the one people are most likely to discover so I'm fine with that20:58
fungiany of the changes we merged onto those branches will not be garbage-collected anyway because of the named refs for them20:58
ianwso you won't be able to browse "what was openstack/opendev running when they were using gerrit 2.X" in the big picture, but indivdual changes are still there?21:00
ianwif "what was openstack running" is actually useful is questionable 21:00
clarkbI think I would prefer to retire them instead of deleting them?21:00
clarkbthe utility of that is probably minimal. If someone is running 2.x still (and they show up on the mailing list occasionally) they may be interested in our patches21:01
clarkbfor our own needs I'm not sure it is helpful anymore21:01
ianwmy concern is that google still looks at the change history, and indexes everything 21:01
clarkbwell I don't think we should delete chagnes in gerrit21:02
clarkb(we have that ability but it is something we should avoid)21:02
fungito see "what was opendev running" it's still fairly trivial to filter changes in the gerrit webui even if the branches to which they merged are no longer present21:02
ianwi tend to agree, hence the robots proposal to kick it out of search engines :)21:03
clarkbianw: that won't kick out changes though, just the branches21:03
ianwhrm, as in a search of merged changes on branch openstack/2.13?21:04
clarkbsince gitea doesn't serve the change refs via its web ui21:04
clarkboh I thought you meant specific change refs like refs/changes/56/12345621:04
clarkbit would block access to openstack/* branches via the branch refs21:04
clarkbleft a thought on https://review.opendev.org/c/openstack/project-config/+/833939 about maybe delaying the switch of the acls while we clean up branches. But we can do the other chagnes in the meantime?21:04
ianwsorry i'm just talking about search results, as in blocking opendev/gerrit/* stops indexing of everything21:05
clarkbbut only in gitea as proposed21:06
clarkbthe gerrit chagnes would still be indexed on review.opendev.org21:06
clarkbwhcih is what I'm trying to clarify around21:06
ianwyes, i don't think those changes matter, that's our changes21:06
ianwthe confusing thing is when you search for gerrit things, and then end up on opendev.org looking at our tree, but it's an "upstream" change21:07
ianwif that makes sense ... it's happened to me quite a few times, when searching for gerrit specific stuff21:07
clarkbya I think so. You want to know the current state but get our stale fork/mirror so the content isn't accurate21:07
clarkbif we retire the branches and replicate that state the indexers should catch up that way too right? and then we wouldn't need to have a special robots rule? That might be preferable?21:08
clarkb(I keep a local git clone fwiw and search through that as the google hosted repo isn't easy to navigate either)21:08
ianwi'm thinking that if the indexer hits https://opendev.org/opendev/gerrit/commits/branch/master though, although the latest commit will wipe everything, if it walks back in the history it might still be indexing all the old upstream commits still21:09
opendevreviewMerged zuul/zuul-jobs master: Fix encrypt files stat validation  https://review.opendev.org/c/zuul/zuul-jobs/+/83419421:10
clarkbit does, btu I think the google indexer is smart enough to deprioritize the old code21:10
clarkbfungi: left a response to your comment on https://review.opendev.org/c/zuul/zuul-jobs/+/83419421:10
clarkbfungi: we do test the single file case earlier in the test cse21:11
ianwfungi's point that https://review.opendev.org/q/project:opendev/gerrit+branch:openstack/2.13 is also a view of what we were doing is good too21:11
corvusi'd like to begin a rolling zuul restart; any thoughts?21:13
corvus(load looks quite low actually)21:13
clarkbcorvus: the changes to fix our gitea deployment are all complete so no concerns there21:14
clarkbI need to go do a school run nowish though so can't help much21:14
clarkbno objections from e21:14
fungisounds great corvus, thanks!21:16
ianwclarkb: perhaps then delete branches !master, and i could update the readme to show gerrit searches that show the changes in each branch?21:16
fungiclarkb: oh, yep i missed that other test case, awesome21:16
ianwand assume that search engines will not walk backwards and index the changes too much21:17
fungiianw: clarkb: the branch deletion approach is compelling insofar as it doesn't require bespoke .htaccess rules or the like which are only controlled by the sysadmin team... this is a solution any project can apply21:18
fungior robots.txt entries21:19
ianwi feel like most other projects aren't that concerned about indexing of the changes, though21:22
corvusbeginning rolling restart of mergers/executors21:23
opendevreviewIan Wienand proposed opendev/system-config master: [wip] gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408522:00
ianwohhh, it's always something22:48
ianwbecause the avatar is referenced like "https://173.231.255.102:3081/avatars/opendev" it gets served as text/xml, not image/svg+xml22:49
ianwso it works when you look at it, but doesn't work when embedded in a page, because ... something or other about xml namespaces blah blah22:49
ianwhrm, even adding .svg to it doesn't seem to help22:51
fungixml: this is why we can't have nice things22:51
fungiwe might be able to force it in the apache proxy layer if the urls are consistent22:52
fungibut that seems like an awful hack22:52
ianw                if err := png.Encode(w, *m); err != nil {23:01
ianwlooks like it really has to be a png23:01
ianwthat's the upload backend, that seems to convert whatever comes in to a png23:01
fungioh, huh23:22
clarkbI guess we need PNGs?23:26
clarkbsorry I decided to update laptop firmware after school run since I'll rely on laptop more with family visiting next week and that took much longer than I expected. fwupdmgr didn't work for whatever reason so I had to fall back to manual process23:27
opendevreviewIan Wienand proposed opendev/system-config master: gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408523:39
opendevreviewIan Wienand proposed opendev/system-config master: gitea: set custom avatars for orgs  https://review.opendev.org/c/opendev/system-config/+/83408523:40
ianw^ png's it ... it will look like https://173.231.255.102:3081/opendev/23:41
ianw(until i delete that held node :)23:41
clarkbianw: left some thoughts23:49
clarkbianw: fungi: https://review.opendev.org/c/opendev/system-config/+/834187 that should be a good update to gitea testing to help prevent regressions like hte one we just had today23:50
fungithanks for the reminder, i meant to look over that. lgtm!23:52
opendevreviewClark Boylan proposed opendev/system-config master: Rebuild Gerrit images  https://review.opendev.org/c/opendev/system-config/+/83424423:53

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!