Wednesday, 2023-06-07

opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Test insecure-ci-registry.opendev.org on jammy  https://review.opendev.org/c/opendev/system-config/+/88542100:05
opendevreviewIan Wienand proposed opendev/system-config master: system-config: update to Ansible 9  https://review.opendev.org/c/opendev/system-config/+/88542200:30
opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Test insecure-ci-registry.opendev.org on jammy  https://review.opendev.org/c/opendev/system-config/+/88542101:22
*** amoralej|off is now known as amoralej07:47
*** amoralej is now known as amoralej|lunch11:02
opendevreviewMerged openstack/project-config master: Cache new cirros images  https://review.opendev.org/c/openstack/project-config/+/88500511:39
fungipython 3.7.17 is likely to be the final 3.7.x point release, as it's due to be eol after this month12:09
*** blarnath is now known as d34dh0r5312:42
*** amoralej|lunch is now known as amoralejk12:58
*** amoralejk is now known as amoralej12:58
opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Test insecure-ci-registry.opendev.org on jammy  https://review.opendev.org/c/opendev/system-config/+/88542113:29
opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Test insecure-ci-registry.opendev.org on jammy  https://review.opendev.org/c/opendev/system-config/+/88542113:36
tonybThat ^^ is failing on bridge99 beacuse there isn't a letencrypt cert for my fake insecure-ci-registry99.  I don't know if I'm off in the weeds or on the right path but either way advice needed13:53
fungiheading out to run errands but should be back at the screen in an hour-ish14:17
*** ralonsoh__ is now known as ralonsoh14:26
frickleranother thing I noticed during my zuul config error cleanup: zuul continues to run check jobs when a patch has been merged. maybe this is ok since usually nobody is expected to interfere with zuul, but maybe also an option to improve this. see e.g. https://review.opendev.org/c/openstack/adjutant/+/88538214:53
opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Test insecure-ci-registry.opendev.org on jammy  https://review.opendev.org/c/opendev/system-config/+/88542115:08
fungifrickler: maybe a good feature would be another pipeline flag that works sort of like the supercedes option but is specifically for cancelling buildsets in those pipelines on merge15:19
fungithough bypassing testing is an infrequent action, so i wouldn't consider it all that wasteful15:20
fungiwe don't really use supercedes in the openstack tenant because of the clean-check antipattern we have traditionally employed there, but the more typical configuration is for check pipeline builds to be cancelled automatically if the change enqueues into the gate pipeline15:21
clarkbtonyb: that looks about right, will need to check job results if it continues to fail for more info15:25
Ramerethfrickler: I see an image on your image-list page that claims to be in deleting, but on our end it's active and still has a VM using it. This is the first time I've noticed this as it's usually queued on our end. Shall I remove the VM and the image manually?15:32
Ramerethclarkb: fungi ^15:32
fungiRamereth: what's the server instance uuid? i'll take a look15:33
Ramereth876cca52-d530-47cd-a82c-e0b529323ba915:33
fungiwe've seen it before in boot-from-volume situations since nodepool may rotate out images while servers are still booted from them (but does usually indicate a leaked server in that case)15:33
fungichecking15:33
opendevreviewJames E. Blair proposed opendev/system-config master: Replace ze07-ze09  https://review.opendev.org/c/opendev/system-config/+/88517015:33
opendevreviewJames E. Blair proposed opendev/system-config master: Replace ze01-ze03  https://review.opendev.org/c/opendev/system-config/+/88550815:33
opendevreviewJames E. Blair proposed opendev/system-config master: Replace ze04-ze06  https://review.opendev.org/c/opendev/system-config/+/88550915:33
opendevreviewJames E. Blair proposed opendev/system-config master: Replace ze10-ze12  https://review.opendev.org/c/opendev/system-config/+/88551015:33
fungiRamereth: looks like that's a debian-buster-arm64 "ready" node which nodepool booted at 02:11:15Z yesterday in anticipation of jobs requesting it, but we just haven't run any jobs since then which needed one of those. i don't think it's leaked, just that the nodepool image builders will start trying to delete old images after rotating them out, and that node has been hanging around waiting to15:38
fungibe called on longer than usual15:38
clarkbit should clean itself up after that node gets used15:39
Ramerethah ok, then I'll just ignore it. I have a nagios check quering your URL notifying me when it sees one in a deleting state15:39
fungiin clouds where we don't bfv that isn't usually an issue, but obviously a bfv node can't have its backing store deleted15:39
clarkbwe might want to set ready nodes for older labels to 0 though15:39
RamerethFWIW I did notice that url was returning 503 yesterday at one point15:39
fungiyeah, i'm good with that. we don't need to pre-boot buster nodes15:39
clarkbRamereth: which url? was it https://nb04... something?15:40
Ramerethyup15:40
opendevreviewJames E. Blair proposed opendev/zone-opendev.org master: Replace ze07-ze09  https://review.opendev.org/c/opendev/zone-opendev.org/+/88516815:40
opendevreviewJames E. Blair proposed opendev/zone-opendev.org master: Replace ze01-ze03  https://review.opendev.org/c/opendev/zone-opendev.org/+/88551315:40
opendevreviewJames E. Blair proposed opendev/zone-opendev.org master: Replace ze04-ze06  https://review.opendev.org/c/opendev/zone-opendev.org/+/88551415:40
opendevreviewJames E. Blair proposed opendev/zone-opendev.org master: Replace ze10-ze12  https://review.opendev.org/c/opendev/zone-opendev.org/+/88551515:40
clarkbthat was likely due to me stopping the backend service and upgrading things ad doing a reboot15:40
Ramerethgotcha15:41
corvusfungi: clarkb you can also set max-ready-age to force old nodes to get gc'd (and still keep ready nodes)15:41
fungialso a fine alternative in my opinion15:41
fricklermax-ready-age=1d would seem to be a good match to our image rotation timing15:48
fricklerand not overly wasteful, either15:48
tonybclarkb: Thanks.  making slow progress, at least it's failing in testinfra now :)15:50
fricklerwhile we're at failing things, where are we at with repairing wheel builds? I think ianw had some patches up for that?15:52
clarkbfrickler: I'm not sure. I thought where we were at was cleaning up unneeded wheels but that builds were working? It has been a bit since i look at it though15:55
fricklerclarkb: well the grafana AFS page says last released 1 month ago, I didn't dig deeper yet15:57
fricklerhttps://review.opendev.org/c/openstack/project-config/+/879722 is the patch I had in mind, that would at least decouple the failures. since we don't have working releases anyway, I would think it is a low-risk patch we can simply give a try by now16:08
frickleralso c9s seems to be the culprit once again with openafs failing https://zuul.opendev.org/t/openstack/build/cb8ce090d03d4b039501ea6c3ea87beb16:12
corvusi'm going to restart zuul-web16:12
fricklercorvus: anything happening in particular? (just being curious)16:13
corvusfrickler: oh just want to get the new errors page up16:16
corvushttps://zuul.opendev.org/t/openstack/config-errors16:17
clarkbfrickler: on that wheels change I left a small but important suggestion. Otherwise ya I think we can alnd that16:18
tonybclarkb: I think based on https://6badbd21c5540c7fe6af-e13746d46d33f29609826c7d7a815da2.ssl.cf1.rackcdn.com/885421/5/check/system-config-run-docker-registry/1b4d7b6/insecure-ci-registry99.opendev.org/docker/registry-docker_registry_1.txt that the new node isn't getting the correct group vars16:23
tonybclarkb: which I'd expect to be: https://6badbd21c5540c7fe6af-e13746d46d33f29609826c7d7a815da2.ssl.cf1.rackcdn.com/885421/5/check/system-config-run-docker-registry/1b4d7b6/bridge99.opendev.org/etc/ansible/hosts/group_vars/registry.yaml16:23
tonybAm I missing something that maps the new hostname into the correct group/role?16:24
clarkbtonyb: inventory/service/groups.yaml defines the groups and the new *99 host should match the registry group I think16:26
clarkbtonyb: I did leave a comment about a small thing I noticed (won't be the cause of the issue but may help debug it?)16:29
clarkbERROR! The requested handler 'letsencrypt updated insecure-ci-registry99-main' was not found in either the main handlers list nor in the listening handlers list is curious because it seems like that handler is right there in the handlers file...16:32
clarkbyou can copy that string and ^F it and it matches16:32
tonybI think you're looking at patchset 4, 5 should fix that16:33
clarkboh yup I was looking at a stale job run16:34
clarkbok we hardcode in the clouds.yaml that the cloud is rax so the errors trying to auth there are expected16:39
clarkbbut that doesn't explain why it isn't listening on port 500016:39
tonybI was assuming that the errors were fatal so the registry errors out16:40
clarkbIf that is the case I'm not sure how this test ever succeeded. Maybe push up a noop change separately to see what the current state looks like?16:41
tonybIf I modify zuul.d/system-config-run.yaml can I safely add /var/registry/conf to the logs?16:42
tonybYup I'll do that16:42
clarkbya the test env shouldn't have any real world credential access16:43
clarkbthey aren't even zuul secrets it is completely separated so fetching that in the test env should be fine16:43
tonybokay cool16:45
opendevreviewTony Breeds proposed opendev/system-config master: [dnm] checking testing for the existing registry  https://review.opendev.org/c/opendev/system-config/+/88552416:46
opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Test insecure-ci-registry.opendev.org on jammy  https://review.opendev.org/c/opendev/system-config/+/88542116:49
corvusclarkb: fungi do you think we're ready to roll the executors?  did the new ze01 and ze02 get normalized wrt afs?16:56
clarkbI haven't checked yet sorry.16:58
clarkbcorvus: I think ze02 was never modified. Only ze01 was and I want to say fungi did that?16:58
opendevreviewDr. Jens Harbott proposed openstack/project-config master: wheel builds : move to individual releases  https://review.opendev.org/c/openstack/project-config/+/87972216:59
fricklerclarkb: ^^ just updated with your comment17:00
clarkb+2 thanks17:02
fungicorvus: clarkb: yes, new jammy ze01 was downgraded back to distro openafs packages after i cleaned up the ppa17:13
fungishould be ready to roll as long as we're settled on https://review.opendev.org/885419 i think?17:14
fungithat's the last remaining bit and only tangentially related, probably not a blocker17:14
fricklercorvus: https://zuul.opendev.org/t/openstack/config-errors looks much nicer already. the "name: Unknown" is still WIP I guess?17:15
fungialso the project detail view will remain broken until we can roll out https://review.opendev.org/88515517:17
fricklerthe one complain I would have is that the "blue bell" link now goes directly to that page and no longer shows the total error count, so more difficult to track my progress17:19
frickler+t17:19
corvusfrickler: yes, we can increase the fidelity there over time; there are already some other errors, just none present in opendev now.  can easily add a counter at the top of the page for total results.17:19
fricklerthat'd be nice, thx17:20
fricklerjust doing some wishful thinking, if we could somehow integrate the openstack governance project/team <-> repo mapping into the zuul config and filter things with that, that might also be helpful on multiple occasions, like not only errors, but also status and builds17:38
*** amoralej is now known as amoralej|off18:10
fungilonger term the idea was that openstack would have its own zuul tenant, and then the tenant views would be exactly that18:15
fricklerfungi: I don't understand your reply, openstack does have its own tenant mostly? I don't think one tenant per project like nova/neutron etc. would be feasible?18:48
fricklermaybe filtering by queue would work, at least for most non-integrated projects18:49
fricklerhmm, the queues on https://zuul.opendev.org/t/openstack/status are shown only for the gate pipeline, not for check or others, is that intentional?18:50
fricklerafaict rate limiting per queue still affects the check pipeline, too, so would be useful to have that information there?18:51
fricklercorvus: ^^?18:51
fungifrickler: oh, sorry i thought you meant filtering to just openstack repos so you could ignore everything non-openstack that's currently sharing the tenant18:58
fungiyes custom query views for config errors could be interesting18:58
fricklerfungi: no, my idea was to have an URL I could give to e.g. the ironic team saying: these are the errors in your projects, please check and fix them19:00
fungiit may be that queue filtering for status only returns the ones for dependent pipelines, so independent pipelines like check are omitted since they form independent per-change queues19:00
fungithough i don't immediately see why those couldn't still be filtered on queue name19:01
fricklerand then I thought that it might as well be useful for the ironic team to have a link that filters just their zuul builds or just their in-progress things on the status page19:01
frickler(choosing ironic as guinea pig because they have a rather high number of repos)19:01
fungiyou can give them a link to filter to just their builds or in-progress changes today, but it would have to be generated with a separate tool that constructs the query parameters based on what's in the governance file19:02
fungifor zuul to have some integrated functionality to do that on the fly, we'd probably need to define a standard grouping mechanism of some kind (named queues are already used for other things cross-team, therefore not good to overload for this purpose)19:03
fricklerah, I guess I could write such a tool and place the link onto some personal page as a first iteration, that's a nice idea19:05
fungisomething like the project groups concept in storyboard could make sense there. basically annotate a project with a list of zero or more groups to which it belongs, and then extend the query api to support filtering by one or more group names19:05
frickleror more general tags, but yes, similar idea19:05
fungiyeah, you could call them tags, groups, or frobnules, doesn't really matter19:06
corvusyeah, i think third-party tool to construct those queries sounds good; both the builds and config-errors page intentionally use the same query params for that19:08
fungifurther development could include known group names in the drop-down filter selectors for some dashboard views, and possibly a groups view where you can get a list of groups within the tenant similar to the current top-level tenants view that takes you directly to group-filtered views of things19:08
fungiprobably something that merits a zuul spec19:08
corvusi'm hesitant to add an organizational grouping system to zuul19:08
corvusthat's called tenants19:09
fungiunderstandable19:09
fungistarting with a build-your-own-dashboard-query tool external would be good to see how much use people find the idea anyway19:10
fungisimilar to the gerrit dashboard query builder tools that have been floating around19:10
corvus++19:11
jrosser`parentproject:openstack/openstack-ansible` pulls dozens of repos into our gerrit dashboard - perhaps that achieves something similar19:20
fungii haven't looked, but probably that's telling gerrit to return all projects whose gerrit config (acl) inherits from the same project. in the case of official openstack repos, they all inherit from the openstack/meta-config project19:26
fungithat way the openstack project is able to set global gerrit policy in one place, simplifying the per-project configs19:26
fricklerthat seems to be a special thing only used for openstack-ansible, never heard of that before, too. https://gerrit-review.googlesource.com/Documentation/cmd-create-project.html19:29
fricklerhttps://paste.opendev.org/show/b3m54jouBhaA9ypX8RLC/19:30
fungiopenstack/openstack-ansible-roles does an inheritFrom = openstack/openstack-ansible19:34
jrosseranyway - creating dashboard queries got completely out of hand, needing continuous adjustment to keep newly created other repos with similar names out19:35
jrosserand thats pretty much sorted it completely and make the query quite compact19:36
fungifwiw, that's the only inheritFrom in openstack namespace acls other than openstack/meta-config being inherited by everything19:37
fricklermaybe that could be used to simplify the setup for some other teams, too19:40
fricklerbut I don't think that zuul could use that information19:40
fungiagreed, that's very gerrit-specific19:40
corvuswildcard support in searches might be useful here too19:57
fungiyes, or regular expressions (though those have greater chance of security risks)19:58
clarkbgitea 1.20 just got a release candidate22:17
clarkbI don't know what is in it yet22:17
opendevreviewIan Wienand proposed opendev/system-config master: [dnm] trigger and openafs build to get error logs  https://review.opendev.org/c/opendev/system-config/+/88555722:23
ianwhttps://8120cec50dbf22b0ed97-a15027182aab035aa882f99410b51a23.ssl.cf2.rackcdn.com/885557/1/check/system-config-zuul-role-integration-centos-9-stream/f19c651/dkms-make-logs/make.log23:23
ianwthat sure is a lot of errors23:23
ianwi can't find any obvious references to those errors in current git or gerrit23:32
ianwthere seem to be 4 options23:39
ianw1) ignore it and hope openafs release something that fix it eventually23:39
ianw2) work w/ openafs to fix it, backport required patches in the mean time and figure out how to deploy them to the rpms we use23:40
ianw3) perhaps like Fedora, consider 9-stream too much of a moving target to keep wheel up-to-date for23:41
ianw4) rework the publishing to go through the executor, so that the wheel build environments don't need openafs.  or indeed do something completely different like containerised builds etc etc23:42
ianwdespite writing it, I'm not 100% convinced on the ignore failure proposed by https://review.opendev.org/c/openstack/project-config/+/879722, and essentially implementing 1) here.  23:44

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!