Sunday, 2022-01-23

fricklercorvus: zuul shows https://review.opendev.org/825849 to be in the check queue since 38h, so likely not related to the zuul restart, but maybe worth looking at anyway? otherwise gibi likely would have to rebase or something like that, since the "recheck" didn't help when checks are still assumed to be running09:22
fungifrickler: that seems like it may coincide with when i downed all the executor containers on friday and rebooted the servers12:59
fungii wonder if one of the builds didn't get rescheduled12:59
*** clarkb is now known as Guest28713:55
fungilast mention of that build (97381c05a56d4d6b99bf55a8b1a30a9b) was 2022-01-21 19:31:05,974 on ze07.opendev.org when i started the executor container after the reboot14:48
fungi"Held status set to False" and then "Deleting stale jobdir"14:48
fungilast mention in the scheduler logs was 2022-01-21 18:31:36,245 on zuul01 when the build was being started14:53
fungithe executor debug log has this, which i think is what should have recorded the build getting prematurely terminated?14:55
fungi2022-01-21 19:19:48,906 DEBUG zuul.ExecutorQueue: [e: d8d565d62f874a67846ae1170f993e9c] [build: 97381c05a56d4d6b99bf55a8b1a30a9b] Updating request <BuildRequest 97381c05a56d4d6b99bf55a8b1a30a9b, job=tempest-ipv6-only, state=completed, path=/zuul/executor/unzoned/requests/97381c05a56d4d6b99bf55a8b1a30a9b zone=None>14:55
fungiso if that request is marked complete, why haven't the schedulers acted on it?14:58
*** diablo_rojo_phone is now known as Guest29215:16
*** Guest287 is now known as clarkb16:44
clarkbfungi: I would assume some sort of miss of a zk state change16:44
clarkbeither because it didn't happen or the scheduler didn't process it properly?16:45
clarkbI was going to push a change to compare gerrit config files but it seems ianw already did that in the 3.4 upgrade prep work (based on the etherpad content)16:47
opendevreviewClark Boylan proposed opendev/bindep master: DNM This is a simple reproducer change for a setuptools bug  https://review.opendev.org/c/opendev/bindep/+/82597317:02
opendevreviewClark Boylan proposed opendev/bindep master: DNM This is a simple reproducer change for a setuptools bug  https://review.opendev.org/c/opendev/bindep/+/82597317:04
*** rlandy__ is now known as rlandy17:34
corvusthere is a known race condition with executors crashing and leaving jobs stuck.  you can probably just dequeue/enqueue the change to fix.18:54
fungigot it, so nothing really to look into which warrants leaving the item enqueued?18:55
opendevreviewNeil Hanlon proposed openstack/diskimage-builder master: Add new container element - Rocky Linux  https://review.opendev.org/c/openstack/diskimage-builder/+/82595722:13
opendevreviewNeil Hanlon proposed openstack/diskimage-builder master: Add new container element - Rocky Linux  https://review.opendev.org/c/openstack/diskimage-builder/+/82595722:20
fungifrickler: i was tempted to try out the authenticated webui for this, but laziness won over since i'm on the sofa with a system that doesn't have easy access to my web id, so i ran zuul-client dequeue and then enqueue on one of the schedulers with the following parameters and fresh builds are now in progress: --tenant=openstack --pipeline=check --project=openstack/placement --change=825849,122:22
fungithe closest solution for a normal user would be to abandon and restore the change on gerrit, which should hopefully trigger similar actions22:23

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!