Saturday, 2022-02-26

clarkbfungi: also it is the testinfra addition that is causing images to be built. A bad idea, but an idea nonetheless: remove that and put it in a separate change?00:03
corvusi'm going to restart the scheduler on zuul0100:12
corvusdb migration in progress00:17
corvusmigration complete; zuul01 is reconfiguring00:22
*** persia is now known as Guest56600:24
clarkbya 831064 looks like it will work. I suspect this is specifically a problem when two jobs try to push the same blob at roughly the same time and the blob being large makes it easier to meet that criteria (as the upload takes longer)00:27
corvuswork==fail testing as expected?00:29
corvuszuul01 is up; restarting zuul02 now, including web/finger00:29
Clark[m]corvus: no it succeeded because I removed the 3.5 Gerrit build job which keeps it and the 3.4 build from pushing at the same time and inducing the error00:33
Clark[m]I think that we notice with Gerrit because we build two near identical images with layer overlap and some of those layers are large which means there is more opportunity to conflict with uploads in the registry00:33
corvusgot it00:36
corvusthere is an error due to the upgrade; i'm triaging the severity now00:37
corvusokay, the fix is posted in #zuul; the error will prevent items currently in pipelines from being reported.00:46
corvusinfra-root: if we want, we can merge the zuul fix and i can do a scheduler restart with it.  that process will probably take 45 minutes minimum, but it would both confirm the fix and allow us to resume with no loss of service.  i'm happy to do that if we are okay with the pause in service in the mean time.00:47
corvusif we don't want to do that, then we'll need to accept the loss of the current queues, and i can just delete them.  i prefer option 1.00:48
corvusit's worth noting that changes enqueued after the restart will not be affected by this bug00:50
Clark[m]I can take a look really quick just need to pause dinner prep00:56
clarkbyup lgtm. I think we can proceed with your plan00:58
corvusbtw, if you mouseover the word 'queued' on the status page now, it'll tell you the node request id it's waiting for, or if it's waiting for an executor.01:03
corvusand.... since you can't copy that, i guess you could write it down on a piece of paper with pencil, then type it in.... :)01:04
corvusor use links01:04
opendevreviewJames E. Blair proposed openstack/project-config master: Add Zuul performance metrics dashboard  https://review.opendev.org/c/openstack/project-config/+/83107102:12
*** mazzy5098812929580857 is now known as mazzy50988129295808503:05
corvusthe fix merged; i'm restarting zuul01 now05:41
corvusand zuul02 at the same time05:42
corvusbunch of reports going to gerrit now.  looks like that fixed the issue.05:47
corvusthere is a ux bug in zuul which shows items in pipelines which are not really there... i have proposed a fix (see #zuul room)18:13
*** dhill is now known as Guest64822:14

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!