Saturday, 2021-11-06

*** seongsoocho_ is now known as seongsoocho05:23
* corvus yawns and stretches15:49
corvusi'll make sure the latest images are pulled15:50
corvusand i'll restart zuul now15:51
corvushttps://zuul.opendev.org/components is looking good15:53
corvusthis startup may take a bit longer because we're populating the connection branch cache in zk one project at a time, and that's taking about 200ms per project.15:55
corvusthat should only be on initial startup though15:55
corvusup; re-enqueing16:04
corvusdone16:06
corvusdebug log chatter looks normal16:06
fungiooh!16:07
corvusstarting zuul0116:07
corvusexpect status page weirdness while it starts16:07
fungizuul01.opendev.org RUNNING 4.10.5.dev191 e2d6992a16:08
corvusi think it's fully started up now16:14
corvusokay, now we wait :)16:15
fungithere are a few cases in the debug log of "AttributeError: 'NoneType' object has no attribute 'cache_key'" but they're all from well before the restart16:15
corvusoh, i didn't think to clear out the cache before starting.  hopefully everything was in a reasonable state.  but if we do see an error, one thing we'll want to consider is whether it was due to previously existing bad data.16:16
fungialso "TypeError: 'NoneType' object is not subscriptable" but only prior to the restart16:16
fungi(so far)16:16
fungithe only exceptions i see after the restart are "AttributeError: 'NoneType' object has no attribute 'layout'" but those were occuring before the restart as well16:17
corvusyeah, that's a status_json formatting job running before being loaded16:17
corvusso not something to worry about16:17
corvusi'd just like to share how cool it is to watch both debug logs streaming and seeing the schedulers cooperate :)16:18
fungisome builds for reenqueued queue items have already returned results16:19
fungi(and succeeded)16:19
corvus(it's also really weird to see an idle scheduler but that's the weekend for ya)16:19
fungiyeah16:21
fungicorvus: since the restart, zuul02 has logged two instances of "AttributeError: 'NoneType' object has no attribute 'getRelatedChanges'" but it's in general cleanup so i guess that could involve lingering cache content from before the restart?19:12
corvusthx, i'll take a look after i finish stuffing my face19:28
corvusi'm going to fire up the repl and inspect that error some more19:59
corvusyeah, the issue is that there was an item in the change cache with no data, and it was in the cache before the restart20:16
corvusso i should have cleared the zk state before restarting into this.  but also, it might be nice if we could recover from this20:17
fungimakes sense20:34
corvusi think the best thing to do is to clear the zk state and restart; i'm not sure how productive debugging change cache issues across this version boundary is going to be20:37
corvusi'm going to go ahead and restart20:37
corvusstopped; deleting zk state now20:39
corvusstarting20:46
fungisure, thanks!20:46
corvusre-enqueing21:08
corvusand starting zuul0121:08
corvusall up again21:16
corvusas a data point, i just ran the relevant change key collection routine (which is what was throwing the exception earlier) and it returned without error21:17
fungioh, awesome, so definitely was all due to preexisting cache content21:19
fungicorvus: there's been at least one more "AttributeError: 'NoneType' object has no attribute 'getRelatedChanges'" since the latest restart22:54
fungi22:47:59,755 utc in general cleanup again22:55
fungithe only new one so far22:57
fungiidentical traceback though22:58
corvusthanks, hopefully we can get better data for this one.23:09
corvusoh interesting; it's the same change as before23:19
corvusi think i see the issue23:31

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!