Monday, 2023-09-04

fricklergood mornings, approved openstack ansible 8 patch now, let the fireworks begin ;)04:48
opendevreviewMerged openstack/project-config master: Switch OpenStack's Zuul tenant to Ansible 8 by default  https://review.opendev.org/c/openstack/project-config/+/89329004:56
fricklernothing exploded at first sight. I wonder though whether we'll be able to detect false positives, where jobs might not be failing when they should have, due to some change in ansible behaviour05:26
opendevreviewHarry Kominos proposed openstack/diskimage-builder master: feat: Add new fail2ban elemenent  https://review.opendev.org/c/openstack/diskimage-builder/+/89254108:03
Clark[m]frickler thank you for merging that. I'm working on BBQ things so I'll try to check on periodically throughout the day.12:09
fungii'll continue to check in whenever it won't get me in trouble with christine12:11
fricklerClark[m]: fungi: checking open patches I found https://review.opendev.org/c/opendev/system-config/+/891869 , I don't think there's any special risk in merging that one?12:37
fungiyeah, we just both +2'd a few seconds apart and neither of us noticed12:45
fungii've approved it now12:45
fungior will once my gertty decides it has to use ipv4 to get to vexxhost12:45
fricklerah, I didn't notice that the reviews were that close to each other, else I might have just workflowed myself, thx12:47
opendevreviewMerged opendev/system-config master: Run bootstrap-bridge with empty nodeset  https://review.opendev.org/c/opendev/system-config/+/89186912:53
opendevreviewMerged zuul/zuul-jobs master: Drop Helm v2 support to fix v3 issue  https://review.opendev.org/c/zuul/zuul-jobs/+/88598714:07
gthiemongeHi Folks, we have a patch in gerrit that doesn't trigger the CI: https://review.opendev.org/c/openstack/octavia/+/89364914:46
gthiemongewe had a previous version of this patch (different change-id) that had the same issue (in the last patchsets) https://review.opendev.org/c/openstack/octavia/+/87804314:47
gthiemongeany idea?14:47
Clark[m]gthiemonge: quick guess is the grenade job explicitly doesn't match the jobs.yaml file so it isn't triggered14:50
gthiemongeClark[m]: do you mean that this file would be in irrelevant-files?14:54
fungigthiemonge: or not included in the files list14:55
Clark[m]You could edit a python file in the change to check that. As .py should be included not excluded14:58
gthiemongehmm trying...15:01
gthiemongeI saw it briefly in https://zuul.openstack.org/status#893649 but then it disappeared15:02
Clark[m]Yes, all changes show up on the status page as they are evaluated. But go away if there is no work to do.15:13
Clark[m]Might need to grep the zuul server log to see why it is deciding to ignore the change15:15
gthiemongeClark[m]: for instance, patchset 10 in https://review.opendev.org/c/openstack/octavia/+/893649 has changes in 2 .py files, I enabled only a few jobs. but I don't see anything in zuul15:36
clarkbgthiemonge: the zuul server log reports patchset 10 has an invalid configuration. And reporting this failed because "'NoneType' object has no attribute 'result'". Not reporting seems to be a zuul bug16:41
clarkbnow to figure out where the configuration error is16:42
clarkbhttps://paste.opendev.org/show/bwf5l06NWrJQooB9EInr/ there is the relevant log. Unfortuantely it doesn't add more context as to what the actual error is16:43
clarkbthe only current items reported by zuul are the warnings for the regex negative lookaheads. Those shouldn't be errors but maybe we are inadverdently treating them as such somewhere?16:45
clarkbcc corvus: couple of thoughts, maybe we should always explicitly log the errors when we record "invalid config for change" in the zuul log? that way if there are reporting errors to gerrit we have some info?16:47
clarkbalso it looks like the reporting error is because we put the builduuid in the report but that doesn't exist yet. So we need some sort of check for that before rendering the more verbose reports?16:47
clarkbis it possible that zuul is ok loading existing configs with warnings but won't accept them in new updates? That could explain the behavior I guess.16:54
clarkbI feel like that would also be a bug if that is the cause here16:54
clarkbI think I see the bug maybe17:00
clarkbyup give me a few minutes to sit down and process it enough to write a fix17:01
gthiemongeclarkb: wow, thanks17:06
gmannyeah, even the same change ran the tests 10 days ago so something recently changed in zuul side?17:26
fungigmann: our zuul deployment upgrades automatically every saturday, so things are changing weekly. this looks to be related to new features that support non-fatal configuration warnings and potentially triggered by the new deprecation warning for regex lookarounds that the google-re2 implementation doesn't support17:29
clarkbremote:   https://review.opendev.org/c/zuul/zuul/+/893682 Allow new configs to be used when warnings are present17:30
clarkbI think that is the fix17:30
gmannack, thanks. will wait for this. 17:31
clarkbI haven't actually run that test case locally because I don't have a zuul test suite running currently here17:31
clarkbthis reminds me I still need to debug my laptop display artifacts...17:31
carlossclarkb: thanks for working on the fix for the issue pointed out by gthiemonge. We are hitting the same issue in Manila, and we currently have our gate blocked, since a change was merged in Nova (bumping the libvirt version)19:11
carlosswe need to do some changes to our CI jobs to unblock the gate19:12
corvuscarloss: check back in about an hour; that's about the minimum time it'll take to merge and restart19:18
carlosstyvm corvus :D19:19
frickler2nd patch is failing in gate :(19:34
fricklerbut just to double check my understanding, this only affects changes that modify zuul config for a project that has regex warnings? and only scheduler needs the fix or executors, too?19:36
Clark[m]Correct and only the schedulers I think19:39
fungithe second change is just a fix for testing, so less urgent19:39
Clark[m]Ya the second change is something I noticed when writing the new test19:40
corvuscan we move this to #zuul:opendev.org ?19:41
frickleryou can discuss the details of the issue and fix over there, but for assessing the impact on opendev and keeping our consumers informed, I think this channel is more suited19:51
fricklerspeaking of the latter, how about a status notice?19:51
Clark[m]Something like #status notice Gerrit changes including configuration updates may fail due warnings in the configuration. Investigation for a fix is ongoing.19:55
Clark[m]I'm not at home where I'm auth'd to send that though19:55
fricklerI can send it, I'd just amend "... due to warnings ..."19:56
frickleralso I wasn't aware that only authed users can send these. though of course that makes sense19:56
frickleranother question, if a fix turns out to need more time, can we restart the executors on an older version? or would we need to wait for a revert (of which patch(es)) and run a new image?19:57
fricklercorvus: fungi: do you agree with the above notice clark proposed?19:58
fungimaybe "silently fail"? since we haven't observed any user-facing feedback on those, right?19:59
corvusyes, i was referring to the details of the patch and its testing regime :)19:59
corvusi think the status should include the word zuul19:59
frickler"... may fail to be tested by zuul due to ..."?20:00
corvusi would suggest: "Gerrit changes including Zuul configuration updates may silently fail.  A fix is in progress."20:00
corvusactually: "Gerrit changes that update Zuul configuration may silently fail.  A fix is in progress." is better i think20:01
fricklerI think it would be better to be specific about not getting any response from zuul at all. "failing" will likely be associated with a V-1 which is not happening20:02
corvus"Some Gerrit changes that update Zuul configuration may fail with no response from Zuul.  A fix is in progress."20:03
fungiagtm20:03
fungier, sgtm20:03
fricklerack, do you want to send yourself or shall I?20:03
corvusi can20:04
corvus#status notice Some Gerrit changes that update Zuul configuration may fail with no response from Zuul.  A fix is in progress.20:04
opendevstatuscorvus: sending notice20:04
-opendevstatus- NOTICE: Some Gerrit changes that update Zuul configuration may fail with no response from Zuul. A fix is in progress.20:04
fricklerreading the latest in #zuul it sounds like the fix does look valid after all, so my other question may not be as relevant any more20:05
opendevstatuscorvus: finished sending notice20:07
*** mmalchuk_ is now known as mmalchuk20:13
*** jonher_ is now known as jonher20:13
corvusthe change merged but please hold on restarting20:56
corvusi think we need one more fix to cover the cases we saw today; details in #zuul:opendev.org 21:08
fungithanks!21:15
corvusokay, more robust fix is enqueued; eta +1h21:33
corvusi'm going to begin restarting schedulers now22:14
corvushttps://review.opendev.org/c/openstack/octavia/+/893649 looks good now -- anything else to double check?  or should we send the all clear status?22:23
corvushow about this? status notice Gerrit changes with updates to Zuul's configuration should now be handled correctly.  Recheck any changes to Zuul configuration which did not report results.22:26
corvusClark: fungi ^?22:29
corvusgthiemonge: carloss ^ things should be fixed if there's anything else you want to check22:30
fungicorvus: sorry, stepped away. lgtm22:33
fungiand thanks again!22:34
carlosscorvus: apparently all good now. thank you!22:35
corvus#status notice Gerrit changes with updates to Zuul's configuration should now be handled correctly.  Recheck any changes to Zuul configuration which did not report results.22:36
opendevstatuscorvus: sending notice22:36
-opendevstatus- NOTICE: Gerrit changes with updates to Zuul's configuration should now be handled correctly. Recheck any changes to Zuul configuration which did not report results.22:36
fungiyay!22:37
opendevstatuscorvus: finished sending notice22:39

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!