Wednesday, 2022-10-12

ianwclarkb: does https://zuul.opendev.org/t/openstack/build/d341689bf3ce49068c98a0ca2a2e8056/console all line in the results tab with your hires screen?00:00
clarkbianw: right, just wondering if the plan is to force merge it. It did sound like fungi was ok with doing so and as we mentioned no artifacts should need promoting?00:00
ianwi think we might need to limit the width of the node name a bit more 00:01
clarkbianw: yes they all seem to be aligned00:01
ianwyeah i see https://imgur.com/a/SnCOWRC00:02
corvusclarkb: do you know why my messages are getting to irc (i assume they are) while yours aren't?00:05
clarkbcorvus: I think because you are actually joined to the irc channel but I'm not00:06
clarkbI don't know why you are joined and I'm not though00:06
clarkbwe both dropped out at roughly the same time00:07
dasmclarkb: corvus i can see your messages on irc (if that helps)00:07
clarkbthen you rejoined at 15:58 but I haven't00:08
clarkbdasm: ya this is my IRC connection. My matrix one isn't in the channel nor do I see the messages I send to it via my irc connection00:08
ianwanyone mind if i restart the zuul-web container to pick up https://review.opendev.org/c/zuul/zuul/+/858250/ ?  i'd like to play with it an the offsets from above00:09
corvusclarkb: apparently i reconnected just before 9am today00:10
clarkbianw: I think that should be safe. I guess that landed after our restarts completed?00:10
corvusianw: i think a rolling restart of both containers is fine; remember they take a few minutes to start.00:10
corvus(so be sure to check the component status page)00:10
ianwthanks00:11
clarkbcorvus: I've tried to provide more details to #irc:matrix.org to see if they have any other insight00:12
clarkbI tried using !join #openstack-nova to join an entirely new channel with no luck (element gets the invite to the room from oftc-irc and I can join as far as element is concerned but I don't show up in the user list of the channel on the irc side of things00:15
clarkbI smell dinner. I need to pop out now. But I'll check to see if anything has changed in the morning00:16
corvusclarkb: maybe something with your nick?  maybe you could change it and change back?  just throwing out ideas00:20
corvusbon appetit00:20
NeilHanlonclarkb: it's a breakage on my side as I maintain the mirrors.. our mirrormanager software is supposed to cull mirrors which are serving bad content and/or are inaccessible, after failing some amount of syncs.. but for some reason it doesn't appear to be catching/fixing this situation00:42
opendevreviewMerged opendev/system-config master: bootstrap-bridge: drop pip3 role, add venv  https://review.opendev.org/c/opendev/system-config/+/85659301:15
*** rlandy is now known as rlandy|out01:25
ianwso the bridge bootstrap failed -- "refusing to convert from file to symlink for /usr/local/bin/ansible"02:30
ianwhowever, it did redirect /usr/local/bin/ansible-playbook to the venv installed ansible02:30
ianwi think probably the clearest thing to do here is for me to manually run "pip uninstall ansible" on bridge; that will remove the global ansible pip install and the next bootstrap run should then be able to link it02:31
ianwthe gate is ok, because it just calls "ansible-playbook" anyway02:32
ianwinfra-prod-bootstrap-bridge should re-run out of the periodic jobs soon02:41
*** diablo_rojo is now known as Guest286803:22
opendevreviewIan Wienand proposed opendev/base-jobs master: setup-keys: add bridge node to "bastion" group  https://review.opendev.org/c/opendev/base-jobs/+/86102603:23
opendevreviewIan Wienand proposed opendev/system-config master: Run jobs with a jammy bridge.openstack.org  https://review.opendev.org/c/opendev/system-config/+/85779903:57
opendevreviewIan Wienand proposed opendev/system-config master: testinfra: Update selenium calls  https://review.opendev.org/c/opendev/system-config/+/85800303:57
opendevreviewIan Wienand proposed opendev/system-config master: Abstract name of bastion host for testing path  https://review.opendev.org/c/opendev/system-config/+/85847603:57
opendevreviewIan Wienand proposed opendev/system-config master: Convert production playbooks to bastion host group  https://review.opendev.org/c/opendev/system-config/+/85848603:57
opendevreviewIan Wienand proposed opendev/system-config master: Run a base test against "old" bridge  https://review.opendev.org/c/opendev/system-config/+/86080203:57
opendevreviewIan Wienand proposed opendev/system-config master: bootstrap-bridge: use abstracted hostname  https://review.opendev.org/c/opendev/system-config/+/86103103:57
ianwok, bootstrap bridge ran ok -> https://zuul.opendev.org/t/openstack/build/bf11d099adbb43039794ce84818d275904:01
*** dasm is now known as dasm|off04:04
ianwi think that stack should pass, and i'm now running out of places i think it might be broken too :)04:08
opendevreviewDr. Jens Harbott proposed opendev/base-jobs master: Drop ara related vars from the base jobs  https://review.opendev.org/c/opendev/base-jobs/+/86069304:41
opendevreviewDr. Jens Harbott proposed openstack/project-config master: Switch the requirements-constraints job to py310  https://review.opendev.org/c/openstack/project-config/+/86103505:25
*** ysandeep|out is now known as ysandeep05:32
opendevreviewTony Breeds proposed openstack/project-config master: Switch the requirements-constraints job to py310  https://review.opendev.org/c/openstack/project-config/+/86103505:38
*** luigi is now known as luigi-out05:42
*** pojadhav is now known as pojadhav|afk06:35
ramishrabshephar: commented in the patch, I think there is one more place where it could be changed, though output_dir thing is kind of messy atm06:40
ramishraoops wrong channel:/06:43
*** pojadhav|afk is now known as pojadhav07:44
opendevreviewgnuoy proposed openstack/project-config master: Add project for managing zuul jobs for charms  https://review.opendev.org/c/openstack/project-config/+/86104608:51
*** dasTor_ is now known as dasTor09:34
opendevreviewjayaditya gupta proposed openstack/diskimage-builder master: Fix issue in extract image  https://review.opendev.org/c/openstack/diskimage-builder/+/85088209:41
*** marios is now known as marios|call10:00
*** marios|call is now known as marios10:04
*** ysandeep is now known as ysandeep|lunch10:12
*** rlandy|out is now known as rlandy10:30
*** ysandeep|lunch is now known as ysandeep11:21
*** bhagyashris_ is now known as bhagyashris11:33
opendevreviewMerged openstack/project-config master: Switch the requirements-constraints job to py310  https://review.opendev.org/c/openstack/project-config/+/86103511:38
*** ysandeep is now known as ysandeep|afk12:29
fungiper-domain import logs for the latest mm3 migration test are available in 149.202.168.204:~fungi12:48
fungii don't see any obvious new errors, and the ones about the fields which were too large for their db columns are now gone12:48
opendevreviewJeremy Stanley proposed opendev/system-config master: Add a mailman3 list server  https://review.opendev.org/c/opendev/system-config/+/85124813:37
opendevreviewJeremy Stanley proposed opendev/system-config master: Fork the maxking/docker-mailman images  https://review.opendev.org/c/opendev/system-config/+/86015713:37
opendevreviewJeremy Stanley proposed opendev/system-config master: DNM force mm3 failure to hold the node  https://review.opendev.org/c/opendev/system-config/+/85529213:37
fungiclarkb: ^ i added redirects for the old list info page and the list index page urls13:37
fungitested by adding manually on 149.202.168.204 first13:37
*** ysandeep|afk is now known as ysandeep|out13:38
fungiseems to work, though maybe i should add some testinfra checking on that now that i think about it13:38
*** dasm|off is now known as dasm14:15
fungioh, but we're not actually testing the other redirects in the new deployment either14:24
fungijust checking for listening sockets and taking some screenshots14:24
Clark[m]Because we don't have the migration data to check with. I guess we could write a basic html file and use that though14:35
fungiright14:40
fungiwell, we could test the redirects to the new interface since we pre-create the mailing lists in it, it's just the rewrites exposing the old archives we can't test without adding some content14:41
Clark[m]++14:43
*** marios is now known as marios|out15:20
clarkbhttps://zuul.opendev.org/t/openstack/build/d671978274324495b3ea163d3b6ad2a5 any idea what caused that to happen? Seems like our test static.o.o which loads up the afs ro content returned a 403 instead of 200 for starlingx content15:39
clarkbthe prod content is available so not a systemic issue with their afs content and the testinfra tests for static lookup other data out of afs so not an afs specific issue15:40
clarkbI've rechecked to see if they are persistent issues15:40
fungii looked at it briefly15:40
fungipretty sure something happened and the test node had trouble reaching afs when apache wanted to read the .htaccess file15:41
fungiif you look at the error details from apache you get a little more insight15:41
fungii didn't check syslog for actual afs errors, but wouldn't be surprised if there are some15:42
fungiassuming the job collected it15:42
opendevreviewMerged openstack/project-config master: Add project for managing zuul jobs for charms  https://review.opendev.org/c/openstack/project-config/+/86104616:21
jrosser_i've not received emails from review@openstack.org since the 9th - is it possible to tell if there have been delivery attempts or bounces?16:56
clarkbjrosser_: "SMTP error from remote mail server after end of data: 553-Message filtered."16:58
jrosser_oh dear :(16:59
clarkbseems to be your mail servers are filtering it as spam16:59
clarkbwe should double check on our end if the host ended up on any lists16:59
clarkbdoes sbl not take a full ipv4 address in their query form?17:02
clarkbfungi: ^17:02
clarkbI found a different sbl query form and neither the ipv4 or ipv6 address is listed17:06
clarkbjrosser_: ^ its possible a different list has it listed, but sbl at least says we are good17:06
jrosser_ok thanks - do you have a transcript with anything useful (like which server rejected it) as the mail routing i have to endure is terrible17:07
clarkbjrosser_: the IPs seems to vary but cluster1.eu.messagelabs.com appears to be the shared fqdn17:09
jrosser_ok thats helpful17:10
jrosser_i've added openstack.org as an allowed domain into my messagelabs portal17:13
fungimaybe add opendev.org too17:23
fungisince we will likely change that from address in the future if/when we set up an mta for the new domain17:24
clarkbOur rocky 9 image should try to rebuild again shortly. If it ends up on nb02 it will run without the mirrorlist change which would be a good check of that20:49
*** timburke_ is now known as timburke20:59
ianwclarkb: if you have time to loop back over https://review.opendev.org/q/topic:bridge-ansible-venv it should be ok.  i did end up moving back to just using "bastion" as the group for testing and production as i think it's a bit easier.  there's a new change to deal with base-jobs/bootstrap as you pointed out too21:10
clarkbianw: oh ya21:12
opendevreviewIan Wienand proposed opendev/system-config master: [wip] switch testing bridge name to bridge01.opendev.org  https://review.opendev.org/c/opendev/system-config/+/86111221:19
clarkbianw: and I guess bootstrap bridge is weird due to its self referential nature?21:21
clarkbhttps://review.opendev.org/c/opendev/system-config/+/858476/6..10/playbooks/bootstrap-bridge.yaml21:21
clarkboh I see the next chnage splits out the handling for that21:21
ianwhttps://review.opendev.org/c/opendev/system-config/+/861031/1/playbooks/bootstrap-bridge.yaml then updates that now ... hopefully the comment helps21:22
clarkbianw: left a comment on that one. Apologies if my previous comments may have confused things.21:36
clarkbGoogle CLA issues sorted. https://gerrit-review.googlesource.com/c/gerrit/+/348194 that should fix ssh rsa problems with gerrit 3.521:40
*** rlandy is now known as rlandy|bbl22:04
JayFI feel like there might be something weird with the opendevreview bot -- https://review.opendev.org/c/openstack/ironic/+/860142 was posted as "verification failed" 3 minutes ago, but the V-2 was put on the patch at more like 20 minutes ago22:05
JayFI don't know if that's "normal" or what, but the latency surprised me so I thought I'd mention it in case it's evidence of some kind of service issue22:05
clarkbJayF: what do you mean by "was posted as verification failed"22:07
JayF> 22:01:40  opendevreview | Verification of a change to openstack/ironic stable/xena failed: Stable only: Factor out addition of packaging lib  https://review.opendev.org/c/openstack/ironic/+/86014222:07
JayFbut on the patch itself, the V-2 was voted on by zuul at 21:3822:08
JayFer, it's actually worse than that; at 20:38 22:08
JayFI do not care or am bothered by this latency; but noticed because I actioned the email notification, then saw an IRC notification and was like "oh no another one", but it was the same one22:09
*** dasm is now known as dasm|off22:09
JayFI just wanted to mention it because it's the kind of strange that you might wanna know about :)22:09
clarkbthe bot got an event from gerrit at 20:38:59 which rules out gerrit emitting the event slowly22:12
clarkband it says it sent the message at that time22:12
JayFI can guarantee it didn't hit my client at that time; and 22:01:40 is much too late for it to be like, client lag22:13
clarkbit then got a second event (the one generated by your comment) that it dcided it needed to post for as well22:14
clarkband that one seems to be what generated the message you saw22:14
JayFYep, and looking back22:14
JayFI see a 20:38:59 too22:14
JayFI did issue a recheck at 21:4522:15
JayFbetween those events22:15
clarkboh actually no I think it was the comment from the arm pipeline22:15
clarkbso ya I think this is fine other than it triggering off of any zuul comment and not necessarily the one that changes the state22:15
JayFokay; makes sense. extra notifications are not so bad just very, very confusingly timed there22:15
clarkbya thats what it is. It posted at 20:38 when the -2 happened. Then at 22:01 it posts again in response to the arm64 pipeline comment22:15
JayFdoesn't help that I missed that it notified at the right time as well22:15
JayFI may have about 40000 patches up with "Stable only: " or "CI: " prefix across multiple stable branches; it's all mixing together 22:16
ianwclarkb: hrm, i guess you're right in that https://opendev.org/opendev/system-config/src/branch/master/zuul.d/infra-prod.yaml#L50 is using the zuul-run playbooks22:17
fungiJayF: clarkb: what triggered exactly? looking through the comments on that change i don't see anything amiss22:17
JayFfungi: tl;dr: a message popped at 20:38:59 that a change failed verification. This was accurate. Another identical message that it failed verification posted at 22:01:40, which appears to have been sprung by the ARM64 pipeline notification 22:18
fungizuul left the verified -2 result at 20:38, then the next comment i see from it is at 22:01 when it says the arm jobs passed22:18
fungiJayF: what does "popped up" mean in this contect?22:18
fungicontext22:18
JayFIRC robot messages in #openstack-ironic22:18
funginot the comment i guess?22:18
JayFfrom opendevreview 22:18
fungioh! i totally missed you were talking about irc there22:19
fungigot it. i think i've never noticed that behavior because none of the projects i work actively on have it set to do notifications on failure results22:19
fungijust new uploads and merges22:20
opendevreviewIan Wienand proposed opendev/system-config master: [wip] switch testing bridge name to bridge01.opendev.org  https://review.opendev.org/c/opendev/system-config/+/86111222:27
opendevreviewClark Boylan proposed opendev/system-config master: DNM testing an upstream gerrit change  https://review.opendev.org/c/opendev/system-config/+/86111722:34
clarkboh I needed to force a failure too to hold the test nodes22:34
opendevreviewClark Boylan proposed opendev/system-config master: DNM testing an upstream gerrit change  https://review.opendev.org/c/opendev/system-config/+/86111722:35
clarkbinfra-root ^ I'm going to hold the gerrit 3.5 job for that and then use it to test that ssh looks happy with rsa keys22:35
clarkbif it does I'll submit the upstream fix and we can deploy that and everyone can use rsa again22:35
fungiawesome, thanks!22:36
clarkbin theory if I can ssh from my local machines to that held node using an rsa key it is working as my openssh is new enough22:37
fungiyeah, i have overrides in my .ssh/config for review.opendev.org22:38
clarkboh heh they already submitted it upstream. Well we'll test it anyway :)22:38
fungii could test with a non-overridden config22:39
fungiin fact, if i ssh by ip address, then my overrides won't be applied anyway22:39
clarkbya I tested this pretty extensively when I fixed 3.6. So I'm 99% sure it will work22:40
clarkbbut I figure being 100% sure is worthwhile22:40
fungiabsolutely22:41
opendevreviewIan Wienand proposed opendev/system-config master: [wip] switch testing bridge name to bridge01.opendev.org  https://review.opendev.org/c/opendev/system-config/+/86111222:57
clarkbI think the test gerrit instance is working23:33
clarkbfungi: 158.69.75.25 is the host if you want to test. I logged in via the web ui as the zuul user (you click on the button on the login page) and then added an rsa key23:34
clarkbwhen I run ssh -v -i throwaway_rsa_key I get debug1: kex_input_ext_info: server-sig-algs=<...,rsa-sha2-512,rsa-sha2-256,ssh-rsa>23:34
clarkbI'm going to update my change to be a rebuild gerrit change so that we can get new images and hopefully deploy that soon23:35
opendevreviewClark Boylan proposed opendev/system-config master: Update our Gerrit images  https://review.opendev.org/c/opendev/system-config/+/86111723:38
clarkbalso I'm fairly certain I would've needed an override with my openssh client so the fact that it works at all is a good indication it is fixed23:40
fungi****    Welcome to Gerrit Code Review    ****23:42
fungiHi Zuul, you have successfully connected over SSH.23:42
fungiConnection to 158.69.75.25 closed.23:42
clarkbfungi: and that was port 29418 right?23:42
clarkbif so I'll go ahead and delete the autohold and I thin kwe can proceed with landing 861117 when we want to plan a gerrit restart23:43
fungiyeah23:44
fungidebug1: kex_input_ext_info: server-sig-algs=<...,rsa-sha2-512,rsa-sha2-256,ssh-rsa>23:44
fungidebug3: sign_and_send_pubkey: signing using rsa-sha2-512 SHA256:...23:44
fungialso it would have to have been port 29418 for me to get the gerrit banner23:45
clarkbautohold deleted. Thank you for helping to test23:45
fungiany time! thanks for fixing it23:45
clarkbnow I'm trying to remembre all the people who might've done an override. I guess we can send email to the mailing lists23:46
fungiyeah, just cast a wide net once we're upgraded23:50
*** rlandy|bbl is now known as rlandy23:50

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!