Tuesday, 2020-04-28

*** mlavalle has quit IRC00:09
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Add ensure-virtualenv  https://review.opendev.org/72330900:10
openstackgerritIan Wienand proposed openstack/diskimage-builder master: functests: use ensure-virtualenv  https://review.opendev.org/72331600:30
openstackgerritMerged opendev/system-config master: Improve zuul-web apache config  https://review.opendev.org/72371100:38
openstackgerritIan Wienand proposed openstack/project-config master: Activate more plain nodes  https://review.opendev.org/72376901:20
*** DSpider has joined #opendev02:28
openstackgerritIan Wienand proposed openstack/project-config master: Activate more plain nodes  https://review.opendev.org/72376902:58
openstackgerritIan Wienand proposed openstack/project-config master: Remove tumbleweed-plain images  https://review.opendev.org/72378002:58
openstackgerritMerged openstack/project-config master: Remove tumbleweed-plain images  https://review.opendev.org/72378003:28
openstackgerritMerged zuul/zuul-jobs master: k8-logs: use failed_when: instead of ignore_errors:  https://review.opendev.org/72364703:56
openstackgerritMerged zuul/zuul-jobs master: container-logs: use failed_when: instead of ignore_errors:  https://review.opendev.org/72364803:57
openstackgerritJan Zerebecki proposed openstack/diskimage-builder master: Retry git clone/fetch on timeout  https://review.opendev.org/72158104:01
*** ykarel|away is now known as ykarel04:28
*** sshnaidm|afk is now known as sshnaidm|off04:34
ianwdirk / cmurphy / AJaeger : so with dib 2.36.0 released we now have fresh builds of opensuse-15 and opensuse-tumbleweed images (tumbleweed just finished)04:47
ianwi would like to move quickly on getting rid of pip-and-virtualenv from suse, as i think it doesn't have too much exposure to areas where that will be difficult04:47
ianwhttps://review.opendev.org/723769 will add opensuse-15-plain nodes, then i can double check devstack and keystone (https://review.opendev.org/723762) but i expect it to "just work"04:48
ianwif there's more areas where suse might be used outside of devstack LMN04:48
*** ysandeep|away is now known as ysandeep05:21
AJaegerthanks, ianw05:42
*** dpawlik has joined #opendev06:06
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: Add ensure-virtualenv  https://review.opendev.org/72330906:09
openstackgerritOpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml  https://review.opendev.org/72380906:22
openstackgerritMerged openstack/project-config master: Activate more plain nodes  https://review.opendev.org/72376906:32
openstackgerritMerged openstack/project-config master: Normalize projects.yaml  https://review.opendev.org/72380906:55
*** rpittau|afk is now known as rpittau07:34
*** ykarel is now known as ykarel|afk07:35
*** tosky has joined #opendev07:36
*** ralonsoh has joined #opendev07:52
*** ykarel|afk is now known as ykarel07:58
*** diablo_rojo_phon has joined #opendev07:59
*** ysandeep is now known as ysandeep|lunch08:22
*** ykarel is now known as ykarel|lunch08:39
*** ysandeep|lunch is now known as ysandeep08:53
*** andreykurilin has joined #opendev08:57
AJaegerinfra-root, Zuul is having problems, there are internal server errors08:57
openstackgerritRico Lin proposed opendev/irc-meetings master: Switch Multi-Arch SIG meeting schedule  https://review.opendev.org/72384808:58
AJaegerhttps://zuul.opendev.org/api/tenant/openstack/status gives 50008:58
AJaeger#status alert Zuul is currently failing testing, please refrain from recheck and submitting of new changes until this is solved.09:00
openstackstatusAJaeger: sending alert09:00
AJaegerinfra-root, https://review.opendev.org/#/c/723148/ shows "EXCEPTION" on changes, Zuul does not look happy at all09:00
-openstackstatus- NOTICE: Zuul is currently failing testing, please refrain from recheck and submitting of new changes until this is solved.09:00
*** ChanServ changes topic to "Zuul is currently failing testing, please refrain from recheck and submitting of new changes until this is solved."09:00
fricklerGearmanError("Unable to submit job to any connected servers")09:04
fricklerI vaguely remember something in backlog about gearman stopping, but don't want to fiddle too much with it right now, let's wait for the usual suspects to show up09:05
openstackstatusAJaeger: finished sending alert09:06
AJaeger#status alert Zuul is currently failing all testing, please refrain from approving, rechecking or submitting of new changes until this is solved.09:10
openstackstatusAJaeger: sending alert09:10
AJaegerfrickler: ok, let me make my statement clearer and spam a bit more if this takes longer. Thanks for checking.09:11
-openstackstatus- NOTICE: Zuul is currently failing all testing, please refrain from approving, rechecking or submitting of new changes until this is solved.09:11
*** ChanServ changes topic to "Zuul is currently failing all testing, please refrain from approving, rechecking or submitting of new changes until this is solved."09:11
fricklerApr 28 08:52:59 zuul01 kernel: [71922766.912535] zuul-scheduler invoked oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=009:14
openstackstatusAJaeger: finished sending alert09:17
AJaegerhttp://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=64794&rra_id=0&view_type=tree&graph_start=1587979104&graph_end=158806550409:18
AJaegerhttp://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=64792&rra_id=0&view_type=tree&graph_start=1587979104&graph_end=158806550409:19
AJaegerwow09:19
AJaegerno wonder09:19
iurygregorynice graphs =)09:34
*** ykarel|lunch is now known as ykarel09:48
*** avass has joined #opendev10:03
*** owalsh has joined #opendev10:09
*** rpittau is now known as rpittau|bbl10:10
*** ysandeep is now known as ysandeep|afk11:07
openstackgerritJan Zerebecki proposed openstack/diskimage-builder master: Retry zypper when refresh failed  https://review.opendev.org/72158711:12
openstackgerritJan Zerebecki proposed openstack/diskimage-builder master: Retry git clone/fetch on timeout  https://review.opendev.org/72158111:14
*** tosky has quit IRC11:23
*** iurygregory has quit IRC11:26
*** iurygregory has joined #opendev11:26
*** tosky has joined #opendev11:54
mordredAJaeger: jeez. what a nice thing11:58
*** panda|ruck has joined #opendev12:02
mordredinfra-root: zuul scheduler is really unhappy - I don't know that I can debug it any further than it is now, but I could certainly restart it12:07
mordredI'm not sure whether it's better to do that - which would potentially lose some debugging context - or wait for corvus to be awake, which might be a little longer12:08
*** rpittau|bbl is now known as rpittau12:08
* mordred is leaning towards restarting - since we're pretty dead in the water right now ...12:10
mordredinfra-root, config-core: ^^ any input?12:11
fricklermordred: from the cacti graphs, we seem to have a memory leak, so after a restart that will likely reappear in a day or two12:11
fricklermordred: which makes me support the restart12:11
mordredyeah. likely to recur again - hopefully at a time it can be investigated12:11
* mordred restarting zuul12:11
mordredfwiw: http://paste.openstack.org/show/792821/12:14
mordredwarnings in the log12:14
AJaegermordred: I'll take care of those warnings12:15
mordredok. zuul is back up12:16
AJaegerthanks, mordred12:18
AJaegermordred: shall we give the all-green again?12:19
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352812:19
AJaegerlike status ok Zuul has been restarted, all events are lost, recheck or re-approve any changes submitted since 9:50 UTC.12:20
AJaeger#status ok Zuul has been restarted, all events are lost, recheck or re-approve any changes submitted since 9:50 UTC.12:23
openstackstatusAJaeger: sending ok12:23
*** ChanServ changes topic to "OpenDev is a space for collaborative Open Source software development | https://opendev.org/ | channel logs http://eavesdrop.openstack.org/irclogs/%23opendev/"12:23
-openstackstatus- NOTICE: Zuul has been restarted, all events are lost, recheck or re-approve any changes submitted since 9:50 UTC.12:23
openstackstatusAJaeger: finished sending ok12:30
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx  https://review.opendev.org/72233912:30
*** dpawlik has quit IRC12:36
*** hashar has joined #opendev12:37
*** dpawlik has joined #opendev12:40
openstackgerritMonty Taylor proposed opendev/system-config master: Update to tip of master in periodic jobs  https://review.opendev.org/72388912:55
openstackgerritMonty Taylor proposed opendev/system-config master: Base run from master flag off of zuul pipeline  https://review.opendev.org/72389012:55
*** ykarel is now known as ykarel|afk13:01
*** roman_g has joined #opendev13:02
openstackgerritMonty Taylor proposed opendev/system-config master: Base run from master flag off of zuul pipeline  https://review.opendev.org/72389013:02
fungiaround now, looks like there was some excitement?13:07
fungii guess the queue backups weren't viable to reenqueue from?13:08
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352813:10
*** avass is now known as Guest8097013:10
AJaegerfungi: they were 4 hours old...13:12
AJaeger(if they existed)13:12
fungiyeah, looks like we have at best a couple hours of status snapshots retained before they're cleaned up13:16
fungithough we had a bunch of old snapshots from before friday's maintenance when the names of the files changed13:21
fungii just cleaned those out with a `find /var/lib/zuul/backup/ -mtime +2 -name \*.json|xargs rm`13:21
fungiyeah, we keep between 2-3 hours of snapshots, because we record a snapshot every minute, and then once an hour we delete all but the last 120 of them13:23
AJaegerah, interesting13:23
mordredfungi: maybe we shoudl update the curl to not save if it gets an error13:23
fungithe manpage mentions a --fail-early option13:27
fungioh, looks like --fail is what we actually want13:27
mordredfungi: yeah - I agree13:28
fungithe description for that option describes our case13:28
fungidon't record the error document, just exit nonzero13:29
mordredwell ...13:29
mordredwe don't actually get an error document13:30
mordredor - maybe we do and my current test is bong13:32
fungioh, we do get an error document13:32
openstackgerritMonty Taylor proposed opendev/system-config master: Add --fail flag to zuul status backup curl  https://review.opendev.org/72389613:33
fungilook at /var/lib/zuul/backup/openstack_status_1588071601.json for example13:33
openstackgerritMerged opendev/irc-meetings master: Switch Multi-Arch SIG meeting schedule  https://review.opendev.org/72384813:33
fungiit's full of html which includes a bit that says <h2>500 Internal Server Error</h2>13:33
mordredfungi: yah - but I betcha we also threw a 50013:33
mordredright13:34
mordred?13:34
fungipresumably, maybe syslog has the stderr from it13:34
funginop13:34
fungie13:34
fungi2>/dev/null13:34
fungiso we're silencing stderr which is probably where you'd see it echoed13:34
mordredoh well13:35
fungialso, any idea where the zuul-scheduler-status-kata-containers and zuul-scheduler-status-prune-kata-containers cronjobs are coming from?13:38
fungiare those cruft?13:38
fungiit claims they were put there by ansible13:38
fungioh, nevermind, we have a {tenant} jinja replacement13:38
fungiso we're instantiating that backup for the openstack and kata-containers tenants only, at the moment13:39
mordredyeah13:45
mordredmaybe we should install it for all tenants?13:45
mordredfungi: next time you have a sec, mind looking at https://review.opendev.org/#/c/723022/ ?13:46
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: haskell-stack-test: add haskell tool stack test  https://review.opendev.org/72326313:54
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352813:59
*** ykarel|afk is now known as ykarel14:00
openstackgerritMonty Taylor proposed opendev/system-config master: Update to tip of master in periodic jobs  https://review.opendev.org/72388914:07
openstackgerritMonty Taylor proposed opendev/system-config master: Base run from master flag off of zuul pipeline  https://review.opendev.org/72389014:07
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: haskell-stack-test: add haskell tool stack test  https://review.opendev.org/72326314:08
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: Use 'block: ... always: ...' instead of ignore_errors  https://review.opendev.org/72364014:08
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ensure-sphinx: use failed_when: false instead of ignore_errors: true  https://review.opendev.org/72364214:08
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: go: Use 'block: ... always: ...' and failed_when instead of ignore_errors  https://review.opendev.org/72364314:08
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ara-report: use failed_when: false instead of ignore_errors: true  https://review.opendev.org/72364414:08
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: use failed_when: instead of ignore_errors:  https://review.opendev.org/72365314:08
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: add-build-sshkey: use failed_when: instead of ignore_errors:  https://review.opendev.org/72365414:08
AJaegerroman_g: could you abandon the open reviews for airship-in-a-bottle, please?14:11
openstackgerritGuilherme  Steinmuller Pimentel proposed openstack/project-config master: Add vexxhost/google-directory-api-linux-agent  https://review.opendev.org/72390414:12
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: Use 'block: ... always: ...' instead of ignore_errors  https://review.opendev.org/72364014:14
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ensure-sphinx: use failed_when: false instead of ignore_errors: true  https://review.opendev.org/72364214:14
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: go: Use 'block: ... always: ...' and failed_when instead of ignore_errors  https://review.opendev.org/72364314:14
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ara-report: use failed_when: false instead of ignore_errors: true  https://review.opendev.org/72364414:14
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: use failed_when: instead of ignore_errors:  https://review.opendev.org/72365314:14
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: add-build-sshkey: use failed_when: instead of ignore_errors:  https://review.opendev.org/72365414:14
roman_gAJaeger: will do today.14:15
AJaegergreat14:17
*** ysandeep|afk is now known as ysandeep14:19
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx  https://review.opendev.org/72233914:20
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: Use 'block: ... always: ...' instead of ignore_errors  https://review.opendev.org/72364014:20
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ensure-sphinx: use failed_when: false instead of ignore_errors: true  https://review.opendev.org/72364214:20
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: go: Use 'block: ... always: ...' and failed_when instead of ignore_errors  https://review.opendev.org/72364314:20
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ara-report: use failed_when: false instead of ignore_errors: true  https://review.opendev.org/72364414:20
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: use failed_when: instead of ignore_errors:  https://review.opendev.org/72365314:20
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: add-build-sshkey: use failed_when: instead of ignore_errors:  https://review.opendev.org/72365414:20
openstackgerritMerged opendev/system-config master: Use the sync-project-config role in service-zuul  https://review.opendev.org/72302214:28
*** rpittau is now known as rpittau|brb14:32
corvuswin 1214:34
mordredinfra-root: https://review.opendev.org/#/c/723889/ is an impl of the "run from tip of master in periodic" we discussed yesterday. the followup https://review.opendev.org/#/c/723890 is a modification that I think might be more maintainable14:36
*** jrichard has joined #opendev14:39
corvusmordred: what's the thinking behind the naming of the var as "zuul_base_..."?14:41
jrichardHow do I get added as the first core reviewer for the https://opendev.org/starlingx/portieris-armada-app repo?14:42
fungijrichard: one of us adds you, and then you can add whoever else you need14:44
corvusjrichard, fungi: done14:44
fungithanks corvus! you beat me to it14:44
jrichardthanks14:46
*** jrichard has quit IRC14:46
*** mlavalle has joined #opendev14:47
mordredcorvus: well - now that you say it - bad thinking - my first patch was using the var in the run-base playbook - which was a mistake14:49
mordredcorvus: how about I name that something completely different?14:49
corvusmordred: sounds great.  maybe just squash all that too; i'm strongly in favor of the pipeline check.14:52
mordredcorvus: ok. cool14:52
openstackgerritMonty Taylor proposed opendev/system-config master: Update to tip of master in periodic jobs  https://review.opendev.org/72388914:53
mordredcorvus: done ^^14:53
mordredcorvus: since we're doing scheduler-in-docker-compose now - would it maybe make sense to run gearman in a separate container instead of  as a scheduler subprocess? that way if it gets oom-killered, compose could restart it? (obviously we don't want to be ooming)14:56
*** rpittau|brb is now known as rpittau14:57
clarkbfungi: at 09:30 ish today we see bup running in the dstat data. I think that rules bup out as a cause14:57
clarkbfungi: then 10:16 ish we see a bunch of different listinfo processes14:58
clarkbfungi: those are started by apache requests which makes me think that indexing bot is the next thing to rule out14:58
clarkbfungi: any objection to adding a robots.txt to the apache docroot that tells SEMrush bot to go away?14:59
*** jrichard has joined #opendev14:59
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352815:00
fungiclarkb: yeah, looks like we had two oom kills today, 10:17:43 and 10:18:1315:01
corvusmordred: no objections15:01
*** ykarel is now known as ykarel|away15:01
fungiclarkb: sounds reasonable... https://www.semrush.com/bot/ https://www.knownhost.com/forums/threads/fighting-semrushbot.4461/ https://forums.digitalpoint.com/threads/how-to-block-semrushbot.2800415/15:03
fungiapparently we're not the only ones it's causing issues fr15:03
fungifor15:03
clarkbfungi: I do wonder if this is at all related to things jimmy is doing15:03
clarkbsince its relatively recentl15:03
roman_gAJaeger: done15:12
*** hashar has quit IRC15:15
AJaegerroman_g: thanks. config-core, https://review.opendev.org/#/c/720160/ is ready to retire airship-in-a-bottle, please review15:22
clarkbinfra-root I've put a robots.txt in /var/www/ on lists.openstack.org. This seems to be served by apache properly. If this changes the OOMing situation we can encode that into puppet15:36
clarkbhttp://lists.opendev.org/robots.txt if you want to see it15:37
mordredclarkb: cool15:37
corvusclarkb: ++15:43
fungithanks clarkb!!!15:50
*** lpetrut has joined #opendev15:59
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx  https://review.opendev.org/72233916:03
mordredclarkb: have a sec for https://review.opendev.org/#/c/723889/ ?16:05
mordredclarkb: also https://review.opendev.org/#/c/723105/16:05
clarkbmordred: ya I need to reboot and then find food but reviewing things is on my todo list16:06
mordredclarkb: cool. I think both of those would be nice to get in16:11
clarkbmordred: corvus also I think we may need to restart apache on zuul scheduler to pick up the caching changes I made16:13
clarkbwe do seem to get gzipped status json now which us nice16:13
clarkbbut the js files may not be compressed yet?16:13
yoctozeptomorning16:16
AJaegerconfig-core, https://review.opendev.org/#/c/720160/ is ready to retire airship-in-a-bottle, please review16:16
yoctozeptoany idea why zuul chooses python2 on aarch64 nodes in CI?16:16
yoctozeptohttps://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_58b/723361/8/check-arm64/kolla-ansible-debian-source-aarch64/58bbcb8/zuul-info/host-info.primary.yaml16:17
clarkbyoctozepto: I don't think we explicitly set it? Ansible does autodetection16:17
clarkbyou can override that though16:18
*** rpittau is now known as rpittau|afk16:18
clarkbmordred: does setting a git repo like https://review.opendev.org/#/c/723889/3/playbooks/zuul/run-production-playbook.yaml imply update to latest?16:18
yoctozeptoyeah, it's odd because it does fine on x86_64 though16:18
clarkbmordred: looks like update yes is the default?16:18
yoctozeptohmm, looks like ansible has no debian defaults16:20
yoctozeptoand goes fallback16:20
yoctozeptoand first is /usr/bin/python16:20
yoctozeptoonly then python316:21
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx  https://review.opendev.org/72233916:21
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: DNM Run builder tests on expanded node  https://review.opendev.org/72407916:21
yoctozeptoso it must be that x86_64 debian images have no python2 or it is linked via /usr/bin/python hmm16:21
openstackgerritMerged zuul/zuul-jobs master: Do not set buildset_fact if it's not present in results.json  https://review.opendev.org/72352416:22
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: DNM Run builder tests on expanded node  https://review.opendev.org/72407916:23
yoctozeptohmm, semingly it's the same one16:23
yoctozeptoneed to dig more as to why it works then16:23
clarkbyoctozepto: python == python216:23
yoctozeptothanks clarkb for clearing up the dep16:23
clarkbthat should be pretty universal on all our systems16:23
mordredclarkb: yes - I believe it does16:24
yoctozeptoclarkb: yeah, that's why it should fail on master :-)16:24
clarkbyoctozepto: I don't understand what you mean by that16:24
clarkbzuul's python should be completely independent of whatever you are testing's python16:24
yoctozeptoit fails on aarch64 because of python216:25
yoctozeptoyet not on x86_6416:25
yoctozeptoyeah, I mean we might have some dependency on it, need to investigate eh16:25
yoctozeptowe = kolla-ansible16:25
*** DSpider has quit IRC16:26
clarkbmordred: comment on https://review.opendev.org/#/c/723889/316:27
yoctozepto"The setuptools package must be installed for both the Ansible Python interpreter and for the version of Python specified by this option."16:30
yoctozeptoquirk of the year, thanks ansible!16:30
yoctozeptoso it's just that x86_64 image has python2 setuptools and it hides this quirk...16:31
AJaegerconfig-core, please review https://review.opendev.org/#/c/723309/ and https://review.opendev.org/#/c/720160/16:32
*** diablo_rojo has joined #opendev16:40
*** DSpider has joined #opendev16:42
clarkbmordred: where does https://review.opendev.org/#/c/723105/2 set python2 for refstack?16:47
*** lpetrut has quit IRC16:51
fungiper distutils-sig ml, pip 20.1 is going to be released any moment16:52
fungikeep an eye out for new disruption related to that16:52
mordredclarkb: in a file on my laptop16:53
openstackgerritMonty Taylor proposed opendev/system-config master: Use python3 for ansible  https://review.opendev.org/72310516:54
mordredclarkb: how's that?16:54
clarkbmordred: +2 thanks. Also see note on the other change you linked16:55
mordredclarkb: doh17:01
openstackgerritMonty Taylor proposed opendev/system-config master: Update to tip of master in periodic jobs  https://review.opendev.org/72388917:01
mordredclarkb: thanks17:01
mordredcorvus: ^^ clarkb found an oops, can you re-review?17:02
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx  https://review.opendev.org/72233917:06
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: DNM Run builder tests on expanded node  https://review.opendev.org/72407917:06
*** roman_g has quit IRC17:12
*** roman_g has joined #opendev17:13
*** roman_g has quit IRC17:20
clarkbinfra-root is now a good time to restart apache on zuul.open*.org? I think that is necessary to pick up the caching changes17:44
clarkb(its really hard to tell if that is working properly due ot how apache hashes cached things)17:44
mordredclarkb: wfm17:45
*** dpawlik has quit IRC17:48
clarkbactually I kinda of what to remove /etc/apache2/sites-enabled/{4,5}0-zuul* files too17:49
clarkbmordred: ^ those are files leftover from puppetry. Are you ok with my moving them aside and restarting to ensure the whole ansible deployed config is happy?17:49
clarkbthat should help reduce confusion in future debugging too17:49
mordredclarkb: yeah please17:52
clarkbcool I'll do that shortly17:52
clarkbok I've restarted apache and it is all happy18:10
clarkbunfortuantely I've in the process realized the vhost update didn't get applied because this job failed https://zuul.opendev.org/t/openstack/build/d1ed5613516e4735a7a933d87782151718:10
clarkbmordred: ^ fyi18:10
clarkbthere is a bug in our zuul ansible /me tries to figure it out18:12
mnaserhm18:14
mnaseris the intermediate registry broken by any hcance o nopendev right now?18:14
mnaserit seems to be returning 404 making jobs fail in post18:14
mnaserhttps://zuul.opendev.org/t/vexxhost/build/2c5e2247629946b18c0d7372def7400618:15
mordredclarkb: it looks like it didn't run at all18:15
openstackgerritClark Boylan proposed opendev/system-config master: Don't restart the zuul scheduler in prod  https://review.opendev.org/72411518:16
clarkbmordred: ^ thats at least part of it18:16
mnaseractually, looking again, that's 127.0.0.1 that its not copying from aka buildset registry18:17
mnaserso might be a zuul thing18:17
mordredclarkb: poop. good catch18:17
clarkbmordred: we run zuul-web after zuul-scheduler so we were bailing out before we got to web18:17
mordredclarkb: nod18:17
clarkbmnaser: I think 404s from the buildset registry are "normal" ?18:18
clarkbmnaser: the docker config is supposed to also try docker hub as well iirc18:18
mnaserclarkb: this 404 is when its trying to do the skopeo copy to the intermediate registry18:18
mnaserspecifically inside the `push-to-intermediate-registry` role18:19
clarkbah ya that wouldn't be a case of normal 404s then18:19
mnaserhttps://opendev.org/zuul/zuul-jobs/commit/6fb73060ec919d4e2364e418db84ce6aaa50492d seems to line up to the timeline of failures18:19
* mnaser moves to #zuul18:20
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: stat for result.json on the executor  https://review.opendev.org/72411618:26
mnaser(for completion: that was a zuul-jobs issue we just fixed with ^ opendev is okay -- other than likely being affected by the same thing inside zuul/zuul-jobs)18:31
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352818:34
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Revert "Do not set buildset_fact if it's not present in results.json"  https://review.opendev.org/72412018:35
*** ysandeep is now known as ysandeep|away18:37
fungiokay, pip 20.1 release is now up at https://pypi.org/project/pip/ so keep your eyes peeled18:37
fungialso looks like the plan is that pip 21 (likely a few months out still) will be dropping python 2.7 support18:39
fungimost of the meat of 20.1 is described in the 20.1b1 release notes18:40
fungipip freeze output for things installed from not-pypi (particularly distro packages) is changing18:41
fungibuilds now happen in-place instead of in a temporary location followed by a copy18:41
fungi--user and --target options can't be combined any longer18:42
fungithere's a preview implementation of the new resolver logic included, but it's disabled by default still18:43
fungiinvocation as python -m pip now removes the cwd from the import path18:44
fungiresolvelib and toml (replacing pytoml) were added to the vendored modules, and versions were updated for vendored copies of certifi, contextlib2, distro, idna, msgpack, packaging, pep517, pyparsing, requests, and urllib318:46
fungithose are the things most likely to impact jobs we run, i think18:47
fungihttps://pip.pypa.io/en/stable/news/ for the complete release notes18:47
*** ralonsoh has quit IRC18:51
mordredfungi: fingers crossed it doesn't break the world19:01
mordredianw, corvus, fungi: got a quick sec for https://review.opendev.org/#/c/723105 ?19:02
openstackgerritMerged zuul/zuul-jobs master: Revert "Do not set buildset_fact if it's not present in results.json"  https://review.opendev.org/72412019:09
openstackgerritMerged openstack/project-config master: Add vexxhost/google-directory-api-linux-agent  https://review.opendev.org/72390419:12
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Revert "Revert "Do not set buildset_fact if it's not present in results.json""  https://review.opendev.org/72413219:13
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Revert "Revert "Do not set buildset_fact if it's not present in results.json""  https://review.opendev.org/72413219:23
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Revert "Revert "Do not set buildset_fact if it's not present in results.json""  https://review.opendev.org/72413219:24
openstackgerritClark Boylan proposed opendev/system-config master: Don't restart the zuul scheduler in prod  https://review.opendev.org/72411519:25
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Revert "Revert "Do not set buildset_fact if it's not present in results.json""  https://review.opendev.org/72413219:26
clarkbmordred: https://review.opendev.org/724115 took a different approach there that hopefully makes testing happier19:26
mordredclarkb: ++19:29
openstackgerritMerged opendev/system-config master: Use python3 for ansible  https://review.opendev.org/72310519:53
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352819:55
mordredianw: were you the one who tracked down ntp vs systemd-timesync before?19:56
mordredianw: if so - see ^^ - there's an issue in focal with systemd-timesync too19:56
ianwmordred: that sounds like something simultaneously i remember and also have actively forgotten :)19:58
ianwthat goes back to ntp for focal?20:04
ianwi'm thinking afs01.dfw.openstack.org is not happy20:09
ianwping but no ssh for me ... pulling up a console20:09
ianwyeah, console there has ... bad stuff ... hung tasks etc and no response.  going to reboot it20:11
fungii wouldn't be surprised if we get a ticket from rackspace in the next few minutes saying there's a problem with the hypervisor host that vm is on20:12
ianwok, it's back ... i'm not about static01 now though20:13
ianwit's probably worth a reboot too with all the hung i/o apache processes20:13
ianw#status log reboot afs01.dfw.openstack.org due to host hypervisor issues killing server20:14
openstackstatusianw: finished logging20:14
ianwstatic rebooting20:14
ianwit couldn't ls /afs/openstack.org20:15
*** rchurch has joined #opendev20:16
smcginnisThat explains the failures I saw.20:17
ianwok it seems happy again20:17
ianw#status log follow-up reboot of static01 after a lot of hung i/o processes due to afs01 issues20:18
openstackstatusianw: finished logging20:18
ianwi think it's ok if a sever just goes away ... but when they end up in a weird zombie state is when things go wrong20:18
ianwafs server20:19
ianwsmcginnis: yeah, unfortunately with a busy that that will have some effect on some jobs :/20:19
smcginnisIt was a promote job, so no big deal. It will just get taken care of with the next merge.20:20
ianwmordred: "Ubuntu 19.10's systemd package introduced /lib/systemd/system/systemd-timesyncd.service.d/disable-with-time-daemon.conf. This prevents systemd-timesyncd.service from starting if the ntp package has been installed. "20:23
ianwhttps://bugs.launchpad.net/ubuntu/+source/systemd/+bug/184915620:23
openstackLaunchpad bug 1849156 in systemd (Ubuntu Eoan) "systemd-timesyncd.service broken on upgrade to 19.10 if ntp was installed" [High,Confirmed]20:23
ianwmordred: yeah, we're preinstalling ntp during build -- https://nb04.opendev.org/ubuntu-focal-0000000004.log -- so i think that means systemd-timesyncd masks itself20:31
clarkbianw: thank you for taking care of afs20:38
*** diablo_rojo has quit IRC20:39
ianwnp20:40
*** diablo_rojo has joined #opendev20:42
clarkbinfra-root https://review.opendev.org/#/c/724115/ that passes testing now and should fix our zuul-scheduler ansible applications20:42
clarkb(this will have the side effect of updating our apache vhost config to compress javascript and css files as they are quite large for zuul dashbaord)20:43
ianwi have to afk for a bit but lgtm20:50
openstackgerritMerged zuul/zuul-jobs master: Add ensure-virtualenv  https://review.opendev.org/72330920:51
clarkbianw: ^ fyi20:52
mordredclarkb: so that it doesn't fall into the cracks: https://review.opendev.org/#/c/723896/21:09
mordredclarkb: (when we restarted zuul this morning after OOM, there were no status backups to replay)21:10
clarkbmordred: looking21:10
mordredclarkb, ianw: I'm landing the patch to update nodepool system-config tests to run a real zookeeper. it should have no production impact - mostly just trying to squish the outstanding stacks21:12
mordredclarkb: I think you could rebase your reorg of playbooks on top of https://review.opendev.org/#/c/720527/ - I don't expect we have much _structural_ outstanding21:13
fungiyeah, it's maybe less awesome because it will now no longer record the content of error reports from the status api, but ultimately that was probably a poor means of debugging intermittent api errors regardless21:13
mordredyeah21:13
fungicounterargument, maybe hours-old status snapshots are useless21:14
fungiit's a good coffee-talk topic21:14
mordredmaybe - unless zuul has been effectively AWOL the whole time21:14
mordredso maybe they're better than nothning? but maybe they are useless21:14
clarkbmordred: looking (fwiw I was able to reproduce the zuul test failures that happened in CI locally and so am being distracted trying to sort that out21:17
mordredooh- that's exciting21:19
clarkbhrm we don't collect logs on successful tests so hard to know if this exception is expected or not :) anyway I'm learning things /me dives back into the hole21:27
*** DSpider has quit IRC21:34
openstackgerritMonty Taylor proposed opendev/system-config master: Run zookeeper cluster in nodepool jobs  https://review.opendev.org/72070921:52
*** jrichard has left #opendev22:30
openstackgerritMerged opendev/glean master: Add container build jobs  https://review.opendev.org/72328522:43
diablo_rojomordred, corvus would I be correct in my understanding you are working on getting jitsi setup? (I apologize for being very out of the loop)22:57
fungidiablo_rojo: a poc is hosted at https://meetpad.opendev.org/ and there's a spec outlining the plan at https://docs.opendev.org/opendev/infra-specs/latest/specs/jitsi-meet.html23:02
corvusdiablo_rojo: yes, i think most of the infra team has pitched in on it actually :)  --  we think it's basically done -- at least, done enough for more testing to find out what isn't done23:02
corvusi have a small todo to add an http->https redirect (which is important because it does not work over http)23:03
openstackgerritClark Boylan proposed opendev/system-config master: Organize zuul jobs in zuul.d/ dir  https://review.opendev.org/72239423:07
clarkbmordred: corvus ^ rebased (actually I just redid it as I think that was simpler23:07
diablo_rojoI had an idea for testing23:08
diablo_rojoEnd of release week I was planning a like.. celebration for ussuri. We could use that and see how it goes?23:08
diablo_rojoI don't know how many people will join exactly.23:08
diablo_rojoFigured I would throw it out there as an idea.23:08
diablo_rojoAssuming thats not too soon?23:09
diablo_rojofungi, thank you!23:09
fungidiablo_rojo: note that there's no dial-in access, it's internet-only23:09
diablo_rojo..I think that is okay? I dunno. Hadn't thought it all the way through really.23:10
corvusdiablo_rojo: that sounds like a great opportunity to throw people at the server and see what happens :)23:10
johnsommordred I see you have been doing some focal work. Have you run into an issue where the instance doesn't bring up the NIC?23:10
openstackgerritJames E. Blair proposed opendev/system-config master: Meetpad: redirect 80 to 443  https://review.opendev.org/72419923:11
corvusif i'm nginxing correctly, i think that's what's needed for the redirect ^23:11
diablo_rojocorvus, cool :) I can have a zoom link as a backup, but I figured it would be a good chance to test things in a more.. real world scenario?23:11
clarkbjohnsom: I don't think we've seen that but we also use glean23:11
johnsomI think our issue is we are adding the ifupdown package, which seems to confuse the new networking in focal and/or cloud-init23:12
johnsomOk, thought I would ask before I get too deep into this strangeness. Thanks23:13
diablo_rojocorvus, I'll get a link from you maybe early next week? (assuming that patch does the redirect you need)23:13
corvusdiablo_rojo: also note that meetpad is etherpad-focused, so if it works as designed, then people will see a big etherpad in the middle with faces on the side (but you can still turn on and off the etherpad individually).  so it's not the best party venue... or, well, it's like a party venue with a giant whiteboard in the middle... which is great if you want to play ascii pictionary i guess.  so, in short:23:14
corvusawesome party venue.  :)23:14
corvusdiablo_rojo: no need to wait, just head on over to https://meetpad.opendev.org/ any time23:14
corvusdiablo_rojo: (just make sure to use https:)23:15
corvus(we just need to land that http redirect so that if someone types "meetpad.opendev.org" into their browser without a protocol they go to the right place)23:16
diablo_rojocorvus, oh cool. Thanks and will do :)23:16
diablo_rojo(and noted)23:16
*** tosky has quit IRC23:39
SotKooh that etherpad integration is lovely23:46

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!