Wednesday, 2020-04-29

*** diablo_rojo has quit IRC00:01
*** mlavalle has quit IRC00:23
*** diablo_rojo has joined #opendev00:31
ianwjohnsom: instance as in that nic should be brought up by glean?00:58
mordredjohnsom: I have not seen that issue01:17
mordredianw: oh - haha: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_276/723528/9/check/system-config-run-zuul/276cc43/bridge.openstack.org/ara-report/result/4a8210fa-682e-4af6-b689-fdc263b00b49/01:18
mordredianw: this is where I'm at atm01:18
johnsomianw: I am just using the ubuntu-minimal element, which used cloud-init. I am pretty sure it is the ifupdown we add (historical reasons) that is the trouble maker.01:20
johnsomI will dig in tomorrow01:20
johnsommordred: thank you.01:20
mordredjohnsom: we use simple-init instead of cloud-init01:21
mordredjohnsom: you might try that - ubuntu-minimal element itself doesn't install cloud-init I don't think - the ubuntu one does01:21
mordredbut you could try ubuntu-minimal simple-init and see if it works for you - it's working well for us01:22
mordredianw: 1.8.4~pre1-1ubuntu2 <-- focal has that version of afs - which is later than what's in our ppa01:22
mordredianw: maybe we should skip the ppa on foca?01:22
mordredfocal?01:22
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352801:25
mordredianw: giving that a try01:25
ianwmordred: i'm very suspect of anything with ~pre in it wrt to afs01:31
ianwmordred: i think we should update to 1.8.5, https://launchpad.net/~openafs/+archive/ubuntu/stable has packages but not for arm6401:38
ianwmordred: i've got 1.8.5 building for focal @ https://launchpad.net/~openstack-ci-core/+archive/ubuntu/openafs/+packages now ... let's see how it goes01:55
ianwi don't think it's worth updating the ppa with 1.8.5 for older distros.  if it ain't broke ...02:02
ianwamd64 and arm64 built.  will run integration tests when they publish02:22
openstackgerritIan Wienand proposed opendev/system-config master: Add focal integration tests  https://review.opendev.org/72421402:30
ianw^ passed ... so i think that's the way to go02:59
clarkbI'll try to figure out why https://review.opendev.org/#/c/722394/ is failing forst thing tomorrow to reduce time its sitting out there conflicting03:12
clarkbI think the docs issue is a zuul sphinx bug03:14
*** diablo_rojo has quit IRC04:51
*** ykarel|away is now known as ykarel04:51
*** ysandeep|away is now known as ysandeep05:45
*** ysandeep is now known as ysandeep|brb05:55
*** rpittau|afk is now known as rpittau06:32
*** ysandeep|brb is now known as ysandeep06:51
*** DSpider has joined #opendev06:57
*** dpawlik has joined #opendev06:59
*** tosky has joined #opendev07:29
*** ykarel is now known as ykarel|afk07:29
*** ysandeep is now known as ysandeep|lunch07:42
*** ralonsoh has joined #opendev07:48
kevinzArm64 bionic Image fail to setup Devstack today, with newly build image: https://zuul.opendev.org/t/openstack/build/2b2cbea8882844bfa0bf5cc62c70524208:04
*** ykarel|afk is now known as ykarel08:21
*** ysandeep|lunch is now known as ysandeep08:26
*** ykarel is now known as ykarel|lunch08:35
*** ykarel|lunch is now known as ykarel09:44
*** mrunge has quit IRC09:57
*** mrunge has joined #opendev09:57
*** rpittau is now known as rpittau|bbl10:55
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add loop var policy to ansible-lint  https://review.opendev.org/72428111:02
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add loop var policy to ansible-lint  https://review.opendev.org/72428111:22
*** ysandeep is now known as ysandeep|coffee11:29
*** rpittau|bbl is now known as rpittau12:31
openstackgerritThierry Carrez proposed openstack/project-config master: Add Github mirroring job to all official repos  https://review.opendev.org/72431012:33
openstackgerritThierry Carrez proposed opendev/system-config master: Disable global Github replication  https://review.opendev.org/71847812:37
openstackgerritThierry Carrez proposed openstack/project-config master: Add Github mirroring job to all official repos  https://review.opendev.org/72431012:50
openstackgerritThierry Carrez proposed openstack/project-config master: Add Github mirroring job to all official repos  https://review.opendev.org/72431012:54
*** ykarel is now known as ykarel|afk12:58
mordredianw: awesome12:58
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352812:59
openstackgerritThierry Carrez proposed openstack/project-config master: Add Github mirroring job to all official repos  https://review.opendev.org/72431013:00
* ttx sighs13:02
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352813:07
openstackgerritMonty Taylor proposed opendev/system-config master: Run test playbooks with more forks  https://review.opendev.org/72431713:07
mordredttx: yes?13:08
openstackgerritThierry Carrez proposed openstack/project-config master: Add Github mirroring job to all official repos  https://review.opendev.org/72431013:10
ttxmordred: things my local tsets fail to catch13:11
mordredttx: oh look - there are some repos in governance that have been retired13:11
ttxfun13:11
AJaegerttx, what about using a project-template?13:11
ttxAJaeger: would that really be less wordy?13:12
AJaegerttx, one line instead of three - and if you ever want to change, it's easier.13:12
mordredttx: I look forward to a release-bot job that runs on governance and project-config changes and proposes update patches :)13:12
AJaegerName it "official-openstack-repo-jobs" ;)13:12
ttxhmmm13:13
AJaegeror something like that...13:13
ttxok. I just need to change the code that actually generates that change :)13:13
mordredttx: also - fwiw, if you grab project-config and look in gerrit/projects.yaml - if acl-config for a project is "/home/gerrit2/acls/openstack/retired.config" - it's retired13:13
mordredI don't know if that makes anything easier - but just mentioning in case it does13:14
ttxI already had a lot of fun with your weird alpha comparison in there13:14
mordredwe have weird alpha comparison?13:14
ttxyou normalize - and _13:14
mordredoh - good for us13:14
ttxI know, right13:14
mordredwe like to make things easier for you13:15
mordredit's what we're here for13:15
mordredespecially when jeepyb is involved13:15
ttxYou think I'm getting too lazy on my python coding13:15
ttxso you throw weird curve balls at me13:15
mordredI do - I wouldn't want your muscles to atrophy13:15
ttxok, stay put, adding template13:15
AJaegerttx, we don't normalize - sort does13:16
ttxhmm13:16
AJaegerLC_ALL=C sort --ignore-case ...13:16
ttxI think you do https://opendev.org/openstack/project-config/src/branch/master/tools/zuul-projects-checks.py#L4113:17
AJaegerttx, yeah that surprised me once as well13:17
ttxreturn s.lower().replace("_", "-")13:17
AJaegerttx, you're right. I looked at the wrong place.13:17
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: tox: allow tox to be upgraded  https://review.opendev.org/69005713:24
fungittx: mordred: in the opendev metrics script i'm using, i check whether project.state is ACTIVE (as opposed to READ_ONLY) as a proxy for identifying what isn't retired13:33
fungithough i ultimately only use that as a filter right now for building a list of namespaces with at least one non-retired project13:34
fungican just hit the gerrit rest api method for /project/13:35
fungianonymously even13:35
fungino need to separately parse acl config options in project-config13:35
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add loop var policy to ansible-lint  https://review.opendev.org/72428113:36
fungittx: actually if you call for a list of projects by regex like ^openstack/.* then gerrit will give you just that subset. filter it by ['state'] == 'ACTIVE' and then if you want, use a set.intersection() against the set of deliverable repos13:39
fungifrom governance13:40
fungiassuming you really just want the subset of writeable openstack namespace repos and deliverable repos in the governance projects.yaml, that is13:41
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add loop var policy to ansible-lint  https://review.opendev.org/72428113:41
*** ykarel|afk is now known as ykarel13:46
ttxAJaeger: where are templates defined those days?13:47
AJaegeropenstack-zuul-jobs13:48
ttxAlso to make it one line instead of 3 I have to break how 'templates:' is used in that file13:48
AJaegerI would be fine to put the template into a new file in project-config/zuul.d/project-templates.yaml13:50
AJaegerttx, let's see what other reviewers think...13:50
ttxopenstack-zuul-jobs looks ok to me13:51
hrwmorning14:01
hrwcan someone look at arm64 mirrors in opendev infra? jobs fail while connecting14:02
*** rkukura has quit IRC14:03
AJaegerttx, I updated your description a bit.14:04
*** rkukura has joined #opendev14:04
openstackgerritSorin Sbarnea proposed zuul/zuul-jobs master: Add testing of fetch-sphinx-tarball role  https://review.opendev.org/72158414:05
openstackgerritThierry Carrez proposed openstack/project-config master: Add Github mirroring job to all official repos  https://review.opendev.org/72431014:06
ttxarh14:07
AJaegerttx, now double the size? Is that what arh means?14:08
openstackgerritThierry Carrez proposed openstack/project-config master: Add Github mirroring job to all official repos  https://review.opendev.org/72431014:08
ttxsorry, I fumbled14:08
ttxok, this one /should/ be ok14:09
ttx(famous last words)14:09
ttxgood thing being, my script is not idempotent :)14:10
ttxs/not/now14:10
ttxI need to sleep14:10
fungihrw: have a link to a failed build report? it'll be easier to start from there14:11
fungihrw: since we have arm-specific regions, i'm going to guess something has happened to the local mirror server in one of them14:12
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352814:12
AJaegerttx, LGTM14:12
*** hashar has joined #opendev14:12
mordredcorvus, fungi: apparently https://podman.io/getting-started/installation.html says that we should be getting podman and friends from opensuse kubic now rather than the ppa14:13
ttxalso submitted https://review.opendev.org/724334 so that the repos removed from Zuul are no longer in governance14:13
AJaegerinfra-root, please review  https://review.opendev.org/718478 https://review.opendev.org/724329 and https://review.opendev.org/#/c/724310 together - that's github mirroring change14:13
ttxThey should be linked by depends on14:14
AJaegerttx, https://review.opendev.org/#/c/721723/ exists for i18n-specs, let's get it merged ;)14:14
*** avass has joined #opendev14:15
ttxwait, I left the post job defined in release-test14:15
AJaegerttx, they are correctly linked - still I wanted to have people review them toggether.14:15
ttxone laaaast update14:15
zbrdo we still have problems with zuul? i just got a job failed with "EXCEPTION" at https://review.opendev.org/#/c/721844/14:15
hrwfungi: https://zuul.opendev.org/t/openstack/build/50687729da274e74a70ad8fd9e9fb26d https://zuul.opendev.org/t/openstack/build/c1a316962a044ac79917b2a4a7130777 https://zuul.opendev.org/t/openstack/build/fcdc90d63fde4038aa6dc4fd1689ac7f14:16
hrwsorry, had a call14:16
hrwfungi: ubuntu debian centos814:16
openstackgerritThierry Carrez proposed openstack/project-config master: Add Github mirroring job to all official repos  https://review.opendev.org/72431014:16
AJaegerinfra-root, Zuul is dead again - https://zuul.opendev.org/tenants gives error 50014:17
AJaegerinfra-root, seem again run out of swap: http://cacti.openstack.org/cacti/graph.php?action=zoom&local_graph_id=64794&rra_id=0&view_type=tree&graph_start=1588083483&graph_end=158816988314:18
corvusi'll restart it14:20
AJaegerthx, corvus14:20
fungioom already killed a scheduler process (likely the geard fork again) at 14:01:3214:20
corvussomething happened to free up memory at 22:40 yesterday14:21
corvusit looks like zuul-web is the big user here14:21
fungithe fact that our last good queue backup (openstack_status_1588168861.json) was at 14:01 i think confirms it was geard which got killed14:21
mordredcorvus: clarkb updated the apache to do better caching14:22
fungier, no, just that's also when the web process started returning 50014:22
corvusmordred: did that get restarted?14:22
mordredcorvus: I think I remember him saying he was going to restart apache to pick up the new settings14:23
corvuscool; i'll restart it again just in case14:23
fungi2020-04-28 12:12:33,399 INFO zuul.Scheduler: Starting scheduler14:23
fungithat's the last scheduler start i find in the logs14:23
mordredcorvus: kk14:23
fungiwhich was after the oom a couple days ago i think14:23
fungi(immediately after i mean)14:23
corvuszuul-web isn't shutting down  after docker-compose stop14:24
*** priteau has joined #opendev14:24
corvusnor fingergw14:25
corvusi ran 'docker stop' + their container ids14:25
corvusthat seems to have stopped them14:25
mordredcorvus: wow14:25
avasscorvus: interesting14:25
corvusstarting again14:25
corvusoh14:26
corvusi forgot they are in different docker-compose directories14:26
corvusso operator error on my part there, sorry14:27
*** tkajinam has joined #opendev14:27
corvus-rw-r--r-- 1 root root 1340022 Apr 29 14:01 openstack_status_1588168861.json14:27
corvusi'll restore from that14:27
AJaegershould we send an all green status afterwards?14:27
fungithanks, that matches the last good queue backup i found as well14:28
*** sean-k-mooney has joined #opendev14:28
*** stephenfin has joined #opendev14:28
*** dpawlik has quit IRC14:28
fungidoesn't look like the --fail option addition patch for that curl cronjob has gotten applied yet14:28
corvusit's up and re-enqueing14:29
corvusyeah, a status notice indicating 14:04 - 14:30 should be rechecked would probably be good14:30
fungi14:01 looks like?14:30
fungithough also who knows how many events might have also been lost in its events queue14:31
AJaeger status notice Zuul had to be restarted, all changes submitted or approved between 14:04 UTC to 14:30 need to be rechecked, we rechecked already those running at 14:0414:31
AJaegeris that good? Anybody better wording?14:31
corvusAJaeger: yes, except i typod: should be 14:01.  maybe just say 14:0014:31
fungii would probably round that start time to 14:00 just to be on the safe side, though you could say we've queued earlier changes14:32
fungi(we didn't technically recheck them)14:32
AJaegerSo: #status notice Zuul had to be restarted, all changes submitted or approved between 14:00 UTC to 14:30 need to be rechecked, we queued already those running at 14:0014:32
fungilgtm14:33
AJaeger#status notice Zuul had to be restarted, all changes submitted or approved between 14:00 UTC to 14:30 need to be rechecked, we queued already those running at 14:0014:33
openstackstatusAJaeger: sending notice14:33
-openstackstatus- NOTICE: Zuul had to be restarted, all changes submitted or approved between 14:00 UTC to 14:30 need to be rechecked, we queued already those running at 14:0014:33
openstackstatusAJaeger: finished sending notice14:36
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352814:38
corvusdone enqueing14:41
corvuswe should check in on this in a few hours and see what's going on14:41
*** mlavalle has joined #opendev14:41
fungiso looking at our scheduler restart timelines, we didn't restart for weeks up until 2020-04-2014:42
fungithere was a memory jump on 2020-04-16 but corvus you said that coincided with when you were doing stuff with repl, yeah?14:42
corvuswas it that one? or was it before that?  i can't recall14:43
corvusthis is the current usage of the processes: http://paste.openstack.org/show/792884/14:43
fungiif we discount that, a leak could have been brought in with the 2020-04-20 13:39:31 restart (though we didn't see evidence of a leak) or the 2020-04-24 22:22:05 restart14:45
fungiwe saw our first oom event between the 2020-04-25 15:56:59 and 2020-04-28 12:12:33 restarts14:45
corvusAJaeger: we cleared out more swap on this restart -- maybe not everything was restarted yesterday (especially zuul-web)?14:47
fungihrw: looking at the errors you linked, the first is about "libssl-dev : Depends: libssl1.1 (= 1.1.0g-2ubuntu4) but 1.1.1-1ubuntu2.1~18.04.5 is to be installed" which suggests there's a mismatch between an older libssl-dev you're trying to install and a libssl1.1 package14:56
mordredcorvus: should we combine the compose file and just have the scheduler start run docker-compose up scheduler -d  - and the web start do  up web fingergw -d ?14:56
fungihrw: the second looks like it's that we're not hosting a /debian/dists/buster-backports/main/source/Sources on our mirror network14:57
mordredcorvus: it seems like the experience with split files so far is not super positive14:57
corvusmordred: i don't know; cause we may well want to split these to different hosts later, so keeping the roles separate may be good.14:57
corvusi'm inclined to stick with the status quo a bit longer14:57
fungihrw: and the third looks like missing (maybe stale) content under /centos/8/AppStream/aarch64/os/repodata/14:58
mordredcorvus: ok14:58
fungihrw: we should probably look into each of these as separate problems, i don't see that they're likely to be related in any way14:58
hrwfungi: backports were hosted as we use it for months in kolla.14:58
hrwfungi: ok. it just happenned at same time14:59
fungihrw: but also, none of these look like connection failures as you originally suggested15:00
hrwsorry15:00
hrwwill do some checking later during evening.15:00
fungihrw: so for the first one, it may be a problem with the mirror server in that region as i can retrieve that directory on our other servers in our mirror network, e.g. mirror.iad.rax.openstack.org/debian/dists/buster-backports/main/source/Sources15:02
fungii mean, that file15:04
fungiand i guess that was actually the second failure example, not the first15:05
*** ykarel is now known as ykarel|away15:06
fungiokay, so second and third failures look like they may be related, as the files they complained about being unable to retrieve are available from endpoints in our other locations15:07
fungii think i see the problem too15:08
fungi/dev/vda1        78G   78G     0 100% /15:08
fungirootfs has filled up on mirror.regionone.linaro-us.opendev.org15:08
*** larainema has joined #opendev15:08
fungiit's possible the first example was also a cascade failure related to being unable to fetch some file from the mirror server15:09
fungiseeing what i can do to clean up the disk some and then i'll reboot the server to make sure it's operating correctly again, but ultimately this was built with a too-small rootfs and seems to have no separate volume to put caches on15:11
fungiit was likely deemed "good enough" when there was very little arm64 testing going on15:11
fungibut it's clearly insufficient now15:11
hrwkevinz: ^^ we need to enlarge rootfs on mirror.regionone.linaro-us.opendev.org15:11
hrwfungi: suggested size?15:12
fungieither get cinder volumes working there or rebuild it with a flavor which has at least a 200gb rootfs15:12
hrwfungi: thanks15:12
fungiright now it looks like it probably used the same flavor as our test nodes, which just doesn't provide nearly enough disk space for the afs and apache proxy caches15:13
hrwsure15:13
clarkbmordred: corvus we need https://review.opendev.org/#/c/724115/ to land to apply zuul-web updates15:13
clarkbuntil that gets fixed my changes to apache config are never applied beacuse we fail in the zuul-scheduler portion of our zuul service playbook and never get to zuul-web15:14
fungihrw: kevinz: https://docs.opendev.org/opendev/system-config/latest/contribute-cloud.html actually suggests a 500gb disk, but we can get by with 200gb right now15:14
mordredclarkb: any reason to not land that now?15:15
clarkbmordred: I dont think so15:15
mordredclarkb: great. +A15:15
clarkbfungi: no lists OOMs overnight15:16
hrwfungi: mailed kevinz to be sure it is not lost15:18
fungiclarkb: yeah, still a bit of a memory spike between 09:00-10:00 but not like previous days15:18
clarkbfungi: I think that was bup15:19
fungihrw: we've got 2.6gb free on the rootfs now, after deleting the systemd journal and rebooting15:19
hrwfungi: thank you15:20
clarkbfungi: hrw was it booted on a larger flavor?15:20
fungii'll also make sure apache tries to purge as much of its proxy cache sa it thinks is stale, though that can take a while and may not buy much additional room15:20
fungiclarkb: it was not, no, it's got a 80gb rootfs and no cinder volume15:20
hrwclarkb: kevinz is an admin there but it is night in China now so it has to wait for tomorrow15:21
fungilooks like it already started at fresh htcacheclean run at boot, so i won't run a separate one15:21
fungilooks like this server was created 2020-01-2215:22
fungiwhich i guess is when the new region was being brought online15:22
fungii'll check the flavor list just to make sure there's not already a suitable one available to us for this15:23
fungimaybe the mirror server was just incorrectly created15:23
corvusclarkb: do we need a zuul-web restart,  or just apache?15:23
corvusclarkb: (once 115 lands)15:23
openstackgerritMonty Taylor proposed opendev/system-config master: Use only timesyncd on focal  https://review.opendev.org/72435415:25
mordredcorvus: I believe just an apache15:25
mordredcorvus: and that should reduce the pressure on zuul-web aiui15:25
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add loop var policy to ansible-lint  https://review.opendev.org/72428115:26
openstackgerritClark Boylan proposed opendev/system-config master: Set up robots.txt on lists servers  https://review.opendev.org/72435615:26
clarkbcorvus: just apache15:26
fungiclarkb: hrw: kevinz: i've confirmed with openstack flavor list, the largest flavor available is 80gb, but we do seem to have some cinder quote there (the arm64 nodepool builder is using 400gb of it), so i'll try to add a 200gb cinder volume for the mirror15:27
clarkbinfra-root ^ change above puppetizes the robots.txt change I made on lists.o.o which seems to have addressed the OOMing15:27
hrwfungi: thanks15:27
AJaegerinfra-root, please review  https://review.opendev.org/718478 https://review.opendev.org/724329 and https://review.opendev.org/#/c/724310 together - that's github mirroring change. Let's get those quickly in to reduce merge conflicts further...15:27
clarkbAJaeger: I've approved the bottom of the stack but might be good to have corvus ack https://review.opendev.org/#/c/724310 since he was invovled in getting the jobs set up right15:29
fungihrw: kevinz: clarkb: okay, so we either need a bigger rootfs flavor or more cinder quota... VolumeSizeExceedsAvailableQuota: Requested volume or snapshot exceeds allowed gigabytes quota. Requested 200G, quota is 500G and 400G has been consumed.15:30
clarkbfungi: did we have a volume that disappeared itself?15:30
clarkboh no its the builder15:30
fungii could make a 100gb volume in there, but that won't be enough15:30
fungiclarkb: yeah, we have 500gb quota but like i said the arm64 nodepool builder is using 40015:31
clarkbfungi: 100GB might be enough if we put apache cache on cinder and afs cache on root disk15:31
fungiyeah, i can probably make that work if i mount creatively15:31
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352815:37
openstackgerritMonty Taylor proposed opendev/system-config master: Use only timesyncd on focal  https://review.opendev.org/72435415:37
clarkbinfra-root actually https://review.opendev.org/724356 is likely to fail on the same job my zuul config reorg is failing on. We need to set at least one private hiera var in testing. I'll sort that out shortly15:39
clarkbthe other major issue is that zuul-sphinx doesn't seem to like it when you have configs in subdirs under zuul.d/ which the zuul docs say is valid15:39
clarkbI'll work on sorting that out too15:39
fungihrw: a few builds in that region may get connection refused errors while i've got apache stopped to relocate its cache onto the separate cinder volume15:42
fungishouldn't be long15:42
hrwfungi: ok15:44
openstackgerritClark Boylan proposed opendev/system-config master: Set up robots.txt on lists servers  https://review.opendev.org/72435615:48
fungi#status log moved /var/cache/apache2 for mirror01.regionone.linaro-us.opendev.org onto separate 100gb cinder volume to free some of its rootfs15:49
openstackstatusfungi: finished logging15:49
clarkbfungi: thanks!15:49
fungihrw: kevinz: clarkb: ^ we could still stand an additional 100gb of cinder quota so we can move the afsclient cache off the rootfs as well, but that should get us through for now15:50
fungiianw: ^ heads up, i know you've been dealing with that region more than most of us, so just be aware15:51
clarkbmordred: https://zuul.opendev.org/t/openstack/build/cd423a364032445bbd9cb4f200c0c871/log/job-output.txt#51660 thats a puppet test failure for nodepool puppeting because it needs a private sshkey in 'hiera'. I expect this is going to be handled by the ansibleification. Do we want to hold off on the zuul.d reorg for that as that should fix the job? I've got to fix the lists job and zuul-sphinx (with a release)15:51
clarkbanyway so waiting may not be terrible15:51
clarkbmordred: note https://review.opendev.org/724356 bundles the lists fix so that the robots.txt change can land15:51
clarkbI'm going to work on zuul-sphinx now15:51
mordredclarkb: ++15:53
hrwfungi: thanks15:56
fungiwelcome!15:57
openstackgerritMerged openstack/project-config master: Add Github mirroring job to all official repos  https://review.opendev.org/72431015:58
openstackgerritMonty Taylor proposed opendev/system-config master: Run zookeeper cluster in nodepool jobs  https://review.opendev.org/72070916:07
mordredclarkb, corvus : rereview of that ^^ when you get a sec, it was missing a private hiera var16:08
openstackgerritMonty Taylor proposed opendev/system-config master: Run nodepool launchers with ansible and containers  https://review.opendev.org/72052716:10
*** ysandeep|coffee is now known as ysandeep|away16:12
openstackgerritMonty Taylor proposed opendev/system-config master: Use only timesyncd on focal  https://review.opendev.org/72435416:13
clarkbmordred: I think thats the fix for the nodepool job failure I had too fwiw16:17
*** Dmitrii-Sh0 has joined #opendev16:20
*** rkukura has quit IRC16:21
*** rkukura has joined #opendev16:21
mordredclarkb: yeah - I think so16:22
*** hashar has quit IRC16:22
*** rkukura has left #opendev16:23
clarkbinfra-root I've noticed that I never commited our host var / group var changes from friday to fix ssh keys for zuul things16:23
ttxfungi, mordred, corvus: looking at https://zuul.openstack.org/builds?job_name=openstack-upload-github-mirror it seems to have picked up mirroring16:23
fungiwoo!16:23
ttxThe duration of those jobs is still a bit disturbing16:24
*** Dmitrii-Sh has quit IRC16:24
*** Dmitrii-Sh0 is now known as Dmitrii-Sh16:24
clarkbI'm committing those changes now16:24
ttxthe mirroring itself takes about 5 seconds but there is lots of boilerplate16:24
ttxabout 50s to set up the job and 30s to close it16:25
ttxI wonder if that blocks the executors for too long16:27
AJaegerclarkb, fungi, can you now +A https://review.opendev.org/#/c/718478/ to remove the gerrit mirroring?16:27
fungittx: the executors run ansible for multiple builds in parallel, and really only throttle taking on new builds if they start to get excessive system load or memory utilization16:28
ttxok, maybe keep an eye on it and see if it horribly slows down things or not16:29
ttxat least the refs/changes cleanup was useful16:29
fungittx: so running an additional 1-2 minute job for each commit which merges is probably not going to make a dent given the average number of test-hours each of those changes consumed already16:30
ttxFYI it took about 4 days of continuous work to clean them all up16:30
fungithat's a lot of deletions, indeed16:32
ttxalso github fails horribly when you try to delete more than 100 at a time16:32
fungiyou could have just stopped at "github fails horribly" ;)16:33
ttxfungi: https://review.opendev.org/#/c/718478/ ready to go16:34
ttx(removing the gerrit-level mirroring)16:34
mordredttx: this is very exciting - thanks for doing that work!16:35
ttxnow I can start to aggressively move abandoned things out of the openstack org16:35
ttxthanks all for the help!16:36
AJaegermordred: want to +A the change after your +2 and corvus', please? It is ready now to merge16:37
mordredAJaeger: done16:37
*** rpittau is now known as rpittau|afk16:37
AJaegerthx16:37
fungittx: we'll still need a gerrit restart after 718478 merges16:38
fungisince the replication plugin config is only read on gerrit service startup16:39
ttxnoted16:39
AJaegerfungi: do we need a long downtime for it?16:39
clarkbAJaeger: no just a restart16:39
clarkbshouldn't be more than a couple minutes16:40
fungiAJaeger: nope, just a quick restart16:40
AJaegerso, something we can sneak in today?16:40
fungimaybe, though this is a busy couple of weeks for openstack16:40
fungifinal rcs next week16:40
*** Dmitrii-Sh has quit IRC16:41
AJaegerif we don't restart, is there a problem that gerrit pushes and the job pushes as well?16:41
fungithough there's not much of a rush on the gate at this point, mostly just the deployment projects trying to catch up i think16:42
*** Dmitrii-Sh has joined #opendev16:42
openstackgerritMerged zuul/zuul-jobs master: Add loop var policy to ansible-lint  https://review.opendev.org/72428116:42
AJaegerfungi: yes, RC1 is cut for most (all?) repos16:42
fungiAJaeger: not really, it may just generate errors when it tries to replicate to nonexistent github repos16:42
fungi(errors nobody will see unless they look at the gerrit error log)16:43
AJaeger;)16:43
fungithough it might also trigger some sort of throttling behavior from github if we're bombarding them with replication attempts to nonexistent repos16:43
fungiso probably still best avoided16:43
AJaegerso, let's ask ttx to wait with repo removal until the restart16:44
openstackgerritMonty Taylor proposed opendev/system-config master: Run zookeeper cluster in nodepool jobs  https://review.opendev.org/72070916:44
openstackgerritMonty Taylor proposed opendev/system-config master: Run nodepool launchers with ansible and containers  https://review.opendev.org/72052716:44
ttxsure, np16:45
mordredclarkb, corvus : sorry - one more time - I forgot that puppet only recognizes a single group for a host - so putting the vars in nodepool-launcher was bogus - they have to go in nodepool16:45
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352816:56
openstackgerritMonty Taylor proposed opendev/system-config master: Use only timesyncd on focal  https://review.opendev.org/72435416:56
*** tkajinam has quit IRC16:57
clarkbinfra-root https://review.opendev.org/#/c/723756/ increases the system-config-run-zuul job's timeout because we get failures like https://zuul.opendev.org/t/openstack/build/7c4c4eaaaf0646c08ac0355212c6f60b17:01
clarkbI think openafs dkms builds are a good chunk of that17:02
mordredclarkb: I've actually got an increase in zuul-executor patch too :)17:02
openstackgerritMonty Taylor proposed opendev/system-config master: Increase timeout on system-config-run-zuul  https://review.opendev.org/72375617:03
AJaegersystem-config reviewer, https://review.opendev.org/723251 removes git*openstack.org from cacti, can we merge that, please?17:03
openstackgerritMonty Taylor proposed opendev/system-config master: Run test playbooks with more forks  https://review.opendev.org/72431717:04
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352817:04
openstackgerritMonty Taylor proposed opendev/system-config master: Use only timesyncd on focal  https://review.opendev.org/72435417:04
mordredclarkb: ^^ rebased yours then rebsaed the focal stack on top of it so I could de-dupe the timeout increase17:05
clarkbmordred: cool17:05
clarkband actually my system-config reorg doesn't have that in it17:05
clarkbso once we think all three of the failing jobs on ^ are fixed I should refresh it with up to date content again17:05
clarkbthen try and land it fast :)17:06
mordredclarkb: :)17:08
openstackgerritMonty Taylor proposed opendev/system-config master: Run nodepool launchers with ansible and containers  https://review.opendev.org/72052717:19
mordredclarkb: https://review.opendev.org/#/c/720709/ is green now17:29
AJaegermordred: could I trouble you with reviewing 723251, please? Will make scrolling for zuul in cacti a tiny bit easier ;)17:29
mordredcorvus: https://review.opendev.org/#/c/723889/ could use a re-review17:29
clarkbmordred: +A17:30
*** priteau has quit IRC17:32
AJaegermordred: thx17:32
*** diablo_rojo has joined #opendev17:33
*** priteau has joined #opendev17:34
*** priteau has quit IRC17:40
*** ralonsoh has quit IRC17:41
openstackgerritClark Boylan proposed opendev/puppet-mailman master: Create /srv/mailman  https://review.opendev.org/72438917:47
openstackgerritClark Boylan proposed opendev/system-config master: Set up robots.txt on lists servers  https://review.opendev.org/72435617:48
clarkbmordred: ^ eventually we should get a working test there17:48
openstackgerritClark Boylan proposed opendev/puppet-mailman master: Create /srv/mailman  https://review.opendev.org/72438918:03
mordredclarkb, corvus: woot - https://review.opendev.org/#/c/720527/ is green again18:03
*** priteau has joined #opendev18:10
openstackgerritMonty Taylor proposed opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352818:11
clarkbmordred: I need to pop out for a bit but will try and properly review that one after18:12
*** priteau has quit IRC18:14
mordredclarkb: maybe by the time you get back the executor-on-focal patch will be grreen too18:26
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx  https://review.opendev.org/72233918:37
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: DNM Run builder tests on expanded node  https://review.opendev.org/72407918:37
*** iurygregory has quit IRC18:38
*** Dmitrii-Sh1 has joined #opendev18:38
*** Dmitrii-Sh has quit IRC18:42
*** Dmitrii-Sh1 is now known as Dmitrii-Sh18:42
clarkbmordred: I don't think the system-config-run jobs that use puppet are doing depends-on properly19:02
clarkbmordred: https://review.opendev.org/#/c/724356/3 that is still failing even though its parent creates the dir it complains about19:03
corvuswhere do modules like puppet-mailman get installed?19:40
clarkbcorvus: I think /etc/puppet/modules then they are ansible synchronized onto remote hosts into the puppet install19:42
clarkbthats /etc/puppet/modules on bridge19:43
*** hashar has joined #opendev19:43
clarkblooking at production brdige that seems to be the case19:43
corvusansible-role-puppet does the copying from bridge to remote node19:44
corvusso what, in the system-config-run job puts the repo in /etc/puppet/modules?19:44
*** priteau has joined #opendev19:44
clarkbcorvus: looks like playbooks/roles/install-ansible/tasks/main.yaml calling install_modules.sh19:45
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Support multi-arch image builds with docker buildx  https://review.opendev.org/72233919:45
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: DNM Run builder tests on expanded node  https://review.opendev.org/72407919:45
mordredyes - install-ansible19:46
mordredclarkb, corvus and yes - getting install-modules to use zuul prepared repos is one of the next things on my list19:46
clarkbmordred: ok in this case should we just land https://review.opendev.org/724389 and recheck https://review.opendev.org/724356 ?19:47
mordredwe can actually simplify it quite a bit19:47
mordredyes - I think that would be better than waiting on reworking install-modules19:47
clarkbmordred: https://review.opendev.org/#/c/720709/ failed on timed out nodepool job fwiw19:47
mordredI also think we can improve it even past just installin from zuul - to not syncing all of the puppet to every host19:48
mordredbut I think that's two steps19:48
mordredclarkb: sigh. afs modules19:48
mordredclarkb: we should maybe bump the timeout there in the same way19:49
clarkbif afs is involved then very likely19:49
mordredclarkb: I've also got a patch up to increase the forks setting19:49
openstackgerritMerged opendev/system-config master: Update to tip of master in periodic jobs  https://review.opendev.org/72388919:49
mordredso that we do more in parallel on the jobs with multiple hosts19:49
mordredclarkb: https://review.opendev.org/#/c/724317/19:50
corvusmordred: i don't understand how afs relates to 70919:50
mordredcorvus: it runs the nodepool job, which is installing the afs package - which compiles the kernel module19:51
corvuswhy does the nodepool job install the afs package?19:51
mordredoh - wait. it doesn't19:51
mordrednevermind - I'm dump19:51
mordreddumb19:51
*** priteau has quit IRC19:51
* mordred is confusing launchers and executors again19:51
mordredlemme go see what went on19:51
openstackgerritMerged opendev/system-config master: Disable global Github replication  https://review.opendev.org/71847819:53
openstackgerritMerged opendev/system-config master: Remove git*.openstack.org  https://review.opendev.org/72325119:53
mordredcorvus: if I'm not reading the log wrong, I think we're running the run playbooks serially - like, there doesn't seem to be much in the way of parallelism20:03
corvusmordred: i'm having a lot of trouble reading those logs, they seem to mostly be rsync file lists; can you summarise what you're seeing?20:04
corvusmordred: like what playbook?  i thought we're only running one -- the service-nodepool playbook?20:05
corvuser what playbooks20:05
mordredyes - that's what I mean - I think our run of service-nodepool is taking longer because the change added more hosts and we don't have good parallelism20:07
mordred(trying to find some good links - this is still just in hypothesis range)20:08
clarkbmordred: few things on https://review.opendev.org/#/c/72052720:08
corvusmordred: right, but there's only one playbook, so you're saying the playbook itself has poor paralellism?20:08
mordredyes - I'm saying I think this: https://review.opendev.org/#/c/724317/2/playbooks/zuul/run-base.yaml might help with that20:08
corvusmordred: well default is 5 and we don't have more than 5 hosts20:09
corvusso that should help in prod but not test?20:09
mordredoh - right - the default is 5 isn't it20:09
clarkbansible will do each task across all hosts before moving to the next task right?20:10
mordredthen I agree - that won't be super helpful for this one - I think it helped the zuul change because there are 6 hosts there20:10
clarkbmight be quicker to have them run free if that isn't going to cause problems20:10
*** avass has quit IRC20:10
corvusthe plays are free20:11
clarkbk20:11
corvusbut the playbook is still a series of plays20:11
corvusit's nb01,nl01  followed by nb04, followed by nb0120:12
corvusand i'm guessing that happens after the zk stuff20:12
mordredyeah -actually - I think we could streamline a bit by rearranging that and de-duplicating a bit20:13
corvusmordred: i think we could rework the service-nodepool playbook to be more parallel but with a little more yaml repitition20:13
corvusheh, i would describe it as re-duplicating :)20:13
mordredyeah20:13
corvusi'm not quite sure what that looks like though; this seems a little hard to describe in ansible20:14
mordredcorvus: https://etherpad.opendev.org/p/aDp6AHnb84UKlAD7LUvH20:14
mordredcorvus: there's one change that I think would add more parallism for install-docker, nodepool-base and configure-openstacksdk20:15
mordredwe could further make configure-openstacksdk happen on all of them at the same time20:15
corvusstructurally, isn't the way to get more parallelism to have fewer plays?20:15
mordredforks only helps within a play20:17
mordredso in the cases where we have a given role spread into two plays, we can't do that role in parallel - we're doing it twice on different sets of hjosts20:18
mordredcorvus: so -in that etherpad, we're always doing each role on the all of the hosts - and only listing it once20:18
mordredit's possible it's ahorrible strategy20:18
corvusi think that's more serial20:19
mordredI think it's both more serial and more parallel :)20:19
corvusmordred: in https://github.com/ansible/proposals/issues/31 bcoca says "have your cd system run multiple ansible-playbook processes"20:19
mordredyeah - really here the issue is that we have one service-nodepool playbook doing 3 completely different things20:20
corvusmordred: we only have one of each host in the test, so if a play only applies to one class of host, then it's as serial as it can possibly get in the test, right?  1 play, one host.20:20
mordredcorvus: yeah - that's why I was combining things like install-docker into a play with more hosts20:20
mordredso that we wouldn't run it twice across teh series - but instead would only ever run it once20:20
mordredbut I think the real answer is to decompose this into more playbooks20:21
mordredand trigger them differently even20:21
corvusthere's one other alternative but its ugly20:21
mordredyeah?20:21
corvusa playbook with a task list that does conditional role inclusion20:21
mordredeww20:21
mordredyeah20:21
corvusit's the best way to get high parallelism with the playbook construction we have now20:21
mordredthis is true20:22
corvusif we want to split the playbooks into "nodepool-builder" "nodepool-launcher" "new-style-nodepool-builder" then we still also need a way to run those in parallel20:22
mordredwant me to try that? also - this is an extra bad case currently because we've got multiple different deploy strategies going on at the same time20:22
mordredcorvus: well - for tests we dont' really need to run all of them in the same test20:23
corvusmordred: yeah, but there's a huge test setup cost20:23
mordredyeah20:23
corvusmordred: we could also bump the timeout here and kick this down the road until we're not straddling puppet20:23
mordredcorvus: yeah. I think we do need to make this better - but this is a particularly bad version of this20:23
corvusit still may be worth improving paralleism then, but that might just be a single role20:24
mordredyeah - a bunch more of them get to be shared - liek nodepool-base20:24
corvusso our calculus about whether to split the playbook vs do conditional role inclusion may be a lot different20:24
mordred++20:24
mordredthat said - maybe we should have the rsync of puppet modules be quieter?20:25
mordredbecause there's just a crapton of rsyncing output20:25
openstackgerritMonty Taylor proposed opendev/system-config master: Run zookeeper cluster in nodepool jobs  https://review.opendev.org/72070920:27
openstackgerritMonty Taylor proposed opendev/system-config master: Run nodepool launchers with ansible and containers  https://review.opendev.org/72052720:27
openstackgerritMonty Taylor proposed opendev/system-config master: Increase timeout for run-service-nodepool  https://review.opendev.org/72441520:27
mordredcorvus: while we're on the topic - wanna review https://review.opendev.org/#/c/724317/ and https://review.opendev.org/#/c/723756/ ?20:28
*** mrunge has quit IRC20:31
corvusmordred: yeah, we should dial down the rsync :)20:31
openstackgerritMonty Taylor proposed opendev/ansible-role-puppet master: Add flag to control logging the rsyncs  https://review.opendev.org/72441820:34
openstackgerritMonty Taylor proposed opendev/system-config master: Stop logging the rsync of puppet  https://review.opendev.org/72441920:35
mordredcorvus: ^^ done20:35
clarkbmordred: did you see my note on https://review.opendev.org/#/c/720527/ ?20:43
mordredclarkb: I did not - but do now - and agree - thank you20:44
openstackgerritMonty Taylor proposed opendev/system-config master: Run nodepool launchers with ansible and containers  https://review.opendev.org/72052720:51
mordredclarkb: responded and fixed20:51
clarkbmordred: the problem with that first testinfra condition is the not20:52
clarkbmordred: I think its still wrong?20:52
mordredclarkb: oh - duh20:59
mordredit wants to be "if not nl, skip20:59
clarkbya20:59
openstackgerritMonty Taylor proposed opendev/system-config master: Run nodepool launchers with ansible and containers  https://review.opendev.org/72052721:00
mordredyah. with you now21:00
clarkbcool I'm going to pop out for a bike ride now. If anyone is able to review thsoe mailman changes, particularly https://review.opendev.org/#/c/724389/ so https://review.opendev.org/#/c/724356/3 can be rechecked that would be great21:04
clarkbthat coupled with zuul-sphinx release and mordred's nodepool work should make it possible for us to split the zuul configs in system-config21:04
openstackgerritMerged opendev/system-config master: Increase timeout on system-config-run-zuul  https://review.opendev.org/72375621:08
openstackgerritMerged opendev/system-config master: Run test playbooks with more forks  https://review.opendev.org/72431721:08
openstackgerritMonty Taylor proposed opendev/system-config master: Run zookeeper cluster in nodepool jobs  https://review.opendev.org/72070921:18
openstackgerritMonty Taylor proposed opendev/system-config master: Run nodepool launchers with ansible and containers  https://review.opendev.org/72052721:18
mordredclarkb, corvus : sigh - I had to squash the timeout change with the zk change because of a different fix that's also in the zk change. so I need to ask for another re-review21:19
corvusmordred: i'm going to +3 for clarkb on that one21:30
mordredcorvus: cool21:32
*** hashar has quit IRC21:35
mordredcorvus: oh wow - going for it with the launchers!21:36
mordredcorvus: I believe we're going to need to chown things on them like we did on the zuul nodes - but obviously things won't be down in a user-facing way while we do21:37
corvusmordred: er, should i not have +3d that? i thought they had all been approved already21:38
corvusmordred: i'll change that to a +221:39
*** DSpider has quit IRC21:39
mordredcorvus: I mean - honestly it's probably fine as long as we're ready to go chown things21:40
mordredbut - yeah - maybe we should land it when we're all watching just in case?21:40
corvusi've got enough other stuff going on21:41
corvusit looks like memory use on zuul01 has increased greatly21:41
corvusi'll start looking into that21:41
mordredcorvus: ++21:42
corvusit does appear that it's actually zuul-web that's the lion's share21:42
mordredcorvus: gross (although I suppose it's better that than a scheduler memory leak)21:42
mordredcorvus: have we restarted apache with clarkb's changes yet?21:43
mordredcorvus: nope21:43
corvushere's the currest snapshot: http://paste.openstack.org/show/792904/21:43
mordredcorvus: https://review.opendev.org/#/c/724115/ is the patch that fixes ansible so that the zuul-web changes can get applied21:44
corvusmordred: we may have one of those too21:44
mordredand it hit a timeout - which we have now fixed in the job21:44
mordredso I'm going ot recheck that21:44
corvusyeah i think maybe we don't need to be landing any prod changes that aren't fixing things at this point :)21:44
mordred++21:44
mordredcorvus: how about I enqueue that one into the gate21:44
corvusso is the apache fix on disk?21:44
mordredno21:45
fungitook a peek at zuul.o.o memory utilization just now, still growing well beyond our normal levels and zuul-web seems to be consuming the lion's share21:45
funginow i see you just switched gears to talking about that21:45
corvusmordred: then yes please21:45
mordredit's not been applied because ansible keeps bombing out21:45
mordredcorvus: ok. enqueued21:45
corvusi'll restart zuul-web to buy us more time21:45
mordredthat should fix the ansible run which should then restart apache with the new caching settings in place21:46
mordred(that was one of those fun rabbit hole)21:46
fungii agree, this seems to be growing fast enough it almost had to be something in the restart on friday/saturday21:47
mordredI think the friday restart was _definitely_ involved with the zuul-web issue21:47
mordredgiven the followup apache fix21:48
openstackgerritMonty Taylor proposed opendev/system-config master: Use only timesyncd on focal  https://review.opendev.org/72435421:49
fungipresumably something which merged between 2020-04-20 14:03:36 and 2020-04-24 22:42:4821:49
corvusthere was a cherrypy release on april 1721:49
corvuslast one before that was nov 27; we should keep that in mind21:49
fungiotherwise i think we would have been seeing higher memory utilization late last week, which cacti doesn't indicate21:50
corvusit's probably good to reduce the exposure of zuul-web, however, i'm not sure it should be using linearly growing memory under any circumstances21:54
corvusso i'm not sure we should expect or consider the apache changes to be a fix for the memory use21:54
mordredcorvus: no - if anything I expect them at best to just to be a a damper on the growth- if the issue is that cherrypy is getting hit for status.json more frequently and for some reason it's leaking that (that being likely to be the largest object it interacts with frequently) - then apache config issue could simply exacerbate21:57
mordredbut I completely agree - it's not reasonable for zuul-web to use memory like that21:57
corvusi wonder if we should try a pin to cherrypy release-121:57
mordredcorvus: worth a try - the memory growth is quick enough we should be able to get some data by having done so21:58
mordredcorvus: (we should collect a baseline growth with apache config in place first)21:59
mordredso that we're not comparing apples to oranges21:59
corvusmordred: or even temporarily revert the apache change if it "helps" too much21:59
mordredyah22:00
mordredcorvus: speaking of: https://review.opendev.org/#/c/723855/22:00
mordredcorvus: so there is apparently another reason to pin back anyway22:01
mordredcorvus: I need to afk for a bit - you in ok shape for me to do that?22:01
corvusmordred: yeah, but i don't understand that commit message22:02
fungiwe should be able to do that without a full scheduler restart, just restarting zuul-web, right?22:02
corvusyeah22:03
mordredcorvus: I think the issue is in prepping for the new depsolver from pip22:04
mordredcorvus: since we pin cheroot and that is different from what cherrypy declares, current pip will install it, but the pip check which will apply the depsolver will fail - and future pip+depsolver will fail22:04
mordredsince future pip will refuse to install conflicting sets of dependencies22:05
corvusmordred: thanks22:05
*** mlavalle has quit IRC22:09
openstackgerritMonty Taylor proposed opendev/system-config master: Run nodepool launchers with ansible and containers  https://review.opendev.org/72052722:11
*** mlavalle has joined #opendev22:12
*** mlavalle has quit IRC22:20
openstackgerritMerged opendev/system-config master: Don't restart the zuul scheduler in prod  https://review.opendev.org/72411522:22
openstackgerritMerged opendev/system-config master: Run zookeeper cluster in nodepool jobs  https://review.opendev.org/72070922:22
*** mlavalle has joined #opendev22:26
openstackgerritIan Wienand proposed opendev/system-config master: install-docker: remove arch match  https://review.opendev.org/72443522:39
clarkbmordred: corvus anything I can do to help with that ?22:40
clarkbI'm back from biking22:40
clarkbthat == zuul web stuff22:41
corvusclarkb: i think the next step is to restart once we have an image with https://review.opendev.org/72385522:46
clarkbrgr22:46
corvusthat should confirm or eliminate cherrypy as a cause, but we need to keep your apache fixes in mind -- we might want to temporarily revert them on disk to reduce variables22:46
clarkbk22:48
clarkbthough I don't think my changes have taken effect yet22:49
clarkb(so all of the laeking was without them)22:49
corvusclarkb: right, but they're about to (or just have) since 724115 merged22:50
clarkbya22:50
*** tkajinam has joined #opendev22:51
openstackgerritIan Wienand proposed opendev/system-config master: Add system-config-run-base-arm64  https://review.opendev.org/72443923:04
openstackgerritMerged opendev/system-config master: Add --fail flag to zuul status backup curl  https://review.opendev.org/72389623:05
*** tosky has quit IRC23:11
ianwwhast is Open-E JovianDSS CI ?23:30
clarkbianw: what is the context?23:36
ianwclarkb: sorry i probably should have said "have we had any communication from ... "23:38
ianwit's been leaving config error comments on devstack chanages (at least) for a while23:39
clarkbnot that I am aware of23:39
fungiianw: i'm guessing some third-party ci system for a cinder driver23:39
clarkbusually we can disable those if they are too chatty23:39
clarkbdisable the account in gerrit I mean23:39
fungiweb searches for "Open-E JovianDSS" turn up network-attached storage devices23:40
fungisounds like it may be misconfigured to report on the wrong project(s)23:41
clarkbmy changes to the zuul vhost config have applied23:41
clarkbthey appear to be functional23:41
clarkbmy browser says I'm getting back gzipped js and css files now which is good and transfer time for those seems a bit slower23:41
clarkbthe caching changes should only apply to status.json though23:41
ianwi'll reach out ... https://wiki.openstack.org/wiki/ThirdPartySystems/Open-E_JovianDSS_CI23:42
clarkbwe could extend the caching to apply to static resources too?23:42
clarkbanyway hopefully caching the status json helps with memory use23:42
clarkb(we are using the disk not memory cache iirc)23:42
ianwfor the record i've sent a mail to the contacts there asking to take a look, i'l log in qa too23:43
clarkbcorvus: looks like we failed to promote that image for zuul-web23:44
clarkbcorvus: I think we may have suppressed the logging of why it failed too23:44
clarkbhttps://zuul.opendev.org/t/zuul/build/dd5783de71054298904f7890adf45529/console#1/0/3/localhost23:44

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!