Tuesday, 2020-05-26

openstackgerritMerged opendev/system-config master: Retire pabelanger as infra-root  https://review.opendev.org/66819200:01
*** iurygregory has quit IRC00:49
*** Meiyan has joined #opendev00:55
openstackgerritIan Wienand proposed openstack/diskimage-builder master: dib-lint: use yamllint to parse YAML files  https://review.opendev.org/73069002:05
openstackgerritIan Wienand proposed openstack/diskimage-builder master: package-installs : allow a list of parameters  https://review.opendev.org/73069102:05
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Revert "Revert "ubuntu-minimal : only install 16.04 HWE kernel on xenial""  https://review.opendev.org/73069202:05
openstackgerritIan Wienand proposed openstack/diskimage-builder master: package-installs : allow a list of parameters  https://review.opendev.org/73069102:16
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Revert "Revert "ubuntu-minimal : only install 16.04 HWE kernel on xenial""  https://review.opendev.org/73069202:16
openstackgerritIan Wienand proposed openstack/diskimage-builder master: package-installs : allow a list of parameters  https://review.opendev.org/73069102:33
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Revert "Revert "ubuntu-minimal : only install 16.04 HWE kernel on xenial""  https://review.opendev.org/73069202:33
*** cloudnull has quit IRC03:02
*** ykarel|away is now known as ykarel03:35
*** ykarel is now known as ykarel|afk03:54
*** ysandeep is now known as ysandeep|brb04:08
*** ysandeep|brb is now known as ysandeep04:38
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: test-base-roles: update include to import_playbook  https://review.opendev.org/73067405:05
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: Make gentoo jobs nv  https://review.opendev.org/72864005:23
*** ykarel|afk is now known as ykarel05:25
openstackgerritMerged zuul/zuul-jobs master: test-base-roles: update include to import_playbook  https://review.opendev.org/73067405:33
openstackgerritSagi Shnaidman proposed zuul/zuul-jobs master: WIP Add ansible collection roles  https://review.opendev.org/73036005:49
*** slaweq has joined #opendev06:52
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: Make gentoo multinode job nv  https://review.opendev.org/72864007:11
*** iurygregory has joined #opendev07:22
*** tobiash has quit IRC07:28
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Remove install-* roles  https://review.opendev.org/71932207:35
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: bindep: update include to import_tasks  https://review.opendev.org/73066007:37
*** tosky has joined #opendev07:38
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: update include to import_tasks  https://review.opendev.org/73067307:39
*** openstackstatus has quit IRC07:39
*** openstackstatus has joined #opendev07:39
*** ChanServ sets mode: +v openstackstatus07:39
*** DSpider has joined #opendev07:40
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ensure-bazel: update include to include_tasks  https://review.opendev.org/73067207:40
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ensure-package-repositories: update include to include_tasks  https://review.opendev.org/73067107:41
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: Rename test install role to ensure-  https://review.opendev.org/73072007:42
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: add-build-sshkey: update include to include_tasks  https://review.opendev.org/73067007:45
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: Make gentoo multinode job nv  https://review.opendev.org/72864007:45
*** rpittau|afk is now known as rpittau07:46
ianwclarkb: if you could look at https://review.opendev.org/730690 and https://review.opendev.org/730691 that should allow us to get focal support (arm64 + amd64) into the last dib 2.0 release07:50
fricklerAJaeger: should we also drop the tumbleweed jobs until someone fixes image builds? see e.g. https://nb01.opendev.org/opensuse-tumbleweed-0000226275.log . the existing image still seems to refer to a now non-existing openstack.org mirror07:58
openstackgerritBogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Support overrideable package_mirror  https://review.opendev.org/73060207:59
AJaegerfrickler: either fix or remove07:59
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: update include to include_tasks  https://review.opendev.org/73067307:59
ianwfrickler: oh, that's a shame.  15 is working in the dib gate, but not tumbleweed i guess08:00
*** moppy has quit IRC08:01
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: bindep: update include to include_tasks  https://review.opendev.org/73066008:01
ianwsame failure https://zuul.opendev.org/t/openstack/build/82663349aa314c0ea7aa4105659993ff/log/nodepool/builds/test-image-0000000003.log#319008:01
*** moppy has joined #opendev08:01
AJaegerfrickler: http://mirror.us.leaseweb.net/opensuse/tumbleweed/repo/oss/x86_64/?C=M&O=D looks recent - isn't that our mirror?08:01
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: add-build-sshkey: update include to include_tasks  https://review.opendev.org/73067008:04
AJaegercmorpheus: do you know who can look at those nodepool failures for tumbleweed? ^08:04
AJaegerfrickler: let me send a patch and WIP it for a day or two...08:04
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: update include to import_tasks  https://review.opendev.org/73066808:05
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: update include to import_tasks  https://review.opendev.org/73066808:06
AJaegerianw, frickler : when did we last build tumbleweed with success?08:07
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ensure-virtualenv: update include to inlude_tasks  https://review.opendev.org/73066908:07
*** tobiash has joined #opendev08:07
openstackgerrityatin proposed openstack/project-config master: Add publish-to-pypi to ansible-config_template  https://review.opendev.org/73072608:08
AJaegerfound it - looks two weeks ago ;(08:08
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ensure-podman: update include to include_tasks  https://review.opendev.org/73066708:08
fricklerAJaeger: ianw: seems the normal tumbleweed sync is working fine, but the /update part seems to be failing http://paste.openstack.org/show/793971/08:09
*** hashar has joined #opendev08:09
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: use-docker-mirror: update include to include_tasks  https://review.opendev.org/73066408:10
AJaegerfrickler: thanks, writing an email now...08:11
openstackgerritMerged zuul/zuul-jobs master: Make gentoo multinode job nv  https://review.opendev.org/72864008:12
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: configure-mirrors: update include to include_tasks  https://review.opendev.org/73066608:12
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: persistent-firewall: update include to include_tasks  https://review.opendev.org/73066508:14
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: multi-node-bridge: update include to include_tasks  https://review.opendev.org/73066208:16
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: Remove tumbleweed from testing  https://review.opendev.org/73072708:16
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ensure-pip: update include to import_tasks  https://review.opendev.org/73066108:19
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ensure-pip: update include to include_tasks  https://review.opendev.org/73066108:20
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Fix deprecation warning from multinode tests  https://review.opendev.org/73047908:21
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ensure-bazel: replace ignore_errors with failed_when  https://review.opendev.org/73073308:39
*** ykarel is now known as ykarel|lunch08:44
*** priteau has joined #opendev08:54
*** tobiash has quit IRC09:02
*** tobiash_ has joined #opendev09:02
*** dtantsur|afk is now known as dtantsur09:05
*** tkajinam has quit IRC09:05
*** ysandeep is now known as ysandeep|lunch09:08
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ensure-bazel: replace ignore_errors with failed_when  https://review.opendev.org/73073309:16
*** iurygregory has quit IRC09:17
*** iurygregory has joined #opendev09:18
*** SotK has quit IRC09:24
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: multi-node-bridge: update include to include_tasks  https://review.opendev.org/73066209:37
*** hashar has quit IRC09:48
*** tosky__ has joined #opendev09:50
*** tosky is now known as Guest6865809:50
*** tosky__ is now known as tosky09:50
*** ykarel|lunch is now known as ykarel09:51
*** Meiyan has quit IRC09:52
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add ensure-dnf-copr  https://review.opendev.org/73074309:59
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add ensure-dnf-copr  https://review.opendev.org/73074310:02
*** ysandeep|lunch is now known as ysandeep10:05
*** rpittau is now known as rpittau|bbl10:20
*** priteau has quit IRC10:35
*** avass has quit IRC10:39
openstackgerritSagi Shnaidman proposed zuul/zuul-jobs master: WIP Add ansible collection roles  https://review.opendev.org/73036011:18
zbrtristanC: clarkb: https://review.opendev.org/#/c/729974/ please.11:21
*** sshnaidm is now known as sshnaidm|afk11:54
*** hashar has joined #opendev11:55
openstackgerritMerged zuul/zuul-jobs master: Remove install-* roles  https://review.opendev.org/71932212:05
openstackgerritMerged zuul/zuul-jobs master: Add option to prefer https/ssl in configure-mirrors  https://review.opendev.org/72940712:09
mordredfrickler: morning! if you have a spare second, https://review.opendev.org/#/c/730483/12:10
hrwmordred: if you have spare: https://review.opendev.org/#/c/728810/12:12
*** cloudnull has joined #opendev12:14
openstackgerritOleksandr Kozachenko proposed zuul/zuul-jobs master: Add container and pod log in the test for ensure-kubernetes role  https://review.opendev.org/72792912:15
mordredhrw: done12:18
hrwmordred: thanks12:19
*** rpittau|bbl is now known as rpittau12:22
*** ykarel is now known as ykarel|afk12:23
*** SotK has joined #opendev12:26
openstackgerritBogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Support overrideable package_mirror  https://review.opendev.org/73060212:27
openstackgerritBogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Support overrideable package_mirror  https://review.opendev.org/73060212:28
openstackgerritMerged opendev/base-jobs master: add arm64 nodesets  https://review.opendev.org/72881012:35
openstackgerritMerged zuul/zuul-jobs master: packer: namespace test jobs correctly  https://review.opendev.org/73050012:41
openstackgerritMerged zuul/zuul-jobs master: ensure-pip: update include to include_tasks  https://review.opendev.org/73066112:43
openstackgerritMerged zuul/zuul-jobs master: ensure-package-repositories: fix loopvar collision  https://review.opendev.org/73047712:51
openstackgerritMerged zuul/zuul-jobs master: Do not interpolate values from tox --showconfig  https://review.opendev.org/72952012:51
openstackgerritMerged zuul/zuul-jobs master: bindep: update include to include_tasks  https://review.opendev.org/73066012:51
openstackgerritMerged zuul/zuul-jobs master: Add python3-devel to bindep  https://review.opendev.org/72870812:51
openstackgerritMerged zuul/zuul-jobs master: ensure-bazel: update include to include_tasks  https://review.opendev.org/73067212:51
openstackgerritMerged zuul/zuul-jobs master: Add container and pod log in the test for ensure-kubernetes role  https://review.opendev.org/72792912:55
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: multi-node-bridge: update include to include_tasks  https://review.opendev.org/73066212:56
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: ensure-package-repositories: update include to include_tasks  https://review.opendev.org/73067112:58
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: multi-node-bridge: update include to include_tasks  https://review.opendev.org/73066212:59
openstackgerritMerged opendev/system-config master: Update username in Zuul executor initscript  https://review.opendev.org/73048313:00
*** ykarel|afk is now known as ykarel13:01
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: multi-node-bridge: update include to include_tasks  https://review.opendev.org/73066213:02
*** sshnaidm|afk is now known as sshnaidm13:10
*** redrobot has joined #opendev13:20
*** ykarel is now known as ykarel|afk13:24
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: multi-node-bridge: update include to include_tasks  https://review.opendev.org/73066213:26
openstackgerritMerged zuul/zuul-jobs master: tox: update include to include_tasks  https://review.opendev.org/73067313:29
openstackgerritMerged zuul/zuul-jobs master: add-build-sshkey: update include to include_tasks  https://review.opendev.org/73067013:29
openstackgerritJavier Peña proposed openstack/project-config master: Remove tox-py27 job for x/packstack  https://review.opendev.org/73081313:30
*** tkajinam has joined #opendev13:37
*** tobiash_ is now known as tobiash13:37
openstackgerritAndreas Jaeger proposed openstack/project-config master: Fix sphinx playbook: install-if-python was renamed  https://review.opendev.org/73081813:43
AJaegerconfig-core, we missed the change above , please review to unbreak sphinx publishing ^13:43
AJaegerthanks, mordred !13:48
*** sgw has quit IRC13:52
*** dtantsur is now known as dtantsur|brb13:53
openstackgerritMerged zuul/zuul-jobs master: Fix deprecation warning from multinode tests  https://review.opendev.org/73047913:56
openstackgerritMerged zuul/zuul-jobs master: tox: empty envlist should behave like tox -e ALL  https://review.opendev.org/73032213:56
openstackgerritMerged zuul/zuul-jobs master: ensure-podman: update include to include_tasks  https://review.opendev.org/73066713:56
openstackgerritMerged zuul/zuul-jobs master: ensure-virtualenv: update include to inlude_tasks  https://review.opendev.org/73066913:56
*** yoctozepto8 has joined #opendev14:00
*** roman_g has joined #opendev14:00
*** yoctozepto has quit IRC14:01
*** yoctozepto8 is now known as yoctozepto14:01
*** rpittau is now known as rpittau|brb14:10
openstackgerritJavier Peña proposed openstack/project-config master: Remove tox-py27 job for x/packstack  https://review.opendev.org/73081314:14
openstackgerritMerged zuul/zuul-jobs master: ensure-package-repositories: update include to include_tasks  https://review.opendev.org/73067114:14
openstackgerritMerged openstack/project-config master: Fix sphinx playbook: install-if-python was renamed  https://review.opendev.org/73081814:16
*** ykarel|afk is now known as ykarel14:21
*** tkajinam has quit IRC14:36
*** rpittau|brb is now known as rpittau14:39
openstackgerritJavier Peña proposed openstack/project-config master: Remove tox-py27 job for x/packstack  https://review.opendev.org/73081314:42
openstackgerritJavier Peña proposed openstack/project-config master: Remove check/gate jobs for x/packstack  https://review.opendev.org/73081314:43
*** iurygregory has quit IRC14:45
*** dtantsur|brb is now known as dtantsur14:50
*** iurygregory has joined #opendev14:53
*** sgw has joined #opendev14:54
openstackgerritMerged zuul/zuul-jobs master: fetch-subunit-output: update include to import_tasks  https://review.opendev.org/73066815:06
*** mlavalle has joined #opendev15:09
*** cmorpheus is now known as cmurphy15:15
cmurphyAJaeger: i don't know who could help with tumbleweed problems besides dirk15:16
cmurphyi don't see any recent open bugs that look related on opensuse's bugzilla15:18
*** ysandeep is now known as ysandeep|afk15:19
AJaegercmurphy: ok, let's see - I might move forward with the zuul-jobs change and we can revert once dirk is back from vacation.15:24
openstackgerritAndreas Jaeger proposed zuul/zuul-jobs master: Remove tumbleweed from testing  https://review.opendev.org/73072715:30
*** priteau has joined #opendev15:34
*** dtantsur is now known as dtantsur|afk15:34
*** sshnaidm is now known as sshnaidm|afk15:34
openstackgerritClark Boylan proposed opendev/base-jobs master: Test mirrors with ssl  https://review.opendev.org/73086115:43
openstackgerritClark Boylan proposed opendev/base-jobs master: Use mirrors with ssl globally  https://review.opendev.org/73086215:43
clarkbinfra-root ^ that first change should be completely safe to land and I'll resurrect my testing job in zuul-jobs for base-test15:43
clarkbif that looks good I think we can land the second job15:43
clarkb*land the second job update (first is base-tes second is base)15:44
openstackgerritMerged zuul/zuul-jobs master: persistent-firewall: update include to include_tasks  https://review.opendev.org/73066515:50
openstackgerritMerged zuul/zuul-jobs master: use-docker-mirror: update include to include_tasks  https://review.opendev.org/73066415:50
*** rpittau is now known as rpittau|afk15:51
*** ykarel is now known as ykarel|away15:51
AJaegerzuul-jobs-maint, a trivial change for review, please https://review.opendev.org/73072015:52
clarkbcorvus: nb01 and nb02 use letsencrypt for the webserver where they serve their images logs and image files15:54
clarkbon the zuul runs of our production playbooks are we still needing to debug problems?15:55
*** ysandeep|afk is now known as ysandeep15:56
openstackgerritMerged zuul/zuul-jobs master: configure-mirrors: update include to include_tasks  https://review.opendev.org/73066615:57
*** ysandeep is now known as ysandeep|afk16:02
mordredclarkb: https://review.opendev.org/#/c/730483/ is the last known issue and is merged16:07
corvusit does look like system-config-run-zuul has been successful recently16:09
corvusmordred: can you review/approve https://review.opendev.org/729785 and its parent16:11
corvusmordred: https://review.opendev.org/729786 is interesting; if you look at the output here: https://zuul.opendev.org/t/openstack/build/e1fdfaf85ebc447399daf073ad7fd258/log/zuul01.openstack.org/docker/zuul-scheduler_scheduler_1.txt16:21
corvusmordred: it looks like we've managed to get the scheduler to nearly start -- basically, every connection configured is erroring out in some manner16:22
corvusmordred: so on the one hand: failure -- we aren't able to use our production config with fake private data to get the system started.  on the other hand: success -- we've at least gotten basic connectivity up.16:23
corvusmordred: i think we're at a crossroads: should we try to get the 'gate-test zuul' more functional, or should we stop here and see if we can do some testinfra validation that all the processes have started and are communicating with each other?16:24
corvusthough, actually, i'm not 100% sure we have yet gotten to the 'gearman has started' stage....16:25
*** hashar is now known as hasharAway16:36
mordredcorvus: might be easier to test all the way to comnnectivity if we ran gearman as a separate process (then it wouldn't need the scheduler to start it)16:37
mordredcorvus: but - I agree about the crossroads - I kinda feel like if we can get testinfra to verify services are runing that may be "god enough" for this?16:39
corvusmordred: i believe gearman is running and the scheduler has connected to it; it's unclear if the merger has.  i can't see why it wouldn't, but i don't see a confirming log message.  but maybe it just isn't emitted.16:40
corvusi think the main thing i'm worried about is that right now, the scheduler happens to be starting and busy-waiting because of the particular combination of bad data we've given it.  a subtle change in zuul behavior in the future could cause it to fail to start, and therefore fail any "is it running" level tests we set up... :/16:41
mordredcorvus: https://zuul.opendev.org/t/openstack/build/e1fdfaf85ebc447399daf073ad7fd258/log/zuul01.openstack.org/docker/zuul-scheduler_scheduler_1.txt#68 seems like a real error16:41
clarkbcorvus: you might be able to confirm it was connected by forcing it to disconnect (eg stop the gearman process)16:41
mordredcorvus: yeah16:41
corvusmordred: it's a real error due to bad fake data16:42
corvusthat's a private key which is incorrect in the gate16:42
corvusbasically, every single connection is failing for a similar reason16:42
mordredah - got it. I was reading that as EACCESS16:43
corvusnope, that's "your key is not a key"16:43
corvusmordred: if we give it a slightly better, but still wrong fake private key, zuul actually stops the scheduler16:43
corvusso we're in kind of a weird place here.  it's possible zuul improvements could make our test result worse.16:43
mordredhah16:44
clarkbcan we feed it a different set of connection data for testing?16:44
mordredyeah - it's like the uncanney valley of fake deployments16:44
clarkbuse a local git dir and run noop jobs or something16:44
mordredclarkb: but then we're not testing that we put all the keys in place properly16:44
clarkbmordred: yes, but it tells us the zuul is functional16:44
corvusclarkb: yeah, that would be the most robust thing i think.  but then things like mordred's "did we write the github key with the correct perms" would be opaque to our testing.16:44
clarkbI think for now properly testing that we've got the github credentials correct is tricky and somewhat orthogonal to "is zuul working with massive internal connectivity changes"16:45
clarkbboth are valuable but right now we are most worried about the second thign right?16:45
mordredwe almost need to legitimately also spin up a full gerrit that looks like review.o.o16:46
corvusi honestly don't know which is better, but i think i worry that zuul will make the decision for us (because, frankly, i'm never going to say "no" to a patch to zuul entitled "correctly report key format error")16:46
corvusmordred: and a github? :)16:46
mordredcorvus: well - yeah - that's an issue  :) - but we could probably do that with an explicit testinfra test - I think clarkb's #2 is the thinng we most want yes?16:47
clarkbya and thinking ahead I think the transition to more state in zk will mean we want the internal connectivity testing again there as well16:47
corvusokay, i'll look into alternative connection data for gate16:49
*** mnasiadka has quit IRC16:49
*** vblando has quit IRC16:50
*** ysandeep|afk is now known as ysandeep|away16:52
*** vblando has joined #opendev16:57
*** mnasiadka has joined #opendev16:58
hrwmorning17:02
hrwfungi: had you had time to look at https://review.opendev.org/730342 https://review.opendev.org/730323 patches? (wheel builds in a need of AFS volumes)17:03
fungihrw: it's on my list for today, hopefully in the next couple of hours before the weekly meeting17:04
hrwfungi: thanks17:05
fungimy pleasure17:05
clarkbI want to say the big issue with wheels for arm64 was lack of reliable afs on arm64 with various platforms?17:05
*** vblando has quit IRC17:05
hrwclarkb: hope it got better17:06
clarkbhrw: I doubt ti did for those older platforms like centos717:06
clarkbbuster, focal, and bionic will probably work though?17:06
hrwI never used afs so hard to tell17:06
*** mnasiadka has quit IRC17:07
*** Open10K8S has quit IRC17:08
*** Open10K8S has joined #opendev17:13
fungiyeah, generating them is probably fine because i think we copy the files through an intermediary anyway17:14
fungiserving them is the trick, since the "mirror" hosts are local to each cloud environment and the arm64/aarch64 environments we have are homogenous so we need openafs or kafs kernel modules which work on those architectures17:16
clarkbfungi: I didn't think we copied through the executor, its directly off the build host17:16
fungibuild host being the job node?17:16
clarkbya17:17
clarkbI'm sure ianw can fill us in if there was something missing. I thought it was afs on the nodes building the wheels but maybe it was something else or simply never done17:18
fungiwell, at one point it was afs on the nodes building the wheels because we did it with cron jobs and not zuul17:21
fungii'm not sure if that's still a problem though17:21
*** mnasiadka has joined #opendev17:22
*** vblando has joined #opendev17:22
fungialso fun, looks like all our wheel build jobs have been failing for a while17:23
fungihttps://zuul.opendev.org/t/openstack/build/5ee8721a0c5345349b9010fdf798ddb817:23
openstackgerritMerged opendev/system-config master: Correct the test gearman certs  https://review.opendev.org/72977117:25
openstackgerritMerged opendev/system-config master: Fix whitespace in zuul-executor PPAs  https://review.opendev.org/72978517:25
clarkbfungi: openstack/project-config/roles/copy-wheels seems to do the copying and it seems to run agains the remote nodes17:26
funginow that we don't copy everything but only what we've actually built, i wonder if we could just copy through the executors and then don't even need secrets on the nodes17:28
fungiman that job produces some large logs17:30
clarkbya my browser is basically giving up on that17:30
fungii recently culled 99.9% of my open tabs, so browser is slightly more responsive at least17:31
fungibut yeah, the log prettifier is struggling still17:31
clarkbfungi: it failed to rm the wheels we didn't want because there were no args given17:32
fungiclarkb: yes, because of an earlier failure17:32
fungii'm trying to find where though17:32
clarkbhttp://paste.openstack.org/show/794013/17:32
fungiyeah, look just above that though, tox exited nonzero because... reasons17:33
clarkbya echo '*** FAILED BUILDS FOR BRANCH stable/ussuri'17:34
fungiwe could make the remove wheels step a little more robust against that and have it emit a clear error when the step which generates remove-wheels.txt finds no matches17:34
fungiright, i'm currently reviewing the ussuri python3 build log17:34
clarkbfungi: I think we collect log files for every wheel we build17:34
clarkbwhich should hopefully tell us why the thing that failed failed (assuming we can also browse those logs)17:35
fungiright, i don't see any problem with python3 so maybe we recently broke ussuri python217:35
fungi(which wouldn't surprise me at all)17:36
clarkbfungi: https://765a239ecc34c44a5b00-7f41bd38e61f614f484bdc6903cb8f38.ssl.cf5.rackcdn.com/periodic/opendev.org/openstack/requirements/master/publish-wheel-mirror-ubuntu-bionic/5ee8721/python2/failed.txt17:38
clarkbfungi: I think the problem is we expect python2 to work at all anymore :)17:38
clarkbwe may need to update it to be best effort for python217:39
fungiyeah, i kinda figured17:39
fungii think there may be patches in flight for that...17:39
fungiAJaeger: ^ do you recall?17:39
fungiprometheanfire: ^ maybe you17:39
clarkbbasically build as many wheels as we can, remove any that are redundant and do best effort17:39
clarkbhttps://765a239ecc34c44a5b00-7f41bd38e61f614f484bdc6903cb8f38.ssl.cf5.rackcdn.com/periodic/opendev.org/openstack/requirements/master/publish-wheel-mirror-ubuntu-bionic/5ee8721/python2/build/ussuri/1/tempest%3D%3D%3D24.0.0/stderr17:40
clarkband ya its basically the problem of "things have moved on"17:40
openstackgerritJames E. Blair proposed opendev/system-config master: WIP: fake zuul_connections for gate  https://review.opendev.org/73092917:44
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: keep subunit2html.py in the role  https://review.opendev.org/73093017:51
*** mlavalle has quit IRC17:54
openstackgerritSagi Shnaidman proposed zuul/zuul-jobs master: WIP Add ansible collection roles  https://review.opendev.org/73036017:55
*** slittle1 has quit IRC18:00
*** slittle1 has joined #opendev18:02
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: keep subunit2html.py in the role  https://review.opendev.org/73093018:03
AJaegerfungi: no idea, sorry18:03
clarkbfungi: where I've gotten to is we need to download the files the job generates in order to run the find + grep for wheel downloads to see why that was empty18:09
openstackgerritMerged openstack/project-config master: Add base replication jobs for oslo-metrics  https://review.opendev.org/72882018:09
clarkbfungi: unfortunately there are 29k files so it might be a little while18:09
clarkb(also I tried to do this by hand via the browser but then I realized how big the scope was)18:10
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: keep subunit2html.py in the role  https://review.opendev.org/73093018:10
clarkblooking at the script I think it may be fine with python2 errors18:10
clarkbthe problem is that the removal list is empty when it shouldn't be18:11
clarkb(and I've confirmed that things like six should be removed but everywhere I see it in the logs it says not downloading because its already in my cache, need to find where it is initially downloaded and ensure that our find + grep works against that)18:11
clarkbI guess I can just do a local pip install too to check if the format of its output has changed18:12
*** mlavalle has joined #opendev18:13
clarkb'  Downloading openstacksdk-0.46.0-py3-none-any.whl (1.3 MB)' is what a local pip command seems to output18:17
*** slittle1 has quit IRC18:18
clarkbwhich won't match sed -n 's,.*Downloading from URL .*/\([^/]*\.whl\)#.*,\1,p'18:18
clarkbI think that is the issue18:18
fungioh! i wonder if the log format changed out from under us18:21
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-tox-output: empty envlist should behave like tox -e ALL  https://review.opendev.org/73033418:23
openstackgerritClark Boylan proposed openstack/project-config master: Update pip output parsing to fix wheel mirror builds  https://review.opendev.org/73093318:24
clarkbfungi: ^ I think that will fix it, but some independent confirmation of the behavior change would be good18:25
fungiwell, also if we're running different versions of pip on different platforms the message format might differ too18:30
clarkbya, though we ahould be able to run an up to date or at least consistent pip everywhere?18:40
clarkb20.something for python2 and 3 support18:40
clarkbI think it is all done in virtualenvs too?18:40
fungiwe're using volumes named like mirror.wheel.bionicx64 for the x86-64/amd64 wheels, what should i call the corresponding aarch64/arm64 volumes? mirror.wheel.bionicaa64 or something else18:41
openstackgerritGhanshyam Mann proposed opendev/irc-meetings master: Add Secure Default policies popup team meeting  https://review.opendev.org/73093518:49
clarkbfungi: the changes hrw pushed made a choice I think18:50
clarkbfungi: since the volume name is used to replicate18:50
fungiclarkb: what version of pip? seems like 19.2.3 still gives me the old output like: Downloading https://files.pythonhosted.org/packages/e1/e5/df302e8017440f111c11cc41a6b432838672f5a70aa29227bf58149dc72f/urllib3-1.25.9-py2.py3-none-any.whl (1218:50
fungi6kB)18:50
fungii'll test again with upgraded pip18:51
clarkbfungi: 20.1.1 is what I used18:51
clarkbfungi: we run `build_env/bin/pip install --upgrade pip` in the build so they should all be using latest currently18:52
clarkbwe'll likely need to do <21.0.0 to continue to support python2 (but that can be a separate thing)18:52
hrwclarkb: just tried to follow with naming18:53
prometheanfirefungi: hi?18:53
clarkbhrw: ya I think your choices were fine, but fungi's volume creation should match that change18:53
hrwclarkb: refreshing patch is a moment once names are known18:54
prometheanfirewe are talking about getting rid of py27 from gate infra?  iirc reqs doesn't use it at all, only swift does as far as I can tell18:54
fungino biggie, i missed that we had volume names there already. happy to just use them18:54
fungiprometheanfire: more about how to make sure we still generate whatever py2 wheels are needed in ussuri, or whether there are patches in flight to stop trying to build py2 wheels with ussuri constraints18:55
hrwpy2 wheels may be useful for stable branches18:55
prometheanfireya, useful for stable branches I can see18:57
fungiwe iterate over the stable branches already18:58
fungidifferent constraints in them anyway18:58
prometheanfirebuilding them from master I'm not sure about, except for swift18:58
fungii think their py2 constraints moved in-repo18:59
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: keep subunit2html.py in the role  https://review.opendev.org/73093018:59
*** diablo_rojo has joined #opendev18:59
fungienabling the requirements repo to drop all the python2 conditionals18:59
fungifrom master18:59
clarkbremember we are only building things for which there aren't wheels on pypi19:01
clarkbswift may need its liberasurecode wheels build though19:01
AJaegerfungi: requirements has already dropped py2 conditionals from master19:01
openstackgerritJames E. Blair proposed opendev/system-config master: WIP: fake zuul_connections for gate  https://review.opendev.org/73092919:01
fungiAJaeger: yeah, in which case it probably makes sense for us to stop trying to build py2 wheels from ussuri constraints19:03
prometheanfirefungi: we don't have py27 signified here https://github.com/openstack/requirements/blob/master/upper-constraints.txt19:03
prometheanfiresame with gr19:04
* prometheanfire removed that a week or two ago19:04
*** dpawlik has joined #opendev19:04
* prometheanfire should read a more recent backlog19:04
AJaegerfungi: ussuri still has py27 constraints, only master dropped AFIAU19:04
clarkbright fungi is pointing out that you also need to update the wheel building19:05
clarkband that some things in ussuri like tempest seem to not work either19:05
hrwclarkb: you run 'pip install a lot of packages' and those which are on pypi are just fetched anyway19:08
*** roman_g has quit IRC19:08
clarkbhrw: yes, but we don't want to publish those in our local wheel mirror we want them to be fetched from pypi instead19:09
clarkbso we install/build wheels for everything then only copy out the subset that isn't already available on pypi19:09
hrwclarkb: yep19:09
openstackgerritMonty Taylor proposed opendev/system-config master: WIP Move users into a base subdir  https://review.opendev.org/73093719:24
*** dpawlik has quit IRC19:35
markmcclainI'm trying to diagnose a POST_FAILURE on tarball upload: https://zuul.opendev.org/t/openstack/build/f5fd340f0158428e85e4b13eb85f6bc2/log/job-output.txt#100419:41
clarkbmarkmcclain: pypi doesn't allow you to replace a release iirc19:43
clarkbmarkmcclain: you can remove a release or provide a newer versioned release, but you can't replace an existing release19:43
clarkb(this way if you've vetted a release you're not getting something different later on)19:43
fungimore specifically it doesn't let you upload a file with the same filename as any file which has been previously uploaded (even if it's since been deleted)19:44
fungias a safety precaution19:44
markmcclainhmm... PYPI shows .whl releaese for 2018.2.8 on Apr 4 and May 20th19:44
markmcclainboth from opstackci Apr-4 from 23.253.136.207 and May-20 from 104.130.127.10219:45
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: fetch-subunit-output: keep subunit2html.py in the role  https://review.opendev.org/73093019:45
fungiwhat's the project name on pypi? sorry, in the middle of a meeting so can't comb through logs just now19:46
markmcclainx/networking-arista19:46
fungiso looking at https://pypi.org/project/networking-arista/2018.2.8/#files then?19:47
markmcclainactually scratch that... there's only one attempt on May 20th19:47
markmcclainwas looking at 2017.2.819:48
fungioh, above you said 2018.2.8, sorry19:49
markmcclainthe view is essentially the same with the tarball there19:50
fungiyeah, https://pypi.org/project/networking-arista/2017.2.8/#files19:50
markmcclainupload date is May 20th, but zuul log shows it failed19:50
clarkbthe wheel actually uploaded fine earlier in the log19:51
clarkbit makes me wonder if twin uploaded bioth when it uploaded the wheel19:52
clarkbthen it failed when we try to upload the tar.gz because it is already there19:52
markmcclainultimately I'm trying to figure out why versioned tarballs stopped publishing to tarballs.o.o in late Feb, but this was first error I encountered sifting through logs19:53
fungiappears there have been a bunch of similar failures: https://zuul.opendev.org/t/openstack/builds?job_name=release-openstack-python&project=x/networking-arista19:54
*** dpawlik has joined #opendev19:55
markmcclainright.. this impacting all branches and most recently last week's releases19:56
clarkblooking at pypi I think the job did what we wanted19:56
clarkbit just failed to record that properly (and maybe that affected tarball uploads too)19:56
clarkbmy hunch is twine uploaded both artifacts at the same time19:57
clarkbthen tried to do that again and failed19:57
ianwhrw: i can look in on those arm64 wheel builds19:59
hrwianw: thanks19:59
mordredcorvus: when you were doing the iptables stuff based on group membership - did you have any issues finding things for hosts not included in the current playbook?19:59
corvusmordred: i don't think so, because i was accessing inventory variables20:00
corvusmordred: we suspected that may be different than facts)20:00
mordredgotcha - so the list of hosts in a group worked fine20:00
corvusyep20:00
mordredasking because we use iptables in base20:01
corvusoh wait20:01
ianwclarkb / modred: could i ask you to look over dib changes https://review.opendev.org/#/c/730690/1 https://review.opendev.org/#/c/730691/3 https://review.opendev.org/#/c/730692/3 which fixes my screwup with the ubuntu kernel installs20:01
fungimarkmcclain: yeah, starts up here: https://zuul.opendev.org/t/openstack/build/f5fd340f0158428e85e4b13eb85f6bc2/log/job-output.txt#95620:01
corvusmordred: let me clarify: i believe that getting the list of hosts in a group is fine -- it is worth noting that in gate tests, we don't have the full inventory available, only the inventory used for that job.20:02
fungimarkmcclain: clarkb: i agree, looks like twine may have started uploading everything on invocation20:02
corvusmordred: so "what are the zuul hosts?" will always work.  "what are the zuul hosts?" when run on, say, "system-config-run-meetpad" will return the empty set.  but on "infra-prod-meetpad" it would return the actual zuul hosts.20:03
clarkbmarkmcclain: fungi though looking at the logs more closely the file sizes it prints don't support this theory20:03
fungialso output may be duplicated in that log, i don't see the same in the console view broken down by task20:04
mordredcorvus: nod.20:05
clarkbianw: looks like centos7 functtests failed on the first one though taht should be separate of the change (so I +2'd)20:05
clarkbfungi: markmcclain is it possible that we are running the job multiple times?20:06
fungior two similar jobs, but yeah i'm trying to pull up the buildset for that20:07
fungii guess we don't have an easy link from a build to its corresponding buildset20:08
fungihttps://zuul.opendev.org/t/openstack/builds?project=x%2Fnetworking-arista&pipeline=release20:08
corvusfungi: try the link after "buildset" on the build summary page20:09
fungithat doesn't seem to indicate any jobs i would expect to race20:09
clarkbfungi: could be a separate pipeline maybe?20:09
fungicorvus: oh, thanks, i'm blind20:09
fungii was sure there was one and then wasn't able to spot it for some reason20:10
fungiclarkb: not that i can see from https://zuul.opendev.org/t/openstack/builds?project=x%2Fnetworking-arista#20:10
clarkbquickly checking zuul configs for it I don't see any overlapping jobs20:11
clarkbya it really does seem like maybe twine is uploading both or pypi is returning an error improperly20:11
clarkbbasically we upload the tar.gz to pypi and there is no conflict but it reports there is (while still accepting the upload)20:11
*** cloudnull has quit IRC20:12
clarkbI need to pop out for a bike ride now, but if that hasn't been sorted out yet when I'm back I'll try to take a second look with fresh eyes20:12
fungilooks like we're using twine 1.15.0 there20:14
clarkbianw: small thing on the second change but worth addressing early I think20:16
fungiwhich is interesting... latest release of twine is 3.1.120:16
fungi1.15.0 was from september20:17
openstackgerritIan Wienand proposed openstack/diskimage-builder master: package-installs : allow a list of parameters  https://review.opendev.org/73069120:18
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Revert "Revert "ubuntu-minimal : only install 16.04 HWE kernel on xenial""  https://review.opendev.org/73069220:18
ianwclarkb: ^ thanks, updated20:18
fungiaha... "Twine now requires Python 3.6 or later. Use pip 9 or pin to “twine<2” to install twine on older Python versions." https://twine.readthedocs.io/en/latest/changelog.html20:19
fungiunfortunately still no idea why it's complaining about an existing file20:20
fungibut i guess it does suggest we're running with older python in that job20:20
clarkbfungi: that pin may be pre bionic20:21
clarkbwe can probably unpin it now?20:21
clarkbthen see if we get different results20:21
openstackgerritMonty Taylor proposed opendev/system-config master: Clean up base playbook  https://review.opendev.org/73098520:23
mordredclarkb, fungi : yeah - maybe old twine is confused by latest pypi20:24
mordredclarkb, corvus: ^^ that's not b) yet - that's just a pre-cursor I think we're going to want first for either b) or c)20:24
fungiclarkb: we're not pinning, other than to skip a known broken version20:25
fungii think we're just running it with older python20:25
fungiso pip is selecting the most recent release which supports python220:25
fungi(because we "Use pip 9")20:25
openstackgerritMonty Taylor proposed opendev/system-config master: WIP Move users into a base subdir  https://review.opendev.org/73093720:29
corvusmordred: istr you were looking into the idea of gate-only data overriding public hostvars and identified that it's tricky -- because the public hostvars are in the playbook-adjacent inventory directory, and that seems to take precedence over the /etc/ansible/hosts hostvars?20:29
mordredcorvus, clarkb: ^^ also - base.users doesn't work - but base/users does in local testing20:29
mordredcorvus: yes - I believe that's correct.20:30
mordredcorvus: you konw ...20:30
fungi#status log ze12 war rebooted around 18:50z due to a hypervisor host problem in the provider, ticket 200526-ord-000103720:30
mordredcorvus: perhaps we combine this topic with the one from the meeting about splutting out 2 sets of public host vars20:30
openstackstatusfungi: finished logging20:30
mordredcorvus: what if we moved hostvars out of playbooks/ entirely...20:31
fungilooks like the zuul-executor process did not start when ze12 was rebooted, and i'm unable to start it manually (though i did confirm the initscript patch got applied at least)20:31
mordredcorvus: and made *2* inventory directories - one that has the openstack.yaml file and the base hostvars20:31
mordredcorvus: and one that includes groups.yaml and includes the per-service hostvars20:31
mordredcorvus: then I believe the order in which the inventory sources are defined in ansible.cfg determines precedence20:31
mordredcorvus: and we can then have an understandable inventory override in the gate jobs20:32
fungiaha! /var/log/zuul/* are owned by user 300020:32
fungiinfra-root: ^ we probably need to fix this across all our providers20:32
fungier, executors20:32
corvusfungi: on it20:32
corvusmordred: sounds promising20:32
fungithanks corvus!20:32
mordredcorvus, fungi we shoudl audit uid - I think I remember seeting the zuul user being 3000 on some hosts and 10001 on others20:33
mordred(now the zuuld user)20:33
mordredcorvus: I'll poke at that idea too20:33
fungia find in relevant subtrees looking for -uid=3000 is probably prudent20:33
mordredcorvus: you know nothing I like more than patches that re-arrange ansible yaml files20:33
fungioh, or comparing the uids in /etc/passwd too, yep20:34
mordredfungi: may want to do an ansible grep of zuuld: in /etc/passwd20:34
mordredyeah20:34
fungizuuld is uid 10001 and gid 10001 on ze1220:34
fungisame on ze01, but on ze02 for example it's uid 3000 gid 1000120:35
fungiso yeah, not consistent20:35
*** priteau has quit IRC20:36
corvusoh i thought that had been done20:36
fungimaybe ansible couldn't update the uid because the executor processes had to be offline?20:36
corvuswell yeah, i mean that's what we were expecting to maybe have problems with20:36
corvusbut i thought someone over the weekend checked the uids and fixed it20:37
corvusbut i guess i misunderstood20:37
fungiit was the username change ansible was complaining about over the weekend (user was named zuul in /etc/password and usermod refused to update that to zuuld)20:38
fungimordred ran a one-off play to sed the passwd file20:38
corvushow is the uid 3000 on ze02 if the playbook ran successfully and says that the uid should be 10001?20:39
fungii wasn't able to start the executor i had stopped over the weekend, spotted that the username was still wrong in the initscript, and pushed up a patch for that but it didn't land until today20:39
corvushuh, apparently that task failed20:41
corvusoh yeah, lots of failures in that playbook20:41
corvushow is it that playbook failed and the job succeeded?20:41
corvusoh it didn't!20:42
corvusit's still all kinds of failing20:42
corvusokay, i guess i will continue to whack the moles there then20:42
corvushttps://zuul.opendev.org/t/openstack/builds?job_name=infra-prod-service-zuul20:42
mordredcorvus: ooh! this "reorganize hostvars" has another nice benefit - it means we can put groups of playbooks in subdirs too - because we don't have to worry about playbooks being adjacent to hostvars20:44
* mordred has nice patch coming20:44
corvusi think i need to shut down most of the executors to do the re-iding20:45
corvusi'll try the process on ze02 now20:46
corvusoh wait20:46
corvusfungi: you said ze12 is down?20:47
fungicorvus: yes, and probably 01 as well20:47
* fungi checks20:47
corvusokay, i'll start there20:47
fungiyeah, 01 is the one i stopped over the weekend to see how far we could get with ansible20:47
fungibut couldn't start it again because of (at least) the incorrect initscript20:48
corvuswhat's the correct way to run ansible on a host from bridge?  https://docs.openstack.org/infra/system-config/sysadmin.html#force-configuration-run-on-a-server is out of date20:48
fungiand now 12 is down since a couple hours ago when rackspace had to do a reboot migration on it20:48
fungiand the executor failed to start (presumably because it couldn't append to the logs)20:48
mordredcorvus: cd ~/src/opendev.org/opendev/system-config20:49
mordredcorvus: then run ansible-playbook playbooks/foo.yaml20:50
corvusas zuul20:50
corvusor cd ~zuul and run as root20:50
mordredsorry - ~zuul20:50
mordredrunning as root works fine20:50
corvusk, that's what i thought; running now20:50
corvusokay, we're not doing any recursive chowning of logfiles, etc20:52
openstackgerritMonty Taylor proposed opendev/system-config master: Split inventory into multiple dirs and move hostvars  https://review.opendev.org/73099120:54
mordredcorvus: I *think* that should work ^^ ... although it's sure going to run all of the system-config test jobs20:55
openstackgerritColleen Murphy proposed openstack/diskimage-builder master: Pre-install xz package in opensuse chroot  https://review.opendev.org/73099220:55
corvusi'm running this on ze12: chown -R --from=3000 zuuld /var/log/zuul; chown -R --from=3000 zuuld /var/lib/zuul20:55
mordredcorvus: ++20:55
fungiyou can probably path multiple paths to chown, but yeah that looks fine21:00
corvusstarting up ze1221:01
corvusseems to be happy21:01
corvusi'll move on to ze01 now21:01
fungiawesome21:02
corvusownership looks ok on ze0121:02
corvusze01 was stopped by request on 5-2321:03
fungiyes, that's the one i stopped manually but couldn't start again because the initscript was wrong at the time21:03
corvusk, it's up and running now21:04
fungiinitscript fix merged earlier today i just hadn't gotten a chance to try it yet21:04
corvusze03 is correct and running21:05
corvusze05 same21:05
corvusze07 same21:06
corvusso 1,3,5,7,12 are good -- i'm going to stop 2,4,6,8,9,10,11 all at the same time and run the corrective steps21:06
*** factor has joined #opendev21:06
mordredhttps://www.youtube.com/watch?v=hFZFjoX2cGg <-- next time anyone needs a brainhole break - I cannot recommend this youtube video highly enough21:11
fungiso roughly half the executors, sounds good21:17
corvusi ran a manual usermod and that chown, now i'm running the playbook just to make sure we got everything21:21
*** hasharAway has quit IRC21:23
corvusno errors; starting back up now21:25
corvusokay, 12 executors running builds now21:27
mordred\o/21:29
corvusi think we expect the next instance of prod-zuul to succeed21:29
*** slaweq has quit IRC21:34
jrosserwhen i look at zuul status, some of my jobs say (2nd attempt) - what makes this happen?21:41
clarkbjrosser: network connectivity issues will do that. Likely due to the effort corvus was doing on executors above21:42
jrosserhttps://review.opendev.org/729878 would be an example of this just now21:42
clarkbzuul will restart jobs like that if it detects a failure outside of the job cintent21:43
*** cloudnull has joined #opendev21:43
jrosseraaah ok, thanks21:43
openstackgerritMonty Taylor proposed opendev/system-config master: Clean up base playbook  https://review.opendev.org/73098521:47
openstackgerritMonty Taylor proposed opendev/system-config master: Split inventory into multiple dirs and move hostvars  https://review.opendev.org/73099121:47
*** dpawlik has quit IRC21:56
clarkbfungi: markmcclain's job ran on a bionic node21:57
clarkbbionic has python3.6, does that mean we are running twine under python2?21:58
openstackgerritMonty Taylor proposed opendev/system-config master: Move base roles into a base subdir  https://review.opendev.org/73093721:58
clarkbthe command is python3 -m pip install twine!=1.12.021:59
fungiclarkb: clarkb 3.521:59
fungitwine 2.0 requires python 3.6 or later22:00
fungiso pip>=9 on python<3.6 will install twine<222:00
clarkbfungi: bionic is 3.622:01
clarkbwhere is 3.5 coming from?22:01
melwittdoes anyone know if/how one can re-create the same env that a zuul job creates locally from a proposed review? context is my colleague is working on this change https://review.opendev.org/730143 and it's failing upstream zuul but passing in a downstream-created environment. is there a way he could re-create the env zuul is using locally to try and debug?22:02
clarkbmelwitt: tripleo has a zuul reproducer tool, but its fairly involved (you basically have to run a mini zuul I think)22:03
clarkbmelwitt: is the pep8 issue a problem or just the devstack failure?22:03
melwittthe devstack failure22:04
fungiit's challenging, because zuul exists specifically to do things which are so complex that you can't really do them locally (like assembling job configuration from distributed sources, integrating changes from lots of different repos, et cetera)22:04
corvusmelwitt: i'm assuming your colleague has run devstack, tempest, etc with the change in place and that's suceeding.  so is it the case there's a suspected nuance of, say, the images used in opendev that's affecting it?22:05
fungiwe do make the images available for download, if that helps22:05
corvusmelwitt: or does your colleague need a way to "run devstack like it runs in the gate?"22:06
melwittfungi: that might help if you couldn't point me to it22:06
fungimelwitt: https://nb01.opendev.org/images/22:07
clarkband that job ran on ubuntu-bionic so you want the ubuntu-bionic image. It will configure the root ssh keys if given config drive metadata22:07
fungiyou'll probably need to supply a configdrive with ssh keys if you need to log into it22:07
ianwfungi: i'll be happy to pickup the wheel work ... it's on my todo list, esp. new builders22:07
melwittcorvus: he's run the tests he's working on through a tripleo deployed env downstream and things pass there. it's a feature in libvirt that became available in a certain version and zuul is installing the needed version, so it's so far a mystery why it's failing in the upstream zuul case22:08
clarkbmelwitt: and yall have seen https://zuul.opendev.org/t/openstack/build/df8489e5f9ea46cb988911fb51137b26/log/compute/logs/screen-n-cpu.txt#21493 ?22:08
clarkbmelwitt: the error seems to happen in nova not libvirt fwiw (though that could be bubbled out of libvirt I suppose)22:09
fungiianw: oh, thanks, i got the additional volumes for 730323 vos created and added the read-only sites for them, but i think there may be a problem with one of the afs servers because i can't do the vos release step prior to mounting them into the tree22:09
fungiianw: Failed to start transaction on 536871099; Possible communication failure; Could not release lock on the VLDB entry for volume 536871098; Error in vos release command.22:09
fungi(et cetera)22:09
corvusmelwitt: it's probably worth running it in a devstack deployment since that's what that job does22:09
clarkbfungi: I can't see how we'd be using python3.5 on ubuntu bionic. We may be using python2 though22:09
melwittclarkb: yes. admittedly I don't know the detail of how this feature works but it sounds like it happens if libvirt isn't doing a thing we expect it to given the version. and we're wondering is this some difference between distro version of libvirt somehow or what22:10
fungiclarkb: yeah, i'm baffled too, but... https://zuul.opendev.org/t/openstack/build/f5fd340f0158428e85e4b13eb85f6bc2/log/job-output.txt#91322:10
fungiclarkb: i wonder if we used python3.5 on the nodepool builder to create the virtualenv somehow?22:11
fungithough no, that's no virtualenv/venv22:11
melwittcorvus: yeah, makes sense22:12
clarkbfungi: ya thats not a virtualenv, super weird22:12
fungiclarkb: so why is pip referencing packages from /usr/local/lib/python3.5 on a bionic node?!?22:12
clarkbmelwitt: does the feature require kvm and not just qemu?22:13
clarkbmelwitt: devstack does not do nested virt by default because it is so flaky22:13
melwittclarkb: that I don't know. I'll ask22:14
fungimelwitt: this is also a good reference to pass along https://docs.opendev.org/opendev/infra-manual/latest/testing.html22:15
ianwfungi: ok, i see cent8 and focal rw volumes in the volume list, but not R/O22:16
melwittthanks for the info, this is all very helpful22:16
fungiianw: i did the `vos addsite afs01.dfw.openstack.org a mirror.wheel.focalx64` (and afs02.dfw) for each of the volumes22:17
fungii didn't hit any errors, but yeah i suppose that's where the problem lies and why vos release isn't working for them22:17
fungipossibly hung addsite transactions22:18
clarkbfungi: I've ssh'ed into a random limestone bionic node and there is no python3.5 that I can see22:18
openstackgerritGhanshyam Mann proposed opendev/irc-meetings master: Add Secure Default policies popup team meeting  https://review.opendev.org/73093522:18
fungiianw: maybe we're hitting a limit somewhere?22:18
fungiclarkb: and no /usr/local/lib/python3.5 either i guess22:18
clarkbthere is 3.6 and 2.7 in /usr/local/lib as expected22:18
clarkbfungi: orrect22:18
clarkboh!22:18
clarkbI see now22:18
fungiis that on the executor?22:18
clarkbfungi: ya22:19
melwittconfirmed that no nested virt is involved. thank you for the ideas, we have a path forward now, thanks so much clarkb corvus fungi22:19
clarkbwe do the wheel build on the bionic node, then copy it to the executor and attempt to twine it from there22:19
fungimelwitt: we're here if you have more questions, just let us know22:19
ianwfungi: hrmmm ... ls: cannot access 'centos-8-x86_64': Connection timed out after i tried to mount it22:19
fungiclarkb: yeah, okay, so that much makes sense (and explains why we get older twine)22:20
melwittthanks all ++ :)22:20
fungiianw: i didn't even get as far as trying to mount because our instructions say to vos release first22:20
fungiand that was breaking22:20
ianwyeah, i'd image that because the r/o mirrors aren't showing up (?) vos release will not be happy22:21
ianwother than the creation logs, nothing of interest in afs server logs i can see22:23
*** DSpider has quit IRC22:23
fungii wonder if we've still got something hung from the afs02 outage last month22:23
fungimore stale transaction records or something?22:23
ianwvicepa plenty of space, nothing in dmesg type logs22:23
clarkbmelwitt: the job appears to be using libvirt 4.0.0 https://libvirt.org/formatdomain.html#elementsVideo says that you need 4.6.0 to use model type none22:25
fungiclarkb: okay, so mystery of the older twine is out of the way, but i'm still no closer to figuring out why x/networking-arista seems to consistently hit file-exists errors for sdist uploads to pypi but other projects like openstack/python-ironic-inspector-client here are just peachy: https://zuul.opendev.org/t/openstack/build/7171b1ae6b5f42eb8055c539db8cbe4f22:26
ianwfungi: you working on mirror-update.opendev.org?22:27
melwittclarkb: oh geez.. let me look into that22:27
fungiianw: nope22:27
fungiianw: i was just using my fungi/admin kerberos account locally22:27
fungiianw: ooh, could it be i'm using too-new openafs?22:28
fungithat wouldn't surprise me22:28
ianwok, that seems to have 1.8.5 openafs packages, but with an uptime of 175 days that would presumably still be running the older 1.8.3 kernel module22:28
fungiyeah, i'm using 1.8.6~pre122:28
fungii bet that's it22:28
fungiwe probably have to cancel the transactions for my addsite commands22:29
clarkbfungi: did you want to push the new patchset for https://review.opendev.org/#/c/730933/ to get the regex to match old and new pip?22:29
clarkbianw: ^ is related to mirror owrk22:29
fungiclarkb: sure, i can do that, just a jiffy22:29
ianwfungi: i don't know, stuff like this shouldn't break22:29
fungiianw: oh, was it using new server with old clients that was the problem not the other way around?22:29
clarkbfungi: actually one sec, your regex won't match the new case22:30
fungiclarkb: huh, i thought i tested it locally against that22:30
clarkbfungi: you need to match the (5MB) suffix22:30
clarkbwithout a #22:30
melwittclarkb: looks like that should totally be the problem 😬22:30
clarkboh wait I see the # is optional so I think we just need to remove the |$ case?22:30
clarkbfungi: I think yours will work but we should just remove the $ case I think22:31
clarkbwe shouldn't ever hit that branch in the regex22:31
clarkbfungi: re why other packages don't hit the dup error, could it be related to upload sizes somehow (eg a race in pypi)22:32
melwittclarkb: sorry for the noise :(22:32
clarkbmelwitt: no worries22:32
*** tobiash has quit IRC22:33
fungiclarkb: ahh, yeah i added the $ case just in case future iterations ended the line immediately after the .whl22:33
*** tobiash has joined #opendev22:34
fungibut it would be fine to only insist on # or space for patterns we've seen thus far22:34
fungiwas just trying to make the match as accepting as possible22:34
clarkbgot it in that case your proposed version is probably fine22:34
ianwfungi: i think to start, we should reboot mirror-update.opendev.org to make sure it's kernel openafs module is in sync with the userspace tools22:35
fungiwe'll still need to update it if they change case on Downloading or use a different word or wrap the filename in brackets or...22:35
clarkbfungi: ianw https://review.opendev.org/#/c/730861/ should be a quick review if you hvae amoment, that is for testing base-test with ssl'd mirrors22:35
clarkbfungi: well we'll be pinning pip soon enough I bet to accomodate python2 there22:35
clarkbfungi: and if we do that the output should be pretty stable22:35
openstackgerritMonty Taylor proposed opendev/system-config master: Split inventory into multiple dirs and move hostvars  https://review.opendev.org/73099122:36
openstackgerritMonty Taylor proposed opendev/system-config master: Move base roles into a base subdir  https://review.opendev.org/73093722:36
mordredclarkb: I don't have this totally working yet - but does that idea make sense of where that's going? ^^22:36
clarkbmordred: I'm not getting how the inventory split helps. We already organize group_vars roughly by service, and host_vars are not collapsed any further than they are already?22:38
clarkbmordred: the roles side makes sense to me22:40
openstackgerritMonty Taylor proposed opendev/system-config master: Split out a base playbook for the zuul service  https://review.opendev.org/73099922:43
mordredclarkb: the inventory split will eventually let us have 2 different files for each host/group - one for variables that are needed for base roles and one for variables that are needed by service-specific roles22:44
clarkbmordred: gotcha so the split is service/ vs base/ not between services as much22:45
mordredclarkb: that way we can file-match inventory/service/host_vars/zuul01.openstack.org.yaml in infra-prod-service-zuul and inventory/base/host_vars/zuul01.openstack.org.yaml in base-zuul22:45
clarkbbut that then helps to know when to run base or not22:45
mordredyeah22:45
mordreda side-benefit is it will also let us override hostvars in gate-specific vars22:46
openstackgerritJeremy Stanley proposed openstack/project-config master: Update pip output parsing to fix wheel mirror builds  https://review.opendev.org/73093322:46
mordredwhich we cant' do when the hostvars are just adjacent to playbooks, since that's a higher-priority location22:46
clarkbianw: ^ is worth a review given the mirror work22:46
mordredclarkb: granted - I'm not 100% sure this is good/better yet22:46
openstackgerritMerged opendev/base-jobs master: Test mirrors with ssl  https://review.opendev.org/73086122:46
clarkbI've restored https://review.opendev.org/#/c/680178/ to test ^22:47
clarkbcompletely unrelated, gitea has a few commits on the 1.12 branch after the rc tag22:51
clarkbthey are largely bug fixes so we may want to wait for another rc or release before deploying that22:52
*** tosky has quit IRC22:52
*** tkajinam has joined #opendev22:57
openstackgerritClark Boylan proposed zuul/zuul-jobs master: DO NOT MERGE test base-test with no virtualenv perms modifications  https://review.opendev.org/68017823:00
ianw       server afs01.dfw.openstack.org partition /vicepa RO Site  -- Not released23:07
ianw       server afs02.dfw.openstack.org partition /vicepa RO Site  -- Not released23:07
ianwfungi: ^ so they are there is vldb23:07
ianwi mean in listvldb23:07
fungiyeah, i wonder why the vos release hangs then23:08
ianwi don't see it hang on afs0123:09
ianwbut it doens't work23:09
ianwfungi: http://paste.openstack.org/show/794022/23:10
fungiVOLSER: volume is busy23:11
fungihuh23:11
fungiand it's afs01.dfw which is struggling apparently?23:11
clarkbinfra-root config-core I think https://review.opendev.org/#/c/730862/1 is ready as base-test testing shows apt and pypi things working with ssl23:12
ianwfungi: "vos status" does not show anything going on, afaics23:12
clarkbdoes listvldb show any locks?23:13
clarkb(if it does maybe we can work back from that to find what locked it?)23:13
ianwno, doesn't show locks23:14
fungiclarkb: looking through the networking-arista project journal on pypi (logged in with our openstackci creds), i see february 22 is the last time it had three log entries for a new release ("new release" and "add py2.py3 file" for the wheel and then "add source file" for the sdist). the next release on april 5 just has the "add py2.py3 file" entry as does each release after it, presumably also corresponding23:22
fungito the sdist upload failures23:22
ianwfungi: ok, i ran a salvage on it, it seemed to do something, and it's still nto happy23:22
fungi:/23:22
fungiianw: could we delete and try to recreate the sites and volumes? it's not like they have any content whatsoever23:23
clarkbfungi: does the journal not show the sdist for the newer releases? beause there are sdists available23:24
fungiclarkb: it doesn't log them being uploaded, no, only the wheels23:25
fungiit also doesn't log the release creation any longer23:25
fungii have no idea if this is a behavior change in pypi or what23:25
clarkbweird, it definitely lists the sdists23:25
fungithe security history log still shows events for each release creation though23:26
fungitimestamps are a bit weird too23:27
fungiinteresting, for 2017.2.8 the security history shows a release created event at May 20, 2020, 9:47:08 PM by user openstack-arista but then the journal shows the wheel upload at May 20, 2020, 9:58:00 PM by user openstackci (from the ip address of ze06)23:30
clarkbooh interesting, is it possible they did an sdist upload out of band then we did the wheel upload?23:31
fungimarkmcclain: ^ you might ask whoever is handling your release process what steps they're following... it shouldn't be necessary to manually create a release on pypi. are they doing that by uploading an sdist?23:31
clarkbmarkmcclain: ^ if you are still around maybe you know?23:31
fungilooking at the security history, the last working releases were when openstackci created the releases23:32
fungithe releases starting in april where openstack-arista created them prior to upload correspond to the build failures we've been seeing23:32
fungievery release starting from april was created by the openstack-arista account rather than by zuul uploading release artifacts23:34
fungithat's got to have something to do with it23:34
clarkb++23:34
fungifar too coincidental of a match23:34
fungithat also explains why i'm not seeing this problem for other repos, they show openstackci creating the corresponding releases at time of package upload23:36
ianw    number of sites -> 323:45
ianw       server mirror-update01.opendev.org partition /vicepa RW Site23:45
ianw       server mirror-update01.opendev.org partition /vicepa RO Site23:45
ianw       server afs02.dfw.openstack.org partition /vicepa RO Site23:45
ianwi feel sure that very odd server name has something to do with all this23:45
clarkbwait did mirror-update get set up as a server?23:46
ianwno ... hence the very odd bit :)23:46
fungii just double-checked and none of the commands i ran to create these volumes/sites had "mirror-update" anywhere in them23:47
fungiso it's presumably not from today at least23:47
ianwi'm running syncserv on mirror-update now, to see if that brings it back inline23:48
ianwfungi: i have no idea how things got so messed up :/  i had to zap a volume, and then vos delentry ... but i think after recreating it's working23:59
clarkbthe existing meetpad server is a 8vcpu 8gb memory server. jvb scaling seems to be more cpu dependent than memory dependent so I'll probably use a flavor with at least 8vcpu (which I'm guessing the smaller is also with 8gb memory) for adding a jvb server23:59

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!