Wednesday, 2022-10-26

*** tkajinam is now known as Guest396600:09
*** rlandy|bbl is now known as rlandy00:20
*** rlandy is now known as rlandy|out00:21
opendevreviewMerged opendev/system-config master: Switch bridge to bridge01.opendev.org  https://review.opendev.org/c/opendev/system-config/+/86111201:13
*** dasm|rover is now known as dasm|off01:18
*** tkajinam is now known as Guest397201:47
opendevreviewIan Wienand proposed opendev/system-config master: run-production-bootstrap-bridge: fix bridge name  https://review.opendev.org/c/opendev/system-config/+/86266502:14
*** raukadah is now known as chandankumar03:10
opendevreviewMerged opendev/system-config master: run-production-bootstrap-bridge: fix bridge name  https://review.opendev.org/c/opendev/system-config/+/86266503:13
ianwok, the bridge bootstrap is running better now03:16
ianwit's installing ansible03:16
ianw× Encountered error while trying to install package.03:20
ianw╰─> netifaces03:20
ianwit doesn't have gcc so this failed to build when installing openstacksdk03:21
ianwthis probably works in the gate because we build a wheel for it ...03:21
ianwhttps://pypi.org/project/netifaces/0.11.0/#files ... i guess no 3.10 wheel03:24
opendevreviewIan Wienand proposed opendev/system-config master: install-ansible: unconditionally install build-essential  https://review.opendev.org/c/opendev/system-config/+/86266603:28
opendevreviewMerged opendev/system-config master: install-ansible: unconditionally install build-essential  https://review.opendev.org/c/opendev/system-config/+/86266604:23
opendevreviewIan Wienand proposed opendev/system-config master: install-ansible: also install python3-dev  https://review.opendev.org/c/opendev/system-config/+/86266805:13
*** marios is now known as marios|ruck05:26
*** tkajinam is now known as Guest398605:43
opendevreviewMerged opendev/system-config master: install-ansible: also install python3-dev  https://review.opendev.org/c/opendev/system-config/+/86266806:18
ianwalright, bridge01.opendev.org is bootstrapped -> https://zuul.opendev.org/t/openstack/build/c647c32a0d4242aa9f4392d582deb07606:24
*** tkajinam is now known as Guest398806:53
*** jpena|off is now known as jpena07:18
opendevreviewIan Wienand proposed opendev/system-config master: Add new bridge to allowed root logins  https://review.opendev.org/c/opendev/system-config/+/86267007:46
noonedeadpunkfolks, do you know anything about centos 8 stream epel mirror? Ie was it dropped/buggy/etc? 08:47
noonedeadpunkAs we got https://zuul.opendev.org/t/openstack/build/c9655845446540368d870da56369785e yesterday and I'm not sure if it's worth jsut re-checking failed jobs or issue is deeper08:59
ianwnoonedeadpunk: afaik we have not done anything to it recently09:01
opendevreviewMerged opendev/system-config master: Add new bridge to allowed root logins  https://review.opendev.org/c/opendev/system-config/+/86267009:04
noonedeadpunkhm, ok, will try to re-check then09:10
fricklercentos isn't exactly known to provide us with high quality mirrors to pull from, we've seen things being out of sync for extended periods before09:18
noonedeadpunk:D09:19
noonedeadpunkyeah, I know, but decided to double check before wasting resources09:19
fricklerno obvious error in the rsync logs afaict, so if there still an issue, it likely will affect our upstream mirror, too09:29
frickler+is09:29
*** rlandy|out is now known as rlandy10:33
*** soniya is now known as soniya|afk10:42
*** dviroel|afk is now known as dviroel11:28
fricklernoonedeadpunk: FYI epel 9 seems to be broken for kolla, too https://062b194e5e6e841b5adf-7651d79ea360a4aa04fbe96029f7a5e2.ssl.cf1.rackcdn.com/656603/4/gate/kolla-ansible-rocky9-source/b2fede2/primary/logs/build/000_FAILED_mariadb-server.log12:17
noonedeadpunkfrickler: yeah we also have failures on centos 9, true12:21
noonedeadpunkand recheck failed same way 12:21
fungihttps://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/mirror-update/files/epel-mirror-update#L31 says we're mirroring from rsync://pubmirror1.math.uh.edu/fedora-buffet/epel12:29
opendevreviewMerged zuul/zuul-jobs master: Pin py38 jobs to focal  https://review.opendev.org/c/zuul/zuul-jobs/+/86262812:54
opendevreviewMerged zuul/zuul-jobs master: Add tox-py311 job  https://review.opendev.org/c/zuul/zuul-jobs/+/86262912:54
*** soniya|afk is now known as soniya13:01
*** dasm|off is now known as dasm|rover13:38
noonedeadpunkgot side-pinged. I can spawn a vm and check if it has issues with the original mirror14:12
fricklerclarkb: fyi I mentioned the storyboard mail on several project channels where I knew they were having that topic during the PTG, and a lot of feedback was that they weren't even aware of the service-discuss list, so maybe you'll want to send a link to openstack-discuss. or maybe fungi as tact lead.14:34
opendevreviewJeremy Stanley proposed opendev/system-config master: Correct block_storage_endpoint_override for rax  https://review.opendev.org/c/opendev/system-config/+/86270615:07
fungiclarkb: ^ frickler spotted that and it seems to allow much simpler venvs with newer openstacksdk/cli15:08
*** dviroel is now known as dviroel|lunch15:08
fungii'm able to make it work with a fresh venv on bridge now with just pip install openstackclient 'python-cinderclient<5'15:10
fungimake that <8 if we drop the v1 override on the cli15:12
opendevreviewJeremy Stanley proposed opendev/system-config master: Correct block_storage_endpoint_override for rax  https://review.opendev.org/c/opendev/system-config/+/86270615:13
clarkbfrickler: ya list membership is someting we've struggled with. When we first split opendev and created those lists every email I sent to the project lists told people to subscribe15:14
clarkbI think now the front page of opendev.org points people to those lists (if not we should add that)15:14
fungiit's listed as "contact info" at the bottom of the page15:20
fungibut we should probably make it more prominent15:20
*** marios|ruck is now known as marios|out15:33
clarkbthe mm3 image upstream merged my change to add lynx to the images16:23
clarkbanother thing that occurred to me is that our fork isn't going to be equivalent to the tagged images on docker hub because our images will be newer. THis potentially complicates upgrades later or a shift to the upstream iamges?16:23
clarkbthere is a "rolling" tag that was updated recently though which might be clsoer to what we want?16:24
*** dviroel|lunch is now known as dviroel16:26
opendevreviewMerged opendev/system-config master: Correct block_storage_endpoint_override for rax  https://review.opendev.org/c/opendev/system-config/+/86270616:39
fungioh, that sounds like an interesting option16:44
fungiclouds.yaml fix has rolled out. confirming on the new bridge01, this is working:16:55
fungipython3 -m venv foo16:55
fungifoo/bin/pip install -U pip setuptools wheel16:55
fungifoo/bin/pip install openstackclient 'python-cinderclient<8'16:56
fungisudo foo/bin/openstack --os-cloud=openstackci-rax --os-region-name=DFW volume list16:56
*** jpena is now known as jpena|off16:56
fricklernodepool rollout failed, like all the hourly jobs it seems? https://zuul.opendev.org/t/openstack/builds?job_name=infra-prod-service-nodepool&project=opendev/system-config17:19
fungioh, looking17:20
fricklerthat might be fixed by https://review.opendev.org/c/opendev/system-config/+/862670 which also failed to deploy though17:21
fungioddly, the ansible log doesn't indicate a failed task17:22
fungioh! new bridge17:23
fungiTASK [nodepool-base : Get zk config]17:24
fungibut we filter the task output, so no clear explanation for what failed with it17:24
fungiguess i'll step through it17:25
fungiplaybooks/roles/nodepool-base/library/make_nodepool_zk_hosts.py still starts with #!/usr/bin/env python317:26
fungifrickler: ^ i expect that's it, new ansible is touchy about that17:27
clarkbfungi: double check since I thought we were going to keep using old ansible and only upgrade once new bridge was stable17:27
fungioh, mebbe17:27
fungiansible-playbook [core 2.11.12]17:28
clarkbya its 2.13.x that had the problem I think that is unlikely to be the issue17:29
fungiansible            4.0.017:29
fungiansible-core       2.11.1217:29
fungithat's what we've got in /usr/ansible-venv anyway, so sounds like the problem lies elsewhere17:29
clarkbmake_nodepool_zk_hosts builds the nodepool config details for our zookeeper stuff out of inventory and host vars17:29
clarkbI would double check that those are looking good17:30
clarkbI need to pop out soon for an appointment so can't dive in right now myself17:31
fungino worries17:31
fungisudo /usr/ansible-venv/bin/ansible-inventory --list17:31
fungiERROR! Unexpected Exception, this is probably a bug: Object of type bytes is not JSON serializable17:31
clarkbthis may be the thing I mentioned somewhere which is that we rely on the yamlgroup inventory plugin which wasn't installed when I had a chance to look at new bridge yesterday 17:32
clarkbianw thought that the bridge boostrapping would do it, but maybe not17:33
fungiyeah, pip list doesn't report yamlgroup installed17:33
clarkbI'm not sure if it was ever pip installed17:33
clarkbyou drop the script into /etc/ansible/somewhere17:33
clarkband have to set it in your ansible.cfg config file17:33
clarkbalso maybe cross check with ianw's etherpad that was linked yesterday. THis may be a known thing17:34
fungilooks like we expect it to be in /etc/ansible/inventory_plugins/yamlgroup.py17:34
fungiand it's tehre, so maybe ansible isn't finding it17:34
fungi/etc/ansible/ansible.cfg does have enable_plugins=yaml,yamlgroup,advanced_host_list,ini17:35
fungiinterestingly, if i add -y it works17:37
fungiso the json encoding problem is specific to the default output format17:37
clarkbI think -y may mean interpret it as a normal yaml inventory though? which isn't quite right17:38
opendevreviewArtom Lifshitz proposed zuul/zuul-jobs master: DNM: Debugging failures  https://review.opendev.org/c/zuul/zuul-jobs/+/86273917:38
clarkbI would've expected it to use the enabled plugins list to find the right stuff instead17:38
fungioh, i thought -y was telling it to output yaml instead of json17:38
clarkbmaybe? (the last time I had to fiddle with ansible inventory I couldn't figure out how to make it talk to lcoalhost without a local connection, its all very confusing)17:40
fungifwiw, ansible-inventory on the old bridge gives the same behaviors from what i can see17:47
fungi`sudo /usr/ansible-venv/bin/ansible-config view` also returns configuration with the yamlgroup plugin specified, so it does appear to be reading that config at least17:53
fungi`sudo /usr/ansible-venv/bin/ansible zookeeper --list-hosts` gives me zk04.opendev.org, zk05.opendev.org and zk06.opendev.org so i think that means the group is known17:55
clarkb++17:57
fungii'm not sure how to easily test that plugin on the command line though18:01
fungilooks like it requires a hostvars dict and a zk_group dict18:02
fungiseems it needs a json blob piped into it18:05
fungiand expects a ANSIBLE_MODULE_ARGS key in that18:05
fungii guess this is part of ansible's module specification/protocol18:06
fungii'm able to get the json format correct, but am unclear on what the actual values for hostvars and zk_group should be, so ansible the script is just returning a changed=true failed=true if i provide a nonempty zk_group18:15
fungii'm thinking it should be "zk_group": ["zk04.opendev.org", "zk05.opendev.org", "zk06.opendev.org"]18:17
Clark[m]It would be whatever is passed by the Ansible that calls the module18:17
fungiyeah, which is hard to suss out because we no_log that task18:17
fungithe task passes zk_group: "{{ groups['zookeeper'] }}" which i think gets expanded before being sent into the module18:18
fungialso passes hostvars: "{{ hostvars }}" but i have no idea what that is or where it gets set (is that some sort of ansible builtin?)18:19
fungior maybe that's a whole inventory blob?18:20
fricklerthat should be the complete result of gather_facts. but I also have no idea why we set no_log on that task. but in the fail case there should also be an exception msg returned?18:45
fungior an unsafe value in one of the facts could get echoed into the log maybe18:48
fungianyway, i've cracked the input format18:48
fungiecho '{"ANSIBLE_MODULE_ARGS": {"hostvars": {"zk04.opendev.org": {"ansible_host": ""}}, "zk_group": ["zk04.opendev.org"]}}' | sudo /usr/ansible-venv/bin/python3 ~zuul/src/opendev.org/opendev/system-config/playbooks/roles/nodepool-base/library/make_nodepool_zk_hosts.py18:49
fungithat gets me successful output, albeit with incomplete data, but it's enough to exercise part of the script at least18:49
fungiso i can be reasonably sure it executes and isn't raising an exception at least18:50
fricklerUnable to pass options to module, they must be JSON serializable: Object of type bytes is not JSON serializable19:01
fungiwhere are you getting that?19:02
fungifrom gather_facts?19:02
fungioh, trying to pass the gather_facts blob in suppose19:02
fricklerno, from running that module. I made a copy of nodepool-base in /root/roles on bridge01 and run /root/sn.yaml19:03
fricklerwhich is a stripped down version of service-nodepool.yaml19:03
fricklerI'll add a debug to show the module parameters before that call19:04
fungiper above, if i pipe some json into the module i can get a successful response, so i guess that comes down to how ansible is injecting the json19:04
fricklerso I haven't found which of the hostvars has bytes in it. but it isn't one of the zk_hosts, so with a bit of filtering this passes https://paste.opendev.org/show/bIspSkl8YqUXvC8lPWoo/19:42
fricklerwith that I'm out for today and leave the fun for ianw I guess19:42
ianwo/ ... looking 19:45
ianwthis is infra-prod-nodepool right?19:47
clarkbI think so. There was a change to update the clouds.yaml on bridge and the nodepool nodes to fix a cinder issue with rax19:50
clarkbit apparently applied to bridge but the nodepool job failed on the zk config module thing19:50
ianwbridge run also has : "Failed to lock apt for exclusive operation: Failed to lock directory /var/lib/apt/lists/: E:Could not get lock /var/lib/apt/lists/lock."19:52
ianwIt is held by process 438359 (python3)19:52
ianw... occasionally19:54
ianwso either something is happening the background more frequently updating apt with jammy, or possibly somehow there's some sort of racy thing where ansible steps on itself19:54
clarkbunattended upgrades should only run once a day. But ya maybe therei s a newer thing doing it more often19:56
clarkbin particular th elists lock is checking for package updates I think19:57
*** dviroel is now known as dviroel|afk20:03
ianwhere is the problem serializing -- "zuul_executor_keytab": "20:27
ianwthis looks like pretty much the same thing i reported and fixed and was rejected @ https://github.com/ansible/ansible/issues/45098 :/20:29
jrosser_i think that make_nodepool_zk_hosts.py might be done natively a bit like this https://paste.opendev.org/show/bgogpaWUq7TTmwFEHyu0/20:31
jrosser_right now the python is really a reimplementation of the default filter20:32
ianwjrosser_: yes, i have to agree it's probably better to do it like that -- rather than serialising all of hostvars and passing it :)20:32
ianw... but ... we do also have !!binary data in our hostvars, which as 45098 says is actually unsupported20:33
clarkbbut it also worked before? I guess that may have been due to older pyyaml or something?20:33
ianwwhy it did work is a question.  it may be a python3.6 -> python3.10 thing20:34
ianwi'm not sure we even use the variable "zuul_executor_keytab"20:34
ianwhttps://review.opendev.org/c/opendev/system-config/+/515181 seemed to remove zuul_launcher_keytab20:36
ianwhttps://review.opendev.org/c/opendev/system-config/+/371818 added it to publish ... 20:37
ianwyep it works with them commented out20:39
ianwi'll do a more thorough investigation in a bit, have to get kid out door20:39
clarkbI think that may have existed before zuul secrets existed20:40
clarkband so we had it as part of the executor install?20:40
ianwi can't see we ever used zuul_executor_keytab in system-config.  the zuul_launcher_keytab we don't use any more20:41
ianwall our keytabs are just base64 encoded strings, e.g.20:42
ianwfoo_keytab: | <base64>20:42
ianwi'm not sure how these ..._keytab: !!binary bits hung on in there20:43
opendevreviewClark Boylan proposed opendev/system-config master: WIP Upgrade to Gitea 1.18  https://review.opendev.org/c/opendev/system-config/+/86266120:57
opendevreviewClark Boylan proposed opendev/system-config master: DNM intentional failure to hold a node  https://review.opendev.org/c/opendev/system-config/+/84818120:57
opendevreviewClark Boylan proposed opendev/system-config master: Disable unused gitea features  https://review.opendev.org/c/opendev/system-config/+/86275620:57
clarkbI think 862756 is safe to land whenever. I noticed those options when looking at 1.18 configs20:58
clarkbI put a hold on 848181 to enable interactive debugging20:59
clarkboh I didn't update its commmit message. Oh well20:59
clarkbother than nodepool, zuul-jobs and the openstack releases announce job have we seen fallout from the jammy base nodeset update?21:01
clarkbdoes anyone remember what gitea paths showed files as vendored?22:07
fungii bet my irc logs remember22:08
fungichecking22:08
clarkbI'm looking at the currently deployed setup on opendev.org and half wondering if the fixed lib made it into that alrady22:08
fungiapparently almost everything in https://opendev.org/openstack/oslo.cache/commit/7fb06bc2034d9747c9721c9d3eff06925a4483c6 showed up as "vendored" according to my chat logs22:09
clarkbaha its on the commits not the file browser thats what I needed22:09
fungistill says "vendored" when i bring that up22:09
clarkbhttps://opendev.org/opendev/system-config/commit/2d9d24d07d73d959241f3a7e4ba50e83542ebed0 is a system config path that shows it. But we can check the 1.18.0-rc0 against that22:10
clarkbhttps://198.72.124.43:3081/opendev/system-config/commit/2d9d24d07d73d959241f3a7e4ba50e83542ebed0 no more vendored22:10
clarkbyay my fix worked22:11
fungiw00t!22:12
clarkbthe key was being shown it was on the commit pages and not on the general file browser22:13
opendevreviewIan Wienand proposed opendev/system-config master: nodepool-base: don't call out to find zk_hosts  https://review.opendev.org/c/opendev/system-config/+/86275922:23
*** rlandy is now known as rlandy|bbl22:26
ianwjrosser_: ^ not calling out is about 20 times faster :)22:27
fungiwow22:28
ianwit's a lot to serialise; bit of a corner case but a good one to be aware of22:32
clarkbdoes running set fact against itself in a loop work like that in ansible?22:40
ianwit does seem to -> https://paste.opendev.org/show/bF50adDZ66VuSYQbNMd6/22:51
clarkbI think a lot of the 1.18 stuff may be around features we don't use like federation and email and the proxy protocol. That means the 1.18 upgrade is likely to be straightforwatd for us. I do want to investigate enabling the proxy protocol though as I think that might help with our logging. The problem is I'm not sure if apache will grok it (haproxy does)23:26
opendevreviewIan Wienand proposed opendev/system-config master: bootstrap-bridge: Codify allowed Zuul logins  https://review.opendev.org/c/opendev/system-config/+/86276123:35

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!