Friday, 2022-10-28

*** rlandy|bbl is now known as rlandy|out00:09
opendevreviewIan Wienand proposed opendev/system-config master: [wip] testing with bridge99.opendev.org  https://review.opendev.org/c/opendev/system-config/+/86284501:16
opendevreviewIan Wienand proposed opendev/base-jobs master: infra-prod: Move project-config reset into base-jobs  https://review.opendev.org/c/opendev/base-jobs/+/86285303:56
opendevreviewIan Wienand proposed opendev/system-config master: Remove old bridge testing  https://review.opendev.org/c/opendev/system-config/+/86276603:59
opendevreviewIan Wienand proposed opendev/system-config master: Refernce bastion through prod_bastion group  https://review.opendev.org/c/opendev/system-config/+/86284503:59
opendevreviewIan Wienand proposed opendev/system-config master: Revert "Update to tip of master in periodic jobs"  https://review.opendev.org/c/opendev/system-config/+/86285403:59
opendevreviewIan Wienand proposed opendev/base-jobs master: infra-prod: Move project-config reset into base-jobs  https://review.opendev.org/c/opendev/base-jobs/+/86285307:00
*** jpena|off is now known as jpena07:22
*** benj_0 is now known as benj_08:07
*** ShadowJonathan_ is now known as ShadowJonathan08:07
*** andrewbonney_ is now known as andrewbonney08:07
*** walshh__ is now known as walshh_08:07
*** open10k8s_ is now known as open10k8s08:07
*** aprice_ is now known as aprice08:07
*** erbarr_ is now known as erbarr08:07
*** TheJulia_ is now known as TheJulia08:07
*** gouthamr_ is now known as gouthamr08:07
*** eball_ is now known as eball08:07
*** snbuback_ is now known as snbuback08:07
*** PrinzElvis_ is now known as PrinzElvis08:07
*** chateaulav_ is now known as chateaulav08:07
*** odyssey4me_ is now known as odyssey4me08:13
fricklerinfra-root: https://review.opendev.org/c/opendev/system-config/+/862759/1/playbooks/roles/nodepool-base/tasks/main.yaml is broken, the attr must be named "host" instead of "addr"09:58
fricklernoticed this via mail from failing nodepool job09:58
opendevreviewDr. Jens Harbott proposed opendev/system-config master: Fix generated zookeeper config for nodepool  https://review.opendev.org/c/opendev/system-config/+/86287810:01
fricklerunrelated: wheels were last successfully built 8 days ago10:03
*** rlandy_ is now known as rlandy10:35
*** arxcruz is now known as arxcruz|ruck10:36
fricklerlooks like afs rpm issue again https://zuul.opendev.org/t/openstack/build/80ad124ac3cf4933a7b9a381ad2f0b9c10:36
fricklerregarding nodepool, someone might want to check why the failures can be seen in the job log, but the job still passed https://bb6d07e84661d82562d5-daf98166b205b84408724e1df10e75fa.ssl.cf5.rackcdn.com/862759/1/gate/system-config-run-nodepool/4cb5b5b/nl01.opendev.org/docker/nodepool-docker_nodepool-launcher_1.txt11:23
*** dviroel is now known as dviroel|rover11:44
opendevreviewMerged opendev/system-config master: Fix generated zookeeper config for nodepool  https://review.opendev.org/c/opendev/system-config/+/86287812:11
frickleractually with that patch in there is still something wrong in the nodepool.yaml generated in gate, the host adresses are empty https://c60c9b47a943d67d7acd-72f68c73c06acdc7229714e8d93d40d1.ssl.cf1.rackcdn.com/862878/1/gate/system-config-run-nodepool/73b5a84/nl01.opendev.org/docker/nodepool-docker_nodepool-launcher_1.txt12:21
fricklerwill have to watch what gets generated by the next periodic job on the live systems12:22
fungiwe can revert https://review.opendev.org/862759 and just go back to relying on the old module for now, worst case12:29
fricklernodepool servers seem to be happy again since 1h, so the issue seems to happen only in CI13:40
fricklerrevert would have been difficult unless we also revert to using old bridge13:41
funginot really. new bridge was already working without 862759, that was merely a performance improvement13:54
fungithe actual problem we hit on the new bridge was related to some unused zk keys which were embedded as raw binary, and were causing encoding problems for ansible13:55
fungithat was fixed by a separate change just prior to 86275913:55
fricklerah, good, do you remember where that fix was? can't find anything matching in system-config14:10
frickleralso this is failing for a couple of days https://zuul.opendev.org/t/openstack/builds?job_name=infra-prod-base&project=opendev/system-config14:10
frickler"usermod: user zuul is currently used by process 1370895" on bridge01 itself14:12
fungifrickler: possible it was an edit to our private group vars, checking...14:14
fungifrickler: yep, that's where it was14:15
fungi"Remove unsued keytab entries" (most recent commit in /etc/ansible/hosts on the new bridge)14:16
fricklerfungi: ah, right, thx14:28
fungianyway, those were what was gumming up the works for newer python, apparently14:29
*** dasm|off is now known as dasm14:58
clarkbre the usermod error that seems similar to the issue I addressed with removing the ubuntu user as part of launch node15:10
clarkbI addressed that by forcing the removal regardless as a subsequent step does a reboot. I doubt we're trying to remove the zuul user there but maybe any modification is tripping over similar?15:10
opendevreviewAndy Ladjadj proposed zuul/zuul-jobs master: fix(packer): prevent task failure when packer_variables is not defined  https://review.opendev.org/c/zuul/zuul-jobs/+/83674415:21
opendevreviewAndy Ladjadj proposed zuul/zuul-jobs master: [upload-logs-base] add ipa extension to known mime types  https://review.opendev.org/c/zuul/zuul-jobs/+/83404515:21
opendevreviewAndy Ladjadj proposed zuul/zuul-jobs master: [upload-logs-base] add android mime-type  https://review.opendev.org/c/zuul/zuul-jobs/+/83404615:22
clarkbfungi: frickler: re nodepool config we should double check that the old code produced different results too. Thinking out loud here I wonder if our inventory in the test jobs has enough of the bits we use in production to produce a working config15:27
opendevreviewAndy Ladjadj proposed zuul/zuul-jobs master: [ensure-python] install python version only if not present  https://review.opendev.org/c/zuul/zuul-jobs/+/77065615:36
*** dviroel|rover is now known as dviroel|rover|lunch15:37
clarkbinfra-root after breakfast I'll get around to deleting gitea-lb01 and jvb02 from their respective clouds15:42
clarkbI'll start by shutting down services on the hosts and letting them sit for about an hour just to make sure there isn't any unexpected fallout then turn them off15:42
clarkbs/then turn them off/then delete them/15:43
*** jpena is now known as jpena|off15:48
fungisounds great, thanks!15:50
clarkbservices are now off15:51
clarkbexpect deletions to occur around 1700 UTC15:51
*** dviroel|rover|lunch is now known as dviroel|rover16:31
clarkbinfra-root the rax dns backup is failing on bridge01, but it is/was also failing on bridge.o.o. Not a regression16:37
clarkbinfra-root I've just realized that both bridges will attempt to run the zuul restart playbook later today16:43
fungid'oh!16:43
clarkbSince bridge.o.o shouldn't be configured autoamtically anymore I'm going to manually comment out the crontab entry for the playbook on that server allowing bridge01 to be the lone zuul restart commander16:43
fungiprobably time that we shut down the old bridge (just not delete it)16:44
fungibut yeah that's a good intermediate step16:44
clarkbcrontab is edited16:44
clarkbya we should probably wait for ianw's monday before doing that?16:44
fungiagreed16:44
clarkbjust in case he feels there are still things that need cross checking16:44
fungithe new bridge is working great though16:44
fungiand now that we've sussed out how to make latest osc work with rackspace volume management, i don't really have anything i need to preserve from the old bridge16:45
clarkbya I've got a bunch of stuff in my homedir but I'm fairly certain non of it is particularly important16:47
clarkbrunning `openstack` for the first time on bridge01 reminds me this is a docker command16:51
clarkbit started doing things I didn't expect at first so I ^C'd16:51
clarkbpersonally I'm not sure how I feel about consuming osc that way16:52
clarkbits definitely a surprise to have stuff download in response to a server list16:52
clarkband now I'm wondering why python-openstackclient and openstackclient are both on pypi16:54
clarkbpython-openstackclient appears to be the up to date one16:55
clarkbopenstackclient says it is a meta pacakge that installs the same major version of python-openstackclient. It doesn't have new releases so I guess that stopped getting updated16:56
clarkbanyway I'm setting up another venv because in the past that has become necessary. I'm just jumping the gun on that.16:59
clarkband I can list nodes and volumes in vexxhost which is what I need to double check my work deleting gitea-lb0117:00
clarkbinfra-root its been an hour since I shutdown services on jvb02 and gitea-lb01 any objections to deleting them now?17:00
clarkbok gitea-lb01 is deleted, but that didn't auto delete its bfv volume. Going to delete that too17:04
clarkbthe volume doesn't seem to remember the last thing it was attached to which makes deletion difficult if you don't do a volume list first (I did this out of fear this may happen)17:06
clarkbI can't server list against rax17:08
clarkbVersion 2 is not supported, use supported version 3 instead.17:09
clarkbInvalid client version '2.0'. Major part should be '3'17:09
clarkbfungi: do you have this working on bridge01? your comments about the volume stuff make me think that this may be the case17:09
clarkb#status log Deleted gitea-lb01 (e65dc9f4-b1d4-4e18-bf26-13af30dc3dd6) and its BFV volume (41553c15-6b12-4137-a318-7caf6a9eb44c) as this server has been replaced with gitea-lb02.17:10
opendevstatusclarkb: finished logging17:10
clarkbok the issue is the cinder api. Apparently we don't support API v2 in the latest version17:12
clarkbok downgrading to osc<5.0.0 fixes it (5.0 might work too? I'm not sure). It reused the python-cinderclient wheel I already had cached which implies the issue is in osc not cinderclient17:15
clarkb#status log Deleted jvb02.opendev.org (a93ef02b-4e8b-4ace-a2b4-cb7742cdb3e3) as we don't need this extra jitsi meet jvb to meet ptg demands17:18
opendevstatusclarkb: finished logging17:18
clarkbgtema: ^ fyi re client issues17:18
opendevreviewClark Boylan proposed opendev/zone-opendev.org master: Remove gitea-lb01 and jvb02 from DNS  https://review.opendev.org/c/opendev/zone-opendev.org/+/86294117:20
clarkbThe good news is that the new bridge can do this stuff with some minor tweaks17:23
fungiclarkb: latest cli/sdk worked for me by pinning python-cinderclient<817:37
fungiit was cinderclient 8.0.0 which dropped volume v2 api support17:38
clarkbfungi: it works fine with old osc and cinderclient 9.1.017:41
fungithe ~fungi/foo venv on bridge01 is able to volume list, and was built via just `pip install openstackclient 'python-cinderclient<8'`17:41
clarkbit could be that both things are breaking it in different ways but if you change one or the other then it works17:41
clarkboh openstackclient should install the same version that python-openstackclient<5.0.0 installs17:42
clarkbI don't think I'm going to debug this further, I just want to call it out as something that downgrading osc alone seems tohave fixed so isn't the cinderclient's sole issue17:42
fungithat working venv has openstackclient==4.0.0 and python-openstackclient==6.0.017:42
fungialso openstacksdk==0.102.017:43
clarkbmine has python-openstackclient==4.0.2 and python-cinderclient==9.1.0 and openstacksdk==0.102.017:43
clarkbI guess hte promise that openstackclient always installs the same major version of python-openstackclient as its major version is wrong17:43
fungiso it's either old osc with new cinderclient, or new osc with old cinderclient?17:43
clarkbya I think so17:44
fungiclarkb: to provide a different cloud.yaml file path/name with osc you need to export an envvar, right? i remember there was some way to do it but can't find an command-line flag at least18:00
fungii guess i should be looking for the old oscc docs18:21
clarkbyes I think it is an env var. Something like OS_CONFIG_FILE18:41
clarkbI forget the actual name though18:41
clarkbfungi: OS_CLIENT_CONFIG_FILE openstacksdk defines it not osc18:48
fungiaha, thanks!18:49
fungii had taken to grepping through the source because i didn't spot it in the docs18:49
clarkbya it might be worth having osc's docs link to sdk's docs on the subject18:50
*** dviroel|rover is now known as dviroel|rover|afk20:22
*** dasm is now known as dasm|off20:23
clarkbpart of me wants to start upgrading gitea backends to jammy now, but I think waiting for the openssl vuln to be fixed is a good idea21:02
*** dviroel|rover|afk is now known as dviroel|rover21:34
*** dviroel|rover is now known as dviroel|out21:36
fungioh, is jammy already openssl 3.x?21:45
clarkbI think so21:47
fungilooks like it is, yeah21:52
fungi3.0.2-0ubuntu1.621:52

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!