Tuesday, 2022-11-22

opendevreviewIan Wienand proposed opendev/system-config master: letsencrypt: build txt record lists betterer  https://review.opendev.org/c/opendev/system-config/+/86520300:02
opendevreviewIan Wienand proposed opendev/system-config master: system-config-run-gitea: use standard bridge host  https://review.opendev.org/c/opendev/system-config/+/86520400:02
opendevreviewIan Wienand proposed opendev/system-config master: [wip] Bump bridge ansible to 6.6.0  https://review.opendev.org/c/opendev/system-config/+/86519500:02
ianwsystem-config-run-gitea : " ERROR: Could not find a version that satisfies the requirement setuptools>=61.0 " ... what is this all about :/00:07
fungiwhich package is requiring that?00:08
clarkbianw: is that for the ansible 6 upgrade change?00:08
fungidoes newer ansible need bleeding edge setuptools?00:08
clarkbits almost certainly going to be the module shebang stuff00:08
clarkbfungi: the problem is that we did a thing where we used /usr/bin/env python or /usr/bin/env python3 in our module files but ansible blows up on that00:09
clarkband it tries to run the modules against different python and things break00:09
fungioh it's trying to run with python 2.7 which has a too-old setuptools maybe (or no setuptools at all)00:09
clarkbin this case ansible is being run out of a virtualenv on bridge but then executing the module using system python and things are unhappy00:09
clarkbwell I think python3 in giteas case but yes00:10
fungioic00:10
ianwno actually this was a -2 on https://review.opendev.org/c/opendev/system-config/+/864600/, just that little doc update00:10
clarkbinteresting it failed to install the node launcher00:11
ianwmaybe it is transient, i always get paranoid when setuptools/pip turn up in a trace 00:12
clarkbianw: its bridge99 complaining00:13
clarkband it has python3.6 and appears to be bionic.00:13
clarkbI wonder if the problem is simply an old node specification for bridge in that job?00:13
clarkb(and the new python package for bridge requires modern python which isn't found on bionic)00:13
clarkbya bridge is bionic on that job00:14
ianwyeah, i fixed that in https://review.opendev.org/c/opendev/system-config/+/86520400:14
ianwi wonder how it's been working ...00:14
clarkbianw: but that fix isn't before the change that failed00:15
clarkbI suspect we just don't run system-config-run-gitea often enough to hvae hit that problem00:15
clarkbhttps://review.opendev.org/c/opendev/system-config/+/865204/2 seems like something that shouldn't be in a chain of things and should be an independent fix? +2 anyway00:15
ianwbut interestingly it did pass in the check00:19
ianwyeah, i agree, i can split it out now00:20
ianwansible 6 is picking up some list constructions that i have no idea how they used to work :)00:20
clarkbianw: I think we only just landed the launch env change00:21
clarkbpossible when the check pass ran the launch env didn't exist on bridge yet00:21
ianwoh yeah, *that* is probably it00:23
ianwthat makes sense00:23
opendevreviewIan Wienand proposed opendev/system-config master: system-config-run-gitea: use standard bridge host  https://review.opendev.org/c/opendev/system-config/+/86520400:24
opendevreviewIan Wienand proposed opendev/system-config master: system-config-run-gitea: use standard bridge host  https://review.opendev.org/c/opendev/system-config/+/86520400:26
opendevreviewIan Wienand proposed opendev/system-config master: borg-backup-server: build borg users betterer  https://review.opendev.org/c/opendev/system-config/+/86520200:26
opendevreviewIan Wienand proposed opendev/system-config master: letsencrypt: build txt record lists betterer  https://review.opendev.org/c/opendev/system-config/+/86520300:26
opendevreviewIan Wienand proposed opendev/system-config master: [wip] Bump bridge ansible to 6.6.0  https://review.opendev.org/c/opendev/system-config/+/86519500:26
*** rlandy|rover|biab is now known as rlandy|rover00:33
ianwi seem to have written a lot of things that don't really evaluate to lists but seem to work :)00:59
fungiyou've mastered dwim programming01:00
ianwhttps://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/letsencrypt-request-certs/tasks/main.yaml#L30 in hindsight probably isn't the best way to do things01:05
*** rlandy|rover is now known as rlandy|out01:08
*** tkajinam is now known as Guest216202:04
opendevreviewIan Wienand proposed opendev/system-config master: letsencrypt: build txt record lists betterer  https://review.opendev.org/c/opendev/system-config/+/86520302:30
opendevreviewIan Wienand proposed opendev/system-config master: [wip] Bump bridge ansible to 6.6.0  https://review.opendev.org/c/opendev/system-config/+/86519502:30
opendevreviewIan Wienand proposed opendev/system-config master: letsencrypt-request-certs: refactor certcheck list  https://review.opendev.org/c/opendev/system-config/+/86521802:30
opendevreviewMerged opendev/system-config master: system-config-run-gitea: use standard bridge host  https://review.opendev.org/c/opendev/system-config/+/86520402:40
*** tkajinam is now known as Guest217403:28
opendevreviewIan Wienand proposed opendev/system-config master: [wip] Bump bridge ansible to 6.6.0  https://review.opendev.org/c/opendev/system-config/+/86519504:02
opendevreviewIan Wienand proposed opendev/system-config master: gitea-git-repos: remove #!/usr/bin/env python  https://review.opendev.org/c/opendev/system-config/+/86522404:02
*** yadnesh|away is now known as yadnesh04:03
*** ysandeep|out is now known as ysandeep04:49
*** ysandeep is now known as ysandeep|ruck04:49
*** pojadhav- is now known as pojadhav05:03
opendevreviewMerged opendev/system-config master: opendev.org: add status update links  https://review.opendev.org/c/opendev/system-config/+/86460005:24
*** tkajinam is now known as Guest217905:36
StutiArya[m]fungi: can you please share the link for #openstack-qa channel that you mentioned05:38
*** pojadhav- is now known as pojadhav05:57
*** ysandeep__ is now known as ysandeep|ruck06:53
opendevreviewIan Wienand proposed opendev/system-config master: opendev.org: close <li> tag properly  https://review.opendev.org/c/opendev/system-config/+/86523307:19
*** yadnesh is now known as yadnesh|afk08:20
*** jpena|off is now known as jpena08:22
*** ysandeep|ruck is now known as ysandeep|ruck|brb08:37
*** yadnesh|afk is now known as yadnesh08:51
*** jpena is now known as jpena|off08:56
*** jpena|off is now known as jpena08:57
fricklerinfra-root: do we have a policy to clean up unused groups in gerrit? or do we just ignore them? like https://review.opendev.org/c/openstack/project-config/+/814597/2 left kolla-cli-core unused. bit confusing while searching for active kolla groups IMO08:59
*** ysandeep|ruck|brb is now known as ysandeep|ruck09:24
*** ysandeep|ruck is now known as ysandeep|ruck|brb10:53
*** ysandeep|ruck|brb is now known as ysandeep|ruck11:09
*** dviroel|afk is now known as dviroel11:20
*** rlandy|out is now known as rlandy|rover11:23
*** dviroel_ is now known as dviroel11:38
*** dviroel_ is now known as dviroel12:16
fungifrickler: long ago i would empty the groups, then rename them with a prefix and then set them to hidden, but i haven't done so for a long time12:42
fungiand it looks like our options for cleaning those up hasn't really changed. gerrit still doesn't seem to have a mechanism for deleting groups once they're created12:52
fungiprobably we could run a (complex) query to generate a list of groups which are neither referenced from an acl directly nor transitively through inclusion in another group, and then bulk rename and hide them12:57
fungiand i'm not finding any official gerrit plugins for group deletion either13:00
fungithough i did happen across this one which has account deletion: https://gerrit.googlesource.com/plugins/account/13:00
*** ysandeep|ruck is now known as ysandeep|ruck|afk13:07
fricklerseems someone started some time ago but then gave up (see the linked review, too) https://www.gerritcodereview.com/design-docs/delete-groups-solution-plugin.html13:07
fricklerthis is also related https://duckduckgo.com/l/?uddg=https%3A%2F%2Fwww.gerritcodereview.com%2Fdesign%2Ddocs%2Fdelete%2Dgroups%2Dconclusion.html&rut=56e19180877adfa2ce096f0eee57023cbfd501c57e808cc4734b42a415aa8cb513:07
fricklermeh, should've been without that bird wrapper13:08
fricklerhttps://www.gerritcodereview.com/design-docs/delete-groups-conclusion.html13:08
frickleroh, nice, the email address parser in gerrit is broken, it makes some.one@gmail.com split into two http URLs separated by a normal "@" https://review.opendev.org/c/openstack/requirements/+/86503013:26
fungiinfra-root: the new mailman3 import i started last night based on the latest fix resulted in no more django path warnings, and my cleanup on the production list footer fields seems to have eliminated those template conversion errors too. now the only errors are for two hidden/private lists which were configured not to create archives (and in both cases the error is that there was no archive to13:30
fungiimport)13:30
fungiif anyone wants to poke around at the held node, it's 104.130.140.226 just be aware that if you want to test authenticated features like list configuration/moderation or altering subscription preferences, you'll need to "reset" your password and then fish the corresponding token out of a deferred message in exim's queue, since we're blocking outbound delivery from the test node13:31
fungiif this looks good, we're probably clear to merge the topic:mailman3 changes (except for the "dnm" one on top of course)13:33
fungiafter i do some spot checks myself, i'm going to start working on booting the new production server and checking its addresses against blocklists (filing exception/removal requests as needed)13:34
*** ysandeep|ruck|afk is now known as ysandeep|ruck13:53
*** dasm|off is now known as dasm13:55
*** frenzy_friday is now known as frenzy_friday|doc14:43
Clark[m]fungi: should we squash the image change into the parent or maybe flip the order so that we build images first and then deploy them?15:09
*** pojadhav is now known as pojadhav|afk15:32
fungiClark[m]: i'm still a little too fuzzy on the container building and consuming workflow to know which of those is preferable. i figured the current sequence allows us to isolate the forking effort so that it's theoretically easier to roll back off of later if we want to?15:35
Clark[m]fungi: the main thing is updating the docker compose file and then removing the other bits. I agree having a separate commit simplifies things15:51
fungionly thing odd of note is that the exim queue had a bunch of notices to subscribers that their subscriptions had been disabled due to bounces. i *think* it's because those subscribers were set to get digests, and exim was prevented from delivering them, so working as designed if so. but hard to be entirely certain of that16:03
fungiif we configured iptables to drop instead of reject outbound smtp, then i'd still have the originals in the queue in order to say for sure16:04
fungioh, though maybe i can tell from the delivery log16:05
fungialso i was wrong about the password reset workflow. it appears users are not precreated by the import script (or maybe that changed in a more recent version than we originally tested), so i needed to click "sign up" and create an account on one of the list sites and then confirm the account creation by following the link from the outbound message stuck in exim's queue16:07
fungibut once i logged in with that new account, it had my existing subscriptions and owner/moderator status for various lists already linked16:08
fungiand i was able to log into another list site with the same username/password after that16:08
fungiconfirming that account creation is still global across all the list sites (but login state is not, just for the record, so you still need to log into each site separately to make changes)16:09
*** yadnesh is now known as yadnesh|away16:12
*** dviroel is now known as dviroel|lunch16:19
*** dasm is now known as dasm|off16:23
clarkbfungi: ya I think at this point I'm happy to land the changes and start pushing towards a real host. Your testing has been fairly extensive and we've caught a fiar bit of stuff but I expect we're about at the limit of what is reasonable for our testing to cover16:32
fungialso we're running clean on the absolute latest release of mailman from a few weeks ago, which is quite exciting16:33
clarkbI hesitate to +2 the changes since I wrote a fair bit of them. But I assume if ianw, corvus, and/or frickler don't object as a third vote we can proceed?16:34
fungiyeah, i mean, they've been up for review for months and discussed ad nauseum, so if anyone was going to object to the implementation/design choices they've had ample opportunity to do so16:36
fungiand we seemed to have consensus on overall direction for this effort16:36
fungias we're following a spec we collectively approved last year16:36
*** hjensas is now known as hjensas|afk16:57
*** marios is now known as marios|out17:00
*** dviroel|lunch is now known as dviroel17:00
corvusi say proceed :)17:00
fungithanks!17:01
*** ysandeep|ruck is now known as ysandeep|out17:04
fungii went ahead and approved the two changes. they don't alter current production services anyway so there's still time to make further adjustments before initial import maintenance for the opendev and zuul list sites anyway17:06
*** jpena is now known as jpena|off17:33
*** frenzy_friday|doc is now known as frenzy_friday17:36
opendevreviewMerged opendev/system-config master: Add a mailman3 list server  https://review.opendev.org/c/opendev/system-config/+/85124818:00
fungionce the other shoe drops, i'll start launching the server18:02
fungitrying out ianw's shiny new launcher package18:02
fungilooks like it's /usr/launcher-venv/bin/launch-node on bridge0118:05
clarkbthe readme was updated as part of the change but ya that sounds right18:07
fungiinterestingly, the osc/sdk versions installed in that venv don't work with rackspace's volume api18:08
fungi$ sudo /usr/launcher-venv/bin/openstack --os-cloud=openstackci-rax --os-region-name=DFW volume list18:08
fungiNo module named 'cinderclient.v2'18:08
fungithough works with the versions of things used by ~fungi/foo/bin/openstack (just gives a deprecation warning for cinder v2)18:09
clarkbI think we half expected we might need to adjust those18:09
fungimy venv was populated with `pip install openstackclient 'python-cinderclient<8'`18:10
opendevreviewMerged opendev/system-config master: Fork the maxking/docker-mailman images  https://review.opendev.org/c/opendev/system-config/+/86015718:11
fungilooks like my venv has python-openstackclient==6.0.0 vs 4.0.2 in the global one, but i have older python-cinderclient==7.4.1 instead of 9.1.018:13
clarkbI think those versions came from my venv on bridge01 ~clarkb/oldoscenv and those worked. But looking in my history they may have only worked against server commands and not volume commands. I thought I had tested both18:14
clarkbI'm happy to update to match yours if it works18:15
fungimy venv only backdates the cinderclient version sufficiently to not reach v2 support removal and otherwise works with latest versions of anything pip is able to install with that18:16
opendevreviewJeremy Stanley proposed opendev/system-config master: Improve launch-node dependency versions  https://review.opendev.org/c/opendev/system-config/+/86532018:25
fungiclarkb: ianw: ^18:25
fungi~fungi/foo has been rebuilt now installing the launch package with that patch applied, and i can openstack volume list from rax dfw just fine18:28
fricklerhttps://review.opendev.org/c/openstack/neutron-tempest-plugin/+/857031 looks weird to me, why was that recheck needed? the devstack patch merged 3 days ago, I assumed zuul to trigger gate for projects in the same (integrated) queue automatically18:31
clarkbyou probably need to look at the zuul scheduler logs from the 19th18:34
clarkbthe way it should work is when the parent change enqueues it also enqueues the child18:34
fungieven for cross-project dependencies?18:35
fungii thought it only knew to do that for git relationships18:35
clarkbyes I think so. However, it may rely on the change cache and the child was approved on the 16th which was possibly long enough for it to lose that change in the cache18:36
fungilooks like the change it depends-on was approved two days later18:37
clarkblooking at zuul really quickly we seem to track this on the child side so ya if the child isn't active anylonger then maybe we lose that info18:40
fungi857031 was approved 2022-11-16 13:05z, then the change it depends-on (860795) was approved 2022-11-18 10:06z and was immediately rejected because it in turn depends-on another change with which it doesn't share a change queue18:40
fungithen someone blindly rechecked 860795 thinking zuul must be confused about that fact and it was immediately rejected again for the same reason18:41
clarkbthough there is getChangesDependingOn which does a Gerrit quiery18:41
fungiit was approved again and enqueued into the gate at 2022-11-19 20:11, but 857031 did not get automatically enqueued at the same time (i'm still not 100% sure i've ever seen it add previously-approved depends-on changes into the queue but i probably haven't paid close attention)18:43
clarkbya I think it tries to. getChangesDependingOn is part of that18:44
fungialso note that there was a zuul rolling upgrade between those, if that matters at all18:45
clarkbfrom zuul01 2022-11-19 20:11:12,812 DEBUG zuul.Pipeline.vexxhost.gate: [e: e3203a0382ea4077a21d6c60c0d76cf2]   No changes need <Change 0x7f006cb75720 openstack/devstack 860795,918:51
clarkber thats the wrong pipeline18:51
funginew problem with launch-node...18:53
fungiopenstack.exceptions.BadRequestException: BadRequestException: 400: Client Error for url: https://dfw.servers.api.rackspacecloud.com/v2/610275/servers, Bad networks format18:53
clarkbfungi: that might be why I used older osc18:54
clarkbwe might need older of both things I guess18:54
funginope, same with the current version in /usr/launcher-venv18:54
fungii tried both that venv and mine just to be sure18:54
fungisudo /usr/launcher-venv/bin/launch-node --cloud=openstackci-rax --region=DFW --flavor="8GB Standard Instance" lists01.opendev.org18:55
fungithat's what i ran18:55
clarkbfrickler: https://paste.opendev.org/show/b6P1Uy2VjAMZOqtBbwyz/ that is zuul deciding that there is one tempets change following that devstack change but that it isn't ready to gate. Your change doens't show up in that list18:56
clarkbfrickler: so zuul didn't find the change for one reason or another. Maybe the gerrit query didn't return it in the list for some reason?18:56
clarkbit does seem to if I manually construct the gerrit query so not sure why that would happen18:57
clarkbfungi: I think ianw and corvus  ran into this too fwiw on old bridge and they sorted it out there so they may have input18:57
fungiwondering if i need to pass some specific --network label18:58
clarkbfungi: no, rax doesn't do networks18:59
clarkbits a client/sdk/something issue where it thinks it has to do that but in reality you don't irc18:59
clarkbfrickler: the depends on string for the change it found is identical to the one in your change18:59
clarkbhttps://review.opendev.org/q/message:%257BDepends-On:+https://review.opendev.org/c/openstack/devstack/%252B/860795%257D is the query it should run I think19:00
clarkbcorvus: ^ any idea why zuul didn't find https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/857031 when https://review.opendev.org/c/openstack/devstack/+/860795 was approved but did find https://review.opendev.org/c/openstack/tempest/+/861110/ per https://paste.opendev.org/show/b6P1Uy2VjAMZOqtBbwyz/19:03
fungiproblem #1 trying with the same package versions as corvus-env on the old bastion, new bastion python appears to be incompatible with openstacksdk 0.41.0:19:27
fungiAttributeError: module 'collections' has no attribute 'MutableMapping'19:27
clarkbya yo uneed 0.99.0 or newer for some of those python fixes iirc. There is another one that impacts python 3.1119:28
clarkber I guess the 3.11 issue was addressed in 0.99.0. Not sure about that one19:28
fungi0.51.0 seems to work19:31
clarkbfungi: I think you need ~0.50.0 for that fix19:32
fungi0.50.0 did not19:32
fungianyway, if that works, i'll see how far forward i can wind it19:32
fungiFileNotFoundError: [Errno 2] No such file or directory: '/var/cache/ansible/inventory'19:34
fungithat does indeed not exist on bridge0119:34
clarkbfungi: launch node is simply removing cache entries19:35
clarkbso that may be something we can ignore now? but if ansible caches elsewhere we would need ot update the path19:36
fungiwell, "ignore" will need patching i think. at least it seems to imply that's why the command failed19:36
clarkbyes you'd need to update the tool19:37
clarkbI'm just pointing out that the code deletes contents of that dir so fi the dir doens't exist we're already winning :)19:37
fungiyep19:38
fungii've wrapped that in a try/except for FileNotFoundError so seeing if it gets farther19:39
fungilooks like launch-node is expecting to find ansible playbooks relative to but outside the virtualenv, or something like that19:50
fungiERROR! the playbook: /home/fungi/bar/lib/python3.10/site-packages/opendev_launch/../playbooks/set-hostnames.yaml could not be found19:50
clarkbya line 215 ish of launch node script19:50
clarkbthat should probably just be hardcoded to the zuul managed repo paths19:51
clarkbthe upside to relative paths in the past was it allowed you to make local edits19:51
fungior get embedded as data files19:51
clarkbfungi: I'm never certain if that is safe because ansible looks content up relative to the playbook file path19:51
clarkbfungi: I would be wary of doing that unless you vendor in an entire system-config19:51
fungioh, right, we have to consider not just python but also ansible there19:52
clarkbcan just default to the zuul path and then make it overrideable with a command line option if we find we need to do local edits of playbook19:52
fungias best as i can tell, this would break for /usr/launcher-venv/bin/launch-node as well19:53
clarkbyes19:54
clarkbthats why I suggest just setting it to /home/zuul/src/opendev.org/opendev/system-config/playbooks/set-hostnames.yaml and so on19:54
clarkband a flag to change the prefix would be a good addition too if we decide using different sets of playbooks is useful19:55
clarkbanother option would be to not install the script to the venv as a command19:55
clarkband instead curate a venv to run it out of but expect users to invoke the script out of the launch dir as before19:56
fungilooks like it's probably getting a lot further now that i've embedded the full playbook path19:58
fungioh, yeah it made it past the reboot19:59
ianw(thanks for looking at this, i figured it might not 100% work.  sorry it's on my todo list to launch nodes once it merged, but i got sidetracked by the ansible update stuff)19:59
fungihey no sweat, you did the heavy lifting20:00
fungii'll add these tweaks to 86532020:00
fungiand try to work out the latest actually viable package set which can do openstack server create and openstack volume list20:01
fungifor rackspace specifically20:01
clarkbfungi: looking in /var/cache/ansible I think we may want to delete host specific facts20:02
clarkbbut that is only an issue if we reuse names which we do far less of today20:02
clarkbso ya I think just not deleting things is fine20:02
fungiwe sort of reuse hostnames if we have to keep retrying to boot the same machine over and over to test the launch tooling ;)20:02
fungithis launch is still progressing, so seems it's probably working. just need to work out a more exact formula for the package versions to install into the launch venv now20:15
opendevreviewMerged opendev/system-config master: opendev.org: close <li> tag properly  https://review.opendev.org/c/opendev/system-config/+/86523320:20
fungilaunching this server has already taken 30 minutes. hopefully it's close to done applying our configuration20:26
fungilooks like it's outputting syslog/journald messages instead of ansible progress. just constant spam from systemd and dbus-daemon20:28
fungiand multipathd20:28
fungithis seems to be TASK [Run unattended-upgrade on debuntu] which has been going for nearly 25 minutes now20:29
fungialso it's dawned on me that i didn't tell it to boot a jammy image so it's using focal20:30
fungijust as well, i was expecting to test this at least once more after i settle on newer python package versions20:30
*** dasm|off is now known as dasm20:31
fungiit finished! roughly 40 minutes start to end20:34
fungiianw: is there a patch to the launch script to make it include host keys in the copy-paste blob for the inventory at the end?20:34
fungijust happened to notice that wasn't included20:34
*** dviroel is now known as dviroel|afk20:37
clarkbfungi: re unattended upgrades that ensures that base image is up to date before we reboot and try to configure stuff on it. But yes it can take some time occasionally20:38
fungiyep, just didn't imagine it would take that long... then again this was focal, and possibly a very old one20:38
fungiinteresting. rackspace's jammy image insists we supply keypair data, but the focal image doesn't20:40
fungiThe requested image '...' requires remote authentication credentials to be passed to the image, but no supported credentials were found within the request. Supported methods: 'ssh_keypair'. (HTTP 400)20:40
clarkbcorvus: frickler: I've tried to manually step through what zuul would do to find that dependency. So far everything seems to work as expected. One thing I noticed is that the commit used Depends-on instead of Depends-On but we ignore case in our re matching20:44
clarkbwe know it found one following change which means it didn't shortcut due to change.uris being empty20:45
fungihuh, so openstackclient can do a server create just fine with newer sdk, but the launch script seemingly cannot20:46
clarkbfungi: we do pass a keypair though20:48
clarkbits a throwaway key that we remove from both ends once bootstrapping is done20:48
clarkbcorvus: frickler: there is gerrit io logging but it seems we disable this (not surprising it would be very verbose)20:49
clarkbbut I'm beginning to think we got an incomplete response from gerrit20:49
clarkbmaybe the gerrit indexes were stale for some reason20:49
clarkbeither that or the gerrit query was case sensitive (though in my testing via the rest api it seems that it isn't)20:49
ianwfungi: not yet (on the ssh keys)20:59
clarkbI think there are one of two explanations. First is that gerrit just returned less data than we expect for some rason so zuul wasn't aware20:59
clarkbsecond is that our query is returning no data for som ereason (encoding, etc) and we're relying on the cached info about changes and that tempest change was live when the other change was approved21:00
clarkbBut I have no hard evidence of this as all my testing returns the results we expect21:00
*** join_eggdreamnft is now known as \join_subline21:01
clarkbI'm going to step away from depends on debuggig now to look at python weirdness on jammy with ceph in tempest jobs21:01
corvusclarkb: i think there's a third possibility which is some error updating the change cache (missed an update, race, etc), but that seems unlikely, especially if you didn't see any indications it wrote an updated change value in response to that query.21:07
corvusclarkb: any extra debug lines you think we should add in case it happens again?21:07
corvus(if you think the only additional info would be the gerrit io log, then that's probably the end of the line, since i don't think we want to turn that on in production)21:08
ianwhttps://fosstodon.org/@opendevinfra has a green tick now, which is kinda cool21:09
clarkbcorvus: the logs I have appear t obe a complete accounting for that processing so ya I don't think it had cache errors (we do log thoes iirc)21:14
ianwclarkb: "the input device is not a TTY" <- does this seem familiar?  istr you having some sort of issue with it with a recent jammy upgrade21:15
clarkbcorvus: maybe we should log following changes found by getChangesDependingOn separately from those already on the change object? then we'd have a clearer idea if the problem is talking to gerrit or our cache?21:15
clarkbianw: is that with docker?21:15
clarkbianw: when I ran into it it was with docker -it and it was a bug in docker. Upgrading docker fixes it (it took them a bit of time to make a fixed release though so at the time we just waited)21:16
ianwyes, this is with https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_8a4/865195/7/check/system-config-run-gitea/8a45d93/bridge99.opendev.org/ara-report/results/523.html21:18
ianwwhich is where we poke at the giteadb with a mariadb call to update logos21:18
ianwbut I guess this only appears when using ansible 6 on the bridge21:19
clarkbpreviously the issue was in docker itself and could be worked around by not doing interactive sessions. Just running commands which can be clunky21:20
ianwand it's talking to, and running this, on th the gitea host which is bionic21:20
clarkbcorvus: ya I think maybe a bit of split up logging in enqueueChangesBehind() might help differentiate where we are missing info21:20
ianwclarkb: hrm, yeah in this case it's "/usr/local/bin/docker-compose -f /etc/gitea-docker/docker-compose.yaml exec  mariadb"21:21
corvusclarkb: kk ping me if you write that21:21
clarkbianw: by default docker-compose asks for a tty iirc which is the inverse of `docker`21:21
clarkbianw: I think the flag is -T? to disable it?21:21
opendevreviewIan Wienand proposed opendev/system-config master: borg-backup-server: build borg users betterer  https://review.opendev.org/c/opendev/system-config/+/86520221:26
opendevreviewIan Wienand proposed opendev/system-config master: letsencrypt-request-certs: refactor certcheck list  https://review.opendev.org/c/opendev/system-config/+/86521821:26
opendevreviewIan Wienand proposed opendev/system-config master: letsencrypt: build txt record lists betterer  https://review.opendev.org/c/opendev/system-config/+/86520321:26
opendevreviewIan Wienand proposed opendev/system-config master: gitea-git-repos: remove #!/usr/bin/env python  https://review.opendev.org/c/opendev/system-config/+/86522421:26
opendevreviewIan Wienand proposed opendev/system-config master: [wip] Bump bridge ansible to 6.6.0  https://review.opendev.org/c/opendev/system-config/+/86519521:26
opendevreviewIan Wienand proposed opendev/system-config master: gitea-set-org-logos: use -T on mariadb command  https://review.opendev.org/c/opendev/system-config/+/86533921:26
opendevreviewIan Wienand proposed opendev/system-config master: borg-backup-server: build borg users betterer  https://review.opendev.org/c/opendev/system-config/+/86520221:26
opendevreviewIan Wienand proposed opendev/system-config master: letsencrypt-request-certs: refactor certcheck list  https://review.opendev.org/c/opendev/system-config/+/86521821:26
opendevreviewIan Wienand proposed opendev/system-config master: letsencrypt: build txt record lists betterer  https://review.opendev.org/c/opendev/system-config/+/86520321:26
opendevreviewIan Wienand proposed opendev/system-config master: gitea-git-repos: remove #!/usr/bin/env python  https://review.opendev.org/c/opendev/system-config/+/86522421:26
opendevreviewIan Wienand proposed opendev/system-config master: gitea-set-org-logos: use -T on mariadb command  https://review.opendev.org/c/opendev/system-config/+/86533921:26
opendevreviewIan Wienand proposed opendev/system-config master: [wip] Bump bridge ansible to 6.6.0  https://review.opendev.org/c/opendev/system-config/+/86519521:26
ianwsorry, rebased to master instead of origin/master :/  anyway, there's a chance that's a complete stack for ansible 6 support21:27
*** dasm is now known as dasm|off22:25
fungiokay, through trial and error i've determined that launch-node will work with rackspace avoiding that network response parsing exception if i pin openstacksdk<0.99 (which results in installing 0.61.0)22:33
fungithe problem arises in 0.99.022:34
opendevreviewIan Wienand proposed opendev/system-config master: Bump bridge ansible to 6.6.0  https://review.opendev.org/c/opendev/system-config/+/86519522:35
opendevreviewIan Wienand proposed opendev/system-config master: bridge: Use any 6.X Ansible release  https://review.opendev.org/c/opendev/system-config/+/86534522:35
ianwfungi: do you think it's worth bisecting?22:36
fungimaybe, but for now i've at least got a patch i can push with accumulated launch-node fixes22:36
clarkbfungi: and openstacksdk 0.102.0 also errors?22:37
fungiyes22:37
clarkbfungi: I ask because I was hoping to update nodepool but I think that means nodepool can't talk to rax if we do :/22:37
fungii started at openstacksdk<1 and then wound backward by version until in hit the last one which avoided the error22:38
ianwdo we have a theory?  if not, i can make some time to bisect it down to a change ... especially if we can pinpiont it for nodepool too22:38
opendevreviewJeremy Stanley proposed opendev/system-config master: Improve launch-node deps and fix script bugs  https://review.opendev.org/c/opendev/system-config/+/86532022:41
fungiianw: i have no theory, other than something in 0.99.0 probably started expecting modern network information returned in the server create response which doesn't match what's coming back from rackspace22:42
fungialso 0.99.0 was the merge of a feature branch, if memory serves, so may not be trivial to bisect22:42
fungianyway, 865320 is what i installed in a venv in my homedir on bridge01 and that seems to be working start to finish to launching a jammy node for the new lists server22:46
ianw++22:47
ianwi'll see if i can make a small replicator ...23:00
*** rlandy|rover is now known as rlandy|out23:09
clarkbfungi: did you want to check my comments on the launch node change and approve if you feel like those updates are too much? we can always do them in followups as we launch more nodes23:29
fungii'll take a look in a bit23:33
clarkbianw: question about https://review.opendev.org/c/opendev/system-config/+/865218/3/playbooks/roles/letsencrypt-request-certs/tasks/main.yaml you updated the example to show a port (8000) and the construction seems to default to :443. I thought certs didn't specify a port or service though?23:36
clarkboh wait this is just for verification not generation I get it23:36
ianwclarkb: yep, that's right -- it's just what goes in the ssl check file23:37
ianwthe first regex should turn anything with ":<port>" into " <port>" and the second should be 'if this doesn't have a space in it, then add " 443"'23:38
ianwi half considered dropping into python with a lib file for it, but i think it *just* meets criteria of being understandable :)23:38
clarkbianw: and the second regex matches because there are no spaces if we didn't replace : with ' ' ? Otherwise we don't match because the $ interfers with our trailing space?23:43
clarkbleft a small nit on it (use + instead of *) but otherwise that lgtm23:43
ianwright; there shouldn't be any trailing spaces.  i guess we could double check that with a |trim23:45
opendevreviewIan Wienand proposed opendev/system-config master: rax: remove identity_api_version 2 pin  https://review.opendev.org/c/opendev/system-config/+/86535123:56
clarkbianw: we should check the catalog or whatever it is to double check there is a v3 there23:58
clarkbjust to make sure we know what we are falling bac kon23:59
ianwyeah i'm still trying to understand it all 23:59
ianwi can say it makes "openstack server list" go from not working to working :)23:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!