Monday, 2020-08-24

ianwdmsimard: re our discussion; there's a few ara changes for stable in to bring stable/0.x support, it gets the jobs green and works with our system-config gate tests of -devel branch (
ianwinfra-root: is a stack of changes to get the -devel job working again, related to ^00:24
kevinzianw: Hi morning! Sorry I was on PTO last Thursday and Friday02:11
kevinzianw: Any problem with Linaro Cloud?02:11
ianwkevinz: hey, no worries!  i was having trouble starting a new mirror host in the control plane tenant, didn't seem to be getting ipv402:11
kevinzo, let me check now02:11
ianwi want to replace the mirror with a focal node, we seem to still have the issues of the bionic node shutting down randomly02:11
kevinzianw: sure, let me check the problem02:12
ianwi can start a node if you like, lmn02:12
kevinzOK, you can start a node. And I will check if the IPV4 pool is full02:13
ianwok, it was getting an address, just no response02:13
ianw139.178.85.140 is what it's been given02:14
dmsimardianw: hey, I'll have a look when I have a chance02:16
kevinzianw: you mean that it can be assigned a IPv4 ,but could not pingable from outside right?02:16
dmsimardoh right, 0.x has ansible as a dependency02:21
dmsimardfor that config02:21
ianwkevinz: yeah, and it's not connecting out either02:36
ianwkevinz: ipv6 works02:36
ianwdmsimard: yeah, it's three hacks really :)  are we the main users of the 0.x branch?02:38
dmsimardi replied on the patches, thanks for that02:41
dmsimardhard to keep track who is using it still, maybe osa/kolla/tripleo too, would need to check02:42
dmsimardtheir playbooks work with 1.x though:
dmsimardhard requirement on py3 hurt some adoption02:45
dmsimardmaybe the new cli in 1.x will be enough to warrant upgrading :p02:46
ianwdmsimard: sorry if i'm out of the loop; it's mostly the static generation -- is there a path for that?02:48
dmsimardstatic generation is in 1.x but somewhat limited: no search or pagination02:49
ianwahh, ok, well that's enough.  i can look at upgrading then02:49
dmsimardthis is static content:
dmsimardah looks like the links are broken :/02:50
ianwdmsimard: do you mean put in a check if running with ansible 2.10?  or just try to import it an catch the exception?02:51
dmsimardI haven't had to look at those in a bit, not sure if it's legit broken or an issue with swift02:51
dmsimardhmmm maybe either could work02:52
dmsimardI'd just rather not touch what is already working for previous versions02:52
dmsimardcan add a if for 2.10 or try/catch for probably a similar result02:52
dmsimardgetting late for me, see you later02:56
ianwdmsimard: NP, thanks for looking :)  i updated 747337 to check for importerror02:57
kevinzianw: could you re-triagger the creation focal node?03:41
ianwkevinz: sure!03:41
kevinzianw: this time it worked03:43
ianwkevinz: ok, looking good :)03:43
ianwit's just about to test ipv603:43
ianwyep, working ... cool.03:44
kevinzI think the reason should be one L3-agent could not update the iptables..03:44
ianwkevinz: i ended up creating a few because of system-config errors, i wonder if that broke something03:51
ianwon friday i mean03:51
openstackgerritIan Wienand proposed opendev/ master: Add linaro mirror02 (focal mirror)
openstackgerritIan Wienand proposed opendev/system-config master: Add linaro Focal mirror
ianwinfra-root: ^ host is up, with a 200g volume spit for apache/openafs caches, so those should be good.  not doesn't switch the CNAME till we've verified it05:46
*** elod_off is now known as elod06:00
*** sshnaidm|afk is now known as sshnaidm07:03
*** fressi has joined #opendev07:10
*** andrewbonney has joined #opendev07:11
*** hashar has joined #opendev07:20
*** tosky has joined #opendev07:20
openstackgerritSorin Sbarnea (zbr) proposed opendev/system-config master: Improved ask read-only message
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: DNM: Add unified synchronize-repos role that works with linux and windows
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Update synchronize-repos
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: synchronize-repos: Remove unecessary git path modifications
*** ysandeep|lunch is now known as ysandeep09:12
openstackgerritSorin Sbarnea (zbr) proposed opendev/gerritlib master: Fixed POLLIN event check
*** priteau has joined #opendev09:24
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Enable linting of test-playbooks
*** AJaeger has joined #opendev09:32
openstackgerritSorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Enable configuration via environment variables
*** ysandeep|brb is now known as ysandeep11:54
openstackgerritSorin Sbarnea (zbr) proposed opendev/elastic-recheck master: Enable configuration via environment variables
zbrclarkb: let me know when we can resume on e-r12:11
openstackgerritLon Hohberger proposed openstack/diskimage-builder master: rhel-common: Provide method to select module streams
openstackgerritLon Hohberger proposed openstack/diskimage-builder master: rhel-common: Provide method to select module streams
clarkbzbr: I'm hoping today, but need to catch up on reviews and emails first. I figured I'd approve the puppet change then check that the install flipped over to python314:56
*** qchris has quit IRC14:57
clarkbif anyone groks pandoc betterthan me, there is a thread about gitea's rst rendering on openstack-discuss. It drops out when hitting the zuul job role directives whereas github's rendering seems to just ignore the error and render them "raw"15:20
clarkbOur gitea is configured to use pandoc for the rendering and I wonder if there is an option we can set to have it ignore those unknown directives rather than giving up15:21
openstackgerritMatthew Thode proposed openstack/diskimage-builder master: update various gentoo bits
*** fressi has quit IRC15:35 does similar, but includes the error in the output document15:57
clarkblooking at review-test we seem to have put review-site (and /home/gerrit2) on the root device. I think this may have been done to make snapshotting easy. Unfortuntely I think that won't lead to very accurate testing since production git repos are on a cinder volume and that is where notedb will be hosted16:24
clarkbare we able to snapshot a cinder volume? I think we can and that may be a better option for us to test similar "devices"16:24
fungii'm totally on vacation, but couldn't you just attach and mount a cinder volume there?16:24
clarkb(though we're back to cinder v1 api on rax iirc)16:24
clarkbfungi: we can but what I'm not sure about is our snapshotting ability16:25
clarkbI also need to poke around the srever more and get a better feel for its general state16:25
clarkba lot of the data seems to be there but gerrit isn't running16:25
fungirsync data from rootfs to a cinder volume and then (bind)mount it into place before each test? that way you can just discard the content on it16:26
clarkbya that may be the simple thing /me was pulling up cinder docs to check if snapshots there are viable16:26
clarkbcinder v2 and v3 do snapshotting16:27
clarkbv1 docs aren't even published (anymore)16:27
*** iurygregory has joined #opendev16:27
clarkbbut also I'm thinking it might be easier to not try and keep this test server in full sync with the prod server16:27
clarkbwe can build a point in time from $nowish, then that should be a reasonable enough approximation16:28
clarkbbecause the syncing adds another layer of complexity on top of the snapshotting16:28
clarkbif we snapshot then drift due to syncing wiht prod how do we reconcile that with a snapshot restore, etc16:28
clarkbfungi: while you arne't here, you did confirm that v1 volume api worked with rax right?16:30
clarkbdid a change get pushed to encode that in our clouds.yaml?16:30
fungii confirmed it worked in my homedir to remove the v2 overrides16:59
*** cmurphy_afk is now known as cmurphy17:14
clarkbanyone else want to review to convert e-r to python3 on our deployment? I'm able to watch that go in (and revert/fix/etc if necessary)17:20
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: bindep: Add missing virtualenv and fixed repo install
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: bindep: Fixed runtime warnings
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: bindep: install packages one by one
openstackgerritSorin Sbarnea (zbr) proposed openstack/diskimage-builder master: Validate virtualenv and pip
openstackgerritMerged zuul/zuul-jobs master: ara-report: add option for artifact prefix
openstackgerritMerged zuul/zuul-jobs master: tox: include command output in log/error
fungiinfra-root: rackspace opened a ticket to let us know there was a problem with the host for zm07 and that it's been rebooted19:39
clarkbthe server is up and zuul-merger is running on it19:40
fungigood deal19:40
openstackgerritMerged opendev/puppet-elastic_recheck master: Use py3 with elastic-recheck
openstackgerritBrian Rosmaita proposed opendev/system-config master: Turn off rendering of RST files by default
clarkbok e-r hasn't updated its install because the install resource is subscribed to the git repo and is subscribe only20:49
clarkbzbr: ^ that means if you land an e-r change the next hourly puppet run should do it20:49
openstackgerritMerged opendev/infra-specs master: Central Authentication Service
clarkbianw: when your day starts I wanted to ask about my dib package list parsing change21:36
clarkb(mostly if there are any concerns wit hthat approach)21:36
ianwclarkb: sorry, yeah will review22:15
ianwi don't think so22:15
clarkbianw: I reviewed your ansible-devel stack and left a couple comments on one or two of them22:18
prometheanfireianw: is debian-minimal considered 'maintained' (as opposed to ironic agent)  I ask because it sems to not install what's needed to add keys22:28
prometheanfire2020-08-24 22:28:09.535 | dib-run-parts Running /tmp/dib_build.trgNuKIW/hooks/root.d/75-debian-minimal-baseinstall22:28
prometheanfire2020-08-24 22:28:09.624 | E: gnupg, gnupg2 and gnupg1 do not seem to be installed, but one of them is required for this operation22:28
ianwclarkb: thanks, will go over it22:28
ianwpromethenaifre: hrm, well we build it in the gate, so ... yeah?22:29
ianwwhat release are you seeing that for?22:29
clarkbwe do install gnupg2 in infra-package-needs22:29
prometheanfireI think the install is done too late22:30
clarkbI think debootstrap dropped it from its default list22:30
clarkbso we just add it on after?22:30
prometheanfire adds keys at the start22:30
prometheanfireinstalls gpg at the end22:30
prometheanfiresome fun ordering issues there22:30
* prometheanfire would just like a way to add it to debootstrap22:31
prometheanfireDIB_DEBOOTSTRAP_EXTRA_ARGS probably?22:31
clarkb/opt/dib_tmp/dib_build.l7lqZiFA/hooks/root.d/75-debian-minimal-baseinstall:main:78 :   sudo chroot /opt/dib_tmp/dib_build.l7lqZiFA/mnt /usr/bin/apt-get install -y systemd-sysv busybox sudo gnupg2 python3 <- that is when we install it according to our logs22:32
clarkbah we use our mirrors in our builds22:33
clarkband we disable verification with those. I bet that explains how it got through22:33
prometheanfiremy issue is an interaction between adding a external repo with an apt key also needed22:34
prometheanfiretrying with    export DIB_DEBOOTSTRAP_EXTRA_ARGS='--include=gpg'22:35
clarkbprometheanfire: debian installs gpgv in debootstrap22:35
clarkbwhich is a minimal gpg version used only to verify signatures22:35
clarkbI think we can just do debootstrap then install gpg after?22:35
clarkbwhere is it failing for you? (since I think this is what we do)22:36
prometheanfireclarkb: part of 75-debian-minimal-baseinstall is to add the keys22:36
prometheanfirenot just verification22:36
clarkbwe don't see mto run that (probably because we don't have gpg keys to verify with the debootstrap step)22:38
clarkbnow that said, I think you can just copy the files in place with modern apt22:38
clarkbso rather than run apt-key add - we can copy the contents instead?22:39
prometheanfireadding ya, probably better to change how that's done, iirc both stretch and buster can do it22:39
clarkbthat may also be why gpg proper was dropped as a dep22:39
prometheanfireI'll play with it later tonight and make a review (unless someone beats me to it)22:40
openstackgerritIan Wienand proposed opendev/system-config master: tox: drop test-requirements.txt
openstackgerritIan Wienand proposed opendev/system-config master: Collect tox logs
openstackgerritIan Wienand proposed opendev/system-config master: Fix ansible-devel job for Ansible 2.10 changes
openstackgerritIan Wienand proposed opendev/system-config master: Ansible devel testing: install ansible-collections from checkout
ianwahh, that would make sense, yes we probably do use our mirrors everywhere and thus avoid gpg errors22:44
ianwclarkb: you ok with to add the linaro mirror02 dns?  sorry should have been a deps on, need it for LE22:48
clarkbianw: yup +2. That reminds me did the launch node script get updated to exclude dsa keys properly22:52
* clarkb tries to find that change again22:52
clarkbit did if anyone else wants to review that one22:53
ianwclarkb: yeah, it did.  i tried it out, and then realised it needs one other thing too ...22:53
ianwbecause we were re-using ip addresses, the no verify key didn't work, also need to set the hosts file to null22:54
ianwthen i wonder though, what wins between sshfp records and known_hosts?22:54
clarkboh interesting22:54
ianware we aware of the botocore install errors22:54
clarkbI'm not aware of botocore issues (in what context?)22:55
ianwsorry, zuul-jobs is where i'm seeing it.  wasn't sure if was a global breakage ... investigating22:55
ianw1.14.48 of boto3 released about 4 hours ago22:56
ianw"ERROR: No matching distribution found for botocore<1.18.0,>=1.17.48 (from boto3->-r /home/zuul/src/ (line 24))"22:59
*** mlavalle has quit IRC23:00
clarkb seems to be the latest version23:00
clarkbbut it isn't at
clarkb does have it23:01
clarkbstale cdn nodes again?23:01
clarkb has it so ya I think this must be the CDN acting up23:01
ianwdo i remember correctly fungi manually clearing caches?23:03
clarkbfungi did a bunch of wget/curls to see if the problem was persistent last week iir23:03
clarkbwe only cache those indexes for 10 minutes23:03
clarkbI think the hope was to identify a specific backend that was bad but I don't think that happened23:04
openstackgerritIan Wienand proposed zuul/zuul-jobs master: Add ensure-rust role
ianwlet's see if ^ goes better ...23:06
openstackgerritMerged opendev/ master: Add linaro mirror02 (focal mirror)
openstackgerritMerged zuul/zuul-jobs master: Add ensure-rust role
ianwfor reference, the MTL cache is definitely getting a different result for the pypi index page for botocore23:36
ianw"Update - We're currently investigating performance issues with our URL and Surrogate Key purging services. Purge All and all other services are unaffected. " ... dunno what that means23:38
clarkbwe're likely going to need to rethink docker image pulls23:39
ianwyeah i saw that ... we should be able to heavily proxy it though?23:40
clarkbmaybe, all requests to docker hub are authenticated23:40
clarkband apache doesn't cache any of those authenticated requests (even though it ca)23:40
clarkbthat likely means we need to look at a different caching tool, something which will cache authenticated requests if they are marked public content23:40
clarkbalso interesting in there they rate limited blob requests not manifests requests, what we current cache ar ethe blob requests so would hopefully be in a better spot but they are changing it to rate limit manifests (arg!)23:41
clarkb"Stay tuned in the coming weeks for a blog post about configuring CI and production systems in light of these changes."23:43
clarkbI guess we wait for their thoughts on CI in particular and go from there23:43
clarkbthere will also be open source plans which we may want to sign up for opendev and zuul23:43
clarkbcorvus: ^ fyi23:43
corvusi guess they are reconciling their accounts payable with their business model23:43
clarkbI think my only real gripe with the changes are the move from blob to manifest based limits23:44
clarkbwe cache the blobs which are the expensive bits already but will be penalized for trying to be good citiizens upfront23:44
clarkbfwiw cache-control: public is the header specifier that indicates you can cache a request that was requested with an authorization header23:45
clarkbapache seems to ignore that. I assume there exists some cache out there that doesn't23:45
clarkb(docker hub does seem to set that header properly too)23:46
clarkbalso we can just drop the caching entirely and as long as IPs don't get recycled too often probably be fine23:47
* clarkb hopes the post for CI has some better ideas23:47
corvusclarkb: it is a distributed ci system ;)23:47
ianwjust logging for reference but here are the mtl headers giving bad botocore results ATM on mtl01.inap

