Thursday, 2020-05-07

clarkbianw: I think you can also just restart the persistent firewall service00:00
clarkb(fwiw I'm convincing myself that nb01/nb02.opendev.org will get the nodepool configs we want them to get now)00:00
ianwyeah, they should be fine with the normal "nodepool.yaml" config file00:01
clarkbyup just found the ternary condition for that00:01
clarkbit will use a host specific config first if found else fallback to nodepool.yaml which is what we want for nb01 and nb0200:02
ianwyep and when up, we can remove nb04 and make sure 01/02 are building everything00:02
openstackgerritMerged opendev/system-config master: install-docker: remove arch match  https://review.opendev.org/72443500:34
ianwhrm, a POST_FAILURE on one of hte beaker jobs00:40
ianwhttps://zuul.opendev.org/t/openstack/build/1eb289a5cbcc42f8aba47ce3c35a2eff00:40
clarkbiamw that could be the ovh issue00:41
clarkbthey areinvestigating and have confirmed the priblem00:41
clarkbdetails in #opemstack-infra00:41
clarkb(that is where amorin was lurking)00:41
ianwi did see that but didn't we stop uploading there?00:42
clarkbianw I dont think that change merged00:43
ianwavailable identity versions when contacting https://auth.cloud.ovh.net/00:46
openstackgerritMerged opendev/system-config master: Add nb01/nb02 opendev servers  https://review.opendev.org/72602100:51
fungithe change to stop uploading to ovh has not been approbed00:56
fungiapproved00:56
ianwok, well i'll keep an eye ...01:02
ianwi think it might be time to merge it, about 2/3 of my changes managed one job that failed01:08
openstackgerritIan Wienand proposed opendev/base-jobs master: Revert "Temporarily disable OVH Swift uploads"  https://review.opendev.org/72602801:10
openstackgerritMerged opendev/base-jobs master: Temporarily disable OVH Swift uploads  https://review.opendev.org/72594301:13
openstackgerritMerged zuul/zuul-jobs master: ensure-tox: use venv to install  https://review.opendev.org/72573701:28
ianwnb01.opendev.org was in emergency from prior work, i've taken it out02:09
*** olaph has quit IRC02:39
ianw01 needs another run to fix up it's certs, as i only noticed it was in emergency after LE phase ran02:42
ianwbut it appears to be building02:42
ianwas mentioned nb01/02.openstack.org are currently borked with locked files.  i'm going to just shut them down now nb01/n02.opendev.org seem to be working02:48
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Upload images to dockerhub with buildx when using buildx  https://review.opendev.org/72603302:50
mordredianw: woot! that's exciting02:51
mordredianw: so that means we're just down to nb03 - and that should be gtg as soon as we actually start uploading our multi-arch images02:54
ianw#status nb01/02.openstack.org shutdown and in emergency file; nb01/02.opendev.org are replacements02:55
openstackstatusianw: unknown command02:55
ianw#status log nb01/02.openstack.org shutdown and in emergency file; nb01/02.opendev.org are replacements02:55
openstackstatusianw: finished logging02:55
ianwyep!02:55
openstackgerritIan Wienand proposed opendev/system-config master: Retire nb01/02.openstack.org  https://review.opendev.org/72603403:07
openstackgerritIan Wienand proposed opendev/system-config master: nodepool-builder: fix servername  https://review.opendev.org/72603503:10
openstackgerritIan Wienand proposed opendev/system-config master: [wip] arm64 builder test  https://review.opendev.org/72603703:33
openstackgerritMerged opendev/system-config master: Add system-config-run-base-arm64  https://review.opendev.org/72443903:54
*** ykarel|away is now known as ykarel04:21
openstackgerritIan Wienand proposed opendev/system-config master: [wip] arm64 builder test  https://review.opendev.org/72603704:59
openstackgerritIan Wienand proposed opendev/system-config master: Move build-essential arm64 things to base  https://review.opendev.org/72603904:59
openstackgerritIan Wienand proposed opendev/system-config master: service-bridge: skip osc/kubectl things for arm64  https://review.opendev.org/72604004:59
openstackgerritIan Wienand proposed opendev/system-config master: Move build-essential arm64 things to base  https://review.opendev.org/72603905:25
openstackgerritIan Wienand proposed opendev/system-config master: service-bridge: skip osc/kubectl things for arm64  https://review.opendev.org/72604005:25
openstackgerritIan Wienand proposed opendev/system-config master: [wip] arm64 builder test  https://review.opendev.org/72603705:25
*** ysandeep|away is now known as ysandeep05:42
openstackgerritIan Wienand proposed openstack/project-config master: Remove special nb04 config file  https://review.opendev.org/72604305:53
openstackgerritIan Wienand proposed openstack/project-config master: Remove special nb04 config file  https://review.opendev.org/72604305:54
openstackgerritIan Wienand proposed opendev/system-config master: [wip] arm64 builder test  https://review.opendev.org/72603706:12
*** DSpider has joined #opendev06:51
*** ralonsoh has joined #opendev07:26
*** tosky has joined #opendev07:31
*** jaicaa has quit IRC07:31
*** jaicaa has joined #opendev07:33
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Bumb ansible to version 2.6  https://review.opendev.org/72605407:40
*** rpittau|afk is now known as rpittau07:47
*** ysandeep is now known as ysandeep|lunch07:50
*** mnasiadka has quit IRC07:56
*** panda has quit IRC07:59
*** mnasiadka has joined #opendev08:01
*** panda has joined #opendev08:02
*** roman_g has quit IRC08:20
*** ysandeep|lunch is now known as ysandeep08:44
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567808:45
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567808:46
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567808:53
*** ykarel is now known as ykarel|lunch09:01
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567809:10
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Bumb ansible to version 2.6  https://review.opendev.org/72605409:16
*** dtantsur|afk is now known as dtantsur09:18
*** diablo_rojo has quit IRC09:24
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567809:38
*** Eighth_Doctor has quit IRC09:39
*** sshnaidm|afk is now known as sshnaidm09:44
*** ykarel|lunch is now known as ykarel09:44
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567809:54
*** Eighth_Doctor has joined #opendev09:58
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567810:07
*** rpittau is now known as rpittau|bbl10:10
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567810:27
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567810:46
*** ysandeep is now known as ysandeep|brb10:57
*** ysandeep|brb is now known as ysandeep11:10
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567811:18
*** ysandeep is now known as ysandeep|afk11:48
*** ysandeep|afk is now known as ysandeep12:00
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567812:02
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567812:11
*** rpittau|bbl is now known as rpittau12:11
*** hashar has joined #opendev12:24
*** hrw has joined #opendev12:48
hrwmorning12:49
hrwcan someone take a look at http://mirror.regionone.linaro-us.opendev.org/ mirror? it 403 all centos8 packages12:50
hrwhttps://39a670f0ca097ca9d2d3-0327a7e653cb74ce0efd34fcb3f0b3e6.ssl.cf5.rackcdn.com/725032/5/check-arm64/kolla-build-centos8-source-aarch64/ff0437e/kolla/build/000_FAILED_base.log shows12:50
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567812:52
*** ykarel is now known as ykarel|afk13:05
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Made sequence indent consistent  https://review.opendev.org/72553813:15
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567813:17
*** roman_g has joined #opendev13:18
openstackgerritMerged openstack/project-config master: Add repository for oslo.metrics  https://review.opendev.org/72584713:21
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567813:22
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Upload images to dockerhub with buildx when using buildx  https://review.opendev.org/72603313:26
fungihrw: those urls seem to be okay when i try to retrieve them13:29
fungido you know whether it's still happening?13:29
fungilooks like there were a probably slew of afs connectivity issues from the linaro mirror at 10:29 utc, and then i see a stray error at 11:55 and another at 12:0913:36
*** rosmaita has quit IRC13:38
fungithough dmesg shows a bunch more outside those times13:38
fungidoesn't look like it's on the afs server side as our other mirrors aren't reporting similar errors13:39
donnydis there a central ARA for the CI?13:39
donnydlike a place to see everything13:39
donnydor are they by job13:39
fungidonnyd: nope, though not sure what you mean by "everything"13:39
fungioh, as in an aggregate of all job logs? no, we have logstash/elasticsearch/kibana though13:40
donnydI didn't know if there was like a central ARA or something. Shows how much I know about ARA13:40
fungiwe don't run ara generally for our zuul jobs, though some jobs which involve nested ansible invocation also install ara in the job and produce a report from the inner ansible13:41
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Add upload-artifactory role  https://review.opendev.org/72567813:42
dmsimarddonnyd: it's doable but openstack-scale is hard :)13:42
dmsimardhttps://api.demo.recordsansible.org/ has data from different jobs13:43
*** jhesketh has quit IRC13:43
donnydthank you dmsimard13:44
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Upload images to dockerhub with buildx when using buildx  https://review.opendev.org/72603313:46
*** jhesketh has joined #opendev13:47
fungihrw: so far, it looks like there may be some sort of intermittent network connectivity issue preventing mirror.regionone.linaro-us.opendev.org from reaching afs01.dfw.openstack.org for around 30 seconds at a time. the server itself seems healthy other than there are some fairly large spike of network utilization around the same times i'm seeing the afs errors:13:50
fungihttp://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=67974&rra_id=all13:50
fungii guess that could be job nodes in that region retrieving files from the mirror13:50
fungiit looks somewhat consistent with previous days, so could just be that those are the times it's being used, and we're less likely to see errors retrieving files when nothing's retrieving files13:52
donnydIs OE going to be put in there to monitor?13:57
fungioh, is it not configured in cacti yet? i'll get to that in a bit14:01
fungiyeah, i don't see that we're collecting system metrics for the openedge mirror14:02
*** ykarel|afk is now known as ykarel14:07
openstackgerritMerged zuul/zuul-jobs master: Bumb ansible to version 2.6  https://review.opendev.org/72605414:21
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Upload images to dockerhub with buildx when using buildx  https://review.opendev.org/72603314:24
*** tosky_ has joined #opendev14:29
*** tosky has quit IRC14:32
*** ysandeep is now known as ysandeep|afk14:33
openstackgerritMerged zuul/zuul-jobs master: Made sequence indent consistent  https://review.opendev.org/72553814:42
*** panda is now known as panda|pto14:42
*** tkajinam has quit IRC14:43
openstackgerritMonty Taylor proposed opendev/system-config master: Add focal to system-config base job  https://review.opendev.org/72567614:47
mordredclarkb, fungi: https://review.opendev.org/#/c/723528/ is finally ready to go14:47
mordredas is its child adding focal to the base job14:47
mordredinfra-root: also - when you get a chance, https://review.opendev.org/#/c/724682 could use a +A14:48
corvusi'm going to perform a shopping outing this morning, which as you doubtless know, can be a bit of an event, so i'll be afk for a while (starting in maybe 30m)14:50
fungiour grocery pickup appointment is tomorrow morning, but the town has also reopened the recycling facility and we have an entire station wagon load of the past couple months recycling pile to go drop off14:52
fungiso yeah, i'll probably disappear for a good chunk of tomorrow morning14:53
openstackgerritMerged zuul/zuul-jobs master: Upload images to dockerhub with buildx when using buildx  https://review.opendev.org/72603315:01
*** tosky_ is now known as tosky15:04
mordredcorvus, fungi: before you disappear, have a minute to review https://review.opendev.org/#/c/724682 ? we need it in so we can re-do clarkb's zuul.yaml reorg15:06
clarkbfungi: and then stand in line behind all the other full station wagons15:07
corvusmordred: +2 (i looked at that yesterday -- but i understand it today :)   did not +W; will let you or fungi do that15:08
fungiclarkb: indeed, so very much looking forward to that15:09
hrwfungi: thanks. will check when other ci job run15:09
fungimordred: my outing's not until tomorrow, but i will try to take a look while in meetings right now15:09
fungihrw: yeah, i still need to perform some more thorough network tests to see if i can spot more general network connectivity issues from the linaro-us mirror15:10
mordredcorvus: awesome - thanks!15:12
openstackgerritMerged openstack/project-config master: Remove special nb04 config file  https://review.opendev.org/72604315:13
clarkbinfra-root disk for / on zuul01 is running out. It has occurred to me I didn't run docker-compose prune in zuul-web after updating its image a few times. I'm going to do that now15:16
clarkboh wait we run docker image prune not docker-compose image prune Does the second thing even exist?15:17
clarkbmordred: ^ I think this is a general issue with us pulling images but not restarting and pruning15:18
clarkbmordred: I'm thinking we should move the pull into the start playbooks15:18
clarkbotherwise we add on more and more images and slowly fill the disk15:18
fungioh, could that explain some of the disk utilization on review.o.o too?15:19
clarkbfungi: we build images less frequently there, but it may explain a small portion of it15:19
clarkbfungi: the issue here is I think everytime zuul builds a new image we are pulling them down with ansible but not restarting and pruning on them15:20
clarkbinfra-root I'm going to run the image prune on zuul now. It will also remove newer zuul-scheduler images than we are running because that hasn't been restarted. The next run of ansible should pull the latest down which will have us covered for tomorrow15:21
clarkbTotal reclaimed space: 3.285GB15:22
mordredclarkb: ++15:22
mordredclarkb: I agree about moving the pull to the start15:22
clarkbmordred: cool I'll work on that patch after tea and breakfast15:23
mordredcool15:23
mordredfungi: you're good with apache - wanna review clarkb's apache caching patch: https://review.opendev.org/#/c/724682 ?15:23
clarkbmordred: I think we should also have bup exclude the docker stuff if we haven't already15:24
mordredalso https://review.opendev.org/#/c/724778/ is related to apache and caching15:24
clarkb/root/.bup on zuul and gerrit continue to be huge consumers of disk15:24
clarkband that is index files for files being backed up aiui15:24
mordredclarkb: I believe I checked and I believe we do? but maybe we don't?15:24
mordredclarkb: there's almost certainly _something_ we're backing up that we don't need to :)15:25
clarkb/var/lib/docker/* <- is excluded15:25
mordred\o/15:26
fungii'll take a look at the apache change next once i get through 72468215:27
clarkbmordred: I think we can exclude /var/lib/zuul too?15:27
mordredclarkb: well - on the scheduler we want to make sure we're backing up keys15:28
clarkbmordred: fungi and ya /var/cache/apache2 is 1.5GB so another not small cost15:28
clarkbmordred: good point15:28
clarkbthough that means we are backing up the status.json dumps15:29
clarkband those are reasonably large and change constantly which could be part of the large /root/.bup15:29
clarkbmaybe we are ok with excluding those?15:29
mordredclarkb: we could exclude /var/lib/zuul/backups on zuul15:29
mordredyeah15:29
clarkbwe do already exclude /var/cache/* so aren't trying to backup apache's cache15:30
clarkb(which is good)15:30
mordredclarkb: /var/lib/zuul/times might also be icky to backup?15:30
clarkbmordred: ++ we can live without those too I think15:30
clarkbmordred: basically I think the /root/.bup cost is indexing many files15:30
clarkband I expect having files with names that change over time contributes to that too if I've read bup docs correctly15:31
clarkbbut also we might consider adding these excludes then "restarting" zuul backups in order to clear all that out?15:31
openstackgerritMerged opendev/system-config master: Remove old init scripts and services for zuul/nodepool  https://review.opendev.org/72601115:31
openstackgerritMerged opendev/system-config master: Run cloud_launcher from zuul  https://review.opendev.org/71879815:31
clarkbI think that was the only way I could find to clean that up. Basically start from a new epoch for the append only bakcups15:31
clarkbalright breakfast and tea and will write some changes15:31
mordredclarkb: ++15:32
fungimordred: your reference to "clarkb's apache caching patch" back at 15:23 seems to have been a link to your ansible roles change i was already reviewing. which one did you mean?15:32
AJaegermordred: https://zuul.opendev.org/t/openstack/build/5ce869e00e044ed1b6f328fc7f8265d7 failed, that's the deploy from 72604315:34
mordredfungi: whoops! https://review.opendev.org/#/c/724444/ and https://review.opendev.org/#/c/724778/15:35
fungicool, i just approved that second one, taking a look at the first now15:35
mordredAJaeger: thanks - looking15:35
mordredError executing /usr/sbin/apache2ctl: AH00526: Syntax error on line 19 of /etc/apache2/sites-enabled/001-nb.conf:15:36
mordredSSLCertificateFile: file '/etc/letsencrypt-certs/nb01.opendev.org/nb01.opendev.org.cer' does not exist or is empty15:36
mordredI think ianw said someting about needing another pulse of le - but that doesn't seem to have happened15:36
mordredyeah - we haven't run letsencrypt again - let me run it real quick15:37
clarkbmordred: as a side note that is just for serving images and status so its not critical to nodepool function15:42
mordredclarkb: yeah - but it causes the nodepool playbook to bomb15:42
clarkbah15:43
mordredok - I've run the le playbook and have re-run the service-nodepool playbook15:44
AJaegerthanks, mordred !15:45
*** priteau has joined #opendev15:46
openstackgerritClark Boylan proposed opendev/system-config master: Update bup excludes for zuul-scheduler  https://review.opendev.org/72618315:46
openstackgerritClark Boylan proposed opendev/system-config master: Pull and prune docker images together  https://review.opendev.org/72618515:54
clarkbmordred: infra-root ^ two more disk saving changes15:54
smcginnisInteresting workshop on GitHub Actions. They've done a lot to make GitHub work like Gerrit.15:56
smcginnisOther than needing to commit a whole stack of half baked commits to get to a good state.15:56
mordredsmcginnis: yeah - and lack of support for patch series15:57
fungiyou can avoid that, you just lose the history15:57
clarkbalso acls15:57
smcginnisThat patch series support has been something I've really missed in GitHub projects.15:57
clarkbfungi: they actually save the diff contexts and comments within that scope now15:57
* mordred ran in to the patch series issue with subsurface on monday - I pushed up a patch, review found an unrelated thing but that needed to be on top - I had to wait until the first PR was landed before I could push up the followup15:58
fungiclarkb: oh, even if you push --force a new commit?15:58
clarkbfungi: yup, its still not great beacuse you can't easily revert and I think comments outside the context of the git diff are either lost or only kept with no context15:58
clarkbbut its something15:58
smcginnisComments on the PR will be kept, but the commits will be lost.15:58
smcginnisAFAIK15:58
mordredthe comments are still there, but they don't show15:58
clarkbsmcginnis: correct the commits go away.15:58
mordredif you go to your email and find when you got the original comment and click that15:59
mordredit'll show it to you15:59
mordredbut you can't browse to it15:59
smcginnisAnyway, not tried to throw shade or anything. They've done a lot of great work. Just found it humorous how much work has been put in to make it gerrit like.15:59
fungimordred: could you have created a new branch off your first pr's branch and then committed the second fix there and made a pr from that>16:00
fungi?16:00
mordredfungi: I could have - but it would have still shown the first pr's commits too16:00
clarkbya there is no way to partition the review context16:00
clarkbthe PR will always contain HEAD to remote HEAD commits in it16:00
mordredsmcginnis: yah - totally - there has been some excellent work done16:00
mordredsmcginnis: it's still not a system I enjoy using, but it has definitely improved over the years16:01
smcginnis++16:01
openstackgerritMerged zuul/zuul-jobs master: Check blocks recursively for loops  https://review.opendev.org/72496716:02
fungi#status log deleted openstack/rally branch stable/0.9 (304c76a939b013cbc4b5d0cbbaadecb6c3e88289) per https://review.opendev.org/72168716:03
openstackstatusfungi: finished logging16:03
clarkbsmcginnis: I think what we are seeing is a realization that they have popularity but if they want to continue to be considered by serious software development environments they need to improve the tools that are available to those groups16:04
clarkbGithub has always been great for hobby projects or school assignments and even small groups of people working together.16:04
clarkbBut I have no idea how projects like nasible or kubernetes manage to survive on it16:05
smcginnisThe one thing they have now that I think could really help us, especially for drive by contributations, is being able to open a web based VS Code environment to make updates.16:05
clarkband slowly github is making the lives of ^ projects like that better16:05
clarkbsmcginnis: eh16:05
clarkbsmcginnis: you forget the CLA16:05
smcginnisYeah, I know there are a lot of other things that need to be done.16:05
clarkbfrom a technical standpoint maybe, but you've got to get the board to get out of the way16:05
smcginnisBut that's kind of the problem.16:05
smcginnisDrive by contributors may have a quick fix, look at the steps needed to submit a patch, and say a great big nope.16:06
openstackgerritMerged zuul/zuul-jobs master: Remove opensuse-15-plain testing  https://review.opendev.org/72575016:06
clarkbsmcginnis: sure, but the biggest issue there is legal and not technical so addressing it from a technical standpoint doesn't help16:06
clarkbif we can remove the legal hurdle then addressing technical hurdles makes a lot more sense16:06
clarkband gerrit is adding that functionality16:07
*** ykarel is now known as ykarel|away16:07
clarkbI think paladox was saying it is really close16:07
paladoxclarkb you mean file uploads?16:07
clarkbpaladox: ya I think you said that could be used to generate a new change in the browser?16:08
paladoxah, yeh. That's been merged now16:08
paladox(so will be included in 3.2)16:08
paladoxit's live on https://gerrit-review.googlesource.com16:08
clarkbnice16:09
paladoxhttps://imgur.com/a/dL91kwI is what it looks like16:09
fungiright, for now the biggest hurdle to casual contribution, as clarkb points out, is that the osf board of directors and legal counsel are still unwilling to consider that the legal safety net they believe to be crucial to prevent companies from going to war with each other over their contributions to these projects are a significant hindrance to the code contributors themselves, but especially to casual16:15
fungicontributors16:15
fungiwe need folks who are on the osf board of directors to take up the challenge of revisiting that and possibly refuting the earlier concerns of member companies whose legal counsel were so insistent on enforcing contributor legal agreements (especially the corporate contributor license agreement)16:17
JayF(I'm 100% serious) can someone run for the board on a platform of "lowering barriers to entry for contributors", that'd get my vote16:17
mordredit's complicated16:17
mordredthe board mostly works via consensus, so if even a few of the directors are generally uncomfortable with something, it's unlikely to go anywhere16:18
fungii've heard from the folks managing ccla compliance at some of these larger member companies who ask why osf still insists on such arcane legal red tape when other large projects are able to get by with something simple like the dco16:18
mordrednot because it'll get voted down - but because the board won't feel like it has gotten to a point where a vote would be consensual16:19
mordredso - as a former board member who has been in favor of fixing this for years - I can tell you it's an arcane enough topic that enough of the members just don't feel like they have enough of handle on it that they're not interested in taking on the risk16:20
mordredwithout an active push from legal counsel saying "this is a good idea and we should do it" (which allows such board members to just say "legal counsel said it's a good idea, if I don't undersatnd the issue I can just trust that) - I think it's unlikely to get traction16:21
mordredALL of that said - we *do* have approval from the board to drop the icla in favor of the dco, which movement forward on is in our court16:22
mordredand is backed up behind a few other things last I remember16:22
funginot really16:22
fungii mean we could basically ignore the real thrust of that board resolution16:22
fungiwhich said that we could drop the icla *if* we did a better job of reporting ccla affiliation16:23
clarkbwhich is somehow our problem16:23
fungiand that we could rely on the dco for unaffiliated contributors16:23
mordredI do not remember it that way16:23
fungireread it ;)16:23
mordredoh - no, ccla affiliation is the important bit16:23
mordredbut I do not remember *us* having to report that better being part of it16:23
clarkbmordred: right companies are unable to track their own employees so we have to do it for them16:23
fungihttps://wiki.openstack.org/wiki/Governance/Foundation/26Oct2015BoardMeetingMinutes16:24
mordredthe members who cared said they were fine with updating employee ccla activity16:24
mordred"WHEREAS, the Foundation needs to develop and implement new software to identify individuals who are listed in the Corporate Contribution License Agreements and implement a new process for all contributions (“New Process”); "16:24
mordredyes16:24
mordredthat's not us16:24
mordredthat's the foudation16:24
mordredand my understanding is openstackid is the answer from the affiliation side16:24
fungistuff like "...adoption of the DCO instead of the Individual Contributor License Agreement (“ICLA”) for contributions by individuals who are not making contributions on behalf of a corporate employer..."16:24
mordredso when I say it's backed up16:25
mordredI mean the issue we still ahve on our side is that we can't let people log in with openstackid to gerrit16:25
fungi"...the Foundation needs to develop and implement new software to identify individuals who are listed in the Corporate Contribution License Agreements and implement a new process for all contributions..."16:25
mordredopenstackid is the "affliation tracking system"16:25
mordredso it has long been my undersatnding that we can't make progress on this until we make progress on new SSO16:26
fungibut anyway, we have the okay from the board to adopt the dco for individuals not making contributions on behalf of their employer, so long as we can figure out who they are16:26
clarkbmordred: I seemed to remember it being a lot more strict than that but maybe the thoughts around it have changed over time.16:27
mordredwhether there needs to be additional things past that at that point remains to be seen, but until tying an openstackid to a dev account is a thing, there is really nothing we can do16:27
fungimy take is that we could just drop the icla enforcement, turn on dco enforcement for relevant repos, and call it a day. but we need to present that in a way that the osf executive team will be okay with16:27
clarkbmordred: ah maybe that was it. step 0 is openstackid maybe there is step 1, 2, n16:28
mordredfungi: right - my memory is that once people can login with openstackid, the executive team is much more comfortable16:28
mordredyes16:28
mordredthere ight be a step 1 or 216:28
fungiand if the osf executives feel like they can't make that call without checking with legal counsel and/or the board, we're sunk16:28
mordredbut we haven't even tohught about it because of step 016:28
*** dtantsur is now known as dtantsur|afk16:28
mordredand step 0 turns out to be hard16:28
mordredI'm mostly just pointing out - there is a known step 0 that is blocking on us - even if there may be unknown step 1, 2, n - and it's a step 0 we want to do for other reasons anyway16:29
mordredit mgiht still be blocked on other people or other things after step 016:29
fungion a related note, i have some 350 lines scraped from years of old e-mail threads chucked into a spec template i'm trying to refine to make something current16:30
fungii still feel like it's not *actually* blocked on us, it's blocked on somewhat unreasonable logistical demands we have to come up with a technical solution for16:31
mordredsmcginnis: also - for the record - the cncf also requires a CCLA for contributions, and it took me over a month to figure out how to get myself added to a ccla list so that I could submit a PR to a cncf project16:32
fungiwe can attempt to un-block it with a semi-reasonable technical compromise, but there's every chance we'll be told that's still not good enough to meet the legal obligation the board insisted on16:33
mordredso - let's remember next time someone tells us "it's easier for projects on github" - that for other big projects that statement is an outright lie16:33
smcginnismordred: Not an uncommon experience either.16:33
mordredyeah16:33
fungiheck, corvus has a tiny (two line?) patch for jitsi-meet rotting in their pr queue blocked on red hat legal signing jitsi's ccla16:34
*** rpittau is now known as rpittau|afk16:34
mordredfungi: ++16:34
*** hashar has quit IRC16:39
clarkbmordred: you reviewed the pull/prune symmetry change but not teh bup update https://review.opendev.org/#/c/726183/ any chance you can review that one too?16:39
mordredclarkb: lgtm16:43
clarkbmordred: is the stack at https://review.opendev.org/#/c/724682 the last major piece before the reorg of system-config zuul confs?16:43
*** sshnaidm is now known as sshnaidm|afk16:46
mordredclarkb: yes17:03
mordredclarkb: https://review.opendev.org/#/c/723528/ <-- also that please17:03
mordredI have confirmed that the cloud-launcher cronjob has been removed from bridge,so landing the "stop removing cron job" patch is fine - I've +A'd the middle of the stack17:04
mordredclarkb: https://review.opendev.org/#/c/726034/ is also a fun one to land :)17:05
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Add docker_dockerfile to upload-docker-image defaults  https://review.opendev.org/72621017:08
clarkbmordred: that si a fun one17:08
mordredclarkb: ikr?17:08
clarkbugh I think weechat decided mouse mode is on for some reason17:10
openstackgerritMerged opendev/system-config master: python-builder: drop # from line  https://review.opendev.org/72537417:10
openstackgerritMerged opendev/system-config master: Configure htcacheclean for zuul-web  https://review.opendev.org/72477817:10
openstackgerritMerged opendev/system-config master: Cache static zuul resources in apache  https://review.opendev.org/72444417:10
clarkbmordred: ianw re https://review.opendev.org/#/c/726034/ shoudl we s/nb04/nb01/ in a followup?17:11
mordredclarkb: yeah - might as well17:11
*** priteau has quit IRC17:25
openstackgerritMerged zuul/zuul-jobs master: Add docker_dockerfile to upload-docker-image defaults  https://review.opendev.org/72621017:25
*** avass has joined #opendev17:50
clarkbmordred: https://review.opendev.org/#/c/724682 failed in the gate17:50
*** ralonsoh has quit IRC17:51
openstackgerritMerged opendev/system-config master: Test zuul-executor on focal  https://review.opendev.org/72352817:53
openstackgerritMerged opendev/system-config master: Retire nb01/02.openstack.org  https://review.opendev.org/72603418:00
mordredclarkb: I've got a recheck on it - it's an unreachable error to the xenial node, so I think it was a cloud-derp18:02
mordred(the arm test came after the recheck because slow arm - the recheck is still running)18:02
*** roman_g has quit IRC18:04
corvusi'm back from errands, but going to do some lunch prep, etc, now, so not really fully back yet18:09
clarkbcorvus: all fine I think we are mostly waiting on gerrit/zuul for the next step in CD stuff18:10
openstackgerritMerged opendev/system-config master: Use zuul checkouts of ansible roles from other repos  https://review.opendev.org/72468218:41
openstackgerritMerged opendev/system-config master: Stop logging the rsync of puppet  https://review.opendev.org/72441918:41
openstackgerritMerged opendev/system-config master: Stop removing cloud-launcher cron  https://review.opendev.org/71879918:41
mordredclarkb: I got a question in sdks about image checksum issues in osc - do you remember us having some issue with that here in the last couple of months?18:43
*** dtroyer has joined #opendev18:44
clarkbmordred: ish. They dont do anything useful except for the swift based upload dedup18:49
clarkbmordred: glance can change the image without telling you and so thr checksum you provide to glance is uselss also glance doesnt check it aiui18:50
clarkbthose are issues but nothing that affected useability in a regressive way18:52
mordredclarkb: nod. well - issue at hand is issues uploading an image to a vmware-backed cloud with osc v5 which uses sdk instead of glanceclient with a checksum mismatch18:56
mordredI'm guessing the vmware driver is modifying the image so the cloud checksum is different18:56
mordredwhich makes me thnk having osc pass "validate_checksums=True" is a bad idea and we should set that to false -a nd probably default it to false in sdk18:57
clarkbya I dont think you can generally verify checksums unfortunayely18:57
clarkbit would be good for glance to checksum what it receives separately to what it stores for use18:57
clarkbthen clients can confirm the remote received the data properly18:58
mordredclarkb: ++19:08
mordredclarkb: and could know that the cloud modified the binary payload19:08
mordredclarkb: ZOMG - the stack landed19:12
mordredclarkb: I pass you the mutex on the repo for zuul.yaml changes19:12
mordredclarkb: also - I'm going for a bike ride - back in a bit19:12
openstackgerritAndreas Jaeger proposed openstack/project-config master: Stop building opensuse-15-plain images  https://review.opendev.org/72575119:20
clarkbcool will get the new ps up in a momeny19:20
corvusreally back now19:20
clarkbfinishing lunch19:20
fungiand i'm about to start cooking dinner19:20
fungilike ships that pass in the night ;)19:21
*** dpawlik has quit IRC19:23
clarkbcorvus: fungi https://review.opendev.org/726183 and https://review.opendev.org/726185 are additional efforts to get disk consumption into reasonable levels19:27
clarkband now to redo the reorg change19:27
openstackgerritJames E. Blair proposed opendev/system-config master: WIP: add Zookeeper TLS support  https://review.opendev.org/72030219:28
corvusclarkb: the prune is an improvement, but i think that the ideal (if we have time to figure this out) is something more like the kernel: always pull, and prune all but the current and latest.19:30
corvusclarkb: that way we're not pulling on start, but we're also not wasting space19:31
corvusclarkb: actually -- does prune do that?19:31
corvusclarkb: ie, does it not remove the running image because it's running, and not remove the :latest image because it's tagged?19:32
openstackgerritClark Boylan proposed opendev/system-config master: Organize zuul jobs in zuul.d/ dir  https://review.opendev.org/72239419:37
clarkbcorvus: it doesn't remove the running image. I think it may remove the latest image19:38
clarkbbut I'm not sure about latest19:38
clarkbit removes all "dangling" images. Now to figure out what that means19:39
clarkb"Dangling images are layers that have no relationship to any tagged images."19:39
corvusclarkb: when i do 'docker image prune' locally, i still have a bunch of tagged images in the list19:39
clarkbso ya we can probably just pull and prune always19:39
corvus++19:39
clarkbinfra-root https://review.opendev.org/#/c/722394/ should be ready for review and the conflicts with list looks less bad than before19:40
clarkbI believe all testing that will run is happy now too but there may be surprises I suppose19:40
clarkbcorvus: I guess we move the pull and prune into main.yaml and simplify start.yaml then? I can work on that patchset if we like that idea19:41
corvusclarkb: yeah that sounds right19:43
openstackgerritClark Boylan proposed opendev/system-config master: Pull and prune docker images together  https://review.opendev.org/72618519:52
clarkbcorvus: ^ I think I switched gears there properly19:52
openstackgerritMerged opendev/system-config master: Update bup excludes for zuul-scheduler  https://review.opendev.org/72618319:54
openstackgerritMerged openstack/project-config master: Stop building opensuse-15-plain images  https://review.opendev.org/72575120:06
paladoxcorvus wondering if you saw my ping from earlier in the morning? :)20:08
paladoxerr wrong channel20:08
corvuspaladox: no -- which channel? :)20:13
corvusah found it20:15
paladoxzuul :)20:15
openstackgerritGage Hugo proposed openstack/project-config master: Retire syntribos - Step 1  https://review.opendev.org/72623720:26
clarkbinfra-root if you get a chance it might be good to get other eyeballs onto the disk constraints issue on zuul01.openstck.org. Hoping we can avoid filling that disk, but worried the last thing I'm seeing there is the /root/.bup dir whcih I'm not sure how to safely clean up20:30
clarkbianw: ^ I know you've done some bup things any idea if we can move the old zuul backups aside, rm /root/.bup then start a new backup series for zuul01?20:30
clarkbor possibly even rm /root/.bup and keep backing up to the existing location?20:31
clarkbhttps://review.opendev.org/#/c/722394/ passed testing20:41
corvusclarkb: interesting it ran so many jobs...20:43
corvusclarkb: they shouldn't be changing, right?20:44
clarkbcorvus: the content shouldn't be changing, its literally copy paste from one file into another20:44
clarkbbut I think zuul may see that as an update anyway?20:44
corvusclarkb: it's basically a diff of the json serialization of the job20:44
clarkbcorvus: is it possible that order matters?20:45
clarkbcorvus: iirc zuul loads the jobs in a sorted order when in zuul.d20:45
clarkband that new order may be different from what was in .zuul.yaml?20:45
mordredcorvus: do you want me to remove my +A?20:45
corvusclarkb: should be job-by-job20:47
corvusmordred: nah; let's just keep an eye out20:47
clarkbalso for each patchset I didn't do a rebase. Instead I redid the copy paste process to be sure I caught changes in master20:48
clarkbit was fairly mechanical so doing that was easier than reasoning about merge conflicts20:48
mordredclarkb: you might find https://review.opendev.org/#/c/725103/ amusing20:50
clarkbmordred: yay puppet20:51
openstackgerritMonty Taylor proposed opendev/system-config master: Remove puppet on non-puppeted servers  https://review.opendev.org/72510420:53
corvusmordred: holy cow, the zk tls change passed tests20:57
mordredcorvus: woot!20:57
corvusmordred: however, it seems it managed to do that despite the zuul scheduler failing to start20:59
mordredcorvus: for the job diff - is it possible that the serialization contains yaml context info (for error line numbers and stuff?)20:59
mordredcorvus: I think we need better testing of the zuul/nodepool services actually starting20:59
corvusmordred: yeah, there's a source_context in there21:00
corvuswe should drop that from the comparison21:00
mordred\o/21:00
* mordred had a thought21:01
corvusmaybe the description, too?21:01
mordredcorvus: yeah - maybe so - I don't hink we need to run tests if the description changes21:01
corvusmordred, clarkb: it looks like the zuul installation as created by system-config on current master largely doesn't work21:15
corvuswe need to fix that before we can use it to validate the tls change21:15
corvusthe scheduler says:21:15
corvusconfigparser.NoSectionError: No section: 'gearman'21:15
corvuswhich is weird, because the zuul.conf has a gearman section.  and the other services don't have that error21:15
clarkbmaybe a bind mount issue?21:16
corvusmaybe; that makes me wonder how prod is working though?21:16
clarkb/etc/zuul:/etc/zuul is a bind mount in docker-compose21:17
clarkband there is a gearman section in prod's /etc/zuul/zuul.conf21:18
corvusthe one we copy to logs is in /etc/zuul/zuul.conf21:18
corvusand it has a gearman section21:18
clarkbI would expect a permissions error to manifest differently21:20
* mordred is confused21:22
corvusi downloaded the zuul.conf from the test and configparser is fine with it21:27
mordredcorvus: do we need to hold a node and recheck the job?21:27
mordredI don't see anything that explains what's different from the logs21:27
mordredeverything ran that should and the file has the right content21:27
openstackgerritJames E. Blair proposed opendev/system-config master: DNM: fail zuul tests  https://review.opendev.org/72624821:31
corvusmordred: yeah, i guess so21:31
corvusnodepool fails to connect to zk21:33
corvusso i guess i should update that too21:33
openstackgerritMonty Taylor proposed opendev/system-config master: Remove puppet on non-puppeted servers  https://review.opendev.org/72510421:33
openstackgerritJames E. Blair proposed opendev/system-config master: DNM: fail zuul tests  https://review.opendev.org/72624821:33
mordredcorvus: your comment on the previous ps ^^ made me realize I was doing that WAY to absurdly21:33
*** DSpider has quit IRC21:34
corvusmordred: much nicer :)21:34
corvusokay, we should have a held zuul and nodepool set soon21:35
clarkbmordred: we need to disable puppte agent on puppet hosts though21:36
clarkbmordred: so I think we want to keep running the old role on puppet too21:36
clarkbmordred: left that note on https://review.opendev.org/#/c/725104/5 if yo ucan double ceck it21:39
clarkband I can't type21:39
openstackgerritMerged opendev/system-config master: Stop running mcollective  https://review.opendev.org/72510321:42
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Actually include platform in the upload build  https://review.opendev.org/72625121:58
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Process siblings in upload-image push  https://review.opendev.org/72625322:02
openstackgerritMonty Taylor proposed opendev/system-config master: Add focal to system-config base job  https://review.opendev.org/72567622:15
mordredclarkb: ^^ I rebased this on your reorg patch22:16
clarkbhrm did we break a bunch of our tests? the reorg change is failing everything now22:16
mordredclarkb: oh good22:16
clarkbmordred: opensift client tarball fetch is returning HTTP 40322:17
corvusthis is a really helpful error from systemd: https://27fbadc9cf2dd544239f-7d0e556db3075d25d1b91bbdcc8a4562.ssl.cf1.rackcdn.com/722394/3/gate/system-config-run-static/cc59be9/bridge.openstack.org/ara-report/result/25990b08-c9a9-4a19-9b80-7839f9c6b1f3/22:17
mordred2020-05-07 22:10:09.258383 | bridge.openstack.org | TASK [install-kubectl : Download openshift client tarball] *********************22:18
mordredyeah22:18
mordredcorvus: yeah - it really is isn't it?22:18
corvusclarkb: https://zuul.opendev.org/t/openstack/build/cc59be9dfac6482dbd543a9a9258f93c/log/static01.opendev.org/syslog.txt#155622:19
corvusclarkb, mordred: ^ that's the static/apache failure22:19
corvussomething about the LE cert22:19
clarkbcorvus: I'm guessing the LE demo site failed and so we didn't fake issue our fake cert?22:20
clarkbdecoupling from their fake site now that we know things essentially work may be a good idea22:20
clarkbianw: ^ is that something you've thought about as an option?22:20
clarkbthe openshiftclient issue seems to be a github thing?22:20
mordredyeah - github seems to be happier now ... I just curl'd the file locally22:20
mordredalthough if I try to click on it in my browser it's still sad22:20
clarkbI'm still getting the aws error22:21
corvusclarkb: do we hit their fake server?22:21
mordrednope. still error22:21
openstackgerritMerged zuul/zuul-jobs master: Actually include platform in the upload build  https://review.opendev.org/72625122:21
clarkbcorvus: yes, then if fake server succeeds we copy over a snakeoil cert22:21
openstackgerritMerged zuul/zuul-jobs master: Process siblings in upload-image push  https://review.opendev.org/72625322:21
corvusclarkb: this is nothing failing for the tarball server: https://27fbadc9cf2dd544239f-7d0e556db3075d25d1b91bbdcc8a4562.ssl.cf1.rackcdn.com/722394/3/gate/system-config-run-static/cc59be9/bridge.openstack.org/ara-report/result/bc0f0019-15c4-438d-9a54-41198f4bcf85/22:21
mordredI found better url22:22
corvusclarkb: i can't see anything wrong22:22
mordredhttps://artifacts-openshift-release-3-11.svc.ci.openshift.org/zips/openshift-origin-client-tools-v3.11.0-d699176-406-linux-64bit.tar.gz is not a github url and is mentioned on the github release page22:22
mordred(also it works)22:22
corvusmordred: it's a google ip22:23
clarkbthis happens on kubernetes releases too22:23
mordredshould we update our download to download from there instead?22:23
clarkbso I think its a complete failure of github's release system22:23
mordredclarkb: but this isn't a new release issue22:23
mordredthis is an old release22:23
clarkbmordred: ya I know22:23
clarkbmordred: I'm saying other releases for other projects have the same issue22:23
corvusmordred: google's usually pretty good about keeping their servers up22:23
mordrednod22:23
mordredcorvus: yeah22:23
clarkbit looks likethey've misconfigured their aws file proxying22:23
mordredclarkb: you wanna update your patch or I can I've got it in my cache22:23
clarkbmordred: we can update the url in a separate change right? I think that will be cleaner for review and history22:24
mordredyeah22:25
openstackgerritMonty Taylor proposed opendev/system-config master: Organize zuul jobs in zuul.d/ dir  https://review.opendev.org/72239422:25
openstackgerritMonty Taylor proposed opendev/system-config master: Add focal to system-config base job  https://review.opendev.org/72567622:25
openstackgerritMonty Taylor proposed opendev/system-config master: Use google hosted oc client tools location  https://review.opendev.org/72625622:25
mordredclarkb, corvus : ^^ that's the file and checksum listed at https://artifacts-openshift-release-3-11.svc.ci.openshift.org/zips/22:26
clarkbmordred: hrm I wasn't expecting different shas there22:26
clarkblooks like the filename embedded sha and the blob sha are different? Is that like when we do a dev release its 3.11 + this delta?22:26
mordredme either - but I'm guessing they built their own to upload to the google location and had github build the ones on github maybe?22:27
clarkbheh I was going to download both and compare them22:28
corvusmordred: where's the page where openshift promises this isn't a haxor trojan thing?22:28
clarkbthen I realized I can't download the old one :)22:28
ianwclarkb: we test against staging ... if anything we should move to testing against some sort of dummy issuer22:29
mordredeek no don't load that22:29
mordredthat's the build from the tip of the release-3.11 branch22:29
mordredand it'll change every time they land a new commit22:29
clarkbmordred: thats what I was thinking22:29
ianwhttps://zuul.opendev.org/t/openstack/build/cc59be9dfac6482dbd543a9a9258f93c/log/static01.opendev.org/acme.sh/acme.sh.log doesn't seem to have anything obvious22:30
clarkbfwiw I can download nad extract it and it seems to have the contents we expect, but if thats dev ya I don't think we want that22:30
mordredhttps://github.com/openshift/origin/commits/d699176b22a0836c4f2dcd327a685473249e963322:30
mordredthat's the sha in that filename22:30
* mordred unrestacks that22:30
openstackgerritMonty Taylor proposed opendev/system-config master: Organize zuul jobs in zuul.d/ dir  https://review.opendev.org/72239422:30
openstackgerritMonty Taylor proposed opendev/system-config master: Add focal to system-config base job  https://review.opendev.org/72567622:30
ianwsorry, take that back it failed with "35" -> https://zuul.opendev.org/t/openstack/build/cc59be9dfac6482dbd543a9a9258f93c/log/static01.opendev.org/acme.sh/acme.sh.log#46122:31
mordredI -2'd the update patch and then abandoned it22:31
clarkbianw: thanks for tracking that back22:31
mordredCURLE_SSL_CONNECT_ERROR22:31
mordredhaha22:31
mordredit's an ssl error22:31
corvusmordred: nice :) le inception22:32
corvusi think that part of the internet is telling us to go do something else for a while22:32
mordredcorvus: yeah22:32
corvusoh, the nodepool/zuul jobs are done22:33
mordredneat!22:33
clarkbit certainly feels that way22:33
mordredianw: fwiw - I still don't have an arm image for you - there were a series of tiny derps in the publish job22:33
clarkbits not made it to the github status dashboard22:33
clarkbthough I expect if I mention it to jlk at this point it would just be an unnecessary interupt22:34
mordredianw: hahaha. your patch to add build-essential for the arm host - is that because we pip install docker-compose? that's actually pretty funny22:34
ianwmordred: ok ... https://c8abc17054e797f6cc7e-38b170b34202fbd22a3d39c7d4e00ec5.ssl.cf5.rackcdn.com/726037/4/check-arm64/system-config-run-nodepool-builder-arm64/886487d/bridge.openstack.org/ara-report/ got further than i thought22:35
openstackgerritJames E. Blair proposed opendev/system-config master: DNM: fail zuul tests  https://review.opendev.org/72624822:35
corvusfailed to fail22:35
mordredianw: oh. hah. I hadn't thought about needing arm zk images for our tests22:35
ianwmordred: yeah i think that was it, but it could be anything we install that has a wheel on x86 that doesn't on arm22:36
mordredianw: yah - I just checked because we're trying to install fewer things with global pip on these docker hosts22:36
ianwit did pull a nodepool image, though22:36
mordredbut - docker-compose is the exception22:36
mordredianw: it did?22:36
clarkbmordred: ianw in theory zk on arm should be easy because jvm at least22:36
corvusmordred, clarkb: can we switch back to distro d-c on focal?22:36
ianwyeah ... just looking now :)22:37
mordredcorvus: maybe22:37
mordredso - uhm. how did the job pull a nodepool image?22:37
ianw"Pulling nodepool-builder ... digest: sha256:82af2f94898157ea82...",22:37
ianwnot sure what that's cut off22:38
mordredI do agree - it does very much look like it successfully pulled on22:38
mordredone22:38
* clarkb looks at ubuntu package versions22:38
mordredbut - I haven't seen us upload one22:38
ianwi'm presuming you have memorised the sha256's of all our container images22:38
mordredcorvus: is it possible that the original docker push based upload to dockerhub actually was working for multi-arch manifest?22:39
clarkbcorvus: yes I think docker-compose on focal is plenty new enough (its from november 2019)22:39
mordredcool22:39
ianwshould we see it @ https://hub.docker.com/r/zuul/nodepool-builder/tags ?22:40
mordredianw: I htink I'd expect to see more than one arch at https://hub.docker.com/layers/zuul/nodepool-builder/latest/images/sha256-f7d46486c717ed08d1b4ea90f77bf1a6e578cbfcb24f4bb353dd7345c88410a9?context=explore22:42
ianwclarkb: i just saw those openshift client failures too, but can download hte file here; is it just a recheck situation?22:44
clarkbif I click on the kubernetes release file it works now22:44
corvusmordred: no idea, i took your word that it wasn't working22:44
clarkbianw: ya I think if there isn't another source for that (maybe that obs kubic related packaging for libcontainers has a sibling for openshift?) then we just recheck once it works and I think its wrking for me now22:45
mordredcorvus: I havent' been able to see any evidence it was working22:45
ianwmordred: am i missing how to see older versions?22:46
mordredianw: there isn't a way to see older versions22:46
corvusmordred: easiest thing would be to fetch the manifest, yeah?22:46
mordredcorvus: yeah - and manifest inspect for me shows me only amd6422:47
corvusalso, i guess the arch should show up on https://hub.docker.com/r/zuul/nodepool-builder/tags ?22:47
mordredyeah22:47
mordredso - I'm baffled by the pull working in that nodepool job -- oh wait22:48
mordredcorvus: is there any chance that job was finding an arm image in our intermediate registry?22:48
mordredthose should be cross-tenant so almost certainly not, right?22:49
mordredianw: what change was that job log from?22:49
ianwhttps://review.opendev.org/#/c/726037/22:49
*** tkajinam has joined #opendev22:49
ianwbut the registry job started at 6:1922:50
ianwhttps://zuul.opendev.org/t/openstack/build/7980170503ff45cfbb354eb5630e5c63 : 2020-05-07T06:19:1622:50
mordredyeah - there's nothing about that that should have been able to find a non-published internal image22:50
ianwand it also *didn't* find a zk image22:51
ianwi've got  a recheck going, let's see if it replicates first22:53
mordred++22:54
corvusmordred, ianw: am i correct in understanding that your current question is "what is sha256:82af2f94898157ea82 and where did it come from?"22:54
mordredcorvus: well - my main question is "how did docker on arm find _anything_ to install for zuul/nodepool" - but I think that's a valid sub question22:55
ianwcorvus: yes, that is what the arm64 job appeared to pull22:55
corvusmordred: it's hard to directly answer the real question because we changed dockerhub since then, right?22:55
corvus(once ianw rechecks, we'll have a test where we can actually inspect dockerhub at the same time though)22:55
ianwhttps://c8abc17054e797f6cc7e-38b170b34202fbd22a3d39c7d4e00ec5.ssl.cf5.rackcdn.com/726037/4/check-arm64/system-config-run-nodepool-builder-arm64/886487d/bridge.openstack.org/ara-report/result/8a4c85f5-536f-4ecd-98b0-b352b83dc6b7/22:56
mordredhttps://zuul.opendev.org/t/openstack/build/886487d6c14749ce8b7e7b29196fcec5/log/nb03.opendev.org/docker/nodepool-builder-compose_nodepool-builder_1.txt#122:56
mordredit pulled amd6522:56
mordredamd6422:56
mordredand then failed22:56
corvusianw, mordred: can you give your links more context?22:57
corvusmordred: that build link is different than the one ianw pointed to earlier22:57
ianwcorvus: https://review.opendev.org/#/c/726037/ is the job that is trying to run gate testing of arm64 nodepool builder22:57
mordredI'm looking at https://zuul.opendev.org/t/openstack/build/886487d6c14749ce8b7e7b29196fcec5 - which is the build for that job22:57
corvusok, thx22:58
mordredand that was the docker log for the nodepool container - which while docker pulled an image for it failed to exec the contents inside -  because "awesome"22:58
ianwso we see it failing to get a zk image due to the arch not matching, but downloading an x86 container for nodepool-builder and trying to run it22:58
ianwzk failure to pull due to arch @ https://c8abc17054e797f6cc7e-38b170b34202fbd22a3d39c7d4e00ec5.ssl.cf5.rackcdn.com/726037/4/check-arm64/system-config-run-nodepool-builder-arm64/886487d/bridge.openstack.org/ara-report/result/d48ca6f2-75bf-4a17-8dff-3bf2659f270b/22:58
corvusianw, mordred: this is the build that pushed sha256:82af... https://zuul.opendev.org/t/zuul/build/efe660457c1a4c34bb9fe46e3587dbe223:01
ianwhopefully the currently running one holds the nodes23:02
ianwssh root@2604:1380:4111:3e56:f816:3eff:fe74:ee7d is the zk node ; ssh root@2604:1380:4111:3e56:f816:3eff:fe5f:c9cf is the nb (to be) host23:05
corvusianw, mordred: dockerhub is updated and the images are now multi-arch: https://hub.docker.com/r/zuul/nodepool-builder/tags23:05
ianwheh, ok so this build is going to grab that anyway, hopefully23:06
* mordred needs to eod23:06
mordredcorvus: woot!23:07
ianwif things are looking good, i guess i'll just move forward on a nb03.opendev.org23:09
ianwi wonder actually if we have enough quota for a duplicate builder23:15
openstackgerritMerged opendev/system-config master: Pull and prune docker images together  https://review.opendev.org/72618523:16
mordredcorvus: I'm trying to eod - but I notice that we've uploaded to latest there, not just to change_{foo}_latest23:17
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Don't upload to the tag with buildx - only to the change tag  https://review.opendev.org/72626123:18
mordredcorvus, ianw: ^^ we should land that soonish - only affects nodepool images - but still23:19
ianwmordred: we're a few seconds away from seeing what the test nb03 tries to pull :)23:19
corvusmordred: yes, important, thanks23:20
corvusmordred: i just assumed the promote job failed to delete the change tag23:20
mordredianw: it should pull the right thing - although the k8s functional test in the nodepool job has not succeeded - so this is an accident :)23:21
ianw            "Labels": {23:21
ianw                "com.docker.compose.config-hash": "2549820a8f46c69784b2981e97d86858ce83ccbb73c5abd374c5c17abe038156",23:21
ianwstandard_init_linux.go:211: exec user process caused "exec format error"23:21
corvusmordred: but yes, now i see that change hasn't even merged yet :(23:21
ianwi would say it has not23:21
mordredpoo23:22
mordredSO - I noticed something in the zk situation23:22
mordredwhich is that it was looking for linux/arm64/v823:22
mordredand we are so far only building linux/arm6423:22
mordreddo we need to try building linux/arm64/v8 ?23:22
ianwArchitecture: aarch6423:23
ianwfrom docker info23:23
mordredlinux/amd64, linux/arm64, linux/riscv64, linux/ppc64le, linux/s390x, linux/386, linux/arm/v7, linux/arm/v623:24
mordredthose are the archs we can build for23:24
ianwbut again, the zk one failed with the no arch match ...23:24
mordredwe're so close ...23:25
ianwthe only difference seems to be latest in nodepool v image: docker.io/library/zookeeper:3.523:27
ianwa specific tag23:27
corvuszk only has amd64 for that tag23:27
ianwOH ... i thought it was our zk image ... that's an upstream image23:28
mordredcorvus: so, this: docker run --rm arm64v8/alpine uname -a23:28
mordredon mttest-docker23:28
corvussome old zk image tags have linux/arm/v6 arch23:28
mordredcorvus: shows that binfmt has been registered for the appropriate arch for our arm64/v8 host23:29
mordredcorvus: so I think we just need to learn how to configure the builder we're launching to work with it23:29
mordredassuming that everything is pulling the right images23:29
mordredoh - actually - arm64v8/alpine is arm64 arch in its image manifest23:30
mordredso maybe those two are compatible23:30
corvusmordred: yes i was about to point that out23:30
corvusi think the v8 thing is a red herring23:31
mordredyeah23:31
corvus(that came from the zk failure, but the zk failure doesn't have *any* kind of an arm image, so we shouldn't fixate on that)23:31
mordredagree23:32
corvusianw: did your test run pull this image?  https://hub.docker.com/layers/zuul/nodepool-builder/latest/images/sha256-606d630c83ada2f40a182f0ea88d6616f2005a1effb11f8e305aed07ee57ea85?context=explore23:32
ianwno23:33
ianwzuul/nodepool-builder   latest              d3999fec2fbc        41 minutes ago      634MB23:33
openstackgerritMerged zuul/zuul-jobs master: Don't upload to the tag with buildx - only to the change tag  https://review.opendev.org/72626123:34
ianwthat has in it "org.zuul-ci.change_url": "https://review.opendev.org/614074"23:34
ianwwhen i inspect that, it has "Architecture": "arm64",23:35
corvusianw: that's weird; dockerhub says that :latest was uploaded 32 minutes ago23:35
mordredcorvus: also - one more thought - we do install dumb-init in python-base - perhaps with docker layers we're getting an x86 dumb-init23:35
corvusmordred: interesting23:35
mordredcorvus: so maybe we actually DO need to build arm python-base23:35
ianwhttp://paste.openstack.org/show/793297/ is the full inspect23:36
mordredthe python base image itself is multi-arch - and nothing in our docker build of nodepool interacts with dumbinit23:36
*** tosky has quit IRC23:36
openstackgerritMonty Taylor proposed opendev/system-config master: Build multi-arch python-base/python-builder  https://review.opendev.org/72626323:39
corvusianw: based on the labels and arch, that appears to be the right image23:39
*** mlavalle has quit IRC23:40
mordredcorvus: so x86 python-base dumb-init is my best hypothesis right now23:40
corvusianw: maybe you can try some docker run in that image?23:41
corvusianw: basically, everything should work except dumb-init if mordred's hypothesis is correct23:41
ianwcoruvs: yeah i am, but i always get the exec format error23:41
mordredianw: what are you running?23:41
ianw# docker run --entrypoint "/usr/bin/python" zuul/nodepool-builder:latest23:41
ianwstandard_init_linux.go:211: exec user process caused "exec format error"23:41
mordredah - interesting23:41
ianwis there some way to inspect the container from the outside; mount it as a volume or something?23:42
ianwi could export it i guess?23:42
corvusianw: yeah, docker save?23:42
mordredok. I really do have to EOD- good luck23:43
ianw file /usr/bin/python2.723:45
ianw /usr/bin/python2.7: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV)23:45
ianwok, so i exported the image and untarred it23:47
ianw# ./bin/bash23:47
ianw-bash: ./bin/bash: cannot execute binary file: Exec format error23:47
ianwhttp://paste.openstack.org/show/793298/ ...23:49
ianwfile format elf64-littleaarch64 (good) v file format elf64-little (bad)23:51

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!