Wednesday, 2020-04-22

clarkbianw it looks like ubuntu-ports got interrupted so it stopped running was that you?00:30
clarkbif not I'll start it again00:31
ianwi don't think so ...00:31
clarkbok starting it again.00:31
ianwSelect returned error 4: Interrupted system call ... anything in dmesg about afs?00:32
clarkbnot recently00:32
ianwnope00:32
clarkbconsidering the size of the repo its probably not completely unexpected00:32
clarkband we'll just keep running it until it goes happy00:32
ianwyeah, i didn't ctrl-c it or anything00:34
clarkbpretty sure I didn't either. Will just continue to watch it00:34
clarkbI checked quota usage and we have plenty of room there too00:34
ianwlooks like the same just happened on the ubuntu main sync00:41
clarkbianw: looks like ubuntu-ports picked up whree it left off00:41
clarkbis it possible the two are interacting with each other?00:41
clarkbthey were fine for a long time :?00:41
clarkbianw: maybe I should start the ubuntu main repo sync again then see if ubuntu-ports stops soon?00:41
clarkboh you got it thanks00:42
clarkbok I'm eating dinner now will keep checking on ti after00:42
ianwi just restarted the main one ... we don't have a timeout in the script do we?00:42
clarkbianw: oh maybe00:42
clarkbwe do in the cron but I thought running it manually like this might be avoiding that00:42
ianwyeah, i think we want NO_TIMEOUT00:42
ianwin fact, we do want that00:43
clarkbah yup00:43
clarkbok I guess next time it times out we can set that00:43
clarkbsorry about that I misread the script00:43
ianwsetting it only under cron is a good idea00:45
ianwkeep that in mind when we move it all over :)00:45
clarkb++00:46
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: cabal-test: add initial haskell job  https://review.opendev.org/72173500:55
openstackgerritIan Wienand proposed openstack/diskimage-builder master: yum-minimal: strip env vars in chroot calls  https://review.opendev.org/72172600:57
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [wip] switch func tests to containers  https://review.opendev.org/72151100:57
openstackgerritTristan Cacqueray proposed zuul/zuul-jobs master: cabal-test: add initial haskell job  https://review.opendev.org/72173501:10
clarkbianw: I updated the etherpad with the timeout info01:27
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [wip] Revert "opensuse: fix python 2.x install"  https://review.opendev.org/72176301:27
clarkbit should timeout in about 34 minutes or so01:27
clarkbI'll try to get them then01:27
ianwclarkb: i can restart them, don't worry too much if you've got better things to do :)01:28
clarkbianw: ok maybe I'll take you up on that. Its add NO_TIMEOUT=1 to the beginning of the command fwiw01:28
ianwyep01:28
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [wip] Revert "opensuse: fix python 2.x install"  https://review.opendev.org/72176301:30
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [wip] Revert "opensuse: fix python 2.x install"  https://review.opendev.org/72176301:36
openstackgerritMerged openstack/diskimage-builder master: Make ipa centos8 dib job voting  https://review.opendev.org/71770001:44
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [wip] Revert "opensuse: fix python 2.x install"  https://review.opendev.org/72176301:58
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [wip] Revert "opensuse: fix python 2.x install"  https://review.opendev.org/72176302:19
ianw^ ok both now running with NO_TIMEOUT02:20
*** DSpider has joined #opendev02:46
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [wip] Revert "opensuse: fix python 2.x install"  https://review.opendev.org/72176302:47
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [wip] Revert "opensuse: fix python 2.x install"  https://review.opendev.org/72176303:13
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Revert "opensuse: fix python 2.x install"  https://review.opendev.org/72176303:42
ianwdirk / mordred / frickler : ^ can we go with something like this before next dib release?03:45
*** kevinz has quit IRC04:24
*** ykarel|away is now known as ykarel04:25
clarkbianw: looks like both are still going. quotas look good too04:51
clarkbianw: if this ends up going past your EOD I'll just pick it up in the morning04:51
clarkbbut now I am going to call it a night. Thanks!04:52
*** ysandeep|away is now known as ysandeep05:34
dirkianw: can you explain why this is a revert?05:59
dirkIt seems more like a refinement of the previous change. E.g. different way of making it work with opensuse 1506:00
dirkianw: please note that we readded python2-six etc to tumbleweed. So it is possible with tumbleweed to still create a python2 virtualenv. Parts of the change would not be needed then06:01
openstackgerritMerged zuul/zuul-jobs master: helm-template: enable using values file  https://review.opendev.org/72136506:09
AJaegerfrickler: could you review https://review.opendev.org/#/c/721719/ to retire i18n-specs, please?06:13
*** dpawlik has joined #opendev06:24
ianwdirk: for tumbleweed we don't want pip-and-virtualenv involved at all, per our removal plans in https://docs.opendev.org/opendev/infra-specs/latest/specs/cleanup-test-node-python.html06:33
ianwfor 15; as i've mentioned several times now, that is using the python2 path to install the python3 packages (it doesn't specify _do_py3) which is wrong and makes a mess even more messy06:37
ianwi believe we could actually drop pip-and-virtualenv from the image right now06:38
ianwunfortunately last time, https://review.opendev.org/#/c/712609/ got -2'd and i didn't notice, so devstack seemed broken, and it led mordred and other on a wild goose chase trying to restore an old image06:39
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Revert "opensuse: fix python 2.x install"  https://review.opendev.org/72176306:51
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Restore SUSE tests to gate  https://review.opendev.org/72177906:58
ianwclarkb: yeah, not sure if i'll get back to it, it's up to l & p in universe it seems07:18
dirkianw: thanks for sharing the background07:21
dirkAlthough you mentioned it several times before ;-) I don't read all of the channels all the time so I might miss comments that are not including my.nick07:22
ianwthat's ok ... the best thing would be if we can *all* never have to think about this element again! :)07:23
*** rpittau|afk is now known as rpittau07:24
*** ralonsoh has joined #opendev07:25
*** tosky has joined #opendev07:45
openstackgerritSorin Sbarnea proposed openstack/project-config master: Enable promote to unarchive gz archives in addition to bz2  https://review.opendev.org/72165208:08
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: run default envlist if tox_envlist is empty  https://review.opendev.org/72179008:22
*** roman_g has quit IRC08:23
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: Run all environments in tox_envlist  https://review.opendev.org/72179008:36
*** Dmitrii-Sh has joined #opendev08:45
*** ysandeep is now known as ysandeep|lunch08:46
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: Run all environments in tox_envlist  https://review.opendev.org/72179009:00
*** ysandeep|lunch is now known as ysandeep09:06
yoctozeptoinfra-root: is ethercalc all right? I can't access https://ethercalc.openstack.org/kolla-infra-service-matrix09:12
*** ykarel is now known as ykarel|lunch09:18
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: tox: update tox_envlist documentation  https://review.opendev.org/72179609:19
frickleryoctozepto: indeed, apache2 error log was full with "[mpm_event:error] [pid 5272:tid 140651845121920] AH00485: scoreboard is full, not at MaxRequestWorkers" starting with today's log rotation09:26
frickler#status log restarted apache2 on ethercalc.openstack.org which seems to have gotten stuck during today's log rotation09:27
openstackstatusfrickler: finished logging09:27
yoctozeptofrickler: thanks09:27
frickleruptime 616d, not bad for a cloud service09:30
*** ykarel|lunch is now known as ykarel09:45
*** rpittau is now known as rpittau|bbl10:42
*** roman_g has joined #opendev11:15
*** ysandeep is now known as ysandeep|brb11:46
*** rpittau|bbl is now known as rpittau11:50
openstackgerritMerged openstack/diskimage-builder master: Remove Babel and any signs of translations  https://review.opendev.org/72067311:51
ttxnow running the async refs/changes cleanup script on the openstack github mirrors. Should take a few days11:51
*** Eighth_Doctor is now known as Conan_Kudo12:25
*** Conan_Kudo is now known as Eighth_Doctor12:27
*** Eighth_Doctor is now known as Conan_Kudo12:28
*** Conan_Kudo is now known as Eighth_Doctor12:29
*** Eighth_Doctor is now known as Conan_Kudo12:29
*** Conan_Kudo is now known as Eighth_Doctor12:30
*** ysandeep|brb is now known as ysandeep12:41
AJaegerconfig-core, could you review https://review.opendev.org/#/c/721719/ to retire i18n-specs, please?12:52
AJaegerttx: thx, hope you have a stable connection ;)12:52
*** ykarel is now known as ykarel|afk13:00
openstackgerritMonty Taylor proposed opendev/system-config master: Move in-tree hiera settings to ansible vars  https://review.opendev.org/72162913:08
openstackgerritMonty Taylor proposed opendev/system-config master: Add new etherpad to cacti  https://review.opendev.org/72163313:09
openstackgerritMonty Taylor proposed opendev/system-config master: Stop cloning a bunch of puppet modules we don't use  https://review.opendev.org/72089213:09
ttxAJaeger: I procured a cloud instance to do that :) Don;t want to see my IP blacklisted on GitHub13:12
AJaeger;)13:14
mordredttx: neat. we already landed the "don't replicate new refs/changes" config yes?13:23
ttxyes13:23
ttxhttps://review.opendev.org/#/c/720679/13:24
mordredinfra-root: holy crap! https://review.opendev.org/#/c/721098 is green! that means it fixes the speculative container support - and otherwise all works!13:34
mordredcorvus: let me know when you've got a few minutes to chat about nodepool containers13:36
AJaegermordred: time for a quick review of https://review.opendev.org/#/c/721719/ to retire i18n-specs, please?13:39
openstackgerritMonty Taylor proposed opendev/system-config master: Stop cloning a bunch of puppet modules we don't use  https://review.opendev.org/72089213:40
AJaegerthx, mordred13:42
corvusmordred: i'm here.  also, w00t!13:45
*** sgw has quit IRC13:47
corvusmordred: left some comments on that change13:49
*** ykarel|afk is now known as ykarel13:51
openstackgerritMerged openstack/project-config master: Retire i18n-specs repo  https://review.opendev.org/72171913:53
openstackgerritMonty Taylor proposed opendev/puppet-cgit master: Retire repo  https://review.opendev.org/72196213:53
openstackgerritMonty Taylor proposed opendev/puppet-accessbot master: Retire repo  https://review.opendev.org/72196313:55
openstackgerritMonty Taylor proposed opendev/puppet-etherpad_lite master: Retire repo  https://review.opendev.org/72196413:55
openstackgerritMonty Taylor proposed opendev/puppet-exim master: Retire repo  https://review.opendev.org/72196513:55
openstackgerritMonty Taylor proposed opendev/puppet-gerrit master: Retire repo  https://review.opendev.org/72196613:55
openstackgerritMonty Taylor proposed opendev/puppet-gerritbot master: Retire repo  https://review.opendev.org/72196713:55
openstackgerritMonty Taylor proposed opendev/puppet-infracloud master: Retire repo  https://review.opendev.org/72196813:55
openstackgerritMonty Taylor proposed opendev/puppet-ipsilon master: Retire repo  https://review.opendev.org/72196913:56
openstackgerritMonty Taylor proposed opendev/puppet-iptables master: Retire repo  https://review.opendev.org/72197013:56
openstackgerritMonty Taylor proposed opendev/puppet-jenkins master: Retire repo  https://review.opendev.org/72197113:56
openstackgerritMonty Taylor proposed opendev/puppet-nodepool master: Retire repo  https://review.opendev.org/72197213:56
openstackgerritMonty Taylor proposed opendev/puppet-openstackci master: Retire repo  https://review.opendev.org/72197313:56
openstackgerritMonty Taylor proposed opendev/puppet-os_client_config master: Retire repo  https://review.opendev.org/72197413:56
openstackgerritMonty Taylor proposed opendev/puppet-packagekit master: Retire repo  https://review.opendev.org/72197513:56
openstackgerritMonty Taylor proposed opendev/puppet-phabricator master: Retire repo  https://review.opendev.org/72197613:57
openstackgerritMonty Taylor proposed opendev/puppet-zuul master: Retire repo  https://review.opendev.org/72197713:57
openstackgerritRoman Gorshunov proposed openstack/project-config master: Retire airship-in-a-bottle: end project gating  https://review.opendev.org/72197813:57
mordredAJaeger: ^^ retire patches!13:57
mordredcorvus: cool - thanks13:57
AJaegermordred: thanks!13:58
mordredcorvus: so - it looks like ianw has sorted the nodepool builder container issue - which means we should either grapple with the arm question or continue with the "switch builders to use ansible" approach. did you get a chance to read the scrollback from me from the othe day about the arm stuff?13:58
mordredhttp://eavesdrop.openstack.org/irclogs/%23opendev/%23opendev.2020-04-19.log.html#t2020-04-19T15:04:32 for context13:59
corvusmordred: on the issue of buildset registry -- i believe it's just a request that it be in the same provider.  if that provider is not capable of producing the label, then it falls back to normal behavior14:02
corvusmordred: so i think we can still use a single buildset registry if we want, we'll just be shipping a lot of data over the internet14:03
mordredcorvus: ok - that's good14:03
mordredthat part was the biggest "this might not work at all"14:03
corvusyeah, i think that's worth a try14:03
corvusmordred: what's the manifest manipulation for?  i didn't understand that part14:04
mordredyou have to create a manifest that references the other images you made - and you push that to docker hub as "the image"14:04
corvusah, that makes me understand the man page now :)14:05
mordredso we make nodepool-builder:x86 and nodepol-builder:arm and then a manifest that is nodepool-builder:latest that refrences nodepool-builder:x86 as the image for x86 and nodepool-builder:arm as the image for arm14:05
mordredcorvus: \o/14:05
*** sgw has joined #opendev14:06
corvusmordred: i think the existing tooling is using docker; do we want to do that and leave buildah for later?14:07
mordredcorvus: yeah- I think I mostly mentioned buildah in case it was easier - it seems like the manifest step is going to be a new job on a new node, so it could likely use whatever is easiest (sort of like using skopeo for the transfers)14:08
mordredI'm guessing a "nodepool-build-manifest" job that dependencies: ['nodepool-build-image-arm', 'nodepool-build-image-x86'] yeah?14:09
corvusmordred: i guess to do this incrementally, we could download the manifest, add an entry, then upload it again?14:09
mordredcan we?14:09
corvusi'm wondering if we can do that without a third job; but even if we could, coordinating between those 2 jobs would be difficult.  so maybe it's simplest/best to do your suggestion14:10
mordredcorvus: we could do it incrementally perhaps if we also used a semaphore14:11
corvus(basically, the step of discovering whether a job was the 1st or second image would be a race)14:11
mordredyeah14:11
corvusmordred: yeah, but then we're serializing image builds which is lame14:11
mordredtotally agree14:11
mordredso this will be fun - I think maybe we just start with nodepool image build jobs in repo - and then extract out a build-combined-manifest role?14:12
mordredor do you think we should go ahead and do a build-combined-manifest role in zuul-jobs to start with?14:12
mordredguess it doesn't really make it any harder14:12
corvusyeah, i'd do that :)14:13
mordredk. I'll work on this today - because I think it's more forward-looking (and honestly a cool new zuul thing) than regressing to a pip install14:13
corvusok14:14
mordredI've got a couple of calls in the middle of the day, so I might fall into a hole for a minute, but I think it's reasonably doable14:14
mordredoh - I guess we should figure out what the docker architecture tag is for our arm hosts14:14
corvusmordred: i'm guessing the docker manifest command needs to run on a machine with dockerd, so it'll need a real node14:15
mordredyeah14:16
corvusmordred: i wonder 2 things: 1) if buildah could do it on the executor; 2) if we can just write a json file and skopeo it into to the registry, without any extra tools?14:16
mordredcorvus: ooh - that would be neat14:17
mordred(the 2 one)14:17
mordredI mean - it IS just a json file14:17
mordredarm64 seems to be the architecture tag14:17
corvusyeah, i think that might be worth it14:17
mordredcorvus: we could do that as a followup14:18
corvusit looks like we just need to know the size and sha256 of the underlying image manifests?14:18
corvusmordred: i think it might be easier to just start with this?14:18
mordredoh yeah?14:18
corvusmordred: we might be able to add the sha256 and size information to the zuul artifact return14:20
corvusmordred: which means the 'create a json' file step could just be a jinja template14:20
corvuswe wouldn't even need to move any data around14:20
corvusactually, maybe we can just 'skopeo inspect' the remote image14:22
mordredoh yeah - almost certainly14:22
* mordred was just docker manifest inspect ubuntu:latest - and docker manifest inspect -v ubuntu:latest - to see what it looked like14:23
mordredcorvus: oh - hah. I was thikning we could just start with nodepool - but really we need to start with python-base/python-builder :)14:25
corvusskopeo inspect docker://insecure-ci-registry.opendev.org:5000/zuul/zuul-merger:c96e6abbfd954049ab90323c268b26fc_latest14:26
corvusFATA[0001] Error determining repository tags: invalid status code from registry 404 (Not Found)14:26
mordredhrm14:26
corvusthat's weird; it's there, i can pull it, and the registry itself only logs 200 return codes14:26
corvusa14:27
corvus2020-04-22 14:27:07,184 INFO cherrypy.access.140475213507728: 70.36.235.80 - - [22/Apr/2020:14:27:07] "GET /v2/zuul/zuul-merger/tags/list HTTP/1.1" 404 762 "" "Go-http-client/1.1"14:27
corvusmordred: ^ maybe the registry is missing an api endpoint and i should flesh that out14:27
mordredcorvus: ++14:28
corvusmordred: how about we proceed under the assumption that i can get that working today and that we'll be able to use 'skopeo inspect' to get the necessary info?14:28
AJaegermordred: https://docs.opendev.org/opendev/infra-manual/latest/drivers.html#step-2-remove-project-content has a slightly different README, want to update for the retirement - or stay with yours? I wouldn't -1 yours, just find the other nicer14:28
mordredcorvus: I'll start working on the other parts of the puzzle14:28
corvusmordred: oh -- i don't see the size in the inspect output; only the sha14:29
corvusif we need size that may not be enough14:29
mordredcorvus: I see size in inspect output from docker14:30
mordredoh - that's on a manifest - one sec14:30
mordredcorvus: yeah- I see size from docker14:30
mordredbut that's not against insecure-ci-registry - that's just against local dockerhub stuff14:31
corvusi pulled an image and 'docker inspect' shows the size, but 'skopeo inspect' does not; it looks like it's missing from skopeo output14:32
mordredcorvus: "awesome"14:32
mordredcorvus: is there a -v option or anythign to skopeo inspect?14:32
corvuslooking14:33
corvusoh, i think i was getting confused with manifests and non-manifests14:34
corvusgimme a sec to sort this out14:34
mordredcorvus: skopeo inspect --raw gives me what docker gives me14:35
mordredcorvus: skopeo inspect --raw docker://docker.io/opendevorg/python-base14:35
mordredand skopeo inspect --raw docker://docker.io/library/ubuntu is uglier but shows the manifest version14:36
corvusmordred: yeah, i think that'll work -- i think the ['config']['size'] from that is what we want to put in ['manifests'][0]['size'] of the multi arch manifest14:37
mordredyeah14:37
corvusw00t, then i think we can proceed as planned :)14:37
mordredcorvus: I think this is goign to be real nice14:37
mordredand as a bonus - doesn't require nested virt! (which is what the docker buildx command needs)14:38
openstackgerritAndreas Jaeger proposed openstack/project-config master: Fix i18n-specs jobs  https://review.opendev.org/72198914:38
corvusmordred: it does require an arm cloud though14:38
corvusand we don't have a great track record with those14:39
mordredindeed14:40
mordredcorvus: well - if our current arm cloud goes away, so does our need for multi-arch images14:40
corvusyeah; but sinc this is going to run on every nodepool change, it's going to add extra time to each.  and we may need to frequently disable the job due to outages.14:42
mordredcorvus: but if we're worried about that - we could just do parallel jobs with an explicit arm64 tag in arm specific pipelines14:42
corvustrue14:42
mordredand in the deployment tell nb03 to boot from nodepool-builder:arm6414:42
mordredit's less sexy - but it might be more resilient in the short term14:42
corvusthat works for check, but it's not great for gate or promote14:44
mordredcorvus: in fact, we could go more simple for arm and just do a post-arm64 pipeline that does the build and upload - since with parallel pipelines we aren't goign to do gate so that ^^14:44
mordredcount on x86 for the testing and speculative (not like we're likley to want to run full integration tests on arm anyway)14:45
mordredand treat the arm build as more of a packaging step14:45
corvusyeah, that'd work, but then the images might get out of sync for deployment14:45
*** factor has quit IRC14:45
corvusthat may not be a concern if we're not going to auto restart the builders14:45
mordredyeah14:46
openstackgerritRoman Gorshunov proposed openstack/project-config master: Retire airship-in-a-bottle: end project gating  https://review.opendev.org/72197814:48
clarkbinfra-root ubuntu-ports is still going but ubuntu finished. Unfortunately ubuntu reports "there have been errors" so I think I'm going to rerun reprepro on ubuntu14:50
fungithanks for the heads up14:50
clarkbin theory it should be much quicker this time because the bulk of the data has been transferred14:51
fungiit does still wind up scanning a bunch of what's there, but yeah, at least it doesn't need to retransmit and write14:51
clarkbyup14:51
clarkbalso it appears that the ubuntu-ports reprepro might have stalled? We don't have timing info within the reprepro log output so hard to say for sure. But if it stays in its current log state for the next $breakfasttime then maybe I'll stop it start it again too14:53
clarkboverall though quota usage looks reasonable and about what I expected14:53
fungistrace the newest child process i guess?14:54
*** mlavalle has joined #opendev14:55
clarkbstrace says it is doing a lot of reading. Maybe it is verifying the state of the repo?14:56
clarkbI'll leave it be at least until strace shows it idling for a long period of time14:56
*** ykarel is now known as ykarel|away14:57
openstackgerritMerged openstack/project-config master: Fix i18n-specs jobs  https://review.opendev.org/72198915:06
clarkblooks like ubuntu main has reported "there are errors" again15:14
clarkbit did add and remove more pacakges though15:14
clarkbso maybe we just need to converge on happy state? I think that is how it works in the normal daily cron15:14
clarkbI'll try running it again since that ran relatively quickly and gives us more data15:14
clarkbactually hrm the same unhappy files show up this time around too15:16
clarkblets look at those more closely15:16
clarkbhttp://paste.openstack.org/show/792549/15:17
clarkbfungi: ^ I think that means the db is saying it knows about some files that aren't present on disk15:17
clarkbfungi: any idea on how to recover that?15:17
openstackgerritMonty Taylor proposed opendev/puppet-cgit master: Retire repo  https://review.opendev.org/72200115:18
openstackgerritMonty Taylor proposed opendev/puppet-etherpad_lite master: Retire repo  https://review.opendev.org/72200215:18
openstackgerritMonty Taylor proposed opendev/puppet-exim master: Retire repo  https://review.opendev.org/72200315:18
openstackgerritMonty Taylor proposed opendev/puppet-gerrit master: Retire repo  https://review.opendev.org/72200415:19
openstackgerritMonty Taylor proposed opendev/puppet-gerritbot master: Retire repo  https://review.opendev.org/72200515:19
openstackgerritMonty Taylor proposed opendev/puppet-github master: Retire repo  https://review.opendev.org/72200615:19
openstackgerritMonty Taylor proposed opendev/puppet-infracloud master: Retire repo  https://review.opendev.org/72200715:19
openstackgerritMonty Taylor proposed opendev/puppet-ipsilon master: Retire repo  https://review.opendev.org/72200815:19
openstackgerritMonty Taylor proposed opendev/puppet-iptables master: Retire repo  https://review.opendev.org/72200915:19
openstackgerritMonty Taylor proposed opendev/puppet-jenkins master: Retire repo  https://review.opendev.org/72201015:19
openstackgerritMonty Taylor proposed opendev/puppet-nodepool master: Retire repo  https://review.opendev.org/72201115:19
openstackgerritMonty Taylor proposed opendev/puppet-openstackci master: Retire repo  https://review.opendev.org/72201215:19
openstackgerritMonty Taylor proposed opendev/puppet-os_client_config master: Retire repo  https://review.opendev.org/72201315:19
openstackgerritMonty Taylor proposed opendev/puppet-packagekit master: Retire repo  https://review.opendev.org/72201415:19
openstackgerritMonty Taylor proposed opendev/puppet-phabricator master: Retire repo  https://review.opendev.org/72201515:19
openstackgerritMonty Taylor proposed opendev/puppet-zuul master: Retire repo  https://review.opendev.org/72201615:19
mordredAJaeger: ^^ fixed readme15:19
mordredAJaeger: SIGH15:20
mordredAJaeger:  I did not commit ammend well15:20
clarkbports just finished and didn't report errors so I am going to vos release it now15:21
clarkbstill trying to make sense of the ubuntu main complaints15:21
clarkboh I forget to remove the package indexes for trusty on ubuntu-ports. THis is what I get for doing this first thing in the morning15:22
clarkbthat shouldn't break anything will just advertise trusty when we don't have packages for it15:22
clarkbI've made a note on the etherpad to do that then vos release again15:22
fungiclarkb: honestly, not sure, the last time i ran into inconsistency issues (for the debian/buster backports addition) i ended up removing reprepro's database and letting it recreate it15:23
clarkbfungi: its looks like deleteifunreferenced (the thing that fails) is a less angry version of deleteunreferenced (based on man page reading)15:24
openstackgerritMonty Taylor proposed opendev/puppet-accessbot master: Retire repo  https://review.opendev.org/72201815:24
clarkbso I'm going to try deleteunreferenced then rerunning reprepro15:24
fungiworth a shot15:24
clarkbthis was the command used at the beginning to clear out trusty anyway so should be safe15:24
mordred++15:25
clarkboh you know what we might be feeding it that list of chrony things manually15:25
clarkbI'm checking on that and if so I might just move that file aside15:25
clarkbbecause it likely got cleared in the trusty clearing15:25
openstackgerritJan Zerebecki proposed openstack/diskimage-builder master: Retry git clone/fetch on timeout  https://review.opendev.org/72158115:27
clarkbya I think that is it15:27
openstackgerritAndreas Jaeger proposed openstack/project-config master: Finish retiring i18n-specs  https://review.opendev.org/72172215:28
AJaegermordred: sorry, I shouldn't have asked ;)15:28
clarkbthat file only had the three chrony packages in it. I've backed it up to my homedir then cleared it out15:29
clarkbthat should get this moving I think15:29
clarkbassuming that reports success I'll clear out the trusty indices in ubuntu and vos release. Then do the same for ubuntu-ports once its vos release is completed15:30
mordredAJaeger: it's ok - I abandoned the bad ones15:32
mordredclarkb: \o/15:32
clarkbits crazy how arcacne reprepro is15:33
mordredclarkb: also - fwiw - ianw got the container fixed up for nb04 - and corvus and I discussed some strategies for building arm containers15:33
mordredclarkb: you might enjoy reading that scrollback - there's a couple of different options we can do15:33
clarkbmordred: have a timestamp I should look for/15:33
mordredclarkb: literally the conversation that ended right before you said the first thing today15:34
AJaegermordred: https://review.opendev.org/#/c/722013/ is in merge conflict?15:35
clarkbmordred: corvus as an intermediate step we could possibly build nodepool-builder-arm64 and not manifest? I don't know if that makes things simpler15:36
clarkbthe last time I looked into the docker manifest stuff it seemed to be painful because client tools weren't all up to speed but that was a while back and they may be ebtter now15:36
corvusclarkb: i think that's effectively mordred's latest option, except he spelled nodepool-builder-arm64 as nodepool-builder:arm64 :)15:37
corvusclarkb: do you recall the client problems?15:38
clarkbcorvus: iirc they simply didn't know about the manifest thing so you'd pull an image and it wouldn't work15:38
corvus(it seems likely that our base debian images would be multi-arch)15:38
clarkbits likely that was an issue for distro shipped docker as newer docker added the support15:39
clarkboh good point so ya we are probably fine there15:39
corvusyep, python:3.7-slim is multiarch15:40
corvus(and does include arm64)15:40
clarkbubuntu main just exited cleanly. I'm removing the trusty indexes now and will vos release \o/15:41
corvusclarkb, mordred: so i think the dedicated image post job is the way to go if we're concerned about cloud stability and build times.  if/when we think our arm64 clouds are in a good enough place in terms of reliability and capacity to put them in the critical path for landing every nodepool change, then the multi-arch manifest is the way to go.15:42
clarkbcorvus: wfm15:44
clarkbfungi: mordred: re focal. The next thing I notice is we don't create a backports or security index15:44
clarkbI believe this is a known thing with reprepro when upstream has no packages15:44
clarkbI forget how we dealt with that in the past but heads up that we'll likely need to handle it here too15:44
clarkbboth ubuntu-ports and ubuntu are vos releasing now15:45
openstackgerritJan Zerebecki proposed openstack/diskimage-builder master: Retry zypper when refresh failed  https://review.opendev.org/72158715:45
openstackgerritJan Zerebecki proposed openstack/diskimage-builder master: Retry git clone/fetch on timeout  https://review.opendev.org/72158115:46
fungiclarkb: in the past we've avoided adding the security and backports suites to sources lists until they have packages15:46
clarkbfungi: doesn't that break our mirror setup on test nodes?15:46
clarkbmaybe we addressed that and its fine15:46
clarkbI'm not sure anything is actually wrong here. Just calling out that the previous behavior that I Think gave us problems is present here too15:47
fungiyeah, i think the only thing it breaks is if systems add lines to their sources list for those suites15:52
mordredI believe security and backports get created upstream once final release happens15:52
fungisince there won't be any package indices for them until there are packages to index for them15:52
fungimordred: at least for debian it's even later15:52
mordredyeah15:53
fungiusually security suite gets created after there's a security update, and backports once there's a package backported15:53
persiaHistorically, for Ubuntu, it was a week or so before release (related to the last final test milestone, as the final installer wanted to have -backports and -security available,. and that needed testing).15:53
fungior, well, technically the backports and security suites for debian *do* exist earlier. the problem here is that reprepro won't create indices for them because they have no packages to mirror15:54
persiaYes.  Reprepro is the limiting factor here.  The decision to avoid adding the suites until they have packages is the correct decision for opendev.15:54
fungiwe have to wait to rely on them until after reprepro will create indices for them, which won't be until there are some packages in them15:54
fungior we need some separate process to create those empty indices outside of reprepro15:55
clarkbmordred: corvus re nodepool ansible/containers/etc I'd like to restart nl03 with the chagne to address quota accounting and see if that makes inap things happier16:17
clarkbis there any reason to not just do that now on the existing host?16:17
mordredclarkb: I can't think of one16:23
clarkbk will do that shortly16:26
*** rpittau is now known as rpittau|afk16:29
clarkbinfra-root I'm stopping and starting nodepool-launcher on nl03 toget the quota accounting fix in16:36
clarkbinap is currently happy so the next time we have delete issues we should see if this helps with grabbing too many nodereuests16:37
clarkbnodepool==3.12.1.dev69  # git sha cbdf374 is what I restarted on16:38
*** dpawlik has quit IRC16:40
*** dpawlik has joined #opendev16:40
*** dpawlik has quit IRC16:45
openstackgerritMerged opendev/base-jobs master: Enable promote to unarchive gz archives in addition to bz2  https://review.opendev.org/72170616:47
openstackgerritMerged openstack/project-config master: Enable promote to unarchive gz archives in addition to bz2  https://review.opendev.org/72165216:48
*** _mlavalle_1 has joined #opendev16:48
clarkbI've confirmed that nodepool_pool_name='main', nodepool_provider_name='inap-mtl01' are set as properties via openstack server show on newly booted instances16:49
*** mlavalle has quit IRC16:50
clarkbmordred: next up on my list is the gerrit /opt/lib replication things. I'm about to be distracted by school for a bit though but wanted to let you know that is still on my list of help sort out16:57
clarkbvos releases continue16:57
mordredclarkb: do we still have an issue with that?17:06
mordredclarkb: I thought we were solid on it now17:07
mordredclarkb: I chowned the three repos - I think they were chown'd weird because they were created when the mounts were inconsistent17:07
clarkbmordred: ah ok I missed that. And yes I see now more retries in gerrit show queue17:07
mordredso I *think* we should be gtg there pending further failures17:07
clarkbmordred: do we need to trigger replication?17:08
clarkbor did that get done too17:08
mordredthat said - I can't build a narrative as to exactly why they got into that state17:08
mordredI did that yesterday too17:08
mordredso I expect them to be ok now17:08
clarkbawesome. I completely missed that. Thank you!17:08
mordredsure!17:08
mnaserso i was confused by this for a long time17:40
mnaserbut in our opendev images, install-from-bindep is more like install-from-bindep-and-python-wheels17:40
clarkbmnaser: I don't think bindep installs anything via pypi/pip17:41
mnaser~maybe~ we need to rename that at some point, i dont think its hugely important, but just wanted dto throw some thought at it, cause i was cofnused for a while where the wheels were being installed17:41
clarkbmnaser: it should only run bindep, then run your package manager with the list bindep supplies17:41
mnaserclarkb: sorry i meant https://opendev.org/opendev/system-config/src/branch/master/docker/python-builder/scripts/install-from-bindep (from opendev images, not the zuul role)17:41
clarkboh the docker images, ya I think thats less concerned about being "clean"17:42
clarkbbut I agree making that clearer would be a good thing17:42
fungiyes, naming is confusing there, i agree17:42
mordred++17:42
mordredit's bad naming17:42
mordrednever let me name things17:42
mordredthat should be "install the bindep dev depends needed to build wheels, collect the final list of bindep and also build all the wheels and collect them"17:43
clarkbcalling it "install-dependencies" might be more accurate17:43
mordredwell - it's not even quite that17:43
clarkboh right this is the staging area17:43
clarkb"build-python-dependencies" maybe17:43
mordredit should be "stage wheels and bindep lists"17:43
clarkb++17:44
mordredthat it installs things in the builder image is an impl detail17:44
mordredit does that to have the side effect of producing wheels for the deps and their transitive deps17:44
mordredmaybe just "stage-install-artifacts"17:44
mordredand then put in a big comment at the top of it that explains what it's doing and why17:44
mnasermordred: this doesn't just stage it though does it? when running in base image it also installs all the wheels into the image17:49
mnaserunelss i'm misunderstanding things :)17:50
mordredmnaser: yes - it installs the wheels in the base image17:50
mordredmnaser: but it installs the wheels that were produced into the wheel cache by installing everything in the builder image17:51
mordredwe do that so that we're sure we're always installing from wheels in the base image and therefore do not need any of the compile bindep depends installed17:51
mordredthe wheel install in the base image should basically be the equiv of just untarring a bunch of pre-built python dirs17:51
clarkbright we use the builder to stage all that so that the result on the actual image is minimal17:52
mordredso the builder image installs all the wheels, which puts them and their depends into the wheel cache - then the wheels in the wheel cache are copied into the base image from the builder image17:52
mordredyup17:52
clarkbotherwise you ahve to have full compiler toolchains and end up with compile artifacts and all that17:52
mordredexactly17:52
mnasermordred: right, for some reason, it's installing memcached in the build phase17:53
mnaserbut its not installing it in the base image after17:53
clarkbmnaser: is memcached only listed as a compile dep?17:53
mordredyeah - that sounds like a bindep file issue17:54
clarkb(but also you probably don't want memcached running alongside whatever python daemon you've got)17:54
mnaserclarkb: i'm the king of improper info, i meant python-memcached, and its listed in extras -- exactly like https://review.opendev.org/#/c/722023/17:54
mnaseri see it install in the get-extra-packages stage17:54
mordredoh hrm. and it's not winding up in your final image? that's super weird17:55
mnaserso i see: Successfully installed python-memcached-1.5917:55
mnaserin my build image17:55
mnaserthe _only_ thing that might be odd is this is based on the uwsgi-base thing17:56
mnasersee https://review.opendev.org/#/c/721702/5/images/horizon/Dockerfile17:56
*** ralonsoh has quit IRC17:56
clarkbpossible that it isn't building a wheel for python-memcached for some reason?17:57
clarkbif that is the case then we won't install it after staging17:58
clarkbmnaser: ^ you might want to check the logs to see if it indicates how that is pull in on the build container17:58
mnaserlet me double check, that is likely the case here17:58
clarkbmnaser: because its an extra it isn't in requirements17:59
clarkbso it would only be installed if it was in the wheel cache17:59
mordredwe use the extras trick to install extra things in the nodepool images though - so it _should_ work17:59
clarkbhttps://pypi.org/project/python-memcached/#files publishes a wheel too so that should've been used18:01
clarkband would wind up in the cache as expected18:01
clarkbvos releases are still running fwiw18:02
*** sshnaidm is now known as sshnaidm|afk18:04
clarkbI've signed us up for those three 2 hour blocks of PTG time I mentioned yesterday18:20
clarkbI've also filled out the attendance survey for opendev18:20
clarkb"see" you there :)18:20
clarkbcorvus: I listed Zuul as a project to avoid conflicts with. but not sure if you planned to do zuul tiem at all18:21
corvusclarkb: i think our most recent plan (from the before times) was to pass this ptg by and try to focus on berlin; i don't think we ever bothered to revisit that after the apocalypse.  i'm fine with that, and no one else has raised that with me, so i think we'll probably keep that as the plan18:23
mnaserok checking this issue now18:57
mnaserclarkb, mordred: ok i can confirm that i dont have an actual python-memcache in the /output/wheels folder..19:04
mnaserhaving said that, maybe we're doing the copy wrong because19:04
mnaserthere is /output/wheels/wheels19:04
mordredmnaser: there shouldnt' be an /output/wheels/wheels19:04
mnaser(note this is still in the build image)19:04
mordredso - yeah - I'd what that sounds like a copy issue19:05
mordredmnaser: your copy line is fine - your'e copying /output19:05
mordredmnaser: that makes me wonder if we're setting teh wheel cache dir wrong19:05
mordredmnaser: I have a patch to clean up something - let me pull it up - maybe what we should do is iterate on it and see if we can't figure out a way to verify/test the steps19:06
mnasermordred: to repro.. http://paste.openstack.org/show/792563/19:06
mnaserand then docker run -it --rm on the target image locally and you can inspect and see both python-memcached missing (it isn't even in wheels folder)19:07
mnaseralso the wheels folder seems to be "indexed"19:07
mordredmnaser: https://review.opendev.org/#/c/715717/19:07
mnaser(like the folders are hashed)19:07
mordredyeah - I thnk that's expected19:08
mordredbut I thought we were supposed to deal with it19:08
mordredlemme read code real quick19:08
mordredmnaser: and I'm going to run your repro19:09
mnaserpip install --force --cache-dir=/output/wheels /output/wheels/*.whl -- only installs the horizon wheel, nothing else19:10
mnaserooh19:11
mnasermordred: i think Open10K8S caught the issue19:11
mordredyeah?19:11
mnasermordred: see this diff https://www.diffchecker.com/07K8988S19:12
mnaserany reason why we're saving the extras inside path/name/requirements.txt instead of path/requirements.txt?19:12
mordredyeah - originally the idea was that you could name the extra after the image name19:13
mnaseroh i see19:13
mordredso if you wanted an extra only for the zuul-executor image19:13
mordredBUT19:13
mordredI don't think that's doing a thing currently19:13
mordredbecause we tend to only run assemble once19:13
mnaserso the two things in play are: requirements.txt is not being modified to append those extra requirements, wheel sre installed in /output/wheels/wheels/xx/xx/foo.wheel and only horizon is inside /output/wheels/horizon.whl19:14
mnaserthe latter might not be that much of an issue i think19:14
mnaserbut the first one is, because nothing is making sure the memcache wheel is ending up on the final image19:14
mordredyeah. so - lemme look at something19:15
mordredmnaser: yeah - I think the extra dir isn't a ton of an issue - we install horizon and then the requirements.txt file - and in both cases we point at the cache dir - so if pip is writing the cache dir to /output/wheels/wheels - it'll read form there - so as long as it's getting the right list on the input it should do the right thing on the output19:18
mordredso I think Open10K8S is right - we need to update that to append the extras to requirements.txt19:18
mordredbecause I don't think we're using the "only install this extra in this image - that'sa. leftover from pbrx - we don't know that in assemble19:19
clarkbmordred: wouldnt it get picked up by the wheel installs though19:19
clarkbwe install everything in the wheel cache or attempt to19:19
mnaserclarkb: you only install the thing we just built aka horizon, the wheel cache is a weird layout on disk like /xx/xx/foo.whl19:20
mnaserso we'd have to do something like installing /output/wheels/wheels/*/*/*.whl or something19:20
clarkbmaybe just do a find19:20
mnaserfiles are like this: /output/wheels/wheels/8f/14/b5/d4c1642a6d221ac2fb3e7bbdcad180929d2f2b7c9c92cd3cdc/XStatic_term.js-0.0.7.0-py3-none-any.whl19:21
clarkbbut I think that was the intent19:21
mordredno - let's fix the thing19:21
mordredno - it's not - the intent is that we use the wheel cache as a wheel cache19:21
mordredand just install the thing19:21
clarkbmordred I see19:21
mordredbut installing the thing with the wheel cache pointed to the right thing should cause the right things to get installed19:21
mnaserOpen10K8S: wanna push up a patch to fix get-extra-packages in the repo with gerrit?19:22
mordredclarkb: we do "/output/wheels/*whl" purely because we don't know what the wheel name is19:22
mordredmnaser, Open10K8S we should make sure we collect the extras and append to requirements.txt BEFORE we make the primary wheel19:23
mordredotherwise the metadata won't pick it up right19:23
mnaserah yes19:23
mordredmnaser: as a followup, we should also potentially allow passing a list of extras you care about19:23
mordredlike - keystone has several and might not want ALL of them - a set of defined extras might be a nice improvement19:24
mnaserya that would be nice, allowing different build profiles19:24
mordredoh - actually ...19:24
mordredcan I suggest a slightly different fix19:24
clarkbfwiw I've always been a fan of just installing all the things19:24
clarkbthe python libs arerelatively small so the cost is low19:25
clarkband then users dont get confused about why something doesnt work19:25
mordredclarkb: yah - but I thinkj some pre-existing thigns might have conflicting package sets19:25
mnaserin openstack context people were complaining about "oh but i dont want vmware libs when i deploy with libvirt"19:25
mordredso might have a setup.cfg that wouldn't work if you tried to install all of them - and they might want a get-out-of-jail19:25
clarkbmnaser: ya but it doesnt actually matter compared to installing python19:25
mordredmnaser: well - I agree with clarkb on that one - they should get over it - it's just libraries :)19:25
clarkbthe cost ie basically nil19:25
mnaserplease chime in then https://review.opendev.org/#/c/720107/ :)19:26
mordredI'm more concerned about cases where it's literally conflicting things19:26
mordredmnaser: instead of changing to append ...19:26
mnaserextra-requirements.txt?19:26
mordredlet's change assemble to look for /output/wheels/*/extra-requirements.txt and install those too19:26
mordredmnaser: and then as a followup - let assemble take an option which is the name of an extra to only install in addition to the main thing19:27
mordredmnaser: that way we COULD have dockerfiles that run assemble with a specific extra as an argument to produce a more targetted sub-image19:27
mordreddoes that make sense?19:27
mnaserdoesn't that mean that the wheel itself will not have that extra included as a dependency?19:27
mnaserso really we don't install anything in build-time and only "side-load them" in stage time19:28
mordredyeah - but that's ok - just do "pip install /output/wheels/*whl -f/ouput/wheels/extra1/extra-requirements.txt -f/output/wheels/extra2/extra-requrements.txt"19:28
mordredmnaser: oh - good point19:28
mordredmnaser: nevermind - let's skip what I said19:28
mordredwe need to run the installs19:29
mordredlet's fix it like Open10K8S found - and we can do the thign I'm thinking as a followup - since I think it'll be more involved to get right19:29
clarkbvos release is still going19:29
mnasercool, makes me wonder how this even worked ever :)19:29
mnasercause i know zuul relies on extras i think19:30
mordredmnaser: yeah - that's my question ... we do19:30
mnasermordred: https://opendev.org/zuul/zuul/src/branch/master/Dockerfile#L4619:31
mnaseraha.19:31
mnaserso we'd break zuul if we did this19:31
mordredah - that's how19:31
mordredyeah19:31
mordredmnaser: so - maybe just doing that pattern for now? (which is sort of what I was suggesting above)19:32
mnasermordred: yeah i guess that what we'll do19:33
mordredmnaser: and we can improve it by letting assemble take an optional list of extras to install so you don't have to do that19:33
clarkbhow does that install extras?19:33
clarkbthe extras arent in the requirements file right?19:33
mordredclarkb: that requirements file is created by the extras extraction in the builder image19:34
clarkbI see so it isn't part of the main repo, its generated19:34
mordredyeah19:35
mordredmnaser: clearly we need better api and better api docs here don't we?19:35
mnasermordred: i think so, maybe those images could be on their own repo19:36
mnaserclarkb, mordred: can we land https://review.opendev.org/#/c/713953/11 ? we are using it and it works :)19:38
openstackgerritMonty Taylor proposed opendev/system-config master: Allow requesting a list of extras to install  https://review.opendev.org/72212519:39
clarkbopendev doesn't use uwsgi anywhere19:40
clarkbmnaser: I think we are the wrong venue for that19:40
mordredclarkb: I was thinking it was ok because the python-base and python-builder images are generally useful and it would be nice to be able to rebuild all three if we change any of them19:40
mordredalthough I can also see not wanting to19:41
mordredmnaser, clarkb: ^^ that might be some nice syntactic sugar to help with the extras install19:41
mnaserthings like pastebin might need that image fwiw19:41
clarkbmnaser: why does it set PBR_VERSION=0.019:41
mnaserclarkb: /me points at mordred "he did it"19:42
mordredmnaser: yeah - in fact, maybe if we update pastebin to use that it would make more sense in opendev19:42
*** diablo_rojo has joined #opendev19:42
clarkbgrepping for PBR_VERSION in system-configs existing images doesn't show us doing that19:42
mordredclarkb: there was a reason ... trying to think what it was19:42
mordredclarkb: it might just be leftover from previous hacking19:42
mnasermordred: i think you added it when you did uwsgi install via assmeble19:43
mnaserbc it tries to do a pip install or something19:43
mordredoh19:43
mordredit's because we're faking a pbr install19:43
mordredso that python builder works19:43
clarkbI think thats a bad thing to do19:43
mordredclarkb: see the setup.cfg and friends in that repo19:43
clarkball your versions will be wrong as a result19:43
clarkbmordred: that repo?19:44
mordredin that change19:44
mordredthe other files19:44
mordredbut I agree - let me relook at a solution for that real quick19:44
mordredI've got it19:45
clarkbout of curiousity why can't the images just use normal python-* and add uwsgi to their deps?19:45
clarkbthis seems unnecessary if you can pip install uwsgi19:45
mordredbecause you need bindep stuff to install uwsgi - and it's not strictly a requirement for the python project itself, just for the images  of that project19:46
mordredbut - I think I have an idea of how to simplify this19:47
clarkbI mean if the project is expected to run with uwsgi it kind of is a dep of the python project19:47
clarkbI -1'd for the PBR_VERSION thing because I think that is an actual bug. That can leak around19:48
clarkb(its only intended to be set for the uwsgi install thing but if its in the env I think other pbr invocations will use it too)19:48
corvuswhat's the next thing i can help with re nodepool ansiblification?19:49
clarkbcorvus: vos releases still running on ubuntu mirror updates. But I think we can move forward on ansible + containering the other x86 builders? (I don't know what that requires though)19:50
corvusi think, if i'm following correctly, mordred's strategy has shifted back 'get builders running in containers' since the /proc problem has been resolved19:51
corvusso i think that means nodepool isn't blocked on focal?19:51
clarkbcorvus: correct19:51
corvusmaybe the next step is the dedicated arm64 image idea19:51
corvusi think i need mordred to indicate whether he's started on patches for that19:52
clarkbthat would enable a rebuild of nb03 which is probably the biggest blocker right now? since nb01/02 can be rebuilt like 0419:52
corvusyeah, that's what i'm thinking19:52
corvusi wonder if nb04 is back in prod19:52
corvusi'll check on that and if not, see what's necessary while i wait for an update from mordred19:52
openstackgerritMonty Taylor proposed opendev/system-config master: Allow passing an arbitrary package list to assemble  https://review.opendev.org/72213319:54
openstackgerritMonty Taylor proposed opendev/system-config master: Add a uwsgi-base container image  https://review.opendev.org/71395319:55
mordredcorvus: I have not - I was caught up in some calls - I'm now done and about to make the dedicated-arm image jobs19:55
openstackgerritJames E. Blair proposed openstack/project-config master: Revert "Move Ubuntu builds away from nb04"  https://review.opendev.org/72213419:56
corvusmordred, clarkb: cool; nb04 is running the latest nodepool image, so maybe we can go ahead and merge ^ now19:56
mordredcorvus: ++19:56
clarkbmordred: corvus can you point me to the place where the new image fixed the debootstrap issue?19:56
corvusclarkb: see commit message19:57
mordredclarkb: https://review.opendev.org/#/c/721394/19:57
clarkbgot it thanks19:57
mordredyeah19:57
mordredI would love it if we could convince debian to land those - if not, I think this is fine for now, but also think we should investigate switchign to mmdebootstrap19:58
mordredin dib19:58
corvusor using the containerfile element?19:58
mordredcorvus: yeah - liek - I think doing this is good enough for a time - but I'd like a long-term solution to be better19:58
corvusi kinda we all felt that the containerfile element would be a good long term plan19:58
mordredyeah - although I also think that fixing dib so that you can run any of the elements in a container without breaking the container would be good from a dib pov - we can't exactly prevent nodepool users from using arbitrary elements - unless we want to update nodepool's config align more explicitlyu to the containerfile element20:00
mordredwhich is to say - yes - and I think the ppa workaround for now is fine as a workaround as we work on the long term plan20:00
clarkbmordred: was debian not happy with those changes?20:01
mordredclarkb: they aren't landed yet - I think there was review comments that wasn't happy about the need to care about being in a container and hinted at wanting things to be different20:02
clarkbI mean unmounting /proc seems like a bad idea in any context :)20:03
corvusmordred: does this error mean anything to you? https://zuul.opendev.org/t/openstack/build/3d3f5320bd9a40efa659e4aa9eebf21b/log/logs/containerfile_bionic-build-succeeds.FAIL.log#146-14720:04
mordredcorvus: no20:05
corvusk, we might need more debug logging or a node hold there20:05
mordredcorvus, mnaser: remote:   https://review.opendev.org/722135 Actually install extras from nodepool_base20:06
mordredis based on the things we learned from mnaser's image issue earlier20:07
mordred(turns out we did not succeed in adding objgraph and yappi to the nodepool image)20:07
mnasermordred: your uwsgi-base attempt had a sad: [0m[91mpython: can't open file 'setup.py': [Errno 2] No such file or directory20:19
clarkbubuntu-ports vos release completed20:19
clarkbI'm going to clean up its trusty indexes now and revos release20:20
clarkboh hey maybe ther are already gone (perhaps ianw removed them?)20:20
clarkboh right its because we never mirrored trusty ubuntu-ports \o/20:21
mnasermordred: it actually took the path with -z $PACKAGES in the code20:21
clarkbok I'm going to remove my lock for ubuntu-ports and call that one done20:21
openstackgerritMonty Taylor proposed opendev/system-config master: Build arm64 versions of python-base and python-builder  https://review.opendev.org/72214020:23
mordredcorvus: ^^ how does that look20:23
clarkbI'm dialing back the quota for ubuntu-ports now20:24
mordredmnaser: why did it do that???20:25
clarkbwill do that for ubuntu as well (adding that note to the etherpad)20:25
mnasermordred: i have no idea bc i made a simple local assemble and the $* behaviour is right20:25
mnasermaybe because docker?20:25
corvusmordred: comment on something we should have caught earlier20:25
mordredcorvus: oh poo. so we will need a dedicated base job20:28
corvusdo we have reliable nested-virt yet?20:29
corvus(i'm just trying to weigh the reliability of the arm clouds vs nested virt)20:29
clarkbcorvus: we ~2 clouds running nested virt evaluation flavors20:30
openstackgerritMerged opendev/system-config master: Allow requesting a list of extras to install  https://review.opendev.org/72212520:30
clarkbI don't think its to the point of 100% relaible yet, but those clouds have indicated they are trying to make it work20:30
clarkb(so if it breaks they will actually try to fix it)20:30
corvusand one arm64 cloud20:31
corvusmordred: as a bonus, we need a job like that for building and publishing tags20:34
corvus(i mean docker images corresponding to git tags)20:34
mordredcorvus: hrm. I'm torn - maybe the docker qemu thing is the right choice? we'd to add a flag to the build and upload jobs to say "use buildx"20:36
corvusmordred: yeah, i'm unsure too.  buildah doesn't (yet) have support for this; but we could probably run buildah inside of qemu ourselves (oy)20:37
corvus(we don't use buildah, yet, just thinking ahead)20:37
corvusmordred: actually, are you sure we need *nested* virt?20:38
corvusmordred: i mean, we're talking cross-arch -- the only thing nested virt would get us is a fast build inside of qemu for the native arch20:38
mordredcorvus: oh - yeah - maybe not20:38
mordredcorvus: that's a very good point20:39
corvusmordred: but if we split it into different jobs, maybe we run the native build normally, and then a buildx qemu emulated build for the others20:39
corvusi wonder how long that takes; maybe i'll try that on my desktop?20:39
mordredcorvus: docker buildx build --platform=linux/amd64,linux/arm64 .20:40
mordredhttps://github.com/docker/buildx/#with-buildx-or-docker-190320:40
fungithis problem goes on the list of things which would be so much easier if gerrit let you review tags20:41
clarkboh ya nested virt may not help much good point20:42
fungi(the building from tags problem i mean)20:42
corvusmordred, fungi: well, re tags... hrm, actually... maybe we can promote them20:43
corvuswe'd still need to map the change to the merge commit20:43
mordredI was thinking we were waiting on zuul v4 to implement support for promoting from the previous ...20:44
corvusbut i think that can be done reliably with git?20:44
fungioh, maybe20:44
mordredyeah - that - I thought last we discussed it the plan was to record something in the zuul db so that we could look up the change that built an image for a given commit so that we could promote the actual artifact20:44
corvusit might be possible.  it's not trivial.  not sure how motivated we are to solve it.20:45
mordredand we were waiting on v4 to implement that20:45
corvusmordred: that's not ringing a bell.20:45
mordredbecause ultimately when we tag a commit if we've otherwise promoted things built in gate we'd conceivably want not to rebuild the image as much as promote the previously gated image20:45
corvusat least, i don't see how anything planned for v4 helps20:45
mordredcorvus: my understanding was that we'd need to record the mapping so we'd know the docker sha for a given change - but it's entirely possible I just hallucinated this20:46
fungialternatively, if we built a new image from the tag, we'd want the tag to not get applied to the canonical copy of the repo unless the image is viable and can pass tests20:46
fungi(which is where the reviewing tags thought came into play)20:46
mordredyeah - that's the problem - it's non-trivial20:47
corvusmordred: oh, maybe the idea was to use direct-push so we know the merge commit20:47
mordredyeah20:47
mordredmaybe that was it20:47
corvusthat is one option20:47
mordredso that we could then reverse map the merge commit to the docker sha20:47
corvusi do think that digging the change out from the merge commit via git is an option20:47
mordredyeah20:48
corvusbut it's decided non-trivial20:48
clarkbubuntu mirror has released20:48
mordredcorvus: incidentally - my laptop just built amd64 and arm64 in parallel of python base in short order20:48
clarkbI'm going to reduce the quota on that volume now then remove my locks20:48
mordredcorvus: 84 seconds20:48
corvusmordred: cool, docker is giving me a really weird error20:49
corvusso i have no dockerx20:49
clarkb#status log Removed Trusty from our Ubuntu mirrors and added Focal. Updates have been vos released and should be in production.20:50
openstackstatusclarkb: finished logging20:50
mordredyou have to install it as a plugin - instructions at https://github.com/docker/buildx/#with-buildx-or-docker-190320:50
corvusmordred: no i get that20:50
corvusmordred: i am unable to build the plugin20:50
mordredcorvus: I'm cooking up runing instructions20:50
mordredcorvus: oh!20:50
mordredcorvus: maybe you need the instructions for older docker above that section?20:50
corvusmordred: nope20:50
corvusthere is a definite error which i would like to share, but am not yet comfortable sharing20:51
mordredkk20:51
corvusgot it.  it was left over from registry testing20:52
mordredcorvus: so - I can build the image, but I can't look at it because pushing to a registry is the only currently supported export format20:55
mordredI mean - I guess I could push to emonty/python-base for now20:55
mordredcorvus: ok - I have pushed to emonty/python-base and it looks correct20:58
clarkbinfra-root if anyone else is willing to review a gitea upgrade change from mordred, https://review.opendev.org/#/c/720202/1, I'm happy to approve that and watch it (probably tomorrow morning?)20:59
clarkbthat was hiding on my todo list from last week20:59
fungii'll take a look now20:59
fungialso i noticed we're getting some strange cronspam from the gitea servers, for how long i'm not sure20:59
mordredcorvus: http://paste.openstack.org/show/792566/21:00
mordredcorvus: that's the sequence of commands I ran to build and push to a registry21:00
corvusmordred: i'm trying to get a comparisen of an arm64 build to amd6421:00
corvusmordred: can you try doing those one at a time?21:00
fungithere's a mysqldump-in-a-container cronjob complaining "No container found for mariadb_1"21:00
mordredcorvus: yeah - you want me to time them?21:00
corvusyep21:00
corvus(i've build an amd64 image, but i apparently need more qemu to build an arm64 image)21:01
mordredcorvus: k. one sec21:01
clarkbfungi: thats our db backups I bet its fallout from the docker-compose upgrade I'll take a look21:01
corvusmordred: i suspect you're running qemu in your docker-provided linux vm21:01
mordredcorvus: I'm going to skip push this time21:02
mordredcorvus: and yes21:02
mordredcorvus: I'm doing python-builder this time because it does more - and also so that I don't have any local cache21:03
mordredcorvus: nicely - the buildkit output gives us times for each step too21:04
mordredI'll paste the whole thing21:04
clarkbfungi: that is really weird21:04
clarkbdocker-compose ps shows nothing21:04
clarkbbut docker ps does21:04
mordredthe qemu arm is definitely slower - but not _stupid_ slower21:04
clarkbfungi: I've got it I think21:05
clarkbgive me a few mintes to confirm21:05
mordredcorvus: http://paste.openstack.org/show/792567/21:07
clarkbfungi: its a path issue. I'll get a change up as soon as I can type it up21:07
mordredcorvus: less than a second for the first two steps, 2 seconds for the third, 4 minutes total for the build - most expensive build step took 36s on x86 and 158s on arm21:08
fungiclarkb: cool, thanks!21:08
corvusmordred: so if we run them on your laptop, we can expect the builds to take ~4x21:09
mordredcorvus: it's doing them in parallel - so we're long-poled on the arm time - x86 finished quickly21:09
mordredyeah21:09
corvusmordred: oh i thought you were doing it serially21:09
corvusmordred: can you repeat that and do it serially?21:10
mordredI did the steps serially and timed21:10
mordredsure21:10
corvusi'm confused21:10
mordredI am too - I'm not sure what you're wanting to see?21:10
corvusmordred: did the arm and amd images build in parallel or serial?21:10
mordredparallel21:10
mordredI misunderstood what things you wanted invididual timing of21:10
corvusmordred: okay, that's going to give us bad numbers for estimating what running multiple jobs in parallel would do21:10
openstackgerritClark Boylan proposed opendev/system-config master: Fix rooted path to docker-compose  https://review.opendev.org/72214521:11
clarkbfungi: infra-root ^ fyi21:11
clarkbnote docker-compose is still in /usr/bin/ on gitea0X which is why the error is odd21:11
mordredcorvus: nod. out of curiosity - why do we think that might be a better choice? (doing the serial steps now)21:11
corvusmordred: so i think if you can do an amd build, then an arm build, that'll do it21:11
clarkbon other hosts it seems to have been removed properly21:11
corvusmordred: i'm not sure it will; it depends on how the build time compares to the setup time21:12
mordredcorvus: doing that now21:12
mordredah - well - the setup time was 2 seconds21:12
*** dpanech has joined #opendev21:12
corvusmordred: i mean job setup21:12
mordrednod21:12
mordredcorvus: x86 took 48s21:14
corvusmordred: a zuul image build job takes 11 minutes to build the docker image and 42 minutes to run. to be honest, i don't know what's happening in the other 32 minutes.21:16
fungiclarkb: in a similar vein, the mysqldump cronjob on the etherpad server is complaining "/bin/sh: 1: /usr/bin/docker-compose: not found"21:17
clarkbfungi: yup its fixed in the same change21:17
corvuswe have those nice ansible task times now, and nothing other than the docke image build steps is > 1 minute21:17
clarkbfungi: it was etherpad that made me realize what was going on21:17
mordredcorvus: weird21:17
fungiclarkb: aha, got it21:17
corvusmordred: i'm looking at https://zuul.opendev.org/t/zuul/build/a8798b497e2c44baa9b514e06786e37c21:17
mordredcorvus: fwiw: https://www.docker.com/blog/getting-started-with-docker-for-arm-on-linux/ has a couple of steps in it that I don't have to do on my laptop - notably docker run --rm --privileged docker/binfmt:820fdd95a9972a5308930a2bdfb8573dd4447ad321:17
clarkbcorvus: https://zuul.opendev.org/t/zuul/build/a8798b497e2c44baa9b514e06786e37c/log/job-output.txt#1185821:18
clarkbthe job is pausing21:18
corvusmordred: yeah the binfmt situation on bionic is bad; we may want to do this on focal21:18
corvusclarkb: ah thanks :)21:18
corvuswe should see if we can get that in the cosole log21:19
corvusokay, so the docker build is about 11 minutes out of 2121:19
corvusor, really, we don't care about the time after the pause21:19
clarkbright thats quickstart or whatever time and isn't directly worried about in the build context aiui21:20
mordredcorvus: yeah21:20
mordredcorvus: 3:31 to run the arm - full log http://paste.openstack.org/show/792568/21:20
dpanechHi there, I need to push to f/centos8 branch in starlingx/kernel.git, can someone help me? I don't have the permissions. I'm splitting out the kernel from integ.git.21:21
dpanechsgw: please confirm it's ok21:21
corvusso let's say 5 minutes setup time for the job, and 11 minutes docker build time21:21
*** DSpider has quit IRC21:22
sgwHi gang, yes dpanech needs the branch to complete the kernel split work in starlingx.  I might have complicated things by creating the f/centos8 branch, so it might need to be deleted first21:22
mordredcorvus: yeah. so we could imagine the qemu arm build would make that take 40-45 minutes - although we still might want to test it - because a ton of what's going on in that docker build is zuul-manage-ansible installing ansible a bunch - so cpu vs. disk performance is going to come in to play21:22
mordredor might come in to play21:23
clarkbsgw: dpanech give me a couple minutes21:23
corvusmordred: do you have an amd build yet?21:23
mordredcorvus: yeah - it was 48s21:23
corvusoh sorry i missed that21:23
sgwclarkb: thanks, ping when your ready for us.21:23
mordredcorvus: so still roughly 4x longer - on my laptop21:24
fungisgw: dpanech: clarkb: the acl for that repo says the members of this group can create new branches in it already: https://review.opendev.org/#/admin/groups/starlingx-release21:24
fungisgw: you're one of them21:24
corvusmordred: https://zuul.opendev.org/t/openstack/build/3199c3bc10564997a63c589b65286b68/console  took 51 seconds, so i think your laptop == build nodes for this comparison21:24
mordredcorvus: good to know :)21:24
fungisgw: go to https://review.opendev.org/#/admin/projects/starlingx/kernel,branches and you should see a "create branch" button21:24
fungiif not, we can troubleshoot it21:25
mordredcorvus: oh - although - it's worth noting we don't need to do this for all of our images21:25
sgwfungi: yea, I was able to create the branch, but what we really want is for dpanech to be abke to push the new branch. so I really need to delete the one I just created and allow dpanech to push the branch direclty21:25
clarkbsgw: you mean bypassing review?21:25
fungisgw: oh... you have content you want to bypass code review and push directly?21:25
mordredcorvus: really only python-base, python-builder and nodepool-builder - although it would likely be awkward to just do builder so might as well do nodepool - we might want to look at nodepool build logs instead of zuul ones21:25
sgwyes, in this case it's to preserve the history21:25
mordredsince zuul's is extra long due to the manage ansible stuff21:25
corvusmordred: splitting into separate jobs would allow us to build an untested arm64 image while we continue jobs testing the built amd64 images21:26
clarkbsgw: dpanech typically that requires a gerrit admin.21:26
fungisgw: it's probably safest to have one of us do it in that case, otherwise you'll need a group added to the acl which has the ability to push commits and merge requests on behalf of other users21:26
corvusmordred: yeah, looks like a nodepool build is going to be about 5m21:27
corvusmordred: (or 20m for amd64)21:27
fungisgw: dpanech: is there a public copy of that branch somewhere one of us can pull from?21:27
sgwfungi: was just about to ask what you wanted!21:27
mordredso - then the question is - do we want to always do separate and always have a job to combine a manifest - or do we want both versions - do a combined build for python-base/python-builder and then do a separate build for nodepool?21:27
sgwdpanech: do you have a public github that you can push your work to?21:28
dpanechsgw: fungi: give a moment21:28
corvusmordred: this is one option: http://paste.openstack.org/show/792569/21:33
corvusmordred: with timing http://paste.openstack.org/show/792570/21:36
dpanechsgw: fungi: here you go: https://github.com/dpanech/starlingx-kernel/tree/f/centos821:36
clarkbdpanech: sgw worth noting for the future that the repo creation step will import existing branches so you can prep the import step to avoid the work at this stage21:37
sgwclarkb: thanks, since we have lost Dean, we lost some of that knowledge, hopefully we don't do too much spliting like this.21:38
openstackgerritMohammed Naser proposed opendev/system-config master: Allow passing an arbitrary package list to assemble  https://review.opendev.org/72213321:38
openstackgerritMohammed Naser proposed opendev/system-config master: Add a uwsgi-base container image  https://review.opendev.org/71395321:38
mordredcorvus: yeah. and with quick-start - the total job time is likely not extended21:38
corvusmordred: and the other option: http://paste.openstack.org/show/792571/21:39
fungisgw: dpanech: cool, i'll pull from https://github.com/dpanech/starlingx-kernel/tree/f/centos8 and push --force to https://opendev.org/starlingx/kernel/src/branch/f/centos8 through review.opendev.org momentarily21:39
corvusmordred: though i may be guessing a bit on the setup times....21:39
clarkbfungi: sgw dpanech fwiw it does look like its just a merge commit and the .gitreview commits?21:39
clarkbfungi: possible that dpanech could push that merge commit afterall?21:39
dpanechfungi: thanks21:39
clarkb(also appears to be a fastforward from the current branch head)21:39
clarkbbut fungi pushing it should also work fine21:40
corvusmordred: maybe for this particular case of nodepool-builder it doesn't matter that much and we should just do it in one job?21:40
dpanechwait21:40
dpanechit's not going into master21:40
mordredcorvus: so - the second option is quite a bit easier to implement (no need for merge images - just need to add an option to the build jobs)21:40
dpanechits going into f/centos821:40
clarkbdpanech: I know21:40
dpanechI tried pushing before and it didn't allow me21:40
clarkbdpanech: you have two merge commits "merge branch 'master'"21:40
mordredyeah - but - I think the first option has some potentially nice benefits and is also still slighly faster21:40
fungidpanech: yep, sgw created the f/centos8 from master head21:41
clarkbdpanech: yes you need special permissions to push merge commits21:41
sgwdpanech: the branch needs an update .gitreveiw also21:41
clarkbsgw: that is tehre too21:41
corvusmordred: i think i'd be in favor of the second option for now and see how it goes21:41
sgwclarkb: ah good, something I had forgotten once!21:41
clarkbI was just pointing out that if you were in the correct group I think you could push those two merge commits for review21:41
mordredcorvus: so we could followup up with option #1 as an later improvement - or if we wanted to start making arm images of zuul21:41
clarkbwhere that could fail is if the commits in the merge weren't already known to gerrit21:41
clarkb(and that is where an admin pushing would be required)21:42
mnasermordred: i know what happened.  we don't have a requires on the zuul job for uwsgi-base, so it was pulling master image21:42
mnasers/master/latest/21:42
mordredmnaser: hah21:42
mordredmnaser: that makes total sense21:42
mordredmnaser: and thank god :)21:42
mnaseryeah i was starting to lose it21:42
mnaseri found a little bug, we needed to quote the "$PACKAGES" in the if to avoid issues with bash complaining21:43
mnaserbut that was it21:43
dpanechi'm confused, did I screw something up?21:43
clarkbdpanech: I don't think so21:43
clarkbdpanech: https://review.opendev.org/#/admin/groups/1921,members that group is allowed to push merge commits like the ones you've made into gerrit for review21:44
sgwI think what is going on is that dpanech could have shared this work with me and I could have pushed the merge commit directly21:44
clarkbdpanech: if you were a member of that group I don't think you need fungi to push for you21:44
clarkbsgw: yes basically21:44
openstackgerritMonty Taylor proposed opendev/system-config master: Allow passing an arbitrary package list to assemble  https://review.opendev.org/72213321:44
openstackgerritMonty Taylor proposed opendev/system-config master: Add a uwsgi-base container image  https://review.opendev.org/71395321:44
clarkbI'm not 100% certain of that yet because gerrit does awnt to have previously seen all commits pushed itno it via a merge commit push21:45
mordredmnaser: like that ^^ ?21:45
clarkbso depending on the contents of those merge commits this may not be the case21:45
sgwAnd looking at the group, I could have added dpanech temporarily21:45
clarkband either way its fine for fungi to push them21:45
*** tobiash has quit IRC21:45
mnasermordred: left a note :)21:45
mordredmnaser: so I don't need the second one you said is fine?21:46
dpanechalso: it's not just the review commits, there's a bunch of earlier commits from integ.git's centos8 in that branch, extracted with paths re-written21:46
mnasermordred: in my local testing yes only the one at the if was needed21:46
dpanechstill confused what you guys are talking about21:46
mnasermordred: or i would se things like this -- ./assemble: line 6: [: foo: binary operator expected21:46
openstackgerritMonty Taylor proposed opendev/system-config master: Allow passing an arbitrary package list to assemble  https://review.opendev.org/72213321:47
openstackgerritMonty Taylor proposed opendev/system-config master: Add a uwsgi-base container image  https://review.opendev.org/71395321:47
clarkbdpanech: https://github.com/dpanech/starlingx-kernel/commits/f/centos8 I'm looking at that. 4th commit down is current master's HEAD21:47
*** tobiash has joined #opendev21:47
clarkbdpanech: that means you've got a fast forward of 3 commits from master|f/centos8 to the state you want21:47
clarkbdpanech: two of those commits are merge commits21:47
clarkbyou are allowed to push merge commits to gerrit of commits that are already in gerrit if you have the push merge commit permissions which "starlingx-release" does have21:48
mnasermordred: yay.  ok, cool.  i don't think we can take advantage of those because requires/provides needs things to be in the same tenant, correct?21:48
fungiclarkb: looks like it's rather a lot of commits21:48
fungi"Your branch is ahead of 'origin/f/centos8' by 31 commits."21:48
mnaseraka vexxhost tenant cant just do requires21:48
fungi(that's after fetching https://github.com/dpanech/starlingx-kernel f/centos8 and resetting --hard to it)21:48
clarkbfungi: ya its probably calculating the deltas pulled in by the merge21:48
corvusmnaser: correct21:48
clarkbfungi: but I think only the merge commits matter if the other commits were already accepted by gerrit21:49
clarkbfungi: because those other commits are accepted and reviewed and merged. What you'd be reviewing here is the combo of the two branches together via a merge commit21:49
clarkb(I don't actually know if the statement above is true in this particular case)21:49
mnasercorvus: bummer -- ok, so we have no means of testing that till we merge uwsgi-base (or if we decide to not have that image then we'll just move it in) -- the changes mordred introduced makes it more trivial to install it in build time anyways..21:49
openstackgerritJames E. Blair proposed openstack/diskimage-builder master: WIP: boot test of containerfile image  https://review.opendev.org/72214821:50
clarkbbut generally in say the feature branch case that we support that is how it works21:50
mordredmnaser: how about update the logdeit image to use it?21:50
clarkbyou don't need an admin to support feature branches, you just need to push commits that gerrit has arleady accepted21:50
mnasermordred: oh i could do that21:50
clarkb*push merge commits for commits that gerrit has already accepted21:50
mnasersounds like work but if that means we can land uwsgi-base i'll do it :p21:50
fungiclarkb: also some of the commits like https://github.com/dpanech/starlingx-kernel/commit/f683f85 aren't in gerrit yet21:51
dpanechto be clear: there are earlier commits in that branch that are not in master, not just the top few21:51
clarkbfungi: gotcha so it wouldn't just work in this case because it violates the commit previously existing in gerrit rule21:51
fungiyup21:51
fungii just wanted to double-check21:52
mordredmnaser: I got it21:52
mordredmnaser: one sec21:52
clarkbdpanech: yup the merge commits reconcile that21:52
mnasermordred: im almost done ;)21:52
mordredmnaser: kk21:52
clarkbdpanech: if we ignore your specific situation for a minute the general rule is you can push merge commits to gerrit just fine. It does require an extra bit of permissions (starglinx-release has that) and gerrit must already know about the commits (basically this means they went through reivew on some branch)21:55
clarkbdpanech: going forward what this means is you don't need a gerrit admin to push for you as long as you keep teh work in gerrit21:56
openstackgerritMohammed Naser proposed opendev/lodgeit master: docker: switch to using uwsgi-base  https://review.opendev.org/72214921:56
mnasermordred: ^ that should work21:56
mnaserexcept we don't really have any tests for the image right now, boo :X21:56
mnaser... i have a helm chart that deploys from the image .. but thats also in another tenant, lol21:57
fungi#status log replaced content of starlingx/kernel f/centos8 branch with f/centos8 branch from github.com/dpanech/starlingx-kernel per sgw21:57
openstackstatusfungi: finished logging21:57
fungidpanech: sgw: http://paste.openstack.org/show/792572/21:57
fungithat's the output of the push command, for the record21:57
fungiyou should be all set21:57
sgwthanks guys21:57
fungiyou can ignore the warning lines, gerrit's just nagging about well-formatted commit messages21:59
dpanechclarkb: sorry I have little experience with gerrit. I thought every review = one patch/commit. Not sure how I would go about importing a pile of commits from another repo via gerrit. I guess I'll read up on it later or ask someone to show me...21:59
mnasermordred: oh joy of joys, opendev/lodgeit is in the opendev tenant, lol21:59
clarkbdpanech: yes every review is one patch/commit. Git merge commits are sort of a workaround to that where you create a single commit that has multiple parents effectively combining streams22:00
mnaseroh but wait, opendev/system-config is in the opendev tenant, that should be ok22:00
clarkbdpanech: so you can still review a merge commit in gerrit as usual. The cavest ais that each of the commits involved in those merge streams must also have been gerrit changes at one point22:00
dpanechclarkb: you mean it can track them by change ids across repos or something?22:02
clarkbdpanech: it does it by commit sha so as long as those haven't chagned I think its fine with it22:02
fungithe idea is that you propose changes as you go along rather than winding up with a stack outside gerrit which can't be code-reviewed22:03
clarkbin this case I think there was some underlying repo rearrangement and that made it an exception to the rule22:04
clarkbI was just trying to make sure we don't think all merge commits need to go through a gerritadmin22:04
mnaserim seeing a lot of retries in zuul status22:04
dpanechclarkb: fungi: clear as mud, but ok. Can I ask you to hold my hand next time I need this ? :)22:05
clarkbdpanech: sure22:05
mnaserim seeing a bunch of retries in zuul status, expected due to any issues with a provider?22:06
clarkbmnaser: no22:06
clarkb(the known provider issues shouldn't cause retries)22:06
mnaserchange 713953,15 had 4 jobs that retried22:06
corvusmirrors or images are more likely22:06
mnaser2x had 2 attempts and 2x had 3 attempts22:06
clarkbya its the indexes being unexpected sizes22:07
clarkbhttps://54b78720d7619cd1224f-698cf8cdc9dda09c975180d31d914d21.ssl.cf2.rackcdn.com/717754/2/gate/neutron-grenade-multinode/8ab891d/job-output.txt22:07
clarkbso we updated the mirror with new indexes and now mirrors are serving stale versions?22:07
clarkbcorvus: you ran checkvol ? check something when static was sad maybe that will fix this?22:07
fungidpanech: basically it's hard to say what you could have done in this case to avoid needing a push --force to incorporate commits which were not in gerrit without knowing what the development process was which led to the creation of those commits, but the general idea is to make sure commits are pushed into gerrit as reviews normally when possible22:08
corvusclarkb: i'll look up our convo from last time while you log into an afs head and see if you can find what's out of date?22:08
corvusclarkb: (so we can confirm the problem/fix quickly)22:08
clarkbcorvus: k22:08
clarkb/afs/openstack.org/mirror/ubuntu/dists/bionic-updates/main/binary-amd64/Packages.gz is 1266584 bytes according to afs22:10
corvusclarkb: what does mirror.iad.rax.openstack.org think it is?22:10
clarkb/afs/openstack.org/mirror/ubuntu/dists/bionic-updates/Release: 3ed520b3f935f958321db0b24f340d8c 1266584 main/binary-amd64/Packages.gz22:11
openstackgerritIan Wienand proposed openstack/diskimage-builder master: pip-and-virtualenv: drop f31 & tumbleweed, rework suse 15 install  https://review.opendev.org/72176322:11
corvusclarkb: apt expected it to be 920439 ?22:12
clarkbcorvus: it seems to agree22:12
clarkbFile has unexpected size (1266584 != 1266587). Mirror sync in progress?22:12
clarkbcorvus: it expected 126658722:12
clarkbbut the Release file says 126658422:12
corvusoh sorry was looking at the wrong line22:12
ianwclarkb: i'm guessing the release didn't go as well as hoped ... lmn if i can help, i'm around22:12
clarkbianw: well I think the reprepro side went ok. This looks like fallout due to afs caches?22:13
clarkbcorvus: so I think that particular one may be fine now22:13
clarkb(afs eventually started serving synced up content)22:13
mnasermordred: left 2 reviews on those 2 patches which i think are the last things needed to make it work :)22:13
corvusclarkb: huh i don't understand is 1266587 not the correct value?22:13
clarkbcorvus: aiui the Release file eg http://mirror.iad.rax.openstack.org/ubuntu/dists/bionic-updates/Release gives you a hash and filesize for the other files22:14
clarkbcorvus: in this case the file above says the Packages.gz file is the correct size22:14
clarkbcorvus: the implication is that when this failed either the Release file or the Packages.gz file were out of date22:14
clarkband that caused apt to be mad22:14
corvushrm22:15
corvusi don't understand how this happens;22:15
corvusit's kind of the entire value proposition of using afs for this22:15
fungiunless there was an interim vos release after a failed reprepro execution22:15
corvusi think clarkb held the lock the whole time?22:16
clarkbcorvus: I did22:16
clarkbI did release it when I was done though22:16
fungiright, and you only called vos release at the end22:16
clarkbbut that was after the vos release had completed22:16
clarkbfungi: correct22:16
fungionce things should have been in sync22:16
ianwfwiw if it was done in the background automaticaly it would show up in http://grafana.openstack.org/d/ACtl1JSmz/afs?orgId=122:16
fungiyeah, so nothing released that volume without checking the flock22:17
clarkbis it possible apache is caching things for us22:17
clarkbwhat I notice in my example log is that it is a multinode job22:17
clarkband each node complains about different indexes22:17
clarkbthat implies to me that apache is serving correct content to some but not others?22:18
clarkbcorvus: ^ re afs value perhaps this is part of the problem22:18
corvusi didn't think we had apache caching these?22:18
clarkbcorvus: I didn't either22:18
clarkbit should all be done by suffix and on that vhost should only be pypi but I'm looking now to double check22:19
clarkbbut notice on that example job on one host it is bionic-update universe that is mad and main is fine then on the other host its bionic-updates main that is mad22:19
corvus"Wed Apr 22 15:44:56 2020" is the last release, so those errors didn't happen to have two requests span the release22:20
openstackgerritMohammed Naser proposed opendev/system-config master: Allow passing an arbitrary package list to assemble  https://review.opendev.org/72213322:20
clarkbhttps://opendev.org/opendev/system-config/src/branch/master/modules/openstack_project/templates/mirror.vhost.erb#L82 is how we configure caching for specific suffixes22:21
clarkband ya only /pypi and /pypifiles are set on the main vhost22:21
clarkbso /ubuntu shouldn't be cached22:21
clarkb(not by apache anyway)22:21
openstackgerritMohammed Naser proposed opendev/system-config master: Add a uwsgi-base container image  https://review.opendev.org/71395322:21
openstackgerritMohammed Naser proposed opendev/system-config master: Add a uwsgi-base container image  https://review.opendev.org/71395322:21
clarkbfwiw it seems like jobs are settling down? eg not many have failed due to retries22:22
clarkbso whatever the out of sync problem was may have corrected tiself (which doesn't make it easier to debug)22:22
clarkbit does definitely seem like we are serving differetn content to different hosts at roughly the same time22:23
clarkbthat implies to me that afs is probably not the problem because those fs reads should all be consistent across that time period?22:23
corvusclarkb: do you know how many variations we have?22:25
mnaseralso maybe it was problematic in a specific region?22:25
corvus(if there are only two, then it makes some kind of old/new mixup more likely; i don't know what >3 would mean)22:26
clarkbcorvus: so far its just those two but I'm mostly only looking at the one job so far22:26
clarkblet me see if logstash has more22:26
mnaser(my comment about specific region is because they evenutally would pass with enough retries)22:26
clarkbmnaser: the region with known failures reports good data to my browser22:27
clarkbmnaser: which is another reason I think it corrected itself22:27
clarkbthe most recent errors of this type in logstash are from 20:36UTC ish22:30
clarkb(and they go back well before that)22:30
clarkbthe failure logstash is reporting seem to largely be from rax.ord22:31
clarkbimplying that it could be a mirror state thing of some sort22:32
clarkb(we don't run the bulk of our jobs in rax-ord last I checked so it shouldn't have the bulk of these failures)22:32
clarkblogstash shows that it is bionic-updates main and universe and bionic-security main though pretty consistently22:33
clarkbwhich is the same set we see on this job in iad22:33
clarkbcorvus: so maybe we try to flush those 3 then check logstash in a few hours (assuming that zuul isn't complaining too much)?22:34
* clarkb makes a list of file paths22:34
corvusclarkb: ok.  i'm winding down for the day, so maybe we can check in tomorrow22:35
clarkbcorvus: if I do a flush volume is that applied globally?22:35
clarkbor do I have to do it on each client?22:35
clarkbhttps://etherpad.opendev.org/p/tMfB9VRcqe7NhS9a4-ZX is updated with a list of paths to flush22:37
openstackgerritMerged opendev/system-config master: Fix rooted path to docker-compose  https://review.opendev.org/72214522:40
corvusclarkb: that's a client thing22:41
openstackgerritpanticz.de proposed openstack/diskimage-builder master: Install required netbase package for ubuntu-minimal element. Based on similar commit for debian-minimal Icc81635870961943707cf6b3f61a9ddbd51cb8fd  https://review.opendev.org/72216422:42
clarkbcorvus: k I'll run checkvolume and those flushes across our mirrors22:43
clarkbord and iad are done22:43
clarkbdoes anyone know where we write out the zuul deployment vars?22:48
clarkbI want to make sure I get the correct list of mirrors22:49
corvusclarkb: you mean /etc/zuul/site-variables.yaml ?22:52
openstackgerritIan Wienand proposed openstack/diskimage-builder master: pip-and-virtualenv : fix fedora 30 install  https://review.opendev.org/71679522:54
clarkbcorvus: ya I think so22:55
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Bionic functional tests should be voting  https://review.opendev.org/71683922:56
ianwinfra-root: there is a stack out @ https://review.opendev.org/#/q/topic:builder-container which is all focused on getting all our image types fully boot tested under the container builder, review appreciated23:04
openstackgerritIan Wienand proposed openstack/diskimage-builder master: yum-minimal: strip env vars in chroot calls  https://review.opendev.org/72172623:18
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [wip] switch func tests to containers  https://review.opendev.org/72151123:18
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Restore SUSE tests to gate  https://review.opendev.org/72177923:18
clarkbok I think I got all the mirrors checkvolume'd and flushvolumed23:23
clarkbdid it wrong the first pass, needed to use -path with flushvolume and it didn't error when I didn't23:23
clarkbso roughly after 23:23UTC April 22 we should see things happier in logstash?23:23
*** Dmitrii-Sh has quit IRC23:24
*** Dmitrii-Sh has joined #opendev23:32
openstackgerritMonty Taylor proposed opendev/system-config master: Allow passing an arbitrary package list to assemble  https://review.opendev.org/72213323:35
openstackgerritMonty Taylor proposed opendev/system-config master: Add a uwsgi-base container image  https://review.opendev.org/71395323:35
toskyuhm, if I can't clone/fetch from https://opendev.org, is it a problem of my connection? I can fetch from the gerrit remote23:37
toskyoh, forget that, it worked23:37
*** _mlavalle_1 has quit IRC23:45
*** Dmitrii-Sh has quit IRC23:46
*** Dmitrii-Sh has joined #opendev23:47
*** tosky has quit IRC23:51
ianwoh we still have trusty tests in dib ... i guess they're time has come23:51
ianwtheir time even23:51
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Remove Trusty testing  https://review.opendev.org/72216823:56

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!