Saturday, 2021-01-16

*** akrpan-pure has quit IRC00:10
*** yoctozepto1 has joined #opendev00:21
*** yoctozepto has quit IRC00:22
*** yoctozepto1 is now known as yoctozepto00:22
*** tosky has quit IRC00:29
*** artom has quit IRC00:31
*** DSpider has quit IRC00:33
*** yoctozepto4 has joined #opendev01:05
*** yoctozepto has quit IRC01:06
*** yoctozepto4 is now known as yoctozepto01:06
*** yoctozepto5 has joined #opendev01:20
*** yoctozepto has quit IRC01:21
*** yoctozepto5 is now known as yoctozepto01:21
*** yoctozepto5 has joined #opendev01:41
*** yoctozepto has quit IRC01:42
*** yoctozepto5 is now known as yoctozepto01:42
*** d34dh0r53 has quit IRC02:59
*** d34dh0r53 has joined #opendev03:01
*** iurygregory has quit IRC03:10
*** cloudnull has quit IRC03:44
*** cloudnull has joined #opendev04:10
*** sboyron has joined #opendev07:11
*** DSpider has joined #opendev09:27
*** sboyron has quit IRC10:36
*** JayF has quit IRC10:58
*** jhesketh has quit IRC11:02
*** jhesketh has joined #opendev11:02
*** JayF has joined #opendev11:02
*** tosky has joined #opendev11:16
*** danpawlik has quit IRC11:33
*** danpawlik5 has joined #opendev11:33
*** fbo has quit IRC12:17
*** fbo has joined #opendev12:19
openstackgerritJeremy Stanley proposed openstack/project-config master: Revert "Un-pause Gentoo image builds"  https://review.opendev.org/c/openstack/project-config/+/77110415:03
fungiprometheanfire: ^ i've tried to include as much diagnostic info as i can in that commit message15:04
openstackgerritJeremy Stanley proposed zuul/zuul-jobs master: Temporarily stop running Gentoo base role tests  https://review.opendev.org/c/zuul/zuul-jobs/+/77110515:19
openstackgerritJeremy Stanley proposed zuul/zuul-jobs master: Revert "Temporarily stop running Gentoo base role tests"  https://review.opendev.org/c/zuul/zuul-jobs/+/77110615:19
openstackgerritJeremy Stanley proposed opendev/system-config master: Correct path in mk-archives-index cronjob on lists  https://review.opendev.org/c/opendev/system-config/+/77110715:29
*** mlavalle has quit IRC15:50
*** _mlavalle_1 has joined #opendev15:50
*** tosky has quit IRC16:16
openstackgerritJeremy Stanley proposed opendev/bindep master: ArchLinux: ignore unrelated warnings from pacman  https://review.opendev.org/c/opendev/bindep/+/77110816:24
*** cgoncalves has quit IRC16:58
*** cgoncalves has joined #opendev17:03
*** cgoncalves has quit IRC17:19
*** cgoncalves has joined #opendev17:27
prometheanfirefungi: kk17:37
prometheanfirefungi: it seemed like it was still unpacking if there was a lock on distfiles17:41
*** brinzhang has joined #opendev18:02
*** brinzhang_ has quit IRC18:04
fungiprometheanfire: yeah, i wonder if whatever unpacking was going on died and the parent process didn't notice or something... but dmesg didn't indicate an oom or anything of the sort18:33
prometheanfirefungi: is this a single instance of an issue or a repeated issue?18:42
fungiprometheanfire: it happens consistently. image build starts, it gets to installing six, then sticks like that for 4+ hours and finally nodepool gives up18:52
fungiimage builds aren't completing18:52
prometheanfirek18:54
*** sgw has quit IRC18:55
*** sgw has joined #opendev18:56
*** brinzhang_ has joined #opendev19:10
*** bodgix_ has joined #opendev19:12
*** bodgix has quit IRC19:12
*** fbo has quit IRC19:12
*** dmellado has quit IRC19:12
*** brinzhang has quit IRC19:13
*** fbo has joined #opendev19:13
*** dmellado has joined #opendev19:13
*** slittle1 has joined #opendev19:14
*** tosky has joined #opendev19:24
fungiprometheanfire: could we maybe add some additional debugging output to the emerge?19:49
prometheanfireheh, there is a --debug option20:01
prometheanfireit's kinda verbose :P20:01
prometheanfirefungi: would this be a good test?  https://dpaste.com/5S8JTJDUY20:05
prometheanfireiirc that's what I was running before when I initially developed the stuff20:06
prometheanfirethen I can test locally20:06
fungiprometheanfire: probably? i honestly haven't tried reproducing an image build locally for a while20:09
prometheanfireok, I'll assume it's right20:09
fungiprometheanfire: i'm also wondering if it could be related to the kernel version on our builder... but also nb01 has now ceased to be reachable so i'm going to see what's happened to it20:18
prometheanfireheh20:18
*** sgw has left #opendev20:32
fungiahh, my bad, i was trying to reach old servers which we never cleaned up in dns20:34
fungii'll clean that up20:35
prometheanfireI am having an issue (not the same one though)20:35
prometheanfirehttps://gist.github.com/prometheanfire/42cfd32e92df5f3a474848e414b2191b20:36
prometheanfireI'm thinking that six isn't installed in the base image for me20:38
prometheanfireit happens before I can install anything20:38
prometheanfireproject-config/nodepool/elements/openstack-repos/extra-data.d/50-create-repo-list does it20:38
fungi#status log deleted old aaaa records for nonexistent nb01.openstack.org and nb02.openstack.org servers20:39
openstackstatusfungi: finished logging20:39
fungilooks like somebody cleaned up the ipv4 address records but not ipv620:40
prometheanfiregonna try wrapping that import in a try/except20:42
prometheanfireyep, that worked20:43
prometheanfirehttps://dpaste.com/33V38UK3P20:44
fungiprometheanfire: so as far as replicating the issue, if it comes down to it, we're running this in an ubuntu 18.04 lts vm with docker-compose using this compose file: https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/nodepool-builder/templates/docker-compose.yaml.j220:46
fungiwith defaults, so "image: docker.io/zuul/nodepool-builder:latest"20:47
prometheanfireit'll be a minute, for my testing20:47
fungii'm not sure if any of those details will be involved in the problem20:47
prometheanfire2021-01-16 20:47:09.165 | Caching gerrit from https://opendev.org/opendev/gerrit.git in /opt/dib_cache/source-repositories/gerrit_0a56dd139195635d3ada2296d9ddf8ce967dea2820:47
fungizuul seems to have caught up on its backlog, so maybe i'll restart the scheduler for gerrit wip support after dinner20:52
*** calcmandan_ has joined #opendev21:00
*** calcmandan has quit IRC21:00
*** lbragstad has quit IRC21:01
*** smcginnis has joined #opendev21:42
fungilooks like the static site volumes are releasing on a normal cadence again21:48
fungithere are still outstanding transactions for some of the mirrors though, so we'll need to consider if we want to try to abort them now that we can make rpc calls in a timely fashion again21:48
fungiwe should be able to approve the revert for serving static sites from the writeable path at least, if any other config-core wants to review: https://review.opendev.org/77085721:51
fungizuul utilization had a bit of a spike around 20:30z so i'll give it a bit longer before i restart the scheduler so i won't need to reenqueue quite so many builds (we have around 150 nodes in use at the moment according to the zuul dashboard in grafana)21:53
mnaserfungi: I think you’re looking for infra-root perhaps :) — I can’t review that :p21:53
fungimnaser: d'oh, you're right, that's a system-config repo. sorrt!21:53
fungier, sorry!21:54
fungibut thanks for looking i guess :/21:54
*** tosky has quit IRC21:59
mordredfungi: lgtm22:15
fungithanks!22:22
*** tosky has joined #opendev22:34
fungiwe're down around 45 nodes in use now... getting ready to restart the scheduler shortly if it drops a bit more22:51
fungier, 65 i mean22:51
openstackgerritMerged opendev/system-config master: Revert "Temporarily serve static sites from AFS R+W vols"  https://review.opendev.org/c/opendev/system-config/+/77085723:12
* fungi sighs23:25
fungianother (smaller) spike around 23:00z, so just over 100 nodes in use at the moment23:25

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!