Friday, 2018-10-26

*** agopi|brb has quit IRC00:00
*** rpioso is now known as rpioso|afk00:10
*** betherly has joined #openstack-infra00:11
*** betherly has quit IRC00:15
*** darvon has joined #openstack-infra00:21
*** betherly has joined #openstack-infra00:31
*** betherly has quit IRC00:36
*** agopi|brb has joined #openstack-infra00:37
*** agopi|brb has quit IRC00:39
*** agopi|brb has joined #openstack-infra00:39
*** longkb has joined #openstack-infra00:40
*** betherly has joined #openstack-infra00:52
*** diablo_rojo has quit IRC00:55
*** betherly has quit IRC00:56
*** ansmith_ has joined #openstack-infra01:02
*** jamesmcarthur has joined #openstack-infra01:03
*** jamesmcarthur has quit IRC01:05
*** jamesmcarthur has joined #openstack-infra01:05
*** betherly has joined #openstack-infra01:12
*** betherly has quit IRC01:17
*** mrsoul has joined #openstack-infra01:17
*** betherly has joined #openstack-infra01:33
*** jamesmcarthur has quit IRC01:35
*** betherly has quit IRC01:37
*** carl_cai has quit IRC01:48
*** lbragstad has quit IRC01:49
*** lbragstad has joined #openstack-infra01:49
*** betherly has joined #openstack-infra01:53
*** rcernin has joined #openstack-infra01:55
*** betherly has quit IRC01:58
*** jamesmcarthur has joined #openstack-infra02:05
*** liusheng__ has joined #openstack-infra02:07
*** jamesmcarthur has quit IRC02:09
*** betherly has joined #openstack-infra02:14
*** bobh has joined #openstack-infra02:14
*** betherly has quit IRC02:19
openstackgerritMerged openstack-infra/project-config master: Disable inap-mtl01 provider  https://review.openstack.org/61341802:22
dmsimardWould it be a good idea to force http -> https redirection on our things that are available over ssl ?02:27
dmsimardlogs, git, zuul, etc02:27
dmsimardI could write a patch like that02:27
*** adrianreza has joined #openstack-infra02:31
*** betherly has joined #openstack-infra02:34
*** betherly has quit IRC02:39
*** bhavikdbavishi has joined #openstack-infra02:47
dmsimardWhat's the thing that closes PRs on github with a template ?02:47
*** rh-jelabarre has quit IRC02:48
*** roman_g has quit IRC02:49
*** betherly has joined #openstack-infra02:54
*** betherly has quit IRC03:00
*** bobh has quit IRC03:10
*** ykarel|away has joined #openstack-infra03:25
*** dpawlik has quit IRC03:27
*** carl_cai has joined #openstack-infra03:27
*** cfriesen has quit IRC03:29
*** dpawlik has joined #openstack-infra03:29
*** betherly has joined #openstack-infra03:35
*** betherly has quit IRC03:40
*** udesale has joined #openstack-infra03:51
*** betherly has joined #openstack-infra03:56
*** lpetrut has joined #openstack-infra03:58
*** betherly has quit IRC04:00
*** dave-mccowan has quit IRC04:14
*** janki has joined #openstack-infra04:22
ianwclarkb: https://review.openstack.org/613503 Call pre/post run task calls from TaskManager.submit_task() I think explains our missing nodepool logs04:28
*** lpetrut has quit IRC04:34
*** dpawlik has quit IRC04:36
*** dpawlik has joined #openstack-infra04:39
*** kjackal has joined #openstack-infra04:45
*** ramishra has joined #openstack-infra05:12
*** yamamoto has quit IRC05:26
*** yamamoto has joined #openstack-infra05:26
*** kjackal has quit IRC05:29
*** carl_cai has quit IRC05:33
*** betherly has joined #openstack-infra05:36
*** betherly has quit IRC05:40
*** bhavikdbavishi1 has joined #openstack-infra05:47
*** trown has joined #openstack-infra05:49
*** kopecmartin has joined #openstack-infra05:50
*** elod_ has joined #openstack-infra05:50
*** evrardjp_ has joined #openstack-infra05:51
*** quiquell|off is now known as quiquell05:53
*** jpenag has joined #openstack-infra05:53
*** hemna_ has joined #openstack-infra05:54
*** ianw_ has joined #openstack-infra05:54
*** dims_ has joined #openstack-infra05:54
*** bhavikdbavishi has quit IRC05:55
*** apetrich has quit IRC05:55
*** dhill_ has quit IRC05:55
*** Diabelko has quit IRC05:55
*** SotK has quit IRC05:55
*** gothicmindfood has quit IRC05:55
*** kopecmartin|off has quit IRC05:55
*** dims has quit IRC05:55
*** dulek has quit IRC05:55
*** jpena|off has quit IRC05:55
*** strigazi has quit IRC05:55
*** elod has quit IRC05:55
*** nhicher has quit IRC05:55
*** lucasagomes has quit IRC05:55
*** gnuoy has quit IRC05:55
*** hemna has quit IRC05:55
*** evrardjp has quit IRC05:55
*** mudpuppy has quit IRC05:55
*** mattoliverau has quit IRC05:55
*** cgoncalves has quit IRC05:55
*** brwyatt has quit IRC05:55
*** emerson has quit IRC05:55
*** bradm has quit IRC05:55
*** chkumar|off has quit IRC05:55
*** ianw has quit IRC05:55
*** Qiming has quit IRC05:55
*** jlvillal has quit IRC05:55
*** aluria has quit IRC05:55
*** mdrabe has quit IRC05:55
*** mpjetta has quit IRC05:55
*** Keitaro has quit IRC05:55
*** trown|outtypewww has quit IRC05:55
*** bhavikdbavishi1 is now known as bhavikdbavishi05:55
*** ianw_ is now known as ianw05:55
*** brwyatt has joined #openstack-infra05:56
*** irclogbot_1 has quit IRC05:58
*** apetrich has joined #openstack-infra06:02
*** dhill_ has joined #openstack-infra06:02
*** Diabelko has joined #openstack-infra06:03
*** Keitaro has joined #openstack-infra06:05
*** chandankumar has joined #openstack-infra06:06
*** ykarel|away is now known as ykarel06:10
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: New Repo: OpenStack-Helm Docs  https://review.openstack.org/61189306:20
*** ccamacho has quit IRC06:20
*** xinliang has joined #openstack-infra06:21
AJaegerconfig-core, two new repos for review, please https://review.openstack.org/#/c/611892 and https://review.openstack.org/61189306:22
AJaegerdmsimard: openstack-infra/jeepyb/jeepyb/cmd/close_pull_requests.py - let me fix quickly...06:23
*** gfidente has joined #openstack-infra06:26
openstackgerritAndreas Jaeger proposed openstack-infra/jeepyb master: Use https for links  https://review.openstack.org/61350906:28
AJaegerdmsimard: ^06:28
*** aojeagarcia has joined #openstack-infra06:29
*** aojea has quit IRC06:33
*** ccamacho has joined #openstack-infra06:41
*** bhavikdbavishi has quit IRC06:48
*** ccamacho has quit IRC06:49
*** ccamacho has joined #openstack-infra06:51
*** yamamoto has quit IRC06:53
*** yamamoto has joined #openstack-infra06:53
*** yamamoto has quit IRC06:53
*** yamamoto has joined #openstack-infra06:54
openstackgerritOpenStack Proposal Bot proposed openstack-infra/project-config master: Normalize projects.yaml  https://review.openstack.org/61351106:55
*** quiquell is now known as quiquell|brb06:57
*** ginopc has joined #openstack-infra07:07
*** quiquell|brb is now known as quiquell07:14
*** rcernin has quit IRC07:22
*** ykarel is now known as ykarel|lunch07:24
*** shardy has joined #openstack-infra07:26
*** strigazi has joined #openstack-infra07:26
*** bauzas is now known as bauwser07:35
*** witek has quit IRC07:35
*** xek has joined #openstack-infra07:35
*** evrardjp_ is now known as evrardjp07:37
*** tosky has joined #openstack-infra07:46
*** hashar has joined #openstack-infra07:53
*** kjackal has joined #openstack-infra07:57
*** rossella_s has joined #openstack-infra08:00
*** jpich has joined #openstack-infra08:03
*** SotK has joined #openstack-infra08:06
*** elod_ is now known as elod08:07
*** carl_cai has joined #openstack-infra08:22
*** derekh has joined #openstack-infra08:23
openstackgerritMerged openstack-infra/project-config master: Normalize projects.yaml  https://review.openstack.org/61351108:29
*** ccamacho has quit IRC08:32
*** panda|off is now known as panda08:32
*** lucasagomes has joined #openstack-infra08:33
*** ccamacho has joined #openstack-infra08:33
openstackgerritFrank Kloeker proposed openstack-infra/openstack-zuul-jobs master: Rename index file of doc translations  https://review.openstack.org/61353108:35
ianwhrm, there seems to be something up with http://mirror.regionone.limestone.openstack.org/08:40
ianw#status log restarted apache2 service on mirror.regionone.limestone.openstack.org08:41
openstackstatusianw: finished logging08:41
ianwnothing really odd in the logs08:41
*** ykarel|lunch is now known as ykarel08:42
*** dulek has joined #openstack-infra08:46
openstackgerritMerged openstack-infra/irc-meetings master: Remove ironic-bfv and ironic-ui meetings  https://review.openstack.org/61269509:05
*** xinliang has quit IRC09:09
*** e0ne has joined #openstack-infra09:16
*** kjackal_v2 has joined #openstack-infra09:16
*** kjackal has quit IRC09:20
*** xinliang has joined #openstack-infra09:21
*** kjackal_v2 has quit IRC09:28
*** kjackal has joined #openstack-infra09:28
*** Qiming has joined #openstack-infra09:35
*** yamamoto has quit IRC09:36
*** alexchadin has joined #openstack-infra09:37
*** electrofelix has joined #openstack-infra09:58
*** dpawlik has quit IRC10:03
*** dpawlik_ has joined #openstack-infra10:03
*** lpetrut has joined #openstack-infra10:11
*** jamesmcarthur has joined #openstack-infra10:12
*** ssbarnea has joined #openstack-infra10:12
*** jamesmcarthur has quit IRC10:16
*** bhavikdbavishi has joined #openstack-infra10:28
*** bhavikdbavishi has quit IRC10:32
mtreinishfungi: we should be already running the fix for 7651, we switched to the ppa to get 1.15 (which includes the fix for 7651)10:32
mtreinishfungi: also persia and I backported that fix for ubuntu at the dublin ptg: https://bugs.launchpad.net/ubuntu/+source/mosquitto/+bug/175259110:32
openstackLaunchpad bug 1752591 in mosquitto (Ubuntu Bionic) "CVE-2017-7651 and CVE-2017-7652" [Undecided,Fix released]10:32
mtreinishso unfortunately I don't think it will fix our crashing issue, that's a bug with the log handling10:33
*** bhavikdbavishi has joined #openstack-infra10:34
*** jpenag is now known as jpena10:36
*** bhavikdbavishi has quit IRC10:38
mtreinishfungi: it probably doesn't hurt to bump up the version, but I'm not optimistic that it would fix the crashing10:40
*** betherly has joined #openstack-infra10:41
*** kjackal has quit IRC10:46
*** pbourke has quit IRC10:48
*** pbourke has joined #openstack-infra10:48
*** dtantsur|afk is now known as dtantsur10:48
*** ssbarnea has quit IRC10:49
*** e0ne has quit IRC10:51
*** e0ne_ has joined #openstack-infra10:52
*** alexchadin has quit IRC10:52
*** AJaeger_ has joined #openstack-infra10:57
*** AJaeger has quit IRC10:59
*** jpena is now known as jpena|lunch11:01
mtreinishfungi: we set it to 'present' in the puppet. So bumping the package will have to be done manually: https://git.openstack.org/cgit/openstack-infra/puppet-mosquitto/tree/manifests/init.pp#n1611:06
slaweqhi infra team11:06
slaweqI just spotted error like: http://logs.openstack.org/14/613314/1/check/neutron-grenade-multinode/6874aba/job-output.txt.gz#_2018-10-26_09_08_36_328644 (/tmp/ansible/bin/ara: No such file or directory) in two different jobs running on Neutron rocky branch, do You know what could cause that?11:07
*** kjackal has joined #openstack-infra11:10
*** dave-mccowan has joined #openstack-infra11:15
*** EmilienM is now known as EvilienM11:24
*** udesale has quit IRC11:28
*** panda is now known as panda|lunch11:29
*** hashar is now known as hasharAway11:31
*** ramishra has quit IRC11:31
*** janki has quit IRC11:36
*** ansmith_ has quit IRC11:39
*** rh-jelabarre has joined #openstack-infra11:43
*** longkb has quit IRC11:49
*** jpena|lunch is now known as jpena11:57
*** ykarel is now known as ykarel|away11:58
*** yamamoto has joined #openstack-infra12:02
*** carl_cai has quit IRC12:02
*** ykarel|away has quit IRC12:02
*** jcoufal has joined #openstack-infra12:04
*** kjackal has quit IRC12:08
*** kjackal has joined #openstack-infra12:09
*** emerson has joined #openstack-infra12:15
dmsimardslaweq: that comes from devstack-gate: http://codesearch.openstack.org/?q=%2Ftmp%2Fansible%2Fbin%2Fara&i=nope&files=&repos=12:16
dmsimardThe ara not found is intriguing. I need to drop kids at school, I'll be able to check in ~20 minutes12:18
slaweqdmsimard: thx a lot12:23
*** panda|lunch is now known as panda12:24
*** eharney has joined #openstack-infra12:25
*** bobh has joined #openstack-infra12:26
*** e0ne_ has quit IRC12:26
*** carl_cai has joined #openstack-infra12:29
*** yamamoto has quit IRC12:32
*** rlandy has joined #openstack-infra12:36
*** quiquell is now known as quiquell|lunch12:37
fungidmsimard: that should be pretty easy to do. we already have some sites/services we do that for (e.g. review, docs, governance, security) so i'd argue there's not a lot of reason to serve any of the reset of them via both http+https these days anyway12:40
*** agopi|brb is now known as agopi12:41
fungilooks like the releases site redirects http->https as well12:41
fungishould be able to just copy configuration from one or more of those, and apply it to anything in our ssl cert check config which is missing that12:42
*** roman_g has joined #openstack-infra12:43
openstackgerritSimon Westphahl proposed openstack-infra/zuul master: Use branch for grouping in supercedent manager  https://review.openstack.org/61333512:44
dmsimardslaweq: if you look a bit above that ara command not found error, you'll see that we failed to install ansible in the first place.. looks like timeout to the limestone mirror http://logs.openstack.org/14/613314/1/check/neutron-grenade-multinode/6874aba/job-output.txt.gz#_2018-10-26_08_42_51_58164412:44
slaweqdmsimard: thx for investigating that, so it looks that it was probably temporary issue on one cloud provider only12:45
dmsimardslaweq: the server looks healthy and reachable right now, there may have been a temporary network issue12:46
dmsimardplease recheck and let us know if it reoccurs12:46
dmsimardfungi: ok, I'll take a stab at it12:47
*** jamesmcarthur has joined #openstack-infra12:47
slaweqdmsimard: sure, thx a lot12:47
fungislaweq: dmsimard: earlier (08:40z in scrollback) ianw noted that apache had died on that mirror and he restarted it. also logged at https://wiki.openstack.org/wiki/Infrastructure_Status12:48
dmsimardah, well there we go12:49
dmsimardI'm not fully awake yet haha12:49
funginp, i'm already well on my way to caffeination12:50
*** mdrabe has joined #openstack-infra12:53
*** yamamoto has joined #openstack-infra12:55
quiquell|lunchfungi: Do you know why I have "This change depends on a change that failed to merge" here https://review.openstack.org/#/c/613297/12:56
quiquell|lunchfungi: all of them has being rebased12:56
fungiquiquell|lunch: the timing of the message is usually an indicator12:57
quiquell|lunchfungi: ahh wait... I didn't rebase on of the... git pull --rebase does not do the job12:58
slaweqfungi: thx also for help12:58
fungiquiquell|lunch: you uploaded patchset #5 at 10:15z, so it was queued for testing or possibly in the midst of running some jobs, then at 11:11z one of its dependencies got uploaded12:58
logan-regarding the limestone mirror apache issue, the disk is 90% full because of the base image churn from yesterday. there are 2 sets of base images cached on all of the nodes currently until nova deletes the old nodepool images today.12:59
quiquell|lunchfungi: ack thanks !12:59
fungiquiquell|lunch: and so it was queued to test with dependent change 613316,2 but you uploaded 613316,3 so zuul was informing you that the original dependency can never merge now and it has aborted the queued/running jobs12:59
*** ansmith_ has joined #openstack-infra12:59
logan-i will remove that hv from the aggregate for now so no nodepool images will get scheduled there, that will keep the usage steady until the cleanup occurs12:59
fungiquiquell|lunch: a recheck of will 613297 will queue it to test with the new dependency you uploaded13:00
*** kgiusti has joined #openstack-infra13:00
*** dave-mccowan has quit IRC13:00
fungilogan-: thanks! one thing worth noting, to work around the full disk issues crashing the mirror vm completely we "preallocated" the remaining rootfs by writing zeroes to a file and then deleting it once we hit enospc13:01
*** derekh has quit IRC13:01
logan-yeah, I suspect the disk hit 100% at some point this morning (90% right now with 12 nodepool vms running), and the preallocation probably prevented it from crashing ;)13:03
*** quiquell|lunch is now known as quiquell13:03
logan-218G of cached images weighing heavy on it heh13:04
*** derekh has joined #openstack-infra13:04
*** derekh has quit IRC13:04
fungioof!13:04
fungihow old are some of those? are we leaking images? that sounds like rather more than i would expect13:05
*** derekh has joined #openstack-infra13:05
logan-i think everything was rebuilt simultaneously yesterday during the zuul/nodepool maintenance so we ended up with 2x the number of images cached than normal13:05
fungiwe should only ever at most have 3x the number of image labels we've defined (current, previous as a safety fallback, and one uploading before the oldest gets deleted)13:06
logan-because iirc nova keeps the base images cached on the hv for 24h after their last use13:06
fungiohhh13:06
fungiso on the compute nodes, not in glance13:06
logan-since that maintenance is coming up on 24h i think this should just work itself out over the next few hours and then I can put the host back in the aggregate :)13:06
logan-yup13:07
openstackgerritSorin Sbarnea proposed openstack-dev/pbr master: Correct documentation hyperlink for environment-markers  https://review.openstack.org/61357613:07
fungialso i think in glance we'll generally run much closer to 2x than 3x because we only upload one image at a time and then delete the oldest for that label13:07
*** tpsilva has joined #openstack-infra13:09
logan-yup, glance is on a 30TB ceph pool so no concerns there13:10
logan-images leak often but I think clarkb cleaned up all of the old leaked images yesterday13:10
*** AJaeger_ is now known as AJaeger13:18
*** dpawlik_ has quit IRC13:18
*** mriedem has joined #openstack-infra13:19
*** dpawlik has joined #openstack-infra13:20
*** efried is now known as fried_rice13:23
*** e0ne has joined #openstack-infra13:25
*** chandankumar is now known as chkumar|off13:33
fungiyes, i believe he did shortly after the upgrade13:35
fungier, the zk cluster replacement for nodepool i mean13:35
*** agopi is now known as agopi|brb13:35
*** agopi|brb has quit IRC13:40
*** jamesmcarthur has quit IRC13:46
ssbarnea|bkp2fungi: regarding moving browbeat config to repo at https://review.openstack.org/#/c/613092 -- already merged in repo, do we need to keep the pubish-to-pypi inside project-config or we can remove the entire section?13:48
ssbarnea|bkp2it is already listed inside repo.13:48
*** boden has joined #openstack-infra13:50
fungissbarnea|bkp2: it looks like other official projects have kept the publish-to-pypi or publish-to-pypi-python3 template application in project-config but i'll admit i haven't been following the goal work there closely enough to know for sure whether that's intended (i have to assume it must be?). AJaeger: do you know the reason for that?13:51
*** bnemec has joined #openstack-infra13:54
*** munimeha1 has joined #openstack-infra13:58
*** agopi|brb has joined #openstack-infra14:05
*** agopi|brb is now known as agopi14:05
openstackgerritMonty Taylor proposed openstack-infra/zuul master: DNM Link to change page from status panel  https://review.openstack.org/61359314:10
*** onovy has quit IRC14:16
AJaegerfungi: we left them in project-config since tagging does not know about pipelines, so a branched project needs to have the job declared in project-config. This is mentioned in infra-manual as well14:17
AJaegerssbarnea|bkp2: https://review.openstack.org/#/c/613004/6/.zuul.yaml did *not* import publish-to-pypi, it's not in-repo14:17
AJaegerssbarnea|bkp2, fungi, so https://review.openstack.org/#/c/613092 is fine to +2A IMHO.14:18
ssbarnea|bkp2AJaeger: no worry. i can add it. i just wanted to know if there is something preventing a full move.14:18
*** gfidente has quit IRC14:18
AJaegerssbarnea|bkp2: https://docs.openstack.org/infra/manual/creators.html#central-config-exceptions14:19
*** dpawlik has quit IRC14:19
fungiAJaeger: ahh, right, we still haven't decided on the possible https://review.openstack.org/578557 behavior change for that14:21
*** jamesmcarthur has joined #openstack-infra14:21
ssbarnea|bkp2AJaeger: so i was remembering something from that doc. Still "should" in specs is such a gray area... :)14:21
*** stephenfin is now known as finucannot14:22
*** dpawlik has joined #openstack-infra14:24
openstackgerritMonty Taylor proposed openstack-infra/zuul master: quick-start: add a note about github  https://review.openstack.org/61339814:25
*** tosky has quit IRC14:25
*** roman_g has quit IRC14:27
bodenhi.. I've been trying to update tricircle's in repo zuul config and tox for zuul v3 (they appear to be out of date) in https://review.openstack.org/#/c/612729/  previously they were installing required projects in tox, but when I remove that and add them to the .zuul.conf there's import errors http://logs.openstack.org/29/612729/3/check/openstack-tox-py27/a00e36e/job-output.txt.gz#_2018-10-25_20_17_28_206916  I see14:28
boden neutron installed as a sibling so I'm confused as to root cause of import err14:28
bodenany ideas?14:28
*** roman_g has joined #openstack-infra14:28
*** tosky has joined #openstack-infra14:29
*** dpawlik has quit IRC14:29
*** kjackal has quit IRC14:32
*** kjackal has joined #openstack-infra14:32
ssbarnea|bkp2AJaeger fungi : small css improvement on os-loganalyze (no more horizontal browsing on pip reqs listings): https://review.openstack.org/#/c/613383/14:35
*** smarcet has joined #openstack-infra14:37
bodenactually maybe its because those dependencies are not int he requirements... I'll try that14:38
*** quiquell is now known as quiquell|off14:42
*** carl_cai has quit IRC14:42
fungiboden: yeah, http://logs.openstack.org/29/612729/3/check/openstack-tox-py27/a00e36e/tox/py27-siblings.txt indicates to me that it didn't get installed (probably owing to it not being in the requirements as you noted)14:45
bodenfungi thanks... what makes you say it wasn't installed from that log.. I see "Sibling neutron at src/git.openstack.org/openstack/neutron"  doesn't that mean it's already there?? just trying to understand for my own benefit14:46
fungimordred: is that ^ correct? would there be both a "sibling at path" line and a "found neutron python package installed" line in that log if it had been?14:46
mordredfungi: reading14:47
fungiboden: i interpreted that to mean that it sees src/git.openstack.org/openstack/neutron and is aware it's listed as a required-project but not necessarily that tox installed it into the resulting virtualenv14:47
bodenhmm ok.. thanks14:47
mordredyes - a found sibling must already be in the requirements for it to be installed14:47
mordredso if there are repos in required-projects but not listed in requirements.txt they will not be installed14:48
fungithe pip freeze is here too which i think confirms it: http://logs.openstack.org/29/612729/3/check/openstack-tox-py27/a00e36e/tox/py27-5.log14:48
fungino neutron in the freeze output14:48
mordred(this is to avoid things like pip install -e src/git.openstack.org/openstack/requirements - which is not what we'd want to have happen14:48
fungilooks like the only package installed from local source in that freeze is tricircle==5.1.1.dev3514:49
fungiboden: ^14:50
fungissbarnea|bkp2: thanks! that looks pretty straightforward14:50
bodenfungi: ack, got it...14:50
*** armstrong has joined #openstack-infra14:51
*** ssbarnea|bkp2 has quit IRC14:51
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Add the process environment to zuul.conf parser  https://review.openstack.org/61282414:51
*** rossella_s has quit IRC14:53
corvusinfra-root: i'm going to be afk until wednesday14:56
*** cfriesen has joined #openstack-infra14:56
fungithanks for the heads-up! i hope it's for fun reasons14:57
*** rpioso|afk is now known as rpioso14:58
*** ssbarnea has joined #openstack-infra15:00
*** diablo_rojo has joined #openstack-infra15:00
*** dansmith is now known as SteelyDan15:01
*** dave-mccowan has joined #openstack-infra15:04
*** dave-mccowan has quit IRC15:10
*** hasharAway is now known as hashar15:12
openstackgerritSorin Sbarnea proposed openstack-dev/pbr master: Correct documentation hyperlink for environment-markers  https://review.openstack.org/61357615:16
*** gyee has joined #openstack-infra15:23
*** onovy has joined #openstack-infra15:31
*** smarcet has quit IRC15:40
*** apetrich has quit IRC15:44
*** zul has quit IRC15:52
clarkbmorning, having a slow start to the day and I need to run some erradns so may be a bit before I'm actually around. I'd like to look at using the new compute resource usage logs to produce a report of some sort that shows usage by projects (and maybe by distro-release and other stuff if we can do it)15:52
*** agopi is now known as agopi|food15:52
clarkbit looks like zk is still happy and the node count has leveled off15:53
*** gothicmindfood has joined #openstack-infra15:54
*** apetrich has joined #openstack-infra15:58
*** lpetrut has quit IRC16:00
*** ginopc has quit IRC16:02
*** dtantsur is now known as dtantsur|afk16:07
*** kjackal has quit IRC16:10
*** e0ne has quit IRC16:10
*** kopecmartin is now known as kopecmartin|off16:13
*** bnemec is now known as beekneemech16:13
dmsimarddoes "flake8: noqa" no longer work ? I'm seeing pep8 failures that should be ignored16:18
openstackgerritAlex Schultz proposed openstack-infra/project-config master: Add noop to instack-undercloud  https://review.openstack.org/61363016:19
openstackgerritTobias Henkel proposed openstack-infra/nodepool master: Support node caching in the nodeIterator  https://review.openstack.org/60464816:20
*** shardy has quit IRC16:20
*** fried_rice is now known as fried_rolls16:20
dmsimardlooks like it's "noqa" instead of "flake8: noqa" now16:21
dmsimard¯\_(ツ)_/¯16:21
clarkbdmsimard: it has always been just # noqa iirc16:23
fungii don't recall ever using "flake8: noqa" and only ever used "noqa" myself16:23
dmsimardhttp://codesearch.openstack.org/?q=flake8%3A%20noqa&i=nope&files=&repos=16:24
fungiinteresting. i guess that must have worked at some point or else it's a really huge case of cargo-culting16:26
ssbarneasome with me, only used # noqa --- .... when it was not really possible to avoid it.16:26
mordredit definitely _used_ to work16:26
*** hashar is now known as hasharAway16:28
fungiskimming through http://flake8.pycqa.org/en/latest/release-notes/index.html it doesn't look like they ever deprecated it and i even see a reference to it in the latest 3.6.0 notes16:28
fungihttp://flake8.pycqa.org/en/latest/release-notes/3.6.0.html#features16:28
fungidmsimard: so it should still work?16:29
fungidmsimard: care to link to the failure in question?16:29
*** aojeagarcia has quit IRC16:30
dmsimardsure, hang on16:30
dmsimardexample: http://logs.openstack.org/99/613399/1/check/openstack-tox-pep8/6b14dee/job-output.txt.gz#_2018-10-25_23_54_59_038898 fixed by https://review.openstack.org/#/c/613634/16:31
*** jpich has quit IRC16:31
openstackgerritTobias Henkel proposed openstack-infra/nodepool master: Support node caching in the nodeIterator  https://review.openstack.org/60464816:31
*** mriedem is now known as mriedem_away16:34
fungidmsimard: is "E261 at least two spaces before inline comment" perhaps the actual problem you ended up solving (inadvertently) there?16:35
dmsimardyeah that's the first thing I tried16:36
fungiyou replaced "foo # flake8: noqa" lines with "foo  # noqa" (note the leading double-space)16:36
dmsimardthese failures sort of confused me to be honest because this code hasn't been touched in a very long time16:36
dmsimardand it started failing just now16:36
fungidid you suddenly switch to a newer flake8?16:37
dmsimardnot sure, to be fair there isn't exactly a lot of traffic on ara since everything is focused on the new 1.0 repos so it may be something that failed now but the cause dates back days/weeks16:38
fungiflake8==3.6.016:38
fungithat's the newest release from 3 days ago16:38
*** agopi|food is now known as agopi16:38
fungiand note the comment about noqa in the release notes i linked for 3.6.016:38
fungi"Only skip a file if # flake8: noqa is on a line by itself (See also GitLab#453, GitLab!219)"16:39
fungiso i take that to mean that prior to 3.6.0 it was skipping that whole file because at least one line had "# flake8: noqa"16:39
dmsimardI think it was only meant to skip a particular line but don't quote me on that16:39
dmsimardat least, that's my understanding of it16:40
fungiyeah, see those gitlab links16:40
dmsimardand from codesearch, it seems to be how projects are using it too16:40
fungiwhich would explain why all those unrelated linting errors for that file suddenly popped up when switching to 3.6.016:40
dmsimardoh! so it ignored the whole file instead of just the one line16:40
fungihttps://gitlab.com/pycqa/flake8/issues/45316:40
dmsimardwhich is in all likelihood not the original intent16:40
fungiyeah, i think people were misusing it16:40
fungiso les cultes du cargo at work16:41
dmsimardwell, if pep8 jobs start failing all over the place, we'll know why :D16:41
* fungi butchers french for your pleasure16:41
dmsimardmaybe openstack-dev worthy16:41
fungiyes, i think this will be of interest to openstack-dev ml16:41
dmsimardI'll send something16:41
openstackgerritTobias Henkel proposed openstack-infra/nodepool master: Support node caching in the nodeIterator  https://review.openstack.org/60464816:41
fungiodds are few people have run into it yet because we generally pin linters for official openstack projects at the start of a cycle16:42
*** fuentess has joined #openstack-infra16:42
fungiso this will require a fair amount of cleanup from a lot of projects who were up to now doing the wrong thing and not realizing it16:43
*** trown is now known as trown|lunch16:43
dmsimard++16:43
*** sthussey has joined #openstack-infra16:44
dmsimardfungi: I don't see a pin on flake8 in openstack/requirements.. would that be elsewhere ?16:49
fungidmsimard: it's in each project. we omit linters from requirements tracking explicitly16:50
dmsimardah16:50
fungibecause different projects will want to raise their linter caps at their own pace16:51
dmsimardmakes sense16:53
*** derekh has quit IRC16:58
*** bauwser is now known as bauzas17:01
*** betherly has quit IRC17:03
*** zul has joined #openstack-infra17:04
*** jpena is now known as jpena|off17:06
*** lpetrut has joined #openstack-infra17:08
*** jamesmcarthur has quit IRC17:16
*** electrofelix has quit IRC17:32
*** ykarel|away has joined #openstack-infra17:34
ssbarneafungi: no meeting in progress, good time to merge https://review.openstack.org/#/c/613022/ ?17:35
ssbarneaif i remember well openstack approach regarding linting was to pin to hacking which was pinning flake8, right?17:36
*** bobh has quit IRC17:36
fungiyes on both questions17:37
*** lbragstad is now known as elbragstad17:37
fungi613022 could use a second infra-root reviewer though since i'm the only +2 on it17:37
*** Swami has joined #openstack-infra17:38
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: Cleanup down ports  https://review.openstack.org/60982917:41
Shrewsianw: addressed your comments in ^^^17:42
*** smarcet has joined #openstack-infra17:49
*** trown|lunch is now known as trown17:50
*** xek has quit IRC17:58
*** jamesmcarthur has joined #openstack-infra18:01
AJaegerconfig-core, please review https://review.openstack.org/613092 https://review.openstack.org/#/c/611893/ https://review.openstack.org/#/c/61189218:10
*** armstrong has quit IRC18:10
*** munimeha1 has quit IRC18:11
*** mriedem_away is now known as mriedem18:28
*** apetrich has quit IRC18:34
*** apetrich has joined #openstack-infra18:35
*** e0ne has joined #openstack-infra18:35
openstackgerritFelipe Monteiro proposed openstack-infra/project-config master: Remove airship-armada jobs, as they are all in project  https://review.openstack.org/61101318:35
openstackgerritMerged openstack-infra/project-config master: Move openstack-browbeat zuul jobs to project repository  https://review.openstack.org/61309218:35
openstackgerritMerged openstack-infra/project-config master: New Repo - OpenStack-Helm Images  https://review.openstack.org/61189218:40
openstackgerritSean McGinnis proposed openstack-dev/pbr master: Fix incorrect use of flake8:noqa  https://review.openstack.org/61366518:43
openstackgerritMerged openstack-infra/project-config master: New Repo: OpenStack-Helm Docs  https://review.openstack.org/61189318:47
*** jamesmcarthur has quit IRC18:48
clarkbfungi: any idea if there are any meetings we need to worry about for 613022?18:53
fungididn't sound like it, but i haven't checked18:53
fungissbarnea seemed to think it was safe earlier when he brought it up18:54
clarkbeavesdrop seems to think it is ok18:54
ssbarneawhat wrong can happen?18:55
clarkbssbarnea: the meetbot is restarted when we change its config so if a meeting is running when that happens it will break the logging of that meeting18:55
ssbarneaahh, yeah. that is why i suspect weekends are the best times for that. i doubt we have official meetings during them.18:56
clarkbthe latest meeting on eavesdrop is 1500UTC18:57
clarkbso I approved it (1900UTC now)18:57
*** armax has quit IRC19:05
openstackgerritAlex Schultz proposed openstack-infra/project-config master: Add noop to instack-undercloud  https://review.openstack.org/61363019:06
*** fried_rolls is now known as efried19:07
*** efried is now known as fried_rice19:11
*** e0ne has quit IRC19:14
*** jcoufal has quit IRC19:15
*** dave-mccowan has joined #openstack-infra19:18
*** toabctl has quit IRC19:19
*** e0ne has joined #openstack-infra19:19
*** smarcet has quit IRC19:19
*** hasharAway is now known as hashar19:20
*** toabctl has joined #openstack-infra19:21
fungias long as it doesn't take several days to merge, we should be fine ;)19:22
*** e0ne has quit IRC19:23
*** anticw has joined #openstack-infra19:26
anticwzuul/pipeline q ... is it possible to have a 3rd party gate that can test some but not all PS?  and then have it +Verified at which point zuul will no longer spend effort testing a PS?19:27
*** harlowja has quit IRC19:27
clarkbanticw: third party testing can filter patchsets however they like. Not sure what you mean by the second bit. You want zuul to not test a patchset if it gets +1 from third party ci? if so that isn't possible because you need a +1 and +2 from zuul to merge code19:31
Shrewsclarkb: oh, forgot to answer your question from yesterday. no, nodepool should not reuse image names. it does, however, retry upload attempts that fail. perhaps a failure actually succeeded?19:33
openstackgerritClark Boylan proposed openstack-infra/zuul master: Small script to scrape Zuul job cpu usage  https://review.openstack.org/61367419:35
clarkbShrews: interesting, that could be19:35
openstackgerritMerged openstack-infra/system-config master: Adding openstack-browbeat  https://review.openstack.org/61302219:36
*** lpetrut has quit IRC19:37
openstackgerritMerged openstack-infra/zuul master: quick-start: add a note about github  https://review.openstack.org/61339819:42
openstackgerritJeremy Stanley proposed openstack-infra/zuul master: Add reenqueue utility  https://review.openstack.org/61367619:44
*** jamesmcarthur has joined #openstack-infra19:46
clarkbmentioned this over in the tc channel but got some simple scripting going to determine nodepool node usage rates by project19:48
clarkbhttp://paste.openstack.org/show/733154/ produced by https://review.openstack.org/61367419:48
clarkbthe breakdown is tripleo: ~50% of all cpu time, neutron: ~14% and nova ~5%19:48
clarkbfor the 13 hour period I scraped the logs for19:48
Shrews50%? wow. i wonder what percentage of our nodes that ends up being19:51
clarkbShrews: 50%19:51
Shrewsoh, i misunderstood what you meant by cpu time19:51
clarkbya sorry. the calculation is job runtime * number of nodes used19:52
clarkband that is 50% of what we used not 50% of theoretical max (though I think we were behind the entire 13 hour period so should be the same)19:52
*** kjackal has joined #openstack-infra19:55
openstackgerritTobias Henkel proposed openstack-infra/nodepool master: Support node caching in the nodeIterator  https://review.openstack.org/60464820:03
openstackgerritTobias Henkel proposed openstack-infra/nodepool master: Rate limit updateNodeStats  https://review.openstack.org/61368020:03
openstackgerritTobias Henkel proposed openstack-infra/nodepool master: Rate limit updateNodeStats  https://review.openstack.org/61368020:09
fungiclarkb: thanks, sounds like it roughly matches up with what we expected20:10
clarkbfungi: ya no surprises for me other than nova and neutron are lower than they were a year ish a go when I ran the numbers20:11
clarkbbut they still seem to be right near the top20:11
clarkbfungi: the other piece of info I find interesting is kolla and osa vs tripleo20:11
clarkb(are they not testing enough or is tripleo just incredibly inefficient, maybe both)20:11
clarkbalso 374 days of testing in 13 hours20:12
*** ansmith_ has quit IRC20:12
clarkbnotmyname also pointed out that activity is likely to factor in. Particularly so since I only had a small window of data20:12
clarkbI think if we can look at a months worth over say the month of november we'll have a much better overall picture20:13
fungiyeah, this is really early to be drawing detailed conclusions20:15
fungialso what will be more interesting is not the snapshot but the trends over time20:15
fungiis that 50% decreasing? and how quickly?that will be interesting to20:15
fungifind out20:15
clarkbyup20:16
clarkband whether or not it syncs up with the release cycle in interesting ways20:16
clarkbor is random etc20:16
fungii mean, they're already aware it's a concern and are working to improve the situation. now we can tell them how effective their attempts are to that end20:16
fungiwhich is far more interesting to me than blamethrowing and witch hunts20:16
anticwclarkb: there is no way to have an external bot +2 in which case zuul doesn't need to test?20:19
clarkbanticw: not in the current system. Zuul is our gatekeeper and it doesn't know how to share those duties with another system20:20
clarkb(I think that is intentional fwiw not a bug)20:20
clarkbthe reason for that is zuul has to ensure that the changes going through it don't break zuul itself20:20
anticwclarkb: would it be perverse then to have the zuul job check an external gate for status and short-circuit to OK?20:23
clarkbanticw: it might be better to try and undersatnd what is you are trying to do more concretely? What does the third party job do? Does it have to be third party? etc20:24
anticwopenstack helm jobs are involved and take a long time, i'm asking people if we can do some of this work and avoid hitting the gates so hard20:24
anticwthere are also sometimes quite long delays before a job will run (3-4 hours isn't uncommon)20:24
*** zul has quit IRC20:25
anticwconcretely, i'm looking at zuul right now for 613611,1 for example, we're 5 hours 14 minutes into it20:26
clarkbanticw: that is relevant to the discusion fungi and I were just having above. In the last 13 hours openstack-helm was .8% of our resource consumption20:26
clarkbanticw: put another way helm isn't hitting the gates so hard20:26
clarkb(so we shouldn't expect helm moving third party to change the backlog situation dramatically)20:27
*** kjackal has quit IRC20:27
anticwok good to know ... so are we doing things poorly that are causing delays?  these delays aren't new20:27
anticwalso, we get a lot of post-failures20:27
anticw(it's better in the last week after some refactoring but still not very fast)20:27
clarkbanticw: no the delays are due to total demand, we have a fixed number of test resources and people trying to test far more than we can keep with (tripleo is ~50% over the last 13 hours for example)20:28
*** armax has joined #openstack-infra20:28
clarkbanticw: the ways to improve the backlog are either to reduce demand (fix bugs in software to reduce gate resets and number of failures that are "invalid") and to increase the number of resources we have20:28
*** jamesmcarthur has quit IRC20:29
clarkbanticw: are you having post failures due to timeouts?20:29
anticwsometimes, unclear why things are slow20:29
clarkbanticw: examples would be good if you have them because post failures can happen for a number of reasons20:29
anticwhttps://review.openstack.org/#/c/613356/20:30
anticwjust taking a recent job20:30
anticwalso, things run slower than i would expect ... if i run the jobs on a VM locally ... on very old hardware (very old) and slow rotating disks, the gate jobs for me run in about half the time20:30
fungianticw: the thread starting at http://lists.zuul-ci.org/pipermail/zuul-discuss/2018-October/000575.html is also probably relevant to your concerns20:30
anticwsometimes less20:30
anticwtesting the gate jobs in aws and azure the timing is even better than my local test20:31
clarkbanticw: do you know what sort of resources you are cosntrainted by? are you running dstat or similar so that we can see what the hold up is?20:31
fungianticw: are your slow-running jobs reliant on nested virtualization performance, perhaps?20:31
clarkbanticw: fwiw kata switch from azure to vexxhost (one of our providers) and the runtime went in half? something like that20:31
anticwfungi: no20:31
clarkbanticw: likely important to identify what the resource contention is if we want to improve it20:31
anticwclarkb: i'm guessing we're badly IO limited in some but not all cases20:32
anticwcertainly some builder infra (as identified by hostname which might be bogus) seem worse than others20:32
clarkbanticw: the last time this came up with OSH I had asked for more logging and data like dstat. Any idea if we have that? Its easy to point and say "this is bad aws better" but I can't make that actionable20:32
anticwclarkb: we have 'more logging' but i don't know that it's enough to pin point it just yet20:33
anticwsrwilkers: ^ ?20:33
*** ykarel|away has quit IRC20:34
anticwnot entirely useful but https://pastebin.com/HGKCEGgJ is a grep from the job running locally20:35
anticwthat shows the job ran in about 15 minutes ... again ... on a VM on pretty old (2010) hardware20:35
clarkbanticw: that particular post failure appears to be due to one of the instances not being reachable at the end. It appears that the job failed properly earlier in the job due to mariadb not starting (possibly beacuse it was supposed to run on the non responsive isntance?)20:35
anticwusing the aformentioned url i think that took 1h 2 mins on a gate20:36
clarkbanticw: ish, but it timed out waiting for a thing to happen that never happend. I don't think that timeout was due to slowness but isntead due to network communication problems20:36
clarkb(still not good, but important to identify the issue)20:36
anticwclarkb: ok, network issues is something that's been pointed out before20:36
anticwi'm not really sure what those would be ... and why some builders wouuld have them20:37
anticwagain, i tried aws and azure as reference points and for the most part they were rock solid and considerably faster (2x to 4x)20:37
clarkbright and I'm asking you to help us idenfity why that is the case so that we can hopefully improve the situation with our zuul20:38
anticwyeah, so if it's networking ... what do you suggest to help there?20:38
funginetwork connectivity between instances in some cloud providers can vary in quality, for sure. that's been one of our biggest challenges for overall reliability of jobs20:38
anticwfungi: ok, but ... we're not doing a lot of networking20:38
clarkbanticw: I don't know that networking was slow. It appears networking didn't work at all for one of your 5 instances. Those two issues may be orthogonal to each other20:38
anticwand networking between VMs in providers isn't a new thing20:38
fungium, yes i'm quite aware20:39
anticwclarkb: others have claimed networking issues as well20:39
fungiis your problem a new thing?20:39
anticwno, not new20:39
clarkbfungi: no we tried debugging this a while back20:39
clarkbI asked for logging and never got any20:39
fungiokay, just figuring out what you mean by "networking between VMs in providers isn't a new thing"20:39
anticwjust the number of checks required has increased so transient failures bite more now20:39
clarkbbecause unfortunately we werent' logging why the containers were failing to start20:39
clarkbjust that they had failed20:39
fungianticw: and this is a 5-node job?20:39
anticwfungi: i'm saying whilst i accept networking might be an issue ... in this day and age that seems suprising20:39
anticwthe above example was yes20:40
fungioh, then prepare to be surprised20:40
fungicloud providers love to under^Wright-size their network gear20:40
fungiand it gets saturated massively for some periods20:40
anticwyeah ... but even so ... networking go super fast and super cheap ... you'd really have to put effort in to make it that poor20:40
anticw10GbE+ is basically free at this stage ... i have a box of 10G nics someone gave me even20:41
fungihas nothing to do with effort making network gear slow and everything to do with noisy neighbors sharing network resources with you20:41
clarkbok I've confirmed the node that had sshc onnectivity issues in ara is the one that was trying to host the failed mariadb container20:41
fungiand if cloud providers are using servers from 2010, imagine that their network gear can easily be of the same vintage20:41
anticwclarkb: how did you verify that?20:41
anticwfungi: i'm using 2010 hardware, i imagine they are using something less ancient20:42
fungiahh, in some cases they aren't (or not much newer anyway)20:42
*** smarcet has joined #openstack-infra20:42
*** openstackstatus has quit IRC20:42
*** openstack has joined #openstack-infra20:44
*** ChanServ sets mode: +o openstack20:44
clarkbanticw: I can't use that data to say that would cause slowless (I don't know enough about your tests), but I am fairly confident that is why it failed20:44
anticwi guess our tests take a long time20:44
anticwwhich makes things worse20:44
fuentessclarkb: hi Clark, can you help me disable the kata Fedora job for the proxy repo? We still have some issues with Fedora on vexxhost, so would be good to disable it until we resolve them20:45
anticwmore likley to have some sort of glitch somewhere the longer we run20:45
clarkbfuentess: yup20:45
anticwclarkb: i don't really know how to instrument cpu/io on zuul VMs but i could locally ... is that useful?20:46
clarkbfuentess: https://git.openstack.org/cgit/openstack-infra/project-config/tree/zuul.d/projects.yaml#n45 is the section of code to edit. Remove the line for the fedora job20:46
clarkbanticw: we've used dstat for a long time with things like devstack + tempest jobs20:46
fuentessclarkb: ohh cool, thanks20:46
anticwyeah, we could have dstat log in the background20:47
clarkbanticw: captures io (network and disk), memory use, cpu usage etc every second iirc20:47
clarkbanticw: and there are tools to render the data into more human friendly formats like stackviz20:47
fungii wonder if we even already have a dstat role for starting it early in the job and then collecting its logs20:47
clarkb(though I think stackviz dstat rendering is broken right now)20:47
clarkbfungi: that type of work has been ongoing in zuul land20:47
clarkbfungi: no definitive answer yet but progress I think20:48
fungiyeah, i could see that being generally useful across a broad variety of jobs20:48
anticwre: multinode i could spit out a DS that does NxN network pings and have that log i guess20:49
anticw(ping in a generic sense)20:49
anticwpeople like mallanox who have their own CI ... does that also require zuul for merges?20:50
fungithat might at least allow you to also short-circuit your job early of one of your expected nodes in the multinode set becomes unreachable20:50
anticw(i forget where i saw this, some typo on a PS# and it popped up once)20:50
clarkbanticw: the other thing to keep in mind is that the jobs themselves can crash the networkign on the host too (I have no idea if that is happening here)20:50
clarkbeither by updating the firewall improperly or applying new config to interfaces that won't work within a provider. We've seen both things happen with jobs in the past20:51
anticwclarkb: how does networking on the host crash?  that seems like it should be pretty rare20:51
anticwthat's the sort of thing i would expect on a c-6420:51
fungi"crash" is a relative term here ;)20:51
clarkbanticw: crash in the sense it stops working not kernel panic crash20:51
clarkbwe've had jobs use invalid network ranges and apply them to the actual host interfaces20:52
clarkbthat will breakthings fast20:52
fungior you could do something to inadvertently flush the iptables rules suddenly blocking all traffic on that instance20:52
clarkbwe've also had jobs apply firewall rules that prevent ssh ya ^20:52
fungior something could simply cause the service you're trying to talk to on that node to die and not restart20:52
anticwwe used to use a lot of memory20:52
anticwthat's better now but not ideal20:52
fungiyes, oom killer knocking out a crucial service is not unusual20:53
anticwour stuff is kind bloaty :(20:53
anticwi don't know that we get oom, most just poor IO performance (lack of page cache hits)20:53
fungiare your jobs setting up swap memory? if not, that will lead to oom faster than you expect20:53
clarkbfungi: I think we do that in the base job now?20:54
clarkbbut it might be devstack specific?20:54
anticwfungi: swap will cause k8s to cry, though can be worked around20:55
clarkbusually you want swap not because you expect the job will succeed, but because swap will allow you to get the necssary data to diagnose problems that happen when memory runs out20:55
fungiwow, really? kubernetes is allergic to swap memory? that seems strange20:56
clarkbfungi: likely as much as anything else is like mysql or kvm20:56
clarkbthings will get really slow and stop working within timeouts20:56
fungii mean, obviously you don't want active tasks paging out memory they're still accessing, but it can give you breathing room for other background processes to get paged out20:56
anticwkinda, there is a long thread/debate about it and i'm just gonna get angry if i get into it :)20:57
fungifair! ;)20:57
*** hashar has quit IRC20:57
anticwi also had someone back into me an hour ago so am a bit crouchy20:57
anticwinto my car i mean20:57
clarkbin any case I think the short term answer here is it would be great to get more log data if possible. Understanding what resource contentions you do have so that we can at least attempt to address them would be good20:58
anticwi like the idea of a long running dstat ... or netstat20:59
anticwi think that would be useful20:59
anticwand some sort of networking sanity checker20:59
anticwit might also be we're just asking too much from the VMs and should move entirely too a 3rd party gate (if possible)20:59
clarkbyup that is possible too, but hard to say without data like ^21:00
anticwclarkb: well, one old data point i that when i run a test locally it needs over 8GB ... how it even runs on the gates i'm not sure21:00
clarkbas for third party gating I think your hack is the cloeset you will get. Zuul has to gate its config changes21:00
anticwwe would still have to wait for things to work through the queue though21:01
anticweven if we fall out in 20s21:01
clarkbanticw: jobs that just want to do an http request don't need to use a nodest with nodes. They can run directly on the executor21:01
clarkbit is a very constrained environment and we use it for stuff like retrigger read the docs builds21:02
anticwclarkb: yeah, but we might not know if we were able to test it in some cases21:02
clarkbalso tripleo has a plan to reduce test resource needs as well as make tests more reliable. Here is hoping that improves the demand sideo f things for them21:02
anticwit would require us to have thorough and robust external gates, i was thinking more "if we can..."21:03
*** trown is now known as trown|outtypewww21:03
openstackgerritSalvador Fuentes Garcia proposed openstack-infra/project-config master: Remove Fedora Job for Kata project  https://review.openstack.org/61369021:03
clarkbfuentess: thanks, I went ahead and approved it21:03
fuentessclarkb: thank you21:04
*** jamesdenton has quit IRC21:11
*** fuentess has quit IRC21:12
*** eharney has quit IRC21:12
*** jamesmcarthur has joined #openstack-infra21:13
fungiwow, even the new zuul status ui is taking a while to load for me21:22
fungitripleo changes currently account for 75% of the changes in the gate pipeline21:23
*** jamesmcarthur has quit IRC21:24
fungiand roughly a third of them are indicating job failures21:24
fungior merge conflicts or dependency on a failed change21:25
fungii should say roughly a third of the changes near the top of their gate queue anyway21:25
fungilooks like the wait for node requests in the check pipeline is a little over 5 hours at this point21:26
*** boden has quit IRC21:26
fungiwe're hovering around 700 nodes in use at the moment21:33
fungiwith another ~100 building/deleting21:34
*** jbadiapa has quit IRC21:34
clarkbya thats about right with inap disabled21:34
fungiand still down half of ovh right?21:34
clarkbno ovh is up21:36
clarkband rarely leaking ports in gra121:37
*** rlandy has quit IRC21:41
*** mriedem has quit IRC21:52
*** armax has quit IRC21:53
*** jlvillal has joined #openstack-infra22:09
*** ansmith_ has joined #openstack-infra22:16
openstackgerritAlex Schultz proposed openstack-infra/project-config master: Add noop to instack-undercloud  https://review.openstack.org/61363022:19
*** tpsilva has quit IRC22:27
*** bobh has joined #openstack-infra22:28
*** armax has joined #openstack-infra22:59
*** agopi has quit IRC23:02
*** pcaruana has quit IRC23:10
*** rh-jelabarre has quit IRC23:18
*** diablo_rojo has quit IRC23:40
*** Swami has quit IRC23:43
*** rcernin has joined #openstack-infra23:43
*** jesusaur has quit IRC23:45
*** smarcet has quit IRC23:45
*** rcernin has quit IRC23:51
*** kgiusti has left #openstack-infra23:51
*** gyee has quit IRC23:55

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!