Monday, 2019-03-04

*** jamesmcarthur has quit IRC00:00
*** jamesmcarthur has joined #openstack-infra00:00
*** jamesmcarthur has quit IRC00:02
*** ijw has joined #openstack-infra00:08
ianwthere's not enough space to if we wanted to anyway, and we should maybe cleanup old volumes.  i'll add a meeting item00:16
ianwfungi: it's already at a 5 minute ttl, so i think most things would be not really caching the remote address00:28
*** hwoarang has quit IRC00:32
*** jamesmcarthur has joined #openstack-infra00:32
*** hwoarang has joined #openstack-infra00:33
ianw#status log graphite.o.o A/AAAA records renamed to graphite-old.o.o, graphite.o.o now a CNAME to these until switch to graphite.opendev.org00:36
openstackstatusianw: finished logging00:36
openstackgerritMerged openstack-infra/project-config master: Add windmill-config project  https://review.openstack.org/64051000:52
*** wolverineav has joined #openstack-infra00:55
fungiianw: that assumes applications re-resolve hostnames periodically at all00:56
fungifor datagrams it might not be as prevalent as established tcp sockets though00:57
fungi(continuing to assume the original address to which you resolved a name is relevant)00:57
*** wolverineav has quit IRC01:00
*** jamesmcarthur has quit IRC01:01
ianwfungi: yeah, we only open udp 8125 and i think it all pretty much uses the statsd client, every packet blast will be a fresh lookup to the local resolver anyway01:01
*** markvoelker has joined #openstack-infra01:02
fungiahh, in that case it's probably fine01:04
*** sdake has joined #openstack-infra01:06
*** markvoelker has quit IRC01:07
*** jesusaur has quit IRC01:19
*** jesusaur has joined #openstack-infra01:25
*** ijw has quit IRC01:27
*** ijw_ has joined #openstack-infra01:27
*** ijw_ has quit IRC01:28
*** ijw has joined #openstack-infra01:29
*** ijw has quit IRC01:30
*** ijw_ has joined #openstack-infra01:30
*** sdake has quit IRC01:44
*** diablo_rojo has joined #openstack-infra01:48
*** sdake has joined #openstack-infra01:50
*** wolverineav has joined #openstack-infra01:56
*** sdake has quit IRC01:59
*** wolverineav has quit IRC02:01
*** markvoelker has joined #openstack-infra02:03
*** sdake has joined #openstack-infra02:05
*** diablo_rojo has quit IRC02:06
*** markvoelker has quit IRC02:06
*** sdake has quit IRC02:08
*** chason has quit IRC02:36
*** chason has joined #openstack-infra02:38
*** sdake has joined #openstack-infra02:41
*** wolverineav has joined #openstack-infra02:42
*** sdake has quit IRC02:44
ianwhttps://github.com/jsocol/pystatsd/blob/master/statsd/client/udp.py#L30 ... boo it's not fine, it looks up once02:47
*** sdake has joined #openstack-infra02:50
*** wolverineav has quit IRC02:50
*** psachin has joined #openstack-infra02:52
*** ijw_ has quit IRC02:58
*** ijw has joined #openstack-infra02:59
*** jamesmcarthur has joined #openstack-infra03:01
*** wolverineav has joined #openstack-infra03:05
*** rkukura has quit IRC03:05
*** jamesmcarthur has quit IRC03:05
*** ijw has quit IRC03:08
*** ijw has joined #openstack-infra03:09
*** ykarel|away has joined #openstack-infra03:11
*** yamamoto has joined #openstack-infra03:16
*** ykarel|away is now known as ykarel03:17
ianw03:21:35.787558 IP 117.114.139.162.59587 > graphite.openstack.org.8125: UDP, length 4903:22
ianwi can not log into this host, but yet it's making it past the graphite firewall, despite not being listed in iptables :/ ?03:22
ianwin china somewhere, must the old arm builder?03:23
*** sdake has quit IRC03:24
*** sdake has joined #openstack-infra03:26
*** jamesmcarthur has joined #openstack-infra03:27
ianwlooks like a zuul executor?  something to do with ustack?  http://paste.openstack.org/show/747206/03:29
clarkbthats unexpected03:29
clarkbis it in our firewall rules? the system config log may have ideas03:30
ianwok, so it's tcpdump, it's *not* making it past the firewall.  but someone is misconfigured to send their stats to us03:30
ianwyeah, it's something from https://www.ustack.com/03:35
*** sdake has quit IRC03:41
*** jamesmcarthur has quit IRC03:41
fungisounds like someone deployed a ci system using a copy of our configuration management and didn't change the statsd destination variable03:43
*** wolverineav has quit IRC03:47
*** wolverineav has joined #openstack-infra03:48
*** janki has joined #openstack-infra03:52
*** wolverineav has quit IRC04:14
*** yamamoto has quit IRC04:18
*** udesale has joined #openstack-infra04:20
*** ramishra has joined #openstack-infra04:26
*** diablo_rojo has joined #openstack-infra04:32
*** yamamoto has joined #openstack-infra04:36
*** ijw has quit IRC04:37
*** udesale has quit IRC04:39
*** ijw has joined #openstack-infra04:39
*** udesale has joined #openstack-infra04:41
*** ijw has quit IRC04:45
ianwyep, just had me confused because we did have a builder at one point in CN sending stats (moved to london though).  but yeah, it's executor, builder, scheduler stats coming in so a misconfiguration04:50
*** ykarel is now known as ykarel|afk04:51
*** wolverineav has joined #openstack-infra04:52
*** Tengu_ is now known as Tengu04:58
*** sdake has joined #openstack-infra05:09
ianw#status graphite.opendev.org now active replacement for graphite.openstack.org.  everything on the firewall list that might need a restart to pickup new address has been done05:14
openstackstatusianw: unknown command05:14
ianw#status log graphite.opendev.org now active replacement for graphite.openstack.org.  everything on the firewall list that might need a restart to pickup new address has been done05:14
openstackstatusianw: finished logging05:15
*** ijw has joined #openstack-infra05:15
*** snapiri has joined #openstack-infra05:19
*** wolverineav has quit IRC05:23
*** ykarel|afk is now known as ykarel05:25
*** ricolin has joined #openstack-infra05:27
*** udesale has quit IRC05:39
*** udesale has joined #openstack-infra05:39
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: Separate out executor server from runner  https://review.openstack.org/60707905:41
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: zuul-runner: implement prep-workspace  https://review.openstack.org/60708205:41
*** raukadah has quit IRC05:45
*** chandankumar has joined #openstack-infra05:46
*** sdake has quit IRC05:48
*** sdake has joined #openstack-infra05:54
*** sdake has quit IRC06:01
*** sdake has joined #openstack-infra06:02
*** sdake has quit IRC06:03
*** diablo_rojo has quit IRC06:04
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: Add API endpoint to get frozen jobs  https://review.openstack.org/60707706:26
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: Get executor job params  https://review.openstack.org/60707806:26
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: Separate out executor server from runner  https://review.openstack.org/60707906:26
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: zuul-runner: implement prep-workspace  https://review.openstack.org/60708206:26
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: zuul-runner: add yaml based configuration file  https://review.openstack.org/64067206:26
*** apetrich has joined #openstack-infra06:30
*** tkajinam_ has joined #openstack-infra06:36
*** tkajinam has quit IRC06:38
*** udesale has quit IRC06:46
*** udesale has joined #openstack-infra06:47
*** markvoelker has joined #openstack-infra06:48
*** wolverineav has joined #openstack-infra06:54
*** wolverineav has quit IRC07:02
*** quiquell|off is now known as quiquell07:03
*** udesale has quit IRC07:05
*** udesale has joined #openstack-infra07:08
*** udesale has quit IRC07:09
*** udesale has joined #openstack-infra07:09
*** auristor has quit IRC07:12
*** slaweq has joined #openstack-infra07:15
*** markvoelker has quit IRC07:21
AJaegerconfig-core, could you go over the queue, please? we have some changes since mid February waiting for a second +2...07:25
*** wolverineav has joined #openstack-infra07:27
*** jtomasek has joined #openstack-infra07:30
*** wolverineav has quit IRC07:31
*** yamamoto has quit IRC07:40
*** wolverineav has joined #openstack-infra07:41
*** yamamoto has joined #openstack-infra07:41
*** auristor has joined #openstack-infra07:41
*** pgaxatte has joined #openstack-infra07:42
*** quiquell is now known as quiquell|brb07:43
*** rkukura has joined #openstack-infra07:44
*** wolverineav has quit IRC07:47
*** ginopc has joined #openstack-infra07:50
*** snapiri has quit IRC07:52
*** adriancz has joined #openstack-infra07:56
*** kopecmartin|off is now known as kopecmartin08:04
*** jesusaur has quit IRC08:06
*** jesusaur has joined #openstack-infra08:10
*** panda|ruck|off is now known as panda|ruck08:12
*** rpittau|sardegna is now known as rpittau08:13
*** dpawlik has joined #openstack-infra08:16
*** markvoelker has joined #openstack-infra08:18
*** quiquell|brb is now known as quiquell08:18
*** rascasoft has joined #openstack-infra08:20
*** pcaruana has joined #openstack-infra08:25
*** mandre_away is now known as mandre08:25
*** tosky has joined #openstack-infra08:27
*** helenafm has joined #openstack-infra08:29
*** rkukura has quit IRC08:31
*** wolverineav has joined #openstack-infra08:33
*** e0ne has joined #openstack-infra08:34
*** wolverineav has quit IRC08:37
*** iurygregory has joined #openstack-infra08:40
*** jpich has joined #openstack-infra08:43
*** dtantsur|afk is now known as dtantsur08:44
*** tkajinam_ has quit IRC08:48
*** markvoelker has quit IRC08:50
*** jpena|off is now known as jpena08:56
*** rossella_s has joined #openstack-infra08:56
*** rkukura has joined #openstack-infra08:58
*** roman_g has joined #openstack-infra08:59
*** janki has quit IRC09:00
*** janki has joined #openstack-infra09:00
*** udesale has quit IRC09:04
openstackgerritMerged openstack-infra/storyboard-webclient master: removes # for cards in automatic worklists  https://review.openstack.org/64012809:10
openstackgerritRoman Gorshunov proposed openstack-infra/irc-meetings master: Update Airship meeting time, change chair  https://review.openstack.org/64035909:15
*** ykarel is now known as ykarel|lunch09:21
*** derekh has joined #openstack-infra09:34
*** e0ne has quit IRC09:40
*** jaosorior has joined #openstack-infra09:41
*** jistr is now known as jistr|sick09:42
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Add Authorization Rules configuration  https://review.openstack.org/63985509:43
*** markvoelker has joined #openstack-infra09:47
*** jbadiapa has quit IRC09:51
*** gfidente has joined #openstack-infra09:52
*** IvensZambrano has joined #openstack-infra09:54
*** gfidente has quit IRC10:00
*** mwhahaha has quit IRC10:01
*** mwhahaha has joined #openstack-infra10:01
*** jpich has quit IRC10:05
*** jpich has joined #openstack-infra10:06
*** electrofelix has joined #openstack-infra10:07
*** gfidente has joined #openstack-infra10:08
*** ykarel|lunch is now known as ykarel10:10
*** wolverineav has joined #openstack-infra10:10
*** jpich has quit IRC10:10
*** jpich has joined #openstack-infra10:12
*** jbadiapa has joined #openstack-infra10:13
*** wolverineav has quit IRC10:15
*** e0ne has joined #openstack-infra10:16
*** luizbag has joined #openstack-infra10:18
*** dtantsur has quit IRC10:18
*** yamamoto has quit IRC10:18
*** dtantsur has joined #openstack-infra10:18
DobroslawHello zuul magicians10:19
DobroslawIs there any magic that needs to be done to run custom job on release pipeline?10:19
DobroslawWhat is my problem: I want to push docker image with proper tag to docker hub on release10:19
DobroslawI added it to `.zuul.yml` https://github.com/openstack/monasca-common/blob/master/.zuul.yaml#L36-L3810:19
Dobroslawbut looks like it wasn't run10:19
Dobroslawthis job is running only in periodic and post10:19
Dobroslawhttp://zuul.openstack.org/builds?project=openstack%2Fmonasca-common&job_name=docker-publish-monasca-base10:19
DobroslawI checked kolla repo and looks like it have the same problem10:19
Dobroslawhttps://github.com/openstack/kolla/blob/master/.zuul.d/ubuntu.yaml#L26-L2910:19
Dobroslawhttp://zuul.openstack.org/builds?project=openstack%2Fkolla&pipeline=release&job_name=kolla-publish-ubuntu-binary10:19
openstackgerritSlawek Kaplonski proposed openstack-infra/project-config master: Move openstack-tox-lower-constraints to UT jobs graph  https://review.openstack.org/63932110:19
*** markvoelker has quit IRC10:21
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: Proposed spec: tenant-scoped admin web API  https://review.openstack.org/56232110:25
*** yamamoto has joined #openstack-infra10:36
*** yamamoto has quit IRC10:36
*** yamamoto has joined #openstack-infra10:37
openstackgerritMerged openstack-infra/project-config master: Add openstack-tox-py37 job to Neutron dashboard  https://review.openstack.org/63958810:37
*** yamamoto has quit IRC10:41
openstackgerritRico Lin proposed openstack-infra/project-config master: Add an openstack/auto-scaling-sig repository  https://review.openstack.org/63712510:42
*** ykarel is now known as ykarel|mtg10:43
openstackgerritSlawek Kaplonski proposed openstack-infra/project-config master: Move openstack-tox-lower-constraints to UT jobs graph  https://review.openstack.org/63932110:48
openstackgerritMerged openstack-infra/project-config master: Set up placement project to use storyboard  https://review.openstack.org/63944510:48
AJaegerDobroslaw: I see the job run, see http://zuul.openstack.org/builds?job_name=docker-publish-monasca-base10:48
DobroslawAJaeger: but not on release10:49
AJaegerDobroslaw: please follow our naming conventions for in-repo jobs at https://docs.openstack.org/infra/manual/drivers.html#consistent-naming-for-jobs-with-zuul-v310:49
AJaegerIt should be monasca-common-X - and not X-monasca-common.10:49
DobroslawOK, will fix10:49
AJaegerwe share a namespace and that makes it easier to see where jobs come from.10:49
AJaegerthanks10:50
*** ricolin has quit IRC10:50
Dobroslawbut it still is not running on release10:50
fricklerDobroslaw: this is what I see in the zuul log, but I cannot find where the BranchMatcher:master comes from http://paste.openstack.org/show/747225/10:50
cmurphyinfra-root I'm seeing some mirror issues with limestone regionone e.g. http://logs.openstack.org/14/640214/3/check/devstack-xenial/de9a63e/job-output.txt.gz http://logs.openstack.org/93/628193/21/check/openstack-tox-py35/117afc2/job-output.txt.gz10:50
fricklerDobroslaw: may well be a zuul bug10:50
AJaegerfrickler: the job si defined on master - so might be implicit10:51
AJaegerDobroslaw, best ask later again once US westcoast is awake...10:52
DobroslawAJaeger: frickler OK, will wait10:52
Dobroslawthank you10:52
Dobroslawbut release is going from master so for me it's looks like zuul bug10:53
fricklercmurphy: looking10:55
*** jangutter has joined #openstack-infra10:57
frickler[Mon Mar 04 06:39:15.952767 2019] [core:error] [pid 1507:tid 139635334301440] [client 2607:ff68:100:54:f816:3eff:fe8e:3c5d:55636] AH00037: Symbolic link not allowed or link target not accessible: /var/www/mirror/ubuntu11:03
fricklerinfra-root: did we change our mirror setup somehow lately? ^^11:04
fricklerin addition I'm also seeing flapping connectivity to that mirror node, like maybe a duplicate node issue11:05
fricklerhmm, actually the latter seems to be the more significant issue11:07
*** jangutter has quit IRC11:09
*** e0ne has quit IRC11:10
*** e0ne has joined #openstack-infra11:11
*** yamamoto has joined #openstack-infra11:14
*** udesale has joined #openstack-infra11:15
ianwfrickler: yeah, no changes and http://grafana.openstack.org/d/ACtl1JSmz/afs?orgId=1 looks healthy11:18
*** markvoelker has joined #openstack-infra11:18
*** jangutter has joined #openstack-infra11:26
openstackgerritJens Harbott (frickler) proposed openstack-infra/project-config master: Disable provider limestone  https://review.openstack.org/64073711:42
fricklerinfra-root: ^^ let's disable limestone until we can get stable networking to the mirror node again11:43
*** markvoelker has quit IRC11:50
*** udesale has quit IRC11:52
*** udesale has joined #openstack-infra11:53
*** wolverineav has joined #openstack-infra11:59
*** e0ne has quit IRC12:00
openstackgerritMerged openstack-infra/project-config master: Move openstack-tox-lower-constraints to UT jobs graph  https://review.openstack.org/63932112:02
*** wolverineav has quit IRC12:04
*** e0ne has joined #openstack-infra12:08
*** aojea has joined #openstack-infra12:09
*** ykarel|mtg is now known as ykarel12:13
*** janki has quit IRC12:16
*** janki has joined #openstack-infra12:16
*** udesale has quit IRC12:20
*** udesale has joined #openstack-infra12:21
*** dave-mccowan has joined #openstack-infra12:22
fricklerthere are also afs failures on mirror02.limestone, which probably explain the issues above:12:22
fricklerMar  4 12:18:53 mirror02 kernel: [2541863.823809] afs: Lost contact with file server 104.130.138.161 in cell openstack.org (code -1) (all multi-homed ip addresses down for the server)12:22
*** priteau has joined #openstack-infra12:30
*** jcoufal has joined #openstack-infra12:34
openstackgerritLuigi Toscano proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-sahara-dashboard-dsvm-integration  https://review.openstack.org/64075112:38
*** jpena is now known as jpena|lunch12:38
*** udesale has quit IRC12:38
*** udesale has joined #openstack-infra12:41
*** markvoelker has joined #openstack-infra12:48
*** yamamoto has quit IRC12:48
*** edmondsw has joined #openstack-infra12:49
*** zbr has quit IRC12:53
*** zbr|ssbarnea has joined #openstack-infra12:53
*** zbr|ssbarnea has quit IRC12:54
*** zbr has joined #openstack-infra12:55
*** rlandy has joined #openstack-infra12:57
*** jistr|sick is now known as jistr|sick|mtg13:01
*** e0ne has quit IRC13:04
*** udesale has quit IRC13:08
*** udesale has joined #openstack-infra13:09
*** yamamoto has joined #openstack-infra13:16
*** rh-jelabarre has joined #openstack-infra13:17
*** markvoelker has quit IRC13:21
*** sdake has joined #openstack-infra13:23
*** jamesmcarthur has joined #openstack-infra13:24
*** jpena|lunch is now known as jpena13:31
*** jamesmcarthur has quit IRC13:32
*** jamesmcarthur has joined #openstack-infra13:33
*** udesale has quit IRC13:36
*** mhu has quit IRC13:36
*** mhu has joined #openstack-infra13:37
*** yamamoto has quit IRC13:37
*** jamesmcarthur has quit IRC13:38
*** yamamoto has joined #openstack-infra13:39
*** wolverineav has joined #openstack-infra13:47
*** sthussey has joined #openstack-infra13:47
*** rfolco is now known as rfolco|pto13:50
*** wolverineav has quit IRC13:52
fungilogan-: when you're around, we've been seeing intermittent but significant packet loss to the ipv4 address of our mirror server there (216.245.200.132)13:59
fungii was seeing upwards of 60% icmp packet loss a few minutes ago from multiple parts of the internet, though it's cleared up again for the moment14:00
fungii haven't observed any packet loss to its ipv6 address, but i wasn't testing both concurrently so that could just be luck14:01
*** jamesmcarthur has joined #openstack-infra14:03
fungii suspect it was impacting v6 as well since i expect cacti is polling snmp on it over v6 and i see some prominent gaps in our graphs for it... take the root disk utilization graph for example http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=64934&rra_id=all14:06
*** jamesmcarthur has quit IRC14:06
*** jamesmcarthur has joined #openstack-infra14:06
fricklerfungi: yes, I was pinging v6 from bridge earlier and also seeing gaps in connectivity14:07
openstackgerritMerged openstack-infra/project-config master: Disable provider limestone  https://review.openstack.org/64073714:08
fungiand now it's entirely unreachable14:13
fungifd3b:86c4:135e:d033::100 is returning no route to host for it14:13
*** sdake has quit IRC14:14
fungithat seems to be a non-globally-routable address somewhere in limestone's network, i guess on a router serial interface14:15
fungithough i can reach it over ipv4 currently14:16
fungiincreasingly strange14:16
*** markvoelker has joined #openstack-infra14:18
*** janki has quit IRC14:19
fungiprobably coincidence, but i started pinging the v6 gateway from the mirror server (over a v4 ssh connection) and suddenly stopped getting ipv4 packets through14:20
fungibut my previously hung v6 ssh session is suddenly working again14:20
*** janki has joined #openstack-infra14:24
*** jamesmcarthur has quit IRC14:24
*** sdake has joined #openstack-infra14:25
*** sdake has quit IRC14:25
mordredfungi: maybe it's a poltergeist14:28
mordredinfra-root: the project creation playbook for gitea seems to be working!14:28
pabelangerI see things!14:30
*** pcaruana has quit IRC14:31
mordredpabelanger: hopefully not dead people ...14:31
pabelanger++14:31
mordredhttps://review.openstack.org/#/c/640218/ and https://review.openstack.org/#/c/640431/ could use review14:32
*** beekneemech is now known as bnemec14:33
*** mriedem has joined #openstack-infra14:36
*** ykarel is now known as ykarel|away14:37
*** sdake has joined #openstack-infra14:38
*** jamesmcarthur has joined #openstack-infra14:38
*** ykarel|away has quit IRC14:46
*** ekultails has joined #openstack-infra14:47
*** sdake has quit IRC14:49
*** markvoelker has quit IRC14:50
*** sdake has joined #openstack-infra14:56
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Add Authorization Rules configuration  https://review.openstack.org/63985514:56
*** e0ne has joined #openstack-infra14:59
fungii am mildly curious what happens if a project is renamed to the old name of a different already-renamed project14:59
fungiwhether gitea drops the old redirect at that point, or rejects the rename attempt14:59
fungi(or option #3)14:59
*** 18WAAFJGJ has quit IRC15:00
*** guilhermesp has joined #openstack-infra15:01
fungiand now we're back to ipv4 working to the limestone mirror but not v615:01
fungii wonder if i can replicate the earlier switch in behavior15:02
*** e0ne has quit IRC15:02
*** jistr|sick|mtg is now known as jistr|sick15:02
funginevermind, now they're both unresponsive again15:02
mordredfungi: I thnik corvus actually submitted a patch upstream related to that - but I don't remember what the sitch is15:03
fungiand now v4 is responding for me again15:04
*** e0ne has joined #openstack-infra15:04
*** eharney has joined #openstack-infra15:04
fungipinging the ipv6 default gateway from the mirror over an ipv4 ssh session doesn't seem to have broken v4 connectivity this time15:05
fungihowever v6 is working again now15:05
fungii wonder if something is breaking neighbor discovery and/or arp15:06
corvusfungi, mordred: yeah, that case should be handled in gitea master: https://github.com/go-gitea/gitea/pull/6216 merged15:07
*** kgiusti has joined #openstack-infra15:07
fungineat! glad i'm not the only one who has these idle what-ifs15:07
fungisince odds are at some point we would have run into that case15:08
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Remove TaskManager and just use keystoneauth  https://review.openstack.org/64064315:08
*** sdake has quit IRC15:11
*** priteau has quit IRC15:11
*** zul has joined #openstack-infra15:12
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Remove TaskManager and just use keystoneauth  https://review.openstack.org/64064315:12
corvusfungi: yeah, i ran into it pretty quick in testing :)15:14
*** sdake has joined #openstack-infra15:15
*** roman_g has quit IRC15:19
*** roman_g has joined #openstack-infra15:19
funginot surprised15:22
*** pcaruana has joined #openstack-infra15:25
corvusfungi: can you review https://review.openstack.org/636775 when you have a moment?  it will improve run_all.sh a bit15:26
fungihappy to!15:27
corvusthough, oops15:28
fungiunapproved15:28
fungiwhat's the oops i overlooked?15:28
corvusfungi, mordred, clarkb: it looks like remote_puppet_git is taking 1.5 hours now15:28
fungioof15:28
fungithat sounds pathological. any idea what task is taking so long?15:29
corvusprobably gitea creation :)15:29
fungiyeah, just thinking maybe that's a one-time cost?15:30
fungior several-time until it gets all the way through?15:30
fungior is it in theory already done creating all the repos15:31
fungi?15:31
corvusyeah... do we know if it's gotten through yet?15:31
mordredcorvus: it SEEMS like all the projects are in - it's at least not showing the one-project-per-run pattern from before and there are a lot in there15:32
*** chandankumar is now known as raukadah15:32
corvusmordred: there are failures in the log for the 'create repo' task15:32
mordredcorvus: ah. then maybe not15:32
corvusthank goodness we still have the debug line in there that tells us what project it's doing :)15:34
*** wolverineav has joined #openstack-infra15:35
*** armax has joined #openstack-infra15:36
corvusmordred, fungi: it looks like we're stuck.  i've checked the last 3 runs, and gitea08 has bombed on stackforge/halthnmon15:36
corvusi think because the description is too long15:36
corvusso i think we just need to update the playbook to use a substring15:37
corvus(i thought we did that already)15:37
openstackgerritMerged openstack-infra/system-config master: Add gitea to project rename playbook  https://review.openstack.org/64021815:37
corvusand yeah, i think they're all stuck on that repo15:38
openstackgerritMonty Taylor proposed openstack-infra/nodepool master: Remove TaskManager and just use keystoneauth  https://review.openstack.org/64064315:38
fungii can't find a halthnmon15:38
corvushealthnmon15:39
fungiahh, stackforge/healthnmon15:39
fungiyeah, tried some obvious variations ;)15:39
mordredgood old healthnmon15:39
corvusfungi: we're playing "spot the typo!"15:39
fungi296 characters, give or take a newline15:40
*** wolverineav has quit IRC15:40
fungiis that field limited to 256 now?15:40
mordredI bet it is15:40
mordredor probably 25515:40
mordredsince it's probably a database field length15:40
fungilimited in code or by db column width?15:40
fungiahh, yeah15:40
mordredI'm guessing db column width15:40
mordredcorvus: you doing a patch or want me to?15:41
openstackgerritDavid Moreau Simard proposed openstack-infra/project-config master: Retire ara-{server,clients,plugins} repos: they've been merged to ara  https://review.openstack.org/64078515:41
corvusmordred: can you?  i'll go double check the length15:41
mordredyup15:41
*** janki has quit IRC15:42
corvusyeah, both repo and org descriptions are limited to 25515:42
*** ricolin has joined #openstack-infra15:42
fungilimestone update, the mirror is now fairly responsive over both ipv4 and ipv6 again15:42
fungiat least for my ssh sessions, but snmp still seems to not be getting much in the way of responses yet according to cacti15:43
fungiahh, yeah seeing packet loss again already. that didn't seem to last long15:44
corvusfungi, mordred: did you know that our base playbook alternates between taking ~12 seconds and 20 minutes to run?15:44
fungiweird...15:44
corvushttp://grafana.openstack.org/d/qzQ_v2oiz/bridge-runtime?panelId=6&fullscreen&orgId=1&from=now-7d&to=now15:44
fungiand, no, i did not know that ;)15:44
corvusoh there's a third time in there too... ~10m15:45
corvusoh nm15:45
corvusthat's a bug i fixed in that patch15:45
fungiquick!15:45
corvusone of those is the "k8s_bootstrap" run15:45
corvusso as far as timings go -- i think the git playbook isn't going to get any faster.  we're already looking at it taking 1.5 hours to no-op it's way from start to healthnmon15:47
*** quiquell is now known as quiquell|off15:47
corvusless says that's 98% through the file, so we *are* almost done15:47
*** markvoelker has joined #openstack-infra15:48
*** zul has quit IRC15:48
fungithis is using api queries not direct to database, right?15:48
openstackgerritRico Lin proposed openstack-infra/project-config master: Add an openstack/auto-scaling-sig repository  https://review.openstack.org/63712515:48
fungiis it parallel across all the gitea servers or serialized?15:48
corvusright, though most of the playbook was already doing that, the only thing we changed was updating the settings15:49
corvusfungi: parallel15:49
fungioof, so ~1.5 hours just to check that ~2k projects don't need updating15:49
corvuswe do execute the settings update regardless15:50
corvuswe have task profiling turned on, but it's not really helping to point out where we're spending the time on this one.  there are too many tasks15:50
fungiso we're managing to average around 2.7 projects per second there15:50
corvuswe ran 65688 tasks across all 8 servers15:51
corvusi can tell you the 50 slowest ones.15:51
fungier, i divided backwards. 2.7 seconds per project15:51
corvusbut i think we need average run time for each task15:51
mordredcorvus: ok. I may need your ansible/jinja help15:52
mordredcorvus: project.description | default('') ... where does a string slice go in that?15:52
corvus"{{ (project.description | default(''))[:255] }}"15:53
corvusi've found that if you sprinkle extra parens, it turns the filter's back into something more python-like :)15:54
mordredcorvus: wow. that's neat15:57
openstackgerritMonty Taylor proposed openstack-infra/system-config master: Limit project description to 255 characters  https://review.openstack.org/64078815:57
*** wolverineav has joined #openstack-infra15:57
mordredcorvus, fungi: ^^15:57
fungihuh, nifty slicing15:58
corvusmordred, fungi: when we ask gitea for the repo list, we are told the description, but we don't have the other info (like tracker url, etc).  but maybe under the circumstances, we should only run the update settings task if the description doesn't match.15:58
mordredcorvus: that works for me as a hack for now15:59
mordredcorvus: maybe we can make a PR to upstream to return more things in the API call - or have a 'detailed-repo-list' call or something15:59
corvus++15:59
*** josephrsandoval has joined #openstack-infra15:59
*** sdake has quit IRC16:00
corvusmordred: you want to work on the update if description changed change?16:00
mordredcorvus: should we set a different default description then? so that we know that if a description is there, it means we've run the settings update at least once?16:00
corvusmordred: oh, we don't update the description with the settings16:01
corvusoops, i just assumed we did.16:02
mordrednod. so - that doesn't get us much it seems16:02
corvusmordred: ok, how about we just go back to updating settings only when we create the project16:02
mordredyeah. and if we want we can make a version of this playbook that can be run by hand to fix up any botched creations16:02
corvusmordred: ya.  the only project that will cause a problem with at the moment is healthnmon.16:03
openstackgerritMonty Taylor proposed openstack-infra/system-config master: Only update gitea project settings during creation  https://review.openstack.org/64078916:04
*** e0ne has quit IRC16:06
*** wolverineav has quit IRC16:07
openstackgerritMonty Taylor proposed openstack-infra/system-config master: Add utility playbook for fixing gitea project settings  https://review.openstack.org/64079216:07
mordredcorvus: there's a utility playbook that we can use for a forced full sync16:08
*** luizbag has quit IRC16:08
corvus++16:08
openstackgerritCamila Moura proposed openstack-infra/storyboard master: Creates the tag with project name and priority  https://review.openstack.org/64079316:09
mordredcorvus: I also don't see an api call we can use to update a description (was going to add a task to update the description if it didn't match)16:10
*** e0ne has joined #openstack-infra16:12
*** tobeass-urdin is now known as tobias-urdin16:17
toskyuhm, I have a zuul job (jobA) defined in the repository A and it uses the bindep role. If I try using the job in another repository (B), it fails because some binary dependencies are not installed16:18
toskyso I guess that jobA tries to use the bindep.txt file from B, not from A16:18
toskyis that expected?16:18
*** gfidente has quit IRC16:19
clarkbtosky: if you are using the infra provided parent job stuff that runs bindep then yes that is relative to the repo under test16:19
toskyclarkb: isn't it a bit of unexpected? Or at least it was for me - I would expect the job to be self-contained16:20
*** e0ne has quit IRC16:20
toskyso that I shouldn't need to not copy the content of bindep.txt, as it is an implementation detail of the job16:20
clarkbtosky: if you want the job to run against a fixed target you'll need to implement that in the job itself16:21
*** markvoelker has quit IRC16:21
clarkbtosky: most of our job design assumes reusable code that pplies to arbitrary repos, not a job fixed against a repo that is mixed in with other repos16:21
clarkbthis is possible, you just have to set it up that way16:21
*** e0ne has joined #openstack-infra16:22
toskyso I guess I would need to tune the call to the bindep role16:22
clarkbtosky: for example the job that sets up the bindep stuff is probably defined in openstack-zuul-jobs. We don't want to force every tox job to use openstack-zuul-jobs bindep file16:22
clarkbto catch up on scrollback it sounds like we've disabled limestone due to network flakyness and the gitea project creation playbook works now but is really slow?16:23
*** e0ne has quit IRC16:23
clarkblooks like the gitea sign in box is still present. Do we need to retrigger docker image builds to rebuild that image after the docker promotion fixes?16:23
*** wolverineav has joined #openstack-infra16:24
mordredtosky: when you use the job in another repository, you can set zuul_work_dir in the job variables16:25
*** gfidente has joined #openstack-infra16:25
mordredtosky: when I make jobs that I expect to run in a consistent context regardless of who triggered them, I'll oftentimes set zuul_work_dir in the job defintion16:26
mordredtosky: in fact - thanks! you just caused me to realize I've got a job that's not doing what I thought it was16:29
toskythanks clarkb and mordred16:29
toskyin fact the other role which is defined and used in the job uses a variable which points to the correct place, I just need to pass it to the bindep role16:29
mordredtosky: remote:   https://review.openstack.org/640804 Make tox tips job actually run sdk16:30
mordredtosky: if you set zuul_work_dir it should make its way to the bindep role16:30
mordredtosky: but cool!16:31
*** wolverineav has quit IRC16:31
mordredtosky: (that's the patch to openstacksdk that this discussion just made me realize I needed ... so thanks again)16:31
clarkbif you do that it won't run bindep for the repo under test which might be needed if installing that repo as well16:31
*** ramishra has quit IRC16:31
mordredclarkb: indeed. we don't really have a bindep-siblings behavior defined particularly well16:32
*** gyee has joined #openstack-infra16:32
toskysomething like "merge all the bindep dependencies" ?16:32
mordredyeah. oh - actually - no, we should not add that16:34
*** sreejithp has joined #openstack-infra16:34
mordredclarkb: a repo will ultimately need to define transitive bindep depends in its own bindep file for real-world usage anyway16:34
mordredbecause pip install foo does not know how to find the bindep file of foo and install it first16:34
mordredso if we did bindep-siblings, we might actually wind up letting people develop things that have transitive bindep depends and work in the gate but not in the real world16:35
*** sreejithp has quit IRC16:35
*** sreejithp has joined #openstack-infra16:35
clarkbya16:35
*** aojea has quit IRC16:38
clarkbmordred: is https://review.openstack.org/#/c/640789/ that the fix for slow gitea runs?16:38
*** sreejithp has quit IRC16:39
*** sreejithp has joined #openstack-infra16:39
mordredclarkb: yeah16:39
mordredclarkb: and the followup lets us cleanup if something deps16:39
mordredderps16:39
clarkbok I'm caught up on that stack and it is now approved16:40
clarkbany idea on the sign in link on the gitea UI? my hunch is we need to rerun image builds since the docker image building jobs were unhapy when the fix for that merged16:41
* clarkb finds tea now that fixes are approved16:41
*** yamamoto has quit IRC16:41
corvusi'll take a look at that16:42
*** rpittau is now known as rpittau|afk16:42
corvusclarkb: i think the image is updated, but i don't think that docker-compose automatically redploys it for us16:43
*** helenafm has quit IRC16:43
corvushere's the last build: http://zuul.openstack.org/build/0c842abbb7474874b406393a4e12710816:45
openstackgerritMerged openstack-infra/system-config master: Limit project description to 255 characters  https://review.openstack.org/64078816:47
fungitosky: clarkb: remember that bindep is about declaring dependencies for a project, not dependencies for a ci job. if a cross-project job has specific distro packages it needs installed, it should take care of that itself16:48
mordredcorvus, clarkb: yes - I believe that is correct - I think we need to tell docker-compose to update16:48
*** ginopc has quit IRC16:49
*** wolverineav has joined #openstack-infra16:49
fungitosky: clarkb: though if the job needs to install packages which are dependencies declared for multiple projects, it seems reasonable to consider concatenating the bindep.txt from each of those projects and use that to determine what to install16:51
fricklercorvus: there was a question by Dobroslaw earlier about why some jobs aren't running in the release queue. I found this in zuul.log, but have no idea where that BranchMatcher comes from, might be a bug? http://paste.openstack.org/show/747225/16:51
*** pgaxatte has quit IRC16:51
*** Vadmacs has joined #openstack-infra16:51
*** yamamoto has joined #openstack-infra16:52
clarkbfrickler: you can't have branch matchers on tag based pipelines since tags don't have a single branch (I think there is work to make this optionally work?). That said I agree I don't see where that branch matcher is set16:53
*** wolverineav has quit IRC16:54
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Add Authorization Rules configuration  https://review.openstack.org/63985516:54
fricklerclarkb: yes, I was looking for a "branch: master" or similar in the zuul.yaml, but I didn't find it16:55
corvusclarkb: https://review.openstack.org/578557 is the change you're thinking about, but i'm not sure that's the problem at issue here.16:55
*** dpawlik has quit IRC16:56
corvusclarkb, frickler: i stand corrected, i think that may be at issue here16:56
fricklercorvus: yeah, at least the commit message seems to match somehow16:57
*** yamamoto has quit IRC16:57
frickleror maybe even https://review.openstack.org/64027216:58
corvusfrickler: here's our documentation that says "don't put tag jobs in-repo": https://docs.openstack.org/infra/manual/creators.html#central-config-exceptions16:58
dmsimardAJaeger: eh, I need to send a patch to add the noop-jobs template to ara-plugins/server/clients in project-config because removing the zuul.yaml file from the repo leaves the projects with no jobs16:58
fricklercorvus: I see, would it make sense to add a note that this covers any release job?17:00
*** josephrsandoval has quit IRC17:00
openstackgerritDavid Moreau Simard proposed openstack-infra/project-config master: Add noop-jobs to ara-{server,clients,plugins}  https://review.openstack.org/64081817:01
fricklergotta leave, bbl17:02
fungifrickler: release, pre-release and tag pipelines in this case17:04
*** kopecmartin is now known as kopecmartin|off17:04
corvusinfra-root: i'd like to restart the nodepool launchers and all of zuul to pick up the affinity changes.  Shrews, should i restart the builders as well?17:06
clarkbcorvus: does that include the base64 commit message escaping?17:07
clarkb(thinking it would be nice to get that in if we can)17:07
corvuslemme check that has landed17:07
mordredcorvus: we should keep our eyes peeled on the launchers/builders due to new openstacksdk release17:08
mordredcorvus: it should be fine - but, you know, a release happened, so being aware isn't terrible17:08
corvusclarkb: yes, that is branch tip in fact17:08
Shrewscorvus: builders are optional for your purposes, but we should do so at some point17:08
mordred(the image code in particular got moved around - although the code is all the same)17:08
openstackgerritDavid Moreau Simard proposed openstack-infra/project-config master: Retire ara-{server,clients,plugins} repos: they've been merged to ara  https://review.openstack.org/64078517:08
Shrewscorvus: b/c of the things mordred said17:08
*** psachin has quit IRC17:09
corvusdo we have a restart nodepool playbook?17:09
Shrewsinfra-root: fyi, launchers will restart with the port cleanup code (again)17:09
*** wolverineav has joined #openstack-infra17:09
Shrewsclarkb: you have a good way to monitor that? ^^^17:09
clarkbShrews: its tough because most of the time the clouds also do similar17:10
clarkbShrews: we only notice on nodepool side when cloud side stops keeping up and i think we are in a period of clouds keeping up right now17:10
clarkbShrews: but in general port list and grep for DOWN or Null attached ports17:10
openstackgerritDavid Moreau Simard proposed openstack-infra/project-config master: Retire ara-{server,clients,plugins} repos: they've been merged to ara  https://review.openstack.org/64078517:10
clarkbyou should see that they go away within the port cleanup run period17:11
clarkbmordred: does the sdk release include the changes to taskmanagers?17:11
clarkbmordred: or are we not affected by that until we update nodepool?17:11
mordredclarkb: no - that isn't in yet17:12
Shrewscorvus: no playbook that i know of17:13
corvusShrews: i'm writing one now!17:14
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Add nodepool_restart playbook  https://review.openstack.org/64082217:15
*** agopi has joined #openstack-infra17:15
dmsimardconfig-core: an easy review to add noop-jobs which is a dependency in cleaning up stuff: https://review.openstack.org/#/c/640818/17:16
openstackgerritAdam Coldrick proposed openstack-infra/storyboard master: Add a 'security' flag to Teams  https://review.openstack.org/64082317:16
*** wolverineav has quit IRC17:16
clarkbcorvus: fwiw I think you can stop/start them all together as the ordering doesn't matter. That might make the waiting for timeouts slightly shorter17:16
corvusclarkb: good point17:17
*** markvoelker has joined #openstack-infra17:18
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Add nodepool_restart playbook  https://review.openstack.org/64082217:20
corvusclarkb: something like that? ^17:20
clarkbya that should do it17:20
openstackgerritMerged openstack-infra/system-config master: Only update gitea project settings during creation  https://review.openstack.org/64078917:21
*** sdake has joined #openstack-infra17:22
openstackgerritMerged openstack-infra/system-config master: Add utility playbook for fixing gitea project settings  https://review.openstack.org/64079217:22
corvusokay, i'll get started on the restarts now.  starting with nodepool first17:22
clarkbI'm reading email and can be on standby for helping with things17:23
corvuswow.  the upload recency table in the builder debug log is.... a lot of numbers :)17:23
corvus#status log restarted nodepool launchers and builder at commit 3561e278c6178436aa1d8d673f839a676598ea1717:25
openstackstatuscorvus: finished logging17:25
corvusbuilders even17:25
dmsimardthanks for the quick reviews <317:26
*** jpich has quit IRC17:27
openstackgerritMerged openstack-infra/project-config master: Add noop-jobs to ara-{server,clients,plugins}  https://review.openstack.org/64081817:28
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Add Authorization Rules configuration  https://review.openstack.org/63985517:29
*** dtantsur is now known as dtantsur|afk17:29
*** dpawlik has joined #openstack-infra17:30
corvusokay, that took a little longer than i expected, but i have found a node created and in-use since the restart17:31
mordredcorvus: \o/17:32
*** rkukura has quit IRC17:32
corvusthe only errors i'm seeing are over-quota errors in ord, and that could be restart related17:33
*** dpawlik has quit IRC17:34
corvusthere are 3 dib images being built now, so we should see if they upload later17:35
*** roman_g has quit IRC17:36
corvusone of them just failed with this message: http://paste.openstack.org/show/747250/17:37
corvuspabelanger: ^ ?17:37
clarkbhttp://git.openstack.org/cgit/openstack/windmill-config seems to confirm it is empty17:38
*** roman_g has joined #openstack-infra17:38
fungilogan-: not sure if you're around to day, but in case you didn't see earlier, we're experiencing a lot of packet loss to mirror02.regionone.limestone.openstack.org over both ipv4 and ipv6 since 04:00 utc (we took that region out of service in our nodepool config starting around 15:15 utc)17:38
corvusdid we... approve a project creation while the git playbook is broken?17:38
pabelangerthat was approved by ianw this morning17:39
corvusyes, it merged at Mon Mar 4 00:52:02 2019 +000017:39
corvusokay. well, it's not going to exist and image builds aren't going to work until the git playbook runs to completion17:39
corvusi would suggest that we hold off on creating more projects, but at this point, it doesn't actually matter, so i don't think we should issue any guidance for reviewers.17:40
corvuswe'll just have to check back on image uploads later17:40
corvusmeanwhile, i think we're ready to restart zuul now17:40
*** ijw has quit IRC17:41
corvusclarkb, pabelanger: are we using jemalloc on all executors?17:42
clarkbcorvus: we will after this restart17:42
corvuslet me rephrase -- are we configured to do so?  and have we restarted them all with that configuration yet, or is this the first restart?17:42
fungithe change to install it on all of them merged17:42
logan-fungi: o/ was just catching up on that backscroll. I've had an ipv4 mtr open trying to see what's up, but no losses since I started that about 10 minutes ago. I'll start an ipv6 one here in a minute. no network issues reported in the racks where these nodes are, normal traffic levels, etc. i'll do some digging around on the nodes and see if I can find anything useful there too.17:43
corvusclarkb: ok, thanks, i think that's what i was asking :)17:43
clarkbcorvus: we should be configured to do so but we have not restarted with that config anywhere but ze0817:43
pabelangersorry, I haven't been following along on the jemalloc discussions17:43
fungicorvus: this will be the first time except on (i believe) ze08 which was our proving ground17:43
dmsimardIs zuul.o.o broken ? web is not displaying17:44
fungithanks logan-! i can reboot the node and see if it helps, just didn't know if you wanted to look into it first17:44
dmsimardit kind of flashes briefly17:44
corvusdmsimard: zuul is restarting17:44
openstackgerritHervé Beraud proposed openstack-dev/pbr master: Fix error when keywords are defined as a list in cfg  https://review.openstack.org/63966117:44
dmsimardack.17:44
fungii've checked ze03 (chosen at random) and it has the libjemalloc1 package installed and the expected LD_PRELOAD exported in /etc/default/zuul-executor17:45
*** wolverineav has joined #openstack-infra17:46
corvusthe scheduler has loaded its config, but the executors haven't finished stopping yet17:49
corvusre-enqueueing17:49
logan-fungi: thanks, i'll dig around for a bit and see what I can find before we try a reboot.17:49
*** gfidente has quit IRC17:50
fungilogan-: appreciated!17:51
*** markvoelker has quit IRC17:51
corvus#status log restarted zuul at commit d298cb12e09d7533fbf161448cf2fc297d9fd13817:55
openstackstatuscorvus: finished logging17:55
openstackgerritMerged openstack-infra/zuul-website master: Add a promotional message banner and events list  https://review.openstack.org/63987117:58
*** e0ne has joined #openstack-infra17:59
*** derekh has quit IRC18:05
*** mriedem has quit IRC18:05
*** wolverineav has quit IRC18:06
*** ociuhandu has joined #openstack-infra18:11
*** wolverineav has joined #openstack-infra18:14
*** rkukura has joined #openstack-infra18:15
*** e0ne has quit IRC18:17
*** diablo_rojo has joined #openstack-infra18:18
*** dpawlik has joined #openstack-infra18:19
*** electrofelix has quit IRC18:22
*** dpawlik has quit IRC18:24
*** whoami-rajat has quit IRC18:24
*** gfidente has joined #openstack-infra18:25
*** harlowja has quit IRC18:27
*** jpena is now known as jpena|off18:27
*** jamesmcarthur_ has joined #openstack-infra18:32
*** jamesmcarthur has quit IRC18:35
clarkbcorvus: we expect windmill-config to be populated after ansible runs with the fix from mordred to only update settings on config updates?18:35
mordredclarkb: yeah18:35
corvusclarkb: strictly speaking, we expect it to be populated after the description length limit change, but they merged at ~ the same time, so... "yeah" :)18:35
clarkbok, I think ansible is running with that now since HEAD is the utility to fix things on bridge18:36
clarkband according to the log it is going through and parsing project names?18:37
corvusclarkb: yeah, i think we're looking at ansible task overhead18:38
openstackgerritMerged openstack-infra/project-config master: Retire ara-{server,clients,plugins} repos: they've been merged to ara  https://review.openstack.org/64078518:38
fungiso theory is most of the ~2.7 seconds per project is ansible spinning up tasks?18:40
*** jamesmcarthur_ has quit IRC18:41
openstackgerritMerged openstack-infra/storyboard master: Add documentation for private stories  https://review.openstack.org/63623518:41
corvusfungi: it'll be less now that we're not unconditionally doing a POST request, but yes.  the only tasks it's doing right now are string comparison.18:41
*** panda|ruck is now known as panda|ruck|off18:41
corvusat the moment, i would not describe this communication as being faster than light.18:42
fungii know little about ansible internals. does it at least reuse a single python process for those?18:42
fungi(so it's not spending most of it's time creating and tearing down python interpreter processes)18:43
corvusnope.18:43
corvusit uses multiprocessing with forks behind the scenes, and at the moment, the subprocs it's using are turning over rather quickly.18:44
fungiso i guess if this were a python script which opened a persistent https socket to the api and hammered it while iterating over the list of projects, it might complete far more quickly18:45
corvusi think this is the same thing we noticed with the plain base playbook too -- where we have so many tasks across all our hosts that ansible bogs down and runs them unusually slowly.  we didn't get far debugging that because at the time, the inventory plugin issue make debugging it impossible.18:46
corvusfungi: yes, though, what this playbook is currently doing is spending 5 seconds "hammering the api", then 1.5 hours doing string comparisons.  then terminating.18:47
fungiaha. wow18:47
*** markvoelker has joined #openstack-infra18:48
corvusi wanted to debug that some more, this may be a really good playbook for that....18:51
corvusso maybe i should spend some time this afternoon poking at that (because if we find something that could be improved, it could make base.yaml run faster).  but if we don't make any headway, we may just need to turn this into a python script :/18:52
*** jamesmcarthur has joined #openstack-infra18:53
fungithat does sound like a good opportunity18:53
*** pcaruana has quit IRC18:56
*** whoami-rajat has joined #openstack-infra18:58
*** jlvillal has quit IRC18:58
*** jamesmcarthur has quit IRC18:59
*** jlvillal has joined #openstack-infra18:59
*** jamesmcarthur has joined #openstack-infra18:59
*** IvensZambrano has quit IRC19:01
*** ricolin has quit IRC19:01
*** electrofelix has joined #openstack-infra19:02
clarkbinfra-root I've done an audit of our current afs volumes and written down a plan for upgrading the afs file servers (but not yet the DBs) at https://etherpad.openstack.org/p/201808-infra-server-upgrades-and-cleanup if you get time to read that over and check it for reasonableness that would be much appreciated.19:02
openstackgerritDavid Moreau Simard proposed openstack-infra/project-config master: Remove openstack-cover-jobs/docs-on-readthedocs templates for ara  https://review.openstack.org/64083719:02
*** ijw has joined #openstack-infra19:03
*** jamesmcarthur has quit IRC19:04
fungiclarkb: did you notice whether the afs02.dfw.openstack.org/main02 cinder volume outage on saturday had any adverse effects?19:07
*** sdake has quit IRC19:07
clarkbfungi: I have not, but also not looked super closely. Just a vos listvldb so far19:08
clarkbthen rendered that into the tabular data19:08
fungiahh, okay19:08
*** ijw has quit IRC19:10
*** electrofelix has quit IRC19:10
*** mriedem has joined #openstack-infra19:11
clarkbI'd be happy to start on steps 1 and 2 ( and 5.1 if we need that ) if this appears like a viable upgrade path19:11
*** ijw has joined #openstack-infra19:12
*** e0ne has joined #openstack-infra19:13
*** wolverineav has quit IRC19:15
*** ijw has quit IRC19:16
*** wolverineav has joined #openstack-infra19:16
*** agopi has quit IRC19:18
*** agopi_ has joined #openstack-infra19:18
*** ijw has joined #openstack-infra19:20
*** markvoelker has quit IRC19:21
*** jamesmcarthur has joined #openstack-infra19:22
*** openstackgerrit has quit IRC19:23
corvusinfra-root: i'm going to comment out the run_all cron job on bridge for manual debugging19:26
corvus(unfortunately, since bridge only has 2G of ram, it swaps even when it runs a single copy of the playbook; there's no way i can debug a second copy with a load average of 10)19:27
clarkbrgr19:27
*** ijw has quit IRC19:27
fungiyikes19:27
*** ijw has joined #openstack-infra19:27
fungithanks for the heads up, and for looking into it19:27
corvusand if anyone has any idea how to make bridge bigger, i'd love to hear it :)19:28
dmsimardcorvus: does it need to be more complicated than "openstack server resize <uuid> --flavor <new-flavor>" ?19:29
corvusdmsimard: no.  does that work on rax?19:31
dmsimardcorvus: it should -- I can test on a throwaway VM to make sure19:31
dmsimardI want to say I remember doing it at least once in rax but if I really did, it was a good while ago so let me double check :D19:32
*** eharney has quit IRC19:33
*** openstackgerrit has joined #openstack-infra19:34
openstackgerritMerged openstack-infra/zuul master: quickstart: web and others wait on mysql to start  https://review.openstack.org/64054819:34
fungiwhat are the main challenges with replacing bridge.o.o with a new instance?19:35
fungiis its ip address trusted and baked into a lot of stuff?19:35
fungior is it just that there's a lot of things on the filesystem which aren't configuration-managed?19:36
mordredfungi: I'd say both of those would be concerns I'd have19:37
dmsimardrabbit holing to find a working openstackclient on bridge, is there one in a virtualenv somewhere ?19:38
dmsimardpython3 -m venv is telling me the python3-venv package isn't installed by pip3 freeze shows that virtualenv is installed... but it's not in /usr/bin or /usr/local/bin19:39
fungidmsimard: ~root/launch-env/bin/openstack19:39
fungithough clarkb was talking about installing it globally19:39
dmsimardfungi: I tried launch-env first, it's complaining about the lack of python datetime lib19:39
openstackgerritRiju Khatri proposed openstack-infra/storyboard-webclient master: adds activity indicator to new worklist modal and detail Task: 29730  https://review.openstack.org/64084519:40
fungidmsimard: ahh, yep. the things i don't notice when i use ~fungi/launch-env/bin/openstack instead19:40
dmsimardthat one works \o/19:41
dmsimardcorvus: is "corvustest" from rax FDW a good candidate for testing a resize ? :D19:41
fungidmsimard: what sort of resize are you wanting to test? afaik rackspace doesn't have server instance resizing allowed (at least for pvhvm flavors)19:42
dmsimardfungi: corvus and I were wondering if resizing worked at all19:43
fungidoesn't hurt to re-validate that previous determination19:43
fungii'll cross my fingers that something there has changed since the last few times we tried19:43
fungii don't recall being able to resize any servers since we moved off the rackspace legacy cloud years ago19:44
fungithey used to have a kb article which said the only way to resize a server was to build a replacement19:46
*** agopi_ is now known as agopi19:50
fungiyeah, at least in their cloud dashboard the "resize" option is greyed out in the action drop-down for bridge.o.o and there's a link in the right-hand sidebar to https://support.rackspace.com/how-to/upgrading-resources-for-general-purpose-or-io-optimized-cloud-servers/19:51
dmsimardtrying a resize on a VM I created, OS-EXT-STS:task_state                | resize_migrating19:52
dmsimardwill let you know if it works or not19:53
fungiwhat flavor did you choose?19:53
*** eernst has joined #openstack-infra19:54
fungisame as bridge? (2 GB Performance)19:54
*** eernst has quit IRC19:54
fungiand what image?19:54
dmsimardfungi: I suspect live resize might (understandably) not work for bare metal servers19:54
fungibridge is a vm as far as i know19:54
dmsimardI created a new VM with the same image and flavor as bridge and I am trying to resize to performance1-819:54
dmsimardstatus has been in "RESIZE" for a few minutes now, not sure if it's doing anything19:55
fungicool. if it works via api and not dashboard that'll be quite the discovery!19:55
dmsimardfungi: the reason why I mentioned bare metal is the wording used in the KB article you linked19:56
dmsimardthe general and i/o optimized nodes are flavor of bare metal nodes iirc19:56
fungii don't see any mention of bare metal or their onmetal service offering there19:56
fungiahh, weird that they would link that from a vm detail view with the words "help me with... resizing my server"19:57
fungidmsimard: looking at their flavor list, the bare metal flavors are the ones prefixed with "onmetal-"19:58
dmsimardit doesn't seem like it works at first glance :(19:58
Shrewsinfra-root: fyi, looks like the new port cleanup code in nodepool launchers has removed only 3 down ports (1 in limestone-regionone, 2 in inap-mtl01). that appears to be working (fyi tobiash)19:59
mordredShrews: \o/19:59
mordredShrews:  so it didn't delete all of teh ports in all of the regions19:59
tobiashShrews: \o/19:59
* mnaser builds a bridge on vexxhost :>19:59
dmsimardoh wait, maybe I spoke too soon, the VM stopped pinging19:59
mordredShrews: that's so much better than last time19:59
Shrewsmordred: right?19:59
dmsimardoh, it's in verify_resize now20:00
mordredShrews: I love it when stuff doesn't just delete stuff20:00
dmsimardwith 8GB of RAM \o/20:00
mordreddmsimard: that's exciting20:00
* tobiash goes upgrading nodepool20:00
dmsimardfor some reason it took like a good ~6 minutes to start the resize20:00
fungidmsimard: that is to say, they appear to have vm flavors for general and io so it seems like they intended for it to apply to virtual machines20:01
fungidmsimard: excellent! thanks for proving their documentation incorrect20:01
mordreddmsimard: so, assuming it finishes properly, that seems workable - although we should probably also make a backup of the important things just in case20:01
fungioh, definitely take a snapshot20:01
dmsimardfungi, mordred, corvus: commands I used for the resize: http://paste.openstack.org/show/747257/20:04
fungithanks! this is an excellent discovery20:05
fungimakes me wonder why they disable it in their dashboard and make no mention of it even being possible20:05
openstackgerritGaëtan Trellu proposed openstack/diskimage-builder master: [lvm] Add Ubuntu bionic as supported distro  https://review.openstack.org/64085020:05
dmsimardI would have been sad if it hadn't worked, live resizing VMs is kind of a big feature in OpenStack :/20:06
dmsimardIt's another problem for bare metal flavors entirely but I understand that20:06
fungiwell, rackspace isn't entirely openstack either, as we've seen so many times in the past20:06
dmsimardtrue20:06
fungiyou're able to ssh into that server and free shows the increased ram?20:06
fungihave you tried rebooting it? (or did the resize reboot it?)20:07
corvusfungi, dmsimard, mordred: i'm in favor of taking a snapshot and doing a resize to 4 or 8gb.  if someone wants to do that now, it's a good time -- run_all has stopped and is commented out, and i'm about to get lunch.20:07
dmsimardfungi: for some reason "infra-root-keys" didn't make it, I had to use the adminPass provided at the instance creation. I rebooted once after ~3 minutes to see if that would've kicked something into gear but the VM eventually rebooted by itself for the resize a few minutes later20:08
fungii'm strongly in favor of that. if dmsimard wants to do it, all the better!20:08
*** ijw has quit IRC20:08
dmsimardfungi: output from free before and after http://paste.openstack.org/show/747258/20:09
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Log exception on module failure with empty stdout  https://review.openstack.org/64065020:09
*** wolverineav has quit IRC20:09
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Manage ansible installations within zuul  https://review.openstack.org/63193020:09
mordredcorvus: I have an idea about a possible playbook optimization - not sure if it's where your brain was already going...20:09
dmsimardcorvus, fungi: I can do it -- do you snapshot using "server image create" or something else ?20:11
clarkbdmsimard: re infra-root-keys do we add that list of keys to the ci account or just the jenkins/zuul account?20:11
clarkbdmsimard: I usually do server image create --name server-fqdn-$date #server20:12
mordredcorvus: http://paste.openstack.org/show/747259/ - if we put the when on the loop, it applies the when to every loop iteration - and maybe we can avoid including the setup-repo.yaml tasks completely when not needed and keep the string comparison inside of the single task/process20:12
fungiclarkb: dmsimard: same here, or sometimes i'll add more detail like ...-before-resize-...20:13
fungijust so we have additional context when looking at the list of snapshots later20:13
dmsimardclarkb: not sure about the nova "infra-root-keys" keypair -- I'm in openstackci-rax dfw and there's just nothing in ubuntu or root's authorized keys20:13
dmsimardI didn't want to rabbit hole into something else so I left it at that20:13
fungidmsimard: i wonder if cloud-init cleared them20:14
mordredyeah - I don't think the keypair stuff will be necessary, because the keys we actually care about are installed by ansible20:14
mordreddmsimard: did the IP stay the same?20:14
dmsimardmordred: the keypair stuff was because I created a new VM for the sole purpose of testing this -- it wouldn't be an issue for bridge20:14
mordreddmsimard: yah. totes20:15
dmsimardeverything stayed the same20:15
mordredsweet20:15
dmsimardkind of the point of resize20:15
fungisounds like a winner to me then20:15
mordredmagical ponies20:15
dmsimardI really like "openstack server rebuild" too, one of my favorite commands :D20:15
mordreddmsimard: yes indeed - but just because it's the thing that makes sense doesn't mean it's reality ;)20:15
mordreddmsimard: there was a taem at HP Public Cloud who wanted us to investigate using server rebuild in nodepool instead of delete/create20:16
mordredbut we didn't get around to it - it's hard to ponder - and hp public cloud is gone20:16
fungialso they had an especially reassigning ip addresses20:17
fungier, had an especially hard time20:17
dmsimardI hit rebuild on a development VM at home and it's pinging under 30 seconds from the original image, same IP, keypairs, everything. It's neat.20:17
fungithen again, so has rackspace20:17
clarkbnote rebuild != resize20:17
dmsimardclarkb: yes :)20:18
dmsimardwe don't want a rebuild on bridge haha20:18
*** markvoelker has joined #openstack-infra20:18
dmsimardJust saying like resize, it's another cool feature :)20:18
*** jamesmcarthur has quit IRC20:19
dmsimardok I'll run the snapshot and let you know when I start the resize20:19
*** jamesmcarthur has joined #openstack-infra20:19
*** dpawlik has joined #openstack-infra20:20
clarkbmordred: prior to kubernetes and nginx sarah novotny had a job running game servers on aws that was basically know when to reuse vs delete instances. Turns out it is a fun problem20:21
* mordred hands dmsimard a cake20:21
mordredclarkb: ++20:21
clarkb(you have to predict demand and in their case determine if waiting on an existing server to become free is cheaper than spinning up a new server)20:21
*** gfidente is now known as gfidente|afk20:21
*** e0ne has quit IRC20:22
mordredyeah. also - I *believe* there is variability in terms of when rebuild is cheaper that maps to cloud deployment somehow - so it might not actually be cheaper on all of the clouds20:23
mordredso I imagine it would get even more funner for us20:23
*** dpawlik has quit IRC20:25
dmsimardmeanwhile, need a +3 on an easy project-config review: https://review.openstack.org/#/c/640837/20:26
dmsimardsnapshot is still queued, I suppose it might take a while since it's not a volume ?20:26
pabelangerdidn't we have to stop intances first for snapshoting to work properly in rax?20:27
pabelangerinstances*20:28
dmsimardI'll give it a while before giving up, it took almost 10 minutes for a resize to kick in earlier20:28
pabelangeror maybe that was something to do with volumes20:28
pabelanger(nevermind me :))20:28
fungiit will snapshot while the server is running, it just tends to do it more quickly if it's not (because the writes have already reached quiescence)20:29
dmsimardIt just transitioned from "queued" to "saving" so it'll work. I have a ping going to/from bridge to see if there's any blips in availability20:31
mriedemclarkb: http://status.openstack.org/elastic-recheck/data/integrated_gate.html "Generated at: 2019-02-08"20:33
mriedemuh oh20:33
mriedemdid something die?20:34
*** Vadmacs has quit IRC20:34
clarkbmriedem: hrm we've seen that where a run doesn't timeout20:36
clarkbI'll look to see if the lock is being held20:36
corvusmordred: ah, yeah, http://paste.openstack.org/show/747259/ looks like it'd be an improvement.  mostly i wanted to dig into why ansible's task workers weren't being reused which is causing much slowness in general.  so i'd still like to do that, but i have no idea what will come of that.  meanwhile, maybe that change will be enough to get the runtime to something tolerable.20:36
clarkbmriedem: nothing has the lock so it is probably failing. I can run it in the foreground20:37
clarkbmriedem: 2019-03-04 20:37:55,442 [eruncategorized]  WARNING: No failures found in group "integrated_gate". The default ALL_FAILS_QUERY might be broken.20:38
mordredcorvus: kk. I'll push it up as a change20:38
*** auristor has quit IRC20:39
mriedemhmmm20:40
mriedemthat was recently fixed...20:40
openstackgerritMonty Taylor proposed openstack-infra/system-config master: Filter setup-repos loop before include_tasks  https://review.openstack.org/64086120:41
clarkbmriedem: elastic-recheck==0.0.1.dev2223  # git sha 5ce47d0 is what we've got installed20:41
mriedemhttps://github.com/openstack-infra/elastic-recheck/commit/cdf6ee031e9514b6a8751f0684c147dd3d50040420:43
mriedemPOST-RUN END RESULT_NORMAL: [trusted : git.openstack.org/opendev/base-jobs/playbooks/base/post.yaml@master]20:43
mriedemcompared to20:43
mriedem    '(filename:"job-output.txt" AND message:"POST-RUN END" AND message:"project-config/playbooks/base/post.yaml")'  # flake8: noqa20:43
clarkbI'm guessing we broke it again some other way20:44
openstackgerritMerged openstack-infra/project-config master: Remove openstack-cover-jobs/docs-on-readthedocs templates for ara  https://review.openstack.org/64083720:45
mriedemi'll dork with it locally20:45
gmannclarkb: fungi ianw can uou check this: setup of stable/stein - https://review.openstack.org/#/c/640641/20:45
*** ijw has joined #openstack-infra20:45
corvusmriedem, clarkb: i haven't chased down all the links yet to find this myself -- but it looks like that query is sensitive to the playbook name, and we recently moved base-jobs to opendev/base-jobs.  has that been updated?20:46
clarkbcorvus: probably not, but that is a likely explanation20:46
*** auristor has joined #openstack-infra20:46
clarkbI wonder if we could just match playbooks/base/post.yaml20:46
clarkbthough we shouldn't be moving it much from now on20:47
mriedemthat's what i'm trying20:47
*** jtomasek has quit IRC20:48
clarkbfungi: AJaeger the docs.old afs volume was there to help facilitate the move from not afs to afs?20:50
clarkbfungi: AJaeger I'm guessing we don't actually need redundant copies of that volume?20:50
*** markvoelker has quit IRC20:51
*** sdake has joined #openstack-infra20:52
dmsimardinfra-root: snapshot for bridge.o.o is finished, I'm briefly testing the image before proceeding with the resize20:52
clarkbdmsimard: I guess the cron is disabled but normally we'd want ot be careful doing taht to avoid multiple independent runs of ansibel happening20:53
clarkbthough I guess we restrict by IP so probably a non concern even then20:53
dmsimardclarkb: yes, I'm doing this with the knowledge that run_all is commented out20:53
openstackgerritMatt Riedemann proposed openstack-infra/elastic-recheck master: Fix ALL_FAILS_QUERY  https://review.openstack.org/64086420:54
mriedemclarkb: corvus: yeah ^ fixes it20:54
mriedem~40% categorization in the gate20:54
fungiinfra-root: in today's exciting episode of "openstack or opendev..." wiki-dev: is it openstack or opendev? i'm gearing up to launch a replacement for it20:54
mordredfungi: oy20:54
mordredfungi: that is the hardest question you've asked20:54
mordredfungi: could I answer by panicing, running around in a circle and then knocking myself unconscious by running in to a light post?20:55
*** dpawlik has joined #openstack-infra20:55
clarkbmriedem: looking20:55
fungimordred: i think when in doubt, maybe we just default to assuming it's not opendev20:56
mordredfungi: I think I'd tend towards thinking openstack ... I don't think we're happy about being in the wiki business in the first place, so I don't think we likely want to expand our user audience20:56
fungipreferable to the lightpost thing anyway20:56
mordredfungi: yeah - what you said but with more words20:56
clarkbI know the wiki is a thing people want, but ya also not sure how much effort we'd be able to invest in it given well experience20:56
clarkbeasy enough to change it later I suppose if we need to20:57
mordred"would I want to excitedly point people towards it as an awesome feature of opendev and why they should use us - or would I prefer to quietly nudge it in to a dark corner and hope nobody notices it's there"20:57
*** IvensZambrano has joined #openstack-infra20:57
fungiclarkb: ianw: gmann: i suppose the reason features.yaml is still in devstack-gate is because we need it to be somewhere branchless?20:57
fungishould that maybe move into openstack-zuul-jobs or something?20:57
openstackgerritMerged openstack-infra/storyboard-webclient master: Sort search results by updated_at by default  https://review.openstack.org/63869020:57
pabelanger+1 no wiki.opendev.org20:57
*** anteaya has joined #openstack-infra20:58
clarkbfungi: the branchless vs not is what makes that complicated aiui. That said we should be able to make it branched if we really want to it just hasn't happened yet20:58
*** jamesmcarthur has quit IRC20:58
clarkbmriedem: I approved the change20:58
*** jamesmcarthur has joined #openstack-infra20:59
*** jamesmcarthur has quit IRC20:59
fungiclarkb: yeah, i believe docs.old was from before the root-marker work and subsequent mass deletion of unmanaged content (and then we recalled a fair amount of more ancient unmanaged content back from it and readded it to the live volume)20:59
*** jamesmcarthur has joined #openstack-infra20:59
clarkbfungi: cool I'll mark that as not needing redundant copies21:00
*** dpawlik has quit IRC21:00
ianwfungi: isn't this just required for legacy jobs at this point?  that was my understanding.21:00
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Manage ansible installations within zuul  https://review.openstack.org/63193021:00
fungiianw: oh, do we not even use features.yaml in the new devstack jobs? how do we manage the feature matrix? last i recall that was the one bit we hadn't managed to rework yet21:00
fungibut there's every chance i'm just behind the times21:01
corvusfungi, clarkb: yes, features.yaml should be able to be factored out, but it will require some thought.21:01
corvusianw: ^21:01
corvusit *is* used in the new devstack jobs, though it never should have been.21:02
*** wolverineav has joined #openstack-infra21:02
gmannyeah currently it is used in new job.21:03
clarkbinfra-root I've annotated my afs volume ethercalc with a bit more specifics on which volumes will need to get redundant copies given my upgrade plan.21:03
*** whoami-rajat has quit IRC21:04
clarkband with that i have gumbo to eat for lunch21:04
ianwclarkb: maybe put in some reboots in the steps on the etherpad?21:05
openstackgerritMerged openstack-infra/zuul master: Optionally disable disk_limit_per_job  https://review.openstack.org/63859621:05
corvusgmann, clarkb, fungi, ianw: the devstack pre playbook uses the test-matrix role defined in devstack-gate21:05
*** gfidente|afk has quit IRC21:05
clarkbianw can do21:06
*** jamesmcarthur has quit IRC21:06
openstackgerritMerged opendev/base-jobs master: Remove promote playbook from opendev-promote-docker-image  https://review.openstack.org/64056321:07
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Manage ansible installations within zuul  https://review.openstack.org/63193021:08
openstackgerritMatt Riedemann proposed openstack-infra/elastic-recheck master: Add query for PlacementFixture bug 1818560  https://review.openstack.org/64086921:09
openstackbug 1818560 in OpenStack Compute (nova) "Nova test_report_client uses nova conf when starting placement intercept, causing missing config opts" [Critical,In progress] https://launchpad.net/bugs/1818560 - Assigned to Chris Dent (cdent)21:09
ianwclarkb: otherwise maybe we can paste in the commands; trying to remember the last time i did any of the volume swizzling and i think it was http://lists.openstack.org/pipermail/openstack-infra/2018-May/005949.html21:09
dmsimardinfra-root: the "brief" test of the snapshot is turning out to take a while, sorry about that. VM is still trying to spawn with the image. I guess it takes a while to download the image back.21:10
ianwyeah, it can take quite a while for rax to actually start a vm even when you upload an image21:11
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Fix missing wait-to-start playbook in quick start  https://review.openstack.org/64087121:11
fungii expect it's the time taken to copy that image to the hypervisor host's cache21:12
dmsimardglance says the image's size is "10170112000" which sounds like bytes, so around 10GB ? not the end of the world but it's not small21:13
ianwclarkb: but other than that, plan LGTM, thanks.  can help if you like21:13
openstackgerritMerged openstack-infra/elastic-recheck master: Fix ALL_FAILS_QUERY  https://review.openstack.org/64086421:25
*** eharney has joined #openstack-infra21:28
*** jcoufal has quit IRC21:28
openstackgerritMerged openstack-infra/elastic-recheck master: Add query for PlacementFixture bug 1818560  https://review.openstack.org/64086921:28
openstackbug 1818560 in OpenStack Compute (nova) "Nova test_report_client uses nova conf when starting placement intercept, causing missing config opts" [Critical,In progress] https://launchpad.net/bugs/1818560 - Assigned to Chris Dent (cdent)21:28
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Fix test race in test_client_dequeue_change_by_ref  https://review.openstack.org/64087821:30
*** yamamoto has joined #openstack-infra21:33
corvusmordred: if i'm reading the ansible source correctly, it starts a new multiprocessing.Process for every task that it runs....21:33
corvusmordred: it looks like that was not previously the way it worked, but something changed in the past few years...21:34
*** ijw has quit IRC21:36
*** jcoufal has joined #openstack-infra21:37
*** jamesmcarthur has joined #openstack-infra21:37
*** yamamoto has quit IRC21:38
corvusmordred, fungi: i *think* this is when ansible switched to a pool of long-running worker processes draining tasks from a queue to running each task in its own worker process: https://github.com/ansible/ansible/commit/120b9a7ac6274c54d091291587b0c9ec865905a121:40
corvusgit tells me that would be ansible v2.121:40
*** sdake has quit IRC21:41
dmsimardVM with the snapshot is still in BUILDING...21:41
dmsimardMakes me question the ability to restore from a snapshot21:41
openstackgerritMerged openstack-infra/storyboard master: Update docs on how to run the tests locally  https://review.openstack.org/64023321:41
dmsimardbe back in a few21:43
*** sdake has joined #openstack-infra21:43
corvusclarkb, fungi: can you +3 https://review.openstack.org/64086121:47
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Manage ansible installations within zuul  https://review.openstack.org/63193021:47
clarkbcorvus: done21:48
corvusi ran a quick test of that in my test setup, and i think it will make the runtime of this playbook tolerable.21:48
fungilgtm21:48
corvusmy guess is just a few minutes total.21:49
*** markvoelker has joined #openstack-infra21:49
*** ijw has joined #openstack-infra21:49
*** mriedem has quit IRC21:50
mordredcorvus: oh joy21:51
*** geguileo has quit IRC21:53
*** smcginnis has quit IRC21:53
*** zzzeek has quit IRC21:54
*** kota_ has quit IRC21:54
*** zzzeek has joined #openstack-infra21:54
*** jamesmcarthur has quit IRC21:57
*** kota_ has joined #openstack-infra21:57
*** jamesmcarthur has joined #openstack-infra21:58
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Web: plug the authorization engine  https://review.openstack.org/64088421:59
dmsimardinfra-root: I have two ongoing attempts to spawn a VM from the bridge.o.o snapshot. The first, a rebuild of my resize test VM has been going for >1hr and the other, a fresh VM, has been going for >30 minutes. The resize looked to work well in the test enough but I'm not sure about snapshot part.22:01
*** jcoufal has quit IRC22:02
*** ijw has quit IRC22:02
dmsimardHave we successfully used snapshots before ?22:02
clarkbdmsimard: yes we have with the previous lists upgrade server testing22:02
clarkbdmsimard: its probably made a 40GB or bigger image that has to be copied?22:02
dmsimardof course, just as I mention that, the rebuild *just* became active22:02
clarkbianw: I'ev updated the etehrpad with the system level steps. Going to sort out afs management tasks to move RW volume around now22:03
*** wolverineav has quit IRC22:03
*** eernst has joined #openstack-infra22:06
*** wolverineav has joined #openstack-infra22:06
*** eernst has quit IRC22:06
*** eernst has joined #openstack-infra22:07
*** betherly has joined #openstack-infra22:07
*** dave-mccowan has quit IRC22:09
*** maumont has joined #openstack-infra22:10
corvusclarkb: why step 5?22:11
corvuslet me rephrase22:11
*** wolverineav has quit IRC22:11
corvusclarkb: why change where the RW volumes are at all?22:11
clarkbcorvus: it means we don't have to track down every single writer22:11
clarkbso mostly belts and suspenders22:11
corvusclarkb: there aren't that many writers, and you've shut most of them off by disabling those cronjobs22:12
corvusclarkb: the rest can just fail22:12
clarkbcorvus: ya thats true22:12
*** betherly has quit IRC22:12
*** jamesmcarthur has quit IRC22:12
corvusclarkb: personally, i feel that an upgrade plan which doesn't require modifying the vldb or volumes themselves is easier and safer22:13
*** jamesmcarthur has joined #openstack-infra22:13
clarkbcorvus: ok we'll need to update the files sites that only use RW volumes today (but planned to do that anyway)22:13
openstackgerritMerged openstack-infra/system-config master: Filter setup-repos loop before include_tasks  https://review.openstack.org/64086122:14
corvusclarkb: ++22:14
corvusclarkb: we may need to add cronjobs for those22:14
corvusor rather, add them to the docs auto-release cronjob22:15
clarkbcorvus: ya they'll need to be added to the afsrelease cronjob22:15
corvusclarkb: i think you can skip opendev22:15
corvusit's now in the gitea image22:15
clarkboh right22:15
clarkbcorvus: do you know what "service" volume is?22:16
corvusclarkb: nothing mounted under it22:16
clarkbcan probably be skipped too then22:16
dmsimardThe snapshot isn't booting and is falling back to "emergency mode", ctrl+d eventually ends up rebooting the VM: https://i.imgur.com/Wite4dW.png22:16
dmsimardDo we happen to have a root password for bridge.o.o already set ?22:16
clarkbdmsimard: no we don't set root passwords22:17
dmsimardnot sure how to interrupt boot to set a password, I don't see a grub prompt22:18
*** eglute has joined #openstack-infra22:18
*** wolverineav has joined #openstack-infra22:18
clarkbcorvus: ianw: great looks like we have a list of volumes to update and get published properly. I'll probably poke at that starting tomorrow morning as I'll be able to fully page in all the afs and kerberos stuff :) Then maybe wednesday/thursday we try the upgrade on ord?22:18
dmsimardThis has gotten very rabbit holey. Do we want to move forward with the resize ?22:18
clarkbianw: and thank you for the offer of help22:18
clarkbdmsimard: the usual method with openstack is you boot a recovery iamge that attaches your server iamge22:19
ianwyeah, graphical boot doesn't help, i thought there was something we put in base puppet ages ago to turn that off22:20
*** markvoelker has quit IRC22:21
clarkbdmsimard: assuming that mordred's ansible fix above helps things we may not need the resize right this moment, but this is still likely to be a useful thing to do overall22:21
clarkbso probably don't want to give up on it yet if we can avoid it22:22
dmsimardI looked on bridge and GRUB_TIMEOUT is set to 0 and GRUB_TIMEOUT_STYLE is set to hidden which would explain why I don't see it22:22
corvuswell, bridge (and puppetmaster before it) has been out of memory, swapping, and ooming for *years*22:24
corvusit's very frustrating, and it's why it often takes half a day or longer for changes to take effect22:25
*** eernst has quit IRC22:25
corvusso upgrading is definitely necessary despite mordred's improvement22:25
corvusbut if we need to do that by redeploying bridge from scratch again some other time, then i guess that's what we'll do22:26
dmsimardI'd love to be confident that our snapshot works before doing the resize, all of this should've been simple and quick enough to do but here we are :D22:27
corvusdmsimard: yeah, it's a good plan.  i don't think we should make more work for ourselves by short-cutting22:28
corvusdmsimard: if the snapshot doesn't boot, let's not proceed22:28
ianwdmsimard: i didn't see it in the saved images list, did you remove it?22:29
clarkbone thing to check with snapshots is that the host flavor is big enough for the disk22:29
dmsimardianw: bridge.openstack.org-20190304-before-resize (faf18644-1c6e-4844-9692-9856b512b8e0)22:29
dmsimardclarkb: tried one with the 8gb flavor (which is the one failing to boot right now) -- the one with the 2gb flavor is still building22:29
openstackgerritAdam Coldrick proposed openstack-infra/storyboard master: Add a 'security' flag to Teams  https://review.openstack.org/64082322:30
dmsimardI can spend some more cycles on this later, need to grab dinner.22:30
clarkbdmsimard: the image min disk is set to 40 and the performance 8gb flavor does have a 40gb root disk22:31
clarkbso ya that should be fine22:31
ianwdmsimard: just taking the liberty of trying emergency mode with that other server, see if we can get any info22:34
dmsimardianw: go for it22:34
clarkbcorvus: should we reenable the cron the running server and run it with mordreds fix or wait for possible resizing?22:35
corvusclarkb, dmsimard: since dmsimard is going to return to this later, and the next step is verifying the snapshot anyway, i think we can go back into production...22:37
corvusclarkb: where's your zuul-cd change at?22:38
clarkbhttps://review.openstack.org/#/c/604925/ looks like it already conflicts again after recent rebase. I'll rebase again now22:38
*** kgiusti has left #openstack-infra22:40
openstackgerritClark Boylan proposed openstack-infra/system-config master: Add zuul user to bridge.openstack.org  https://review.openstack.org/60492522:40
openstackgerritClark Boylan proposed openstack-infra/system-config master: Manage user ssh keys from urls  https://review.openstack.org/60493222:40
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Run docker-compose pull before docker-compose up  https://review.openstack.org/64088922:41
corvusclarkb: ^ there's the answer to your question from a while ago22:41
corvusclarkb, dmsimard: i will re-enable the cron jobs on bridge22:42
clarkbah the image pull isn't implicit22:42
clarkbbut if you've got new images then the up implies a restart22:42
corvusyep22:42
*** harlowja has joined #openstack-infra22:42
corvusi went ahead and did a git pull on system-config on bridge, and re-enabled the crons22:43
*** agopi has quit IRC22:44
dmsimardcorvus: +122:44
mordredcorvus: ok to +A the docker-compose?22:45
corvusclarkb: is the addition of test_ara in 604925 intentional?  (i didn't see it mentioned in the commit, and it's not obvious why it's needed for this change, but it's fine on its own :)22:45
corvusmordred: yep22:45
mordredcorvus: done!22:46
clarkbcorvus: that appears to be bad rebasing22:46
clarkbcorvus: I can remove it. I just rebased so may as well remove that unneeded function22:46
corvusclarkb: ok either way.  i +2d now, and will be happy to +2 the cleanup.22:47
corvusassuming, of course, that test works :)22:47
openstackgerritClark Boylan proposed openstack-infra/system-config master: Add zuul user to bridge.openstack.org  https://review.openstack.org/60492522:48
openstackgerritClark Boylan proposed openstack-infra/system-config master: Manage user ssh keys from urls  https://review.openstack.org/60493222:48
clarkbcorvus: ^ should be cleaned up now22:48
*** sdake has quit IRC22:50
openstackgerritMatthieu Huin proposed openstack-infra/zuul master: [WIP] Web: plug the authorization engine  https://review.openstack.org/64088422:51
*** slaweq has quit IRC22:53
*** smcginnis has joined #openstack-infra22:54
ianwdmsimard: so it has a command-line "root=/dev/xvdb1" and the partitions seem to be /dev/xvda1 (cloudimg-rootfs) /xvda15 & 14 (efi stuff)22:54
ianwbridge has everying on xvda22:55
*** dpawlik has joined #openstack-infra22:56
*** tkajinam has joined #openstack-infra22:56
ianwdifferent kernel in the /proc/cmdline ... hrm22:57
*** jamesmcarthur has quit IRC22:57
dmsimardianw: bridge.o.o (the real one) has /dev/xvda1, xvda14 and xvda15 (???) as well as an xvde2 for /opt22:58
*** jamesmcarthur has joined #openstack-infra22:58
*** eernst has joined #openstack-infra23:00
*** dpawlik has quit IRC23:00
*** jamesmcarthur has quit IRC23:00
*** slaweq has joined #openstack-infra23:01
*** eernst has quit IRC23:02
openstackgerritLogan V proposed openstack/diskimage-builder master: Add DIB_APT_MINIMAL_CREATE_INTERFACES toggle  https://review.openstack.org/63986523:03
ianwyeah, i got up the initramfs ... closer ...23:04
clarkblogan-: it looks like our nasible run may be having trouble talking to the mirror now23:05
*** slaweq has quit IRC23:05
clarkblogan-: not sure if that helps with debugging (just noticed that the ansible process to that node is slow/not finishing quickly)23:05
corvusfungi: can you +3 https://review.openstack.org/604925 ?  i'm in no rush for the followup if you want to discuss it more23:06
*** eernst has joined #openstack-infra23:06
logan-clarkb: yeah :/ it seems like there's some neutron weirdness going on in the controllers / network nodes. i'm rolling thru them doing openstack minor version upgrades and reboots and then we'll see where things stand23:07
fungiyep, looks good23:07
fungithanks again, logan-!23:08
corvusfungi: and i downgraded my vote in the follup to a +1 to reduce the chances of premature merging23:08
logan-im tempted to leave it disabled this week and bring that cloud up to rocky.23:09
*** eernst has quit IRC23:10
fungicorvus: thanks. i'm open to being convinced this isn't really a risk in practice, but combine it with the fact that we know there's a *.openstack.org cert floating around outside our sphere of control and it just gets that much more questionable23:10
*** rascasoft has quit IRC23:10
*** sdake has joined #openstack-infra23:12
openstackgerritAdam Coldrick proposed openstack-infra/storyboard master: Use ColumnElements instead of strings in migration  https://review.openstack.org/64089223:15
*** betherly has joined #openstack-infra23:16
*** eernst has joined #openstack-infra23:16
*** eernst has joined #openstack-infra23:17
*** markvoelker has joined #openstack-infra23:18
*** eernst has quit IRC23:19
*** betherly has quit IRC23:20
*** eernst has joined #openstack-infra23:23
corvusJpMaxMan, clarkb, fungi, mordred: i think all the zuul stuff is in place for netlify-cms.  see the most recent build of https://review.openstack.org/635924 -- if you follow the link there, it will take you directly to the preview site23:25
corvusthat change can be merged now if we want :)23:26
clarkbneat23:27
*** eernst has quit IRC23:27
*** openstackgerrit has quit IRC23:28
*** tosky has quit IRC23:28
JpMaxManNice! Now what?23:29
clarkbJpMaxMan: basically merge that change then all subsequent changes to that sandbox will get similar previewing builds23:30
clarkbwhich can be used by reviewers to decide if they want to merge those changes or not23:31
*** eernst has joined #openstack-infra23:31
JpMaxManHah yeah I know speaking more macro like do we roll up all the project sites using this ?23:31
clarkbpuppet is going to run on review.o.o nowish. I think this should fix windmill-config which will fix our image builds23:31
*** eernst has quit IRC23:32
corvusJpMaxMan: yeah, i think maybe we're probably ready to either move starlingx or zuul to this for real... maybe we can try to regroup with mordred tomorrow, and jimmy too if he's around?23:35
fungiJpMaxMan: probably once we see it work well for the stx site, i guess so23:35
*** sdake has quit IRC23:35
mordredcorvus: yeah - I think that's a great next step23:35
clarkbhttps://git.openstack.org/cgit/openstack/windmill-config has content now23:35
fungimight at least be nice to go through the paces of a change pushed from netlify and a change pushed directly to gerrit just to make sure it's doing what we expect23:35
mordredcorvus: probably try moving startlingx for real first - since it's already using an appropriate framework and we'll need to do some rework on zuul23:35
funginot that zuul is really using much of a framework. it's just hand-edited html/css which were orignially spat out of a templating engine23:36
mordredbut I think we can fast-follow starlingx with zuul pretty easisly23:36
*** dpawlik has joined #openstack-infra23:37
corvusmordred: zuul has the publication framework, stx has the vuepress buildout.  they're each halfway there :)23:37
fungigood point23:37
mordredcorvus: indeed. :)23:38
corvusmordred, clarkb: git playbook ran in 10m23:38
mordredcorvus: that's so much better than 1.5 hours23:39
fungithat's a good order of magnitude improvement (down from nearly 100)23:39
*** dklyle has quit IRC23:39
corvusi'm trying to dig the result out :)23:39
JpMaxManfungi: yeah I think some more testing is in order for sure23:40
mordredcorvus: opendev.org seems to have repos all the way to the end of stackforge at least - so seems like it got past healthnmon23:40
corvuszero failures :)23:40
mordredbut that looks like it was from 4 hours ago23:40
corvusoh, hrm23:40
clarkbcorvus: mordred did we end up correcting the setup of the projects that made it in when the db stuff wasn't working?23:40
clarkb(wondering if we want to do an out of band pass with the playbook just for this)23:41
mordredclarkb: all of them should be fine - except for maybe healthnmon23:41
corvusmordred: hrm, we must have run it to completion with the description length fix.  xstatic-angular-animate is the last project in the list.23:41
*** dpawlik has quit IRC23:41
corvusso the state is correct23:41
mordredclarkb: we were doing full re-runs already23:41
mordredyah. oh - and in fact healthnmon should be fine too23:41
corvusyep, looks like it has the correct settings23:42
mordredcorvus: \o/23:42
mordreddo we have that new project that was landed this morning?23:42
clarkbcool23:42
corvushttps://opendev.org/openstack/windmill-config23:42
clarkbopenstack/windmill-config23:42
mordredwoot23:42
clarkbnext step is replication?23:42
mordredso now, in theory, we should be good to land the gerrit replicate patch, assuming we're happy with the playbook23:43
corvusand https://review.openstack.org/#/admin/projects/openstack/windmill-config exists23:43
*** mriedem has joined #openstack-infra23:43
corvusmordred, clarkb: yes on replication23:43
mordredwoot!23:43
clarkbI'll remove my wip23:43
clarkbhttps://review.openstack.org/#/c/640431/ WIP removed23:43
corvusclarkb: frickler had a q23:43
mordredcorvus, clarkb: anybody want to predict an over/under on how long it'll take for replication to catch up?23:44
mordred:)23:44
corvusmordred: i think *less* than 24h :)23:45
clarkbcorvus: frickler I actually did read the docs on that let me find a link to a manpage23:45
*** rlandy has quit IRC23:45
clarkbhttps://git-scm.com/docs/git-push#URLS ssh://[user@]host.xz[:port]/path/to/repo.git/ is the form we are using23:46
* mordred puts €1 on 7hrs23:46
clarkbit is possible we need to explicitly set the ssh:// though23:46
*** maumont has quit IRC23:46
clarkb[user@]host.xz:path/to/repo.git/ is the form we were using which didn't have the ssh://23:46
corvusyeah, but it's gerrit, not ssh, doing this23:47
clarkbhttps://gerrit.googlesource.com/plugins/replication/+doc/master/src/main/resources/Documentation/config.md points to the git push url docs too23:47
clarkbgiven that I'd expect ssh:// to be what we want23:47
corvuswell, i don't think we need ssh://23:48
corvuswe don't have it in any other sections23:48
clarkbya but the other sections use the scp form I think (which doesn't have the ssh://23:48
clarkbbut with scp form you can't set the port since scp uses : ?23:48
* clarkb double checks23:48
* mordred believes in clarkb23:49
clarkbwe can convert review-dev to the other form really quick and test if we want23:49
corvuswhere does review-dev replicate to?23:49
clarkbgithub iirc23:49
clarkbgtest-org replicates to github23:49
* clarkb double checks that23:50
corvusah, so put a :22 on there23:50
clarkbyup23:50
clarkblet me get that chagne up23:50
*** eharney has quit IRC23:51
clarkbrereading docs I'm fairly certain we want the ssh:// do we prefer I test what I've already pushed for review.o.o or test ssh:// then update review.o.o change to ssh:// if that works?23:51
*** markvoelker has quit IRC23:51
*** hwoarang has quit IRC23:51
corvusclarkb: whichever you prefer23:51
*** wolverineav has quit IRC23:52
*** mriedem has quit IRC23:52
*** hwoarang has joined #openstack-infra23:53
*** IvensZambrano has quit IRC23:53
*** openstackgerrit has joined #openstack-infra23:53
openstackgerritClark Boylan proposed openstack-infra/system-config master: Use explicit ssh url with review-dev replication config  https://review.openstack.org/64089623:53
corvusclarkb: can we just do that manually on review-dev real quick?23:54
clarkbcorvus: sure23:54
clarkbI can do taht real quick23:54
clarkbreview-dev.o.o has been restarted with that config. Now I'll merge something in gtest and we should see it replicate23:57
clarkbhttps://github.com/gtest-org/test/commit/0e293c8fbf715d3e601be7be6ffcf56be6da10bd is there23:58
clarkbhttps://review-dev.openstack.org/#/c/107956/ I just submitted that change23:58
clarkbcorvus: ^ if you think that looks right I'll update my change for prod23:58
openstackgerritMerged openstack-infra/system-config master: Add zuul user to bridge.openstack.org  https://review.openstack.org/60492523:59
openstackgerritMerged openstack-infra/system-config master: Run docker-compose pull before docker-compose up  https://review.openstack.org/64088923:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!