Saturday, 2020-02-22

fungidonnyd: note that some jobs may still be running on nodes there for a little bit after that change merges00:00
fungiif you wanted to be extra careful you could check whether nodepool has deleted all the server instances first00:00
*** ociuhandu has quit IRC00:00
donnydThe general purpose jobs have been disabled for a week or so00:00
fungiright, so it would only be the "special" nodes anyway00:00
donnydFn has only been running the specialized jobs00:01
fungiof which there very well may be none at this point in the week00:01
*** mattw4 has joined #openstack-infra00:01
donnydBut I will surely check before I pull the plug on anything00:01
fungibecause people smarter than us are out drinking00:01
donnydLol00:01
*** ociuhandu has joined #openstack-infra00:01
clarkbmordred: changes lgtm00:02
fungidonnyd: also give it a few minutes for nodepool to delete the remaining images so it doesn't get stuck thinking they're a thing for all eternity00:02
fungiand keep in mind that nodepool changes aren't applied instantaneously when config changes for them merge00:03
openstackgerritMerged openstack/project-config master: Full Disable of FortNebula  https://review.opendev.org/70925700:03
*** mattw4 has quit IRC00:07
*** ociuhandu has quit IRC00:07
*** ociuhandu has joined #openstack-infra00:09
*** ahosam has joined #openstack-infra00:13
donnydfungi: I'm not actually going to shut down the controller just yet, just need to grab the IPs from the mirrors project00:14
*** ociuhandu has quit IRC00:14
donnydAlso I plan to reuse as much as i can from the existing configs, its seems to be working well00:15
fungiokay, cool00:16
*** rfolco has quit IRC00:17
*** Goneri has quit IRC00:28
*** ociuhandu has joined #openstack-infra00:36
clarkbI did't manage to get to the pip virtualenv spec but will try to hav ethat top of list on monday00:41
*** ociuhandu has quit IRC00:46
*** ociuhandu has joined #openstack-infra00:47
*** ociuhandu has quit IRC00:48
*** ociuhandu has joined #openstack-infra00:48
*** ijw has quit IRC00:53
*** ijw has joined #openstack-infra00:56
*** ociuhandu has quit IRC00:59
*** ociuhandu has joined #openstack-infra01:00
*** ijw has quit IRC01:01
*** ociuhandu has quit IRC01:05
*** ijw has joined #openstack-infra01:07
*** gyee has quit IRC01:09
*** lbragstad has quit IRC01:16
*** rkukura has quit IRC01:26
*** artom has quit IRC01:28
*** rh-jelabarre has quit IRC01:32
*** rkukura has joined #openstack-infra01:34
fungii've lost count of all the things i didn't get to this week01:42
*** rfolco has joined #openstack-infra01:54
*** ahosam has quit IRC02:02
*** redrobot has quit IRC02:03
*** jamesmcarthur has joined #openstack-infra02:09
*** jamesmcarthur has quit IRC02:43
*** jamesmcarthur has joined #openstack-infra02:43
*** jamesmcarthur has quit IRC02:48
*** jamesmcarthur has joined #openstack-infra02:57
*** auristor has quit IRC02:58
*** jamesmcarthur has quit IRC03:07
*** matt_kosut has joined #openstack-infra03:07
*** roman_g has quit IRC03:09
*** auristor has joined #openstack-infra03:09
*** matt_kosut has quit IRC03:12
*** apetrich has quit IRC03:13
*** auristor has quit IRC03:13
*** auristor has joined #openstack-infra03:17
*** ijw has quit IRC03:37
*** jamesmcarthur has joined #openstack-infra03:42
*** igordc has quit IRC03:45
*** rfolco has quit IRC03:52
*** imacdonn has quit IRC04:45
*** matt_kosut has joined #openstack-infra04:45
*** imacdonn has joined #openstack-infra04:46
*** jamesmcarthur has quit IRC04:48
*** matt_kosut has quit IRC04:50
*** ricolin has joined #openstack-infra04:58
*** ramishra has quit IRC05:13
*** evrardjp has quit IRC05:34
*** evrardjp has joined #openstack-infra05:35
*** ociuhandu has joined #openstack-infra06:15
*** ociuhandu has quit IRC06:20
*** stevebaker has quit IRC06:44
*** stevebaker has joined #openstack-infra06:44
*** dave-mccowan has quit IRC07:18
*** ociuhandu has joined #openstack-infra08:00
*** ociuhandu has quit IRC08:01
*** ociuhandu has joined #openstack-infra08:02
*** slaweq has quit IRC08:09
*** ociuhandu has quit IRC08:15
*** ociuhandu has joined #openstack-infra08:16
*** slaweq has joined #openstack-infra08:20
*** ociuhandu has quit IRC08:22
*** slaweq has quit IRC08:24
*** xek has joined #openstack-infra08:29
*** ahosam has joined #openstack-infra08:30
*** ociuhandu has joined #openstack-infra10:02
*** ociuhandu has quit IRC10:15
*** lumir_ has joined #openstack-infra10:38
*** matt_kosut has joined #openstack-infra10:45
*** matt_kosut has quit IRC10:50
*** yamamoto has joined #openstack-infra11:17
*** yamamoto has quit IRC11:29
*** yamamoto has joined #openstack-infra11:29
*** slaweq has joined #openstack-infra11:30
*** yamamoto has quit IRC11:34
*** slaweq has quit IRC11:35
*** ahosam has quit IRC11:35
*** yamamoto has joined #openstack-infra11:36
*** yamamoto has quit IRC11:40
*** yamamoto has joined #openstack-infra11:42
*** tosky has joined #openstack-infra11:43
*** slaweq has joined #openstack-infra11:47
*** slaweq has quit IRC11:51
*** yamamoto has quit IRC11:55
*** slaweq has joined #openstack-infra12:04
*** slaweq has quit IRC12:13
*** dciabrin has quit IRC12:21
*** ociuhandu has joined #openstack-infra12:22
*** slaweq has joined #openstack-infra12:25
*** ociuhandu has quit IRC12:27
*** slaweq has quit IRC12:29
*** nicolasbock has joined #openstack-infra12:39
*** ociuhandu has joined #openstack-infra12:45
*** ociuhandu has quit IRC12:50
*** eharney has quit IRC13:05
*** nicolasbock has quit IRC13:20
*** yamamoto has joined #openstack-infra13:47
*** yamamoto has quit IRC13:49
*** yamamoto has joined #openstack-infra13:51
*** ociuhandu has joined #openstack-infra14:01
*** Lucas_Gray has joined #openstack-infra14:01
openstackgerritAlbin Vass proposed zuul/zuul-jobs master: Adds role to install hashicorp packer  https://review.opendev.org/70929214:04
*** yamamoto has quit IRC14:05
*** ociuhandu has quit IRC14:06
*** Lucas_Gray has quit IRC14:16
*** surajpatil1 has joined #openstack-infra14:39
*** bdodd has quit IRC14:41
*** surajpatil1 has quit IRC14:42
*** bdodd has joined #openstack-infra14:42
*** rfolco has joined #openstack-infra14:54
openstackgerritMonty Taylor proposed opendev/system-config master: Replace kubectl snap with apt repo  https://review.opendev.org/70925314:59
openstackgerritMonty Taylor proposed opendev/system-config master: Remove snap cleanup tasks  https://review.opendev.org/70929314:59
*** rfolco has quit IRC14:59
mordredclarkb: ^^ there's a fun cleanup15:00
*** rfolco has joined #openstack-infra15:05
*** yamamoto has joined #openstack-infra15:06
openstackgerritMonty Taylor proposed opendev/system-config master: Add jobs to build gerrit 3.1  https://review.opendev.org/70929515:06
*** ociuhandu has joined #openstack-infra15:07
mordredinfra-root: ooh, reading 3.1 release notes: https://www.gerritcodereview.com/3.1.html#the-messageoftheday-extension-point-is-removed ... this mentions a javascript "banner" plugin entrypoint. perhaps we should write a js plugin that interfaces with statusbot - so when we have a global announcement in place it would show up in a gerrit banner in addition to in IRC15:10
*** yamamoto has quit IRC15:10
*** ociuhandu has quit IRC15:11
*** Lucas_Gray has joined #openstack-infra15:14
*** bdodd has quit IRC15:16
corvusmordred: statusbot is ready for that: corvus@eavesdrop01:~$ cat /var/lib/statusbot/www/alert.json15:42
corvus{"alert": null}15:42
corvusi think that used to be served by apache, but looks like it isn't right now15:43
*** Lucas_Gray has quit IRC15:48
*** Lucas_Gray has joined #openstack-infra15:51
*** Lucas_Gray has quit IRC15:54
*** Goneri has joined #openstack-infra15:55
AJaegermordred: so, if 3.1 *removes* it - why do you want to use it? Still, like the idea, would be great to show alerts on review.o.o and zuul.o.o16:07
*** tosky has quit IRC16:37
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Add pause-buildset-registry role  https://review.opendev.org/70925616:38
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Fix unittests for python2  https://review.opendev.org/70930216:38
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Fix unittests for python2  https://review.opendev.org/70930216:47
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Fix unittests for python2  https://review.opendev.org/70930216:57
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Add pause-buildset-registry role  https://review.opendev.org/70925616:57
openstackgerritMonty Taylor proposed zuul/zuul-jobs master: Fix cleanup of symlink fixtures  https://review.opendev.org/70930616:57
AJaegerinfra-root, did we break Zuul? I'm getting RETRY_LIMITs that I can't locate ;(17:01
AJaegerexample: http://zuul.opendev.org/t/openstack/status/change/709298,117:01
AJaegerbut there are more where I now see "2. attempt"17:01
mordredAJaeger: that certainly doesn't seem awesome17:02
mordredAJaeger: enum34 made a release recently that broke python2.7 tests for zuul - but I doubt that would have the same impact here17:03
openstackgerritTobias Henkel proposed zuul/zuul master: Add foreground option  https://review.opendev.org/63564917:03
openstackgerritTobias Henkel proposed zuul/zuul master: Deprecate -d switch for running in foreground  https://review.opendev.org/70518517:03
openstackgerritTobias Henkel proposed zuul/zuul master: Don't enforce foreground with -d switch  https://review.opendev.org/70518917:03
AJaegerand seems to be kind of random, 709298 after a recheck gets "2. attempt" everywhere17:03
mordredI've got the log streaming for one of those17:04
corvusi'll see if i can find more info about attempt #1 of linters on that change17:04
*** Goneri has quit IRC17:05
mordredcorvus: also - do we know what the deal is with log streaming printing waiting for logger pretty constantly?17:05
mordredoh - nevermind17:06
corvusmordred: no, that's new to me17:06
mordredthat's for shell tasks that run before starting the console daemon17:06
AJaegerthanks, corvus and mordred !17:06
corvuswe have shell tasks before the console daemon?17:06
mordredyup17:06
corvusmordred: did you catch the releasenotes failure?  it just hit retry limit17:06
mordredno - I'm watching http://zuul.opendev.org/t/openstack/stream/fde8063280134fe7a6033e4602c8bcd7?logfile=console.log17:06
corvusaw bad luck17:07
mordredyeah - I made it to RUN on this one - not gonna get a retry17:08
corvusgit fetch error:   stderr: 'fatal: Project not found: openstack/openstack-ansible-rabbitmq_server17:09
clarkbyou can pull up those logs via logdtash too17:09
corvusthat's a failing before ansible17:09
clarkbsince we log attempts. you search for attempts 3, then search on that job for that change on attempt 117:09
clarkboh if it is before ansible then logstash may not help17:10
AJaegerheisenbugs - when I trail, it works ;(17:10
corvustobiash: ^ if you're around, this may touch the merger work17:10
corvuslemme collect a log to paste17:10
corvustobiash, clarkb, mordred, AJaeger: http://paste.openstack.org/show/789894/17:12
corvusnote there are 2 builds involved there17:13
tobiashcorvus: that sounds like something severe (oom?) happened to one of the worker processes17:13
AJaegerthis is openstack-manuals, why does it care about openstack-ansible-rabbitmq_server at all?17:14
corvustobiash: an oom happened at 16:1617:14
AJaegerso, an hour ago?17:15
corvusor 45m before this error17:15
tobiashcorvus: the second exception looks weird, but shouldn't be running in the process pool17:16
corvusAJaeger: if i understand tobiash's theory, the oom killed one of the workers and that went unnoticed until the manuals job (build 3565916d689d46be94f957e46cd6651f) ran 45m later, at which point the process pool noticed it, and that caused the job to fail17:16
corvusand yeah, i don't understand the second error yet, or whether or how it might be related to the first17:17
AJaegerthanks for explanation.17:17
tobiashcorvus: so it looks like we should catch concurrent.futures.process.BrokenProcessPool and re-initialize the process pool (or even stop the executor?)17:18
AJaegermany jobs are failing, so do we need to restart executor and the worker?17:18
corvustobiash: yeah, i think we should try the first thing, and if that doesn't work, we can look at the second  (similar to the streamer situation; i still don't think we've merged clarb's patches for that; we should check on that too)17:18
mordred++17:19
mordredcatching broken process pool seems reasonable17:19
*** ralonsoh has joined #openstack-infra17:19
corvustobiash, AJaeger i see continuing errors, so actually let me revise the hypothesis: that it may not have gone unnoticed for 45m, it may have started failing every job running on that executor.17:20
tobiashthe exception says that the process pool is not usable anymore so I'd expect that every job fails from that point on17:20
mordredcorvus: yeah - if each one is going to throw BrokenProcessPool, that would make sense absent something to rectify the pool - it would also likely let that executor accept a much higher % of the available jobs, since it's failing them quickly17:21
corvusthere are many instances of the rabbitmq access rights error too. the first instance of that is at 6:2517:21
corvusoh, wait, log rotation...17:22
corvusthat was happening all yesterday too17:23
corvusmordred: git clone https://review.opendev.org/openstack/openstack-ansible-rabbitmq_server17:23
mordredcorvus: why is something git cloning from review?17:24
tobiashah good, so that one is unrelated17:24
AJaegerOops, 'fatal: remote error: Git repository not found17:24
corvusmordred: it's not.  i'm asking you to to confirm that it's fubar in gerrit17:24
mordredah. nod17:24
corvusmordred: (but it was a *fetch* from review by zuul that failed)17:25
corvushttps://review.opendev.org/#/admin/projects/?filter=rabbit17:25
mordredconfirm17:25
mordredI cannot clone that repo17:25
mordredbut I can clone other repos that way17:25
corvusi don't see it there; so it looks like that is a problem with gerrit17:25
AJaegerindeed, I can't find it ;(17:25
corvusi'm going to identify which executors need restarting and will restart them as a temp fix (until we can make the change that tobiash suggested); this is probably something that we can limp along with for a few days at a time until then.17:26
AJaegerlast new repo creation was https://review.opendev.org/70896117:26
corvusif someone wants to dig into the gerrit issue, that's open for the taking17:26
mordredcorvus: it exists in /home/gerrit2/review_site/git17:27
corvus(i wonder if it has a broken refs/meta?)17:27
AJaeger708961 looks ok. Still, did creating it broke something?17:27
corvusonly ze02 is broken, so i will restart only it17:28
* clarkb is headedto a kids birthday party. Sorry cant help right now17:28
mordredand I can clone it from there locally - so the git repo seems fine-ish17:28
*** ccamacho has quit IRC17:28
*** dklyle has quit IRC17:28
corvusmordred: from the error log: http://paste.openstack.org/show/789895/17:30
mordredcorvus: oh goodie17:30
corvus#status log restarted zuul-executor on ze02 due to process pool failure17:31
AJaegersorry, have to step out as well17:33
corvusinteresting ebdf54221280df20522ab15cea9c9b67c0c03ca4 does exist in that repo17:34
corvus(and it is a change to project.config)17:34
mordredcorvus: maybe it was a hiccup?17:34
corvusmordred: i rsynced a copy of the repo and it fsck's and gc's without error17:34
*** evrardjp has quit IRC17:34
corvusmordred: yeah, we may just want to try restarting gerrit?17:34
*** evrardjp has joined #openstack-infra17:35
mordredcorvus: yeah - or - maybe clearing cache?17:35
corvusoh yeah com.google.gerrit.server.project.ProjectCacheImpl17:35
mordredcorvus: maybe just trya . gerrit flush-caches and see if that fixes it?17:36
corvusmordred: should i do "--all"?17:36
corvusor "projects"17:36
mordredmaybe try projects first17:36
corvusokay, i will do that now17:36
corvusnope.  moving on to --all17:37
mordredkk17:37
corvusall is slow17:38
mordredyeah, I'll bet17:39
corvusi'm not sure this is faster than restarting :/17:40
mordred:(17:40
corvusdone17:40
corvusstill erroring17:40
corvusrestart now?17:40
mordreddouble :(17:41
mordredyeah. I think so17:41
corvusokay, i will stop gerrit17:41
mordredfingers crossed17:42
corvusstarting17:42
corvusi alse restarted apache to speed up the proxy refresh17:43
corvus\o/ https://review.opendev.org/#/admin/projects/openstack/openstack-ansible-rabbitmq_server  exists17:43
mordredTHANK GOD17:43
corvusand i can git clone that repo directly from gerrit17:43
corvus#status log restarted gerrit to correct cached git object error with openstack/openstack-ansible-rabbitmq_server (repo on disk appears normal)17:44
corvuser, statusbot isn't here17:44
mordredcorvus: all of the systems are taking mardis gras weekend off17:46
*** openstackstatus has joined #openstack-infra17:46
*** ChanServ sets mode: +v openstackstatus17:46
corvus#statusbot log restarted statusbot because it disappeared17:46
openstackstatuscorvus: finished logging17:46
corvus#status log restarted gerrit to correct cached git object error with openstack/openstack-ansible-rabbitmq_server (repo on disk appears normal)17:46
openstackstatuscorvus: finished logging17:46
corvus#status log restarted zuul-executor on ze02 due to process pool failure17:47
openstackstatuscorvus: finished logging17:47
corvusokay, i'm going to saturday now :)17:47
mordredcorvus: saturday!17:47
openstackgerritTobias Henkel proposed zuul/zuul master: Recover from broken process pool  https://review.opendev.org/70930717:53
tobiashcorvus, mordred: that should make the executor recover from such a situation ^17:54
paladoxcorvus we've seen that error before.18:40
paladoxjust it didn't impact us as much :)18:40
paladoxcorvus i think gerrit reindexes on startup? (not too sure if it does it under 2.13 though)18:41
*** jamesmcarthur has joined #openstack-infra18:48
*** jamesmcarthur has quit IRC19:01
*** jamesmcarthur has joined #openstack-infra19:02
*** mugsie has quit IRC19:25
*** mugsie has joined #openstack-infra19:27
*** ociuhandu has joined #openstack-infra19:44
*** ociuhandu has quit IRC19:48
*** rfolco has quit IRC20:02
*** rfolco has joined #openstack-infra20:07
*** surajpatil1 has joined #openstack-infra20:19
*** rfolco has quit IRC20:23
*** surajpatil1 has quit IRC20:24
*** slaweq has joined #openstack-infra20:31
*** Lucas_Gray has joined #openstack-infra20:41
*** jamesmcarthur has quit IRC20:44
*** ralonsoh has quit IRC20:49
*** ociuhandu has joined #openstack-infra20:54
*** jamesmcarthur has joined #openstack-infra21:09
*** ociuhandu has quit IRC21:18
*** jamesmcarthur has quit IRC21:19
*** ociuhandu has joined #openstack-infra21:30
*** xek has quit IRC21:35
*** jamesmcarthur has joined #openstack-infra21:54
*** ijw has joined #openstack-infra21:54
*** jamesmcarthur has quit IRC22:00
*** ociuhandu has quit IRC22:01
*** slaweq has quit IRC22:10
*** slaweq has joined #openstack-infra22:22
*** slaweq has quit IRC22:26
*** jamesmcarthur has joined #openstack-infra22:40
*** jamesmcarthur has quit IRC22:45
*** ijw has quit IRC22:46
*** dave-mccowan has joined #openstack-infra23:20
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [wip] uncap hacking  https://review.opendev.org/70933223:26
*** jamesmcarthur has joined #openstack-infra23:41
*** jamesmcarthur has quit IRC23:46
*** jamesmcarthur has joined #openstack-infra23:54

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!