Thursday, 2020-02-27

*** mattw4 has quit IRC00:00
ianwtrystack.org points to 54.39.56.124 ... which then ends up at openstack.org.  so i'm guessing infra has no involvement in it any more?00:02
*** yamamoto has joined #openstack-infra00:05
*** matt_kosut has joined #openstack-infra00:07
*** yamamoto has quit IRC00:10
fungiianw: yeah, we should be able to just drop the old trystack redirects00:11
*** matt_kosut has quit IRC00:12
*** admcleod has joined #openstack-infra00:13
*** tosky has quit IRC00:21
openstackgerritIan Wienand proposed opendev/system-config master: [wip] static: final redirects  https://review.opendev.org/71016000:28
*** jamesmcarthur has joined #openstack-infra00:35
*** ociuhandu has joined #openstack-infra00:42
*** gyee has quit IRC00:44
*** mattw4 has joined #openstack-infra00:46
*** ociuhandu has quit IRC00:48
*** jamesmcarthur has quit IRC00:51
*** jamesmcarthur has joined #openstack-infra00:52
*** jamesmcarthur has quit IRC00:57
openstackgerritIan Wienand proposed opendev/system-config master: [wip] static: final redirects  https://review.opendev.org/71016001:05
*** Lucas_Gray has quit IRC01:17
*** ramishra has joined #openstack-infra01:18
*** jamesmcarthur has joined #openstack-infra01:22
*** jamesmcarthur has quit IRC01:23
*** jamesmcarthur has joined #openstack-infra01:23
*** ramishra has quit IRC01:26
openstackgerritIan Wienand proposed opendev/system-config master: [wip] static: final redirects  https://review.opendev.org/71016001:29
*** gagehugo has joined #openstack-infra01:36
*** ramishra has joined #openstack-infra01:48
*** jamesmcarthur has quit IRC01:55
*** yamamoto has joined #openstack-infra01:56
*** jamesmcarthur has joined #openstack-infra02:00
openstackgerritIan Wienand proposed opendev/system-config master: [wip] static: final redirects  https://review.opendev.org/71016002:20
openstackgerritMerged opendev/system-config master: static: fix git raw file redirect  https://review.opendev.org/71015102:24
rm_worklolol gerrit: https://review.opendev.org/#/c/549297/  "Updated 1 year, 12 months ago"02:29
rm_worksuch maths02:29
*** jamesmcarthur has quit IRC02:31
*** ijw has quit IRC02:37
*** ociuhandu has joined #openstack-infra02:43
*** jamesmcarthur has joined #openstack-infra02:43
*** ociuhandu has quit IRC02:48
*** jamesmcarthur has quit IRC02:49
*** yamamoto has quit IRC02:49
*** smarcet has joined #openstack-infra02:49
*** smarcet has left #openstack-infra02:52
*** jamesmcarthur has joined #openstack-infra02:53
*** rlandy|bbl is now known as rlandy02:56
*** diablo_rojo has quit IRC02:57
*** xinranwang has joined #openstack-infra03:00
*** jamesmcarthur has quit IRC03:07
*** nicolasbock has quit IRC03:09
openstackgerritIan Wienand proposed opendev/system-config master: [wip] static: final redirects  https://review.opendev.org/71016003:15
*** jamesmcarthur has joined #openstack-infra03:31
*** rlandy has quit IRC03:44
*** rh-jelabarre has quit IRC03:46
*** yamamoto has joined #openstack-infra03:55
*** jamesmcarthur has quit IRC03:58
*** iokiwi has quit IRC03:59
*** iokiwi has joined #openstack-infra03:59
*** yamamoto has quit IRC04:00
*** matt_kosut has joined #openstack-infra04:08
*** matt_kosut has quit IRC04:12
*** yamamoto has joined #openstack-infra04:13
ianwi'm confident the git redirect was just a single typo and https://review.opendev.org/710151 has applied, so i have switched git.openstack.org back to a CNAME to git.04:14
ianwstatic.opendev.org i mean04:14
openstackgerritIan Wienand proposed opendev/system-config master: [wip] static: final redirects  https://review.opendev.org/71016004:34
*** udesale has joined #openstack-infra04:35
*** ociuhandu has joined #openstack-infra04:45
*** imacdonn has quit IRC04:47
*** imacdonn has joined #openstack-infra04:48
*** ociuhandu has quit IRC04:49
*** gagehugo has quit IRC05:03
*** gagehugo has joined #openstack-infra05:03
*** larainema has joined #openstack-infra05:06
*** ykarel|away is now known as ykarel05:08
*** mattw4 has quit IRC05:24
openstackgerritIan Wienand proposed opendev/system-config master: static: implement legacy redirect sites  https://review.opendev.org/71016005:26
ianwAJaeger: ^ i think that's going to be it for static.openstack.org.  files.openstack.org should be idle now too.  thanks for all your help!05:27
*** rcernin has quit IRC05:33
*** rcernin has joined #openstack-infra05:33
*** evrardjp has quit IRC05:34
*** evrardjp has joined #openstack-infra05:35
*** kozhukalov has joined #openstack-infra06:07
*** lmiccini has joined #openstack-infra06:08
*** matt_kosut has joined #openstack-infra06:09
*** matt_kosut has quit IRC06:14
*** rcernin has quit IRC06:24
*** lbragstad has quit IRC06:26
*** Lucas_Gray has joined #openstack-infra06:34
*** ociuhandu has joined #openstack-infra06:45
*** ccamacho has quit IRC06:49
*** ociuhandu has quit IRC06:50
AJaegerianw: thanks for driving that!06:51
*** pgaxatte has joined #openstack-infra06:57
AJaegerianw: commented, many redirects are wrong (not your fault - still we should fix)07:00
*** pgaxatte has quit IRC07:01
*** pgaxatte has joined #openstack-infra07:02
*** ociuhandu has joined #openstack-infra07:02
*** dangtrinhnt has joined #openstack-infra07:06
*** ociuhandu has quit IRC07:07
*** xinranwang has quit IRC07:19
AJaegerianw: let me update 71016007:25
*** ralonsoh has joined #openstack-infra07:28
*** ykarel is now known as ykarel|lunch07:35
openstackgerritAndreas Jaeger proposed opendev/system-config master: Update redirects for legacy sides  https://review.opendev.org/71019507:40
AJaegerianw: here's my proposal ^07:40
AJaegerconfig-core, please review https://review.opendev.org/710072 and https://review.opendev.org/71011207:42
*** tesseract has joined #openstack-infra07:53
*** slaweq has joined #openstack-infra07:53
*** dangtrinhnt has quit IRC07:56
*** dangtrinhnt has joined #openstack-infra07:57
*** jcapitao_off has joined #openstack-infra07:58
*** jcapitao_off is now known as jcapitao08:00
*** dangtrinhnt has quit IRC08:01
*** tkajinam has quit IRC08:02
*** ociuhandu has joined #openstack-infra08:06
*** matt_kosut has joined #openstack-infra08:10
*** ociuhandu has quit IRC08:12
*** raukadah is now known as chandankumar08:13
*** matt_kosut has quit IRC08:14
*** tosky has joined #openstack-infra08:17
*** ccamacho has joined #openstack-infra08:18
*** dangtrinhnt has joined #openstack-infra08:20
*** dchen has quit IRC08:21
*** matt_kosut has joined #openstack-infra08:22
*** ociuhandu has joined #openstack-infra08:28
*** udesale has quit IRC08:31
*** udesale has joined #openstack-infra08:32
*** ociuhandu has quit IRC08:38
*** amoralej|off is now known as amoralej08:43
*** jpena|off is now known as jpena08:51
openstackgerritMerged openstack/project-config master: Add publish job for stackviz  https://review.opendev.org/71007209:00
openstackgerritAndreas Jaeger proposed opendev/system-config master: Update redirects for legacy sides  https://review.opendev.org/71019509:06
*** smarcet has joined #openstack-infra09:07
*** lucasagomes has joined #openstack-infra09:08
*** smarcet has left #openstack-infra09:09
*** rpittau|afk is now known as rpittau09:09
*** pkopec has joined #openstack-infra09:12
*** Lucas_Gray has quit IRC09:12
*** ociuhandu has joined #openstack-infra09:14
*** Lucas_Gray has joined #openstack-infra09:15
*** elod has quit IRC09:19
*** apetrich has joined #openstack-infra09:21
*** Lucas_Gray has quit IRC09:22
*** Lucas_Gray has joined #openstack-infra09:24
fricklerAJaeger: how about I trigger the stackviz periodic job now, so we can look at the results and don't have to wait until tomorrow?09:28
*** ykarel|lunch is now known as ykarel09:30
*** pgaxatte has quit IRC09:33
*** pgaxatte has joined #openstack-infra09:35
*** yamamoto has quit IRC09:35
*** rkukura has quit IRC09:37
*** gfidente|afk is now known as gfidente09:39
*** ociuhandu has quit IRC09:41
*** dangtrinhnt has quit IRC09:43
*** xek__ has joined #openstack-infra09:43
*** dangtrinhnt has joined #openstack-infra09:43
*** auristor has quit IRC09:45
*** dangtrinhnt has quit IRC09:51
fricklerAJaeger: stackviz failed to build, I don't know enough about npm to debug further http://zuul.openstack.org/build/e78ebf7871ae49149fb3cd2dc8b15e7c09:53
frickleralso "The task includes an option with an undefined variable. The error was: 'short_name' is undefined" in rename-latest.yaml09:54
*** dmellado has quit IRC09:59
AJaegerfrickler: thanks! Will fix!10:01
openstackgerritAndreas Jaeger proposed openstack/project-config master: Fix stackviz periodic job  https://review.opendev.org/71021210:03
AJaegerfrickler: These worked in stackviz, so my rework here broke it - and that made it easy to fix. ^10:04
*** ociuhandu has joined #openstack-infra10:12
*** auristor has joined #openstack-infra10:13
*** sshnaidm|afk has joined #openstack-infra10:17
*** carli has joined #openstack-infra10:18
openstackgerritMerged openstack/openstack-zuul-jobs master: Remove puppet-forge jobs  https://review.opendev.org/71011210:22
*** elod has joined #openstack-infra10:29
*** sshnaidm|afk is now known as sshnaidm10:29
fricklerAJaeger: that looks easy enough, I should likely have spotted the first issue in review, too. will trigger another run once the fix merges10:30
carlihello, I'm wondering who I can talk to, because I think I may have done a goof with a script of mine and I wonder if I'm responsible for the bad state in which logstash.openstack.org/ is (it's not showing data anymore), and I would like to know who to warn about it and if I can help unbreak it. I'm not even sure this is the right channel but couldn't find anything that seemed more appropriate10:35
*** ociuhandu has quit IRC10:39
fricklercarli: you have found the right location and I can confirm that the dashboard seems broken, but I don't have time to dig further currently, need to wait for some other infra-root10:39
openstackgerritMerged openstack/project-config master: Fix stackviz periodic job  https://review.opendev.org/71021210:41
carlifrickler:ok, good that i have found the proper place. I think it's my fault because I was using it to get some info and have I think accidentally made requests too often10:44
*** carli is now known as carli|afk10:51
*** roman_g has joined #openstack-infra11:00
*** Lucas_Gray has quit IRC11:05
*** Lucas_Gray has joined #openstack-infra11:09
*** rpittau is now known as rpittau|bbl11:12
*** jcapitao is now known as jcapitao_lunch11:14
*** kozhukalov has quit IRC11:17
*** ociuhandu has joined #openstack-infra11:18
*** ociuhandu has quit IRC11:18
*** ociuhandu has joined #openstack-infra11:19
*** ociuhandu has quit IRC11:20
*** smarcet has joined #openstack-infra11:21
*** kozhukalov has joined #openstack-infra11:21
*** ociuhandu has joined #openstack-infra11:21
*** Lucas_Gray has quit IRC11:24
*** ociuhandu has quit IRC11:26
*** matt_kosut has quit IRC11:26
*** ociuhandu has joined #openstack-infra11:26
*** Lucas_Gray has joined #openstack-infra11:27
*** ociuhandu has quit IRC11:29
*** ociuhandu has joined #openstack-infra11:30
fricklerAJaeger: your fix still didn't work, this playbook overrides the npm_command https://opendev.org/openstack/project-config/src/branch/master/playbooks/javascript/content.yaml#L411:30
*** kozhukalov has quit IRC11:35
*** ociuhandu has quit IRC11:35
*** kozhukalov has joined #openstack-infra11:35
*** ociuhandu has joined #openstack-infra11:36
*** ociuhandu has quit IRC11:36
*** ociuhandu has joined #openstack-infra11:37
*** Lucas_Gray has quit IRC11:37
*** Lucas_Gray has joined #openstack-infra11:38
*** ociuhandu has quit IRC11:39
*** ociuhandu has joined #openstack-infra11:39
*** ociuhandu has quit IRC11:40
*** ociuhandu has joined #openstack-infra11:42
*** yamamoto has joined #openstack-infra11:42
*** kozhukalov has quit IRC11:43
*** yamamoto has quit IRC11:46
*** ociuhandu has quit IRC11:48
*** ociuhandu has joined #openstack-infra11:48
AJaegerfrickler: argh - thx11:52
*** matt_kosut has joined #openstack-infra11:55
*** ociuhandu has quit IRC11:56
*** ociuhandu has joined #openstack-infra11:56
*** ociuhandu has quit IRC11:57
*** ociuhandu has joined #openstack-infra11:57
*** ociuhandu has quit IRC11:58
*** matt_kosut has quit IRC11:58
*** matt_kosut has joined #openstack-infra11:58
*** iurygregory has joined #openstack-infra11:59
openstackgerritAndreas Jaeger proposed openstack/project-config master: fix stackviz publishing  https://review.opendev.org/71023711:59
AJaegerconfig-core, hope that's fixes the job, please review ^12:00
*** carli|afk is now known as carli12:02
*** nicolasbock has joined #openstack-infra12:03
*** amoralej is now known as amoralej|lunch12:08
*** udesale_ has joined #openstack-infra12:19
*** gagehugo has quit IRC12:19
*** udesale_ has quit IRC12:21
*** udesale_ has joined #openstack-infra12:21
*** udesale has quit IRC12:22
*** yamamoto has joined #openstack-infra12:25
*** iurygregory has quit IRC12:32
*** jpena is now known as jpena|lunch12:35
*** jamesmcarthur has joined #openstack-infra12:36
*** iurygregory has joined #openstack-infra12:36
*** ociuhandu has joined #openstack-infra12:38
*** kozhukalov has joined #openstack-infra12:39
*** ociuhandu has quit IRC12:40
*** eharney has quit IRC12:45
*** Lucas_Gray has quit IRC12:46
*** Lucas_Gray has joined #openstack-infra12:47
*** iokiwi has quit IRC12:47
*** iokiwi has joined #openstack-infra12:48
*** ociuhandu has joined #openstack-infra12:48
*** rkukura has joined #openstack-infra12:49
*** Goneri has joined #openstack-infra12:50
*** rlandy has joined #openstack-infra12:56
*** jamesmcarthur has quit IRC13:00
*** jamesmcarthur has joined #openstack-infra13:00
Tenguhello there! We need to create a bunch of repositories on opendev.org/openstack - I'm wondering how to do it.... Care to point me to relevant doc, or contact? Thanks !13:03
*** nicolasbock has quit IRC13:03
*** nicolasbock has joined #openstack-infra13:04
fricklerTengu: https://docs.openstack.org/infra/manual/creators.html13:04
*** dmellado has joined #openstack-infra13:05
*** sshnaidm_ has joined #openstack-infra13:05
*** udesale_ has quit IRC13:05
fricklerTengu: feel free to ask us here if something in the docs is unclear or you need further help13:05
*** udesale_ has joined #openstack-infra13:05
Tengufrickler: thanks! hopefully I'll be able to push things until tomorrow.13:05
*** jamesmcarthur has quit IRC13:06
Tengufrickler: as a quick intro: we're wanting to split the existing "tripleo-validations" in a subset of different things, with a new name (validations-*). Not sure in what category this enters :/13:08
*** sshnaidm has quit IRC13:08
*** jamesmcarthur has joined #openstack-infra13:10
Tengufrickler: oh. so I guess I need to point that in a discussion on #tripleo to get some ACK and actions - tripleo "head" will do the needed creation if I understand correcly.13:11
mordredTengu: you can make the patches needed - but yes, the PTL will need to ack that it's ok before we merge them13:13
*** Lucas_Gray has quit IRC13:14
Tengumordred: ... stupid question: I didn't see what repository hold that part of the config :/13:14
Tenguaaahh wait - openstack/project-config  apparently.13:14
Tenguwill check that, and prepare things - I'll be off next week, so won't move things too fast right now. Thanks for the inputs!13:16
mordredTengu: yah. the steps starting from Adding the Project to the CI System in that doc are the ones you'll need to follow13:16
mordredTengu: cool! let us know if you need help13:16
*** Lucas_Gray has joined #openstack-infra13:17
*** ociuhandu has quit IRC13:17
*** jcapitao_lunch is now known as jcapitao13:17
*** ociuhandu has joined #openstack-infra13:18
*** amoralej|lunch is now known as amoralej13:18
*** nicolasbock has quit IRC13:21
*** pkopec has quit IRC13:23
*** ociuhandu has quit IRC13:23
*** Lucas_Gray has quit IRC13:24
*** rpittau|bbl is now known as rpittau13:24
*** Lucas_Gray has joined #openstack-infra13:25
*** nicolasbock has joined #openstack-infra13:26
*** ociuhandu has joined #openstack-infra13:26
*** rfolco has joined #openstack-infra13:28
*** rfolco has quit IRC13:29
*** jamesmcarthur has quit IRC13:32
*** jamesmcarthur has joined #openstack-infra13:32
*** jpena|lunch is now known as jpena13:34
*** yamamoto has quit IRC13:35
*** lpetrut has joined #openstack-infra13:36
*** rh-jelabarre has joined #openstack-infra13:41
*** gshippey has joined #openstack-infra13:43
*** iurygregory has quit IRC13:46
*** jamesmcarthur has quit IRC13:47
*** jamesmcarthur_ has joined #openstack-infra13:47
*** matt_kosut has quit IRC13:47
*** matt_kosut has joined #openstack-infra13:48
*** smarcet has quit IRC13:52
*** matt_kosut has quit IRC13:52
*** yamamoto has joined #openstack-infra13:54
*** eharney has joined #openstack-infra13:56
fungicarli: looks like logstash has recovered now13:58
fungii started looking at logs, but didn't actually do anything13:58
*** lbragstad has joined #openstack-infra13:59
carliok, great then14:00
fungithe apache logs for the kibana interface were complaining that calls to the elasticsearch api endpoint (on elasticsearch03.openstack.org) were timing out. the elasticsearch logs on 03 were in turn complaining about timeouts getting data from 0714:00
fungiso i don't know for sure, but elasticsearch07 may have been temporarily unhappy. it logged servicing the queries it received from 03 but didn't mention any errors14:02
* fungi is baffled14:02
*** Lucas_Gray has quit IRC14:06
*** Lucas_Gray has joined #openstack-infra14:08
*** ykarel is now known as ykarel|away14:08
mordredfungi: well, that should keep excess noise down14:11
fungiyup, plenty of baffles for everyone14:12
mordredfungi: in other news, https://review.opendev.org/#/c/709253/ seems like the type of patch you'd enjoy14:14
Shrewsfor some definition of "enjoy"14:14
* mordred baffles Shrews14:15
AJaegerconfig-core, please review https://review.opendev.org/71023714:17
AJaegersmcginnis: could you sent a patch to finish the retiring of the bdd plugin repo, please?14:17
*** yamamoto has quit IRC14:18
Shrewsmordred: 2 questions on that change: 1) why the need for 'become: yes' when it wasn't needed before?  2) Removing snapd seems unnecessary maybe? At least, if someone were to look at that role at a later date, they would question why that's being done.14:20
*** Lucas_Gray has quit IRC14:20
fungimordred: also i've left a couple of reminder comments in that change... we may want to think about how we might restrict google's key to be bound only to that package repository, and also how to tell apt to only consider specific packages available for installation from that special repository14:23
mordredfungi: good point14:24
*** Lucas_Gray has joined #openstack-infra14:24
mordredShrews: good point14:25
ShrewsI guess for #2 it's more about cleanup than anything on-going, but I can't think of a better way to do cleanup14:25
fungiwe've been a little lax about that in the past, but the second thing has been possible for years and i some recent-ish improvements in apt are supposed to make the former possible now as well (but i don't recall how new, maybe too new still)14:25
mordredfungi: yeah - I think in this case the later is easy and definitely should be done14:26
mordredShrews: yeah - for 2 it's mostly just cleanup - we only had snapd installed so that we could install that package14:27
mordreds/package/snap/14:27
mordredso if we're not using the snap anymore, seems good to cleanup after ourselves14:27
mordredShrews: that said - there'sa . follow up patch to remove the removal - so we can run the removal once then land the cleanup and not confuse ourselves in the future14:27
Shrewsmordred: totes. just seems weird to have that cleanup in there after it's been executed once. i don't have a good solution for that though14:27
Shrewsmordred: oh, that works14:28
mordredme either - it's one of the biggest problems with git-driven ops - what to do with one-off commands. I think the answer is alknasdofnasoernfaoj14:28
Shrewsah, of course14:28
openstackgerritTobias Henkel proposed zuul/zuul master: Optimize canMerge using graphql  https://review.opendev.org/70983614:28
Shrewsmordred: also, sorry for missing the cleanup bit in the related review14:29
openstackgerritMerged openstack/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-full-bdd job  https://review.opendev.org/71006314:29
lennybHi, is https://review.opendev.org functions ok, I am getting 'early EOF index-pack failed' error http://paste.openstack.org/show/790075/14:30
Shrewslennyb: you should be cloning from opendev.org (https://opendev.org/openstack/nova.git)14:32
openstackgerritMonty Taylor proposed opendev/system-config master: Replace kubectl snap with apt repo  https://review.opendev.org/70925314:32
mordredfungi: like that ^^ ?14:32
lennybShrews, it's part of CI, so devstack should pull review server. Am I wrong?14:33
*** jamesmcarthur_ has quit IRC14:35
Shrewslennyb: where is that error being produced?14:35
mordredlennyb: no- in CI nothing should be cloning - zuul should be providing the needed git repos14:36
mordredlennyb: I agree with Shrews' question - where is this from?14:36
lennybShrews, during running devstack  #./stack.sh . I will remove GIT_BASE var in my local.conf. I am running third-party CI14:38
ShrewsGIT_BASE should be opendev.org14:39
openstackgerritSean McGinnis proposed openstack/project-config master: Retire devstack-plugin-bdd repo  https://review.opendev.org/71028014:40
*** jamesmcarthur has joined #openstack-infra14:40
lennybShrews, mordred. thanks.14:41
*** gagehugo has joined #openstack-infra14:47
*** jamesmcarthur has quit IRC14:48
openstackgerritJames E. Blair proposed opendev/base-jobs master: Add test docker image build jobs  https://review.opendev.org/71028314:55
corvusmordred, fungi: ^ there's a redo from yesterday14:56
mnaserjust a heads up14:56
mnaserhttps://www.githubstatus.com/14:56
*** udesale_ has quit IRC14:57
openstackgerritJames E. Blair proposed zuul/zuul master: Stream output from kubectl pods  https://review.opendev.org/70926115:02
*** Lucas_Gray has quit IRC15:05
*** ociuhandu has quit IRC15:08
*** Lucas_Gray has joined #openstack-infra15:09
*** mattw4 has joined #openstack-infra15:10
*** marios|ruck has joined #openstack-infra15:11
marios|rucko/ folks we are seeing a lot of RETRY_LIMIT in tripleo jobs is this a known issue http://tripleo-cockpit.usersys.redhat.com/d/9DmvErfZz/cockpit?orgId=1&fullscreen&panelId=6315:12
marios|ruckoops sorry bad url we have an upstream one15:13
marios|ruckthere http://dashboard-ci.tripleo.org/d/cockpit/cockpit?orgId=1&fullscreen&panelId=6315:13
*** ociuhandu has joined #openstack-infra15:14
openstackgerritTristan Cacqueray proposed zuul/zuul master: executor: blacklist dangerous ansible host vars  https://review.opendev.org/71028715:15
*** mattw4 has quit IRC15:17
*** goneri_ has joined #openstack-infra15:19
*** gyee has joined #openstack-infra15:21
*** goneri_ has quit IRC15:21
fricklermarios|ruck: looks like something is broken, but I cannot tell yet where. doesn't seem to affect tripleo only. infra-root: this looks bad http://zuul.openstack.org/builds?result=RETRY_LIMIT15:22
openstackgerritJames E. Blair proposed zuul/zuul master: Stream output from kubectl pods  https://review.opendev.org/70926115:23
openstackgerritJames E. Blair proposed zuul/zuul master: Add destructor for SshAgent  https://review.opendev.org/70960915:23
fricklernot sure all of it can just be github being broken15:23
corvusi'll track one of those down15:25
corvusRESULT_UNREACHABLE15:26
corvusthat happened while a jab playbook was in progress15:26
corvuslet me see if i can find where the nodes were15:26
fricklerone job I was watching failed with that on rax-ord15:27
corvusrax-iad15:28
fricklerslightly related: shouldn't retries be capped at 3? seeing "5. attempt" on some patches in kolla gate, e.g. 710067,215:29
corvusfrickler: what job name?15:29
fricklercorvus: kolla-ansible-centos-source-upgrade15:30
marios|ruckthanks frickler15:30
corvusfrickler: http://zuul.openstack.org/job/kolla-ansible-base has increased the retry attempts to 515:31
corvusi'm not sure if we're seeing a cloud problem, or a zuul/zk/nodepool problem15:31
corvusoh this was a while a go wasn't it?15:34
* frickler needs to be homeward bound now, will try to check again later15:34
fricklercorvus: from the age of the gate queue, might have started about 12h ago, but likely still ongoing15:34
corvusit looks like zuul has lost zk connection 6 times today so far15:35
corvuspossibly due to memory pressure15:37
*** jamesmcarthur has joined #openstack-infra15:38
*** lmiccini has quit IRC15:38
corvusapparently starting on 2-25 at 15:00, which is right around triggered a zuul-scheduler full-reconfigure on zuul01 to troubleshoot lack of job matches on openstack/openstack-ansible-rabbitmq_server changes15:39
corvusbut it's also when i was poking around in the repl15:40
corvusthat seems a more likely cause; i may not have cleaned up sufficiently.  sorry about that.15:40
corvuswe should restart the zuul scheduler soon to fix15:40
funginote i also did another full-reconfigure later to get zuul to notice a gerrit branch deletion15:40
corvusi've often wondered if issuing "del" commands in the repl is necessary to clean up memory; i'm leaning toward "the answer is yes" after this15:41
*** chandankumar is now known as raukadah15:41
fungiseems it continues to apply implied branch matchers on a project if it had more than 1 branch but then drops to only 1 through deletion of other branches15:41
fungii've no idea if that should be considered a bug, and/or whether it's worth fixing15:41
fungifull-reconfigure got it to stop applying an implied branch matcher though15:42
corvussounds like a bug15:43
corvusi'd like to see if we can limp through merging 710287 to zuul so we can restart with that15:43
funginoted, i'm in meetings this morning but can be around to help with a restart15:44
*** marios|ruck is now known as marios|out15:50
openstackgerritJames E. Blair proposed zuul/zuul master: Stream output from kubectl pods  https://review.opendev.org/70926115:51
openstackgerritJames E. Blair proposed zuul/zuul master: Add destructor for SshAgent  https://review.opendev.org/70960915:51
*** sshnaidm_ is now known as sshnaidm15:58
*** priteau has joined #openstack-infra16:00
*** diablo_rojo has joined #openstack-infra16:00
*** apevec has joined #openstack-infra16:01
tristanCit seems like lots of job are failing in RETRY_LIMIT without logs, is there a known issue?16:02
fungitristanC: yep, we think it's zk fluttering due to scheduler memory pressure16:02
*** marios|out has quit IRC16:03
fungiwe're preparing to restart it as soon as 710287 (hopefully) lands16:03
apevecis that repeating every few days? https://twitter.com/openstackinfra/status/123149717826854502416:04
tristanCfungi: thanks, i didn't realize my irc client wasn't scrolled all the way down and missed the context :)16:04
*** jcoufal has joined #openstack-infra16:05
clarkbapevec: separate issues with common symptoms. The one you note was caused by a bug in zuul's git management iirc16:06
clarkbcorvus: fungi I've got appointment this morning but then will be arpund to help probably 2 hours from now16:06
*** factor has quit IRC16:08
fungiapevec: is what repeating? the retry_limit issue? not that i've seen16:08
*** factor has joined #openstack-infra16:08
*** pkopec has joined #openstack-infra16:08
*** ccamacho has quit IRC16:08
fungiit's just an acute thing which has been happening for the past few hours and will clear as soon as we can restart the scheduler to relieve memory pressure16:09
fungiapevec: the cause on sunday was a problem with some redirects, i believe16:11
*** factor has quit IRC16:11
fungiunrelated, just resulting in a similar symptom16:11
fungi(if you count "zuul gave up trying to rerun this job after x times" a symptom)16:12
*** factor has joined #openstack-infra16:13
apevecfungi, ack, thanks for info16:15
fungicorvus: 710287 isn't going to merge without a retry, zuul-tox-remote failed16:15
*** Lucas_Gray has quit IRC16:15
*** lmiccini has joined #openstack-infra16:15
*** Lucas_Gray has joined #openstack-infra16:16
*** mattw4 has joined #openstack-infra16:17
AJaegerconfig-core, please review https://review.opendev.org/710237 and https://review.opendev.org/710280 (I can approve once Zuul gets restarted)16:17
mordredfungi: I can't find anything about restricting which repos an apt-key can be used for - any thoughts of where I should look?16:20
*** lmiccini has quit IRC16:20
*** ociuhandu_ has joined #openstack-infra16:21
*** factor has quit IRC16:22
*** factor has joined #openstack-infra16:22
*** factor has quit IRC16:23
*** ociuhandu has quit IRC16:25
fungimordred: see signed-by in https://manpages.debian.org/experimental/apt/sources.list.5.en.html16:25
fungii'm checking to see if we have that yet16:25
*** ociuhandu_ has quit IRC16:26
fungilooks like it's in debian/buster at least16:26
fungiyeah, appears in a manpage on an ubuntu/bionic server as well16:27
*** tesseract has quit IRC16:27
fungi"option to require a repository to pass apt-secure(8) verification with a certain set of keys rather than all trusted keys apt has configured. It is specified as a list of absolute paths to keyring files (have to be accessible and readable for the _apt system user, so ensure everyone has read-permissions on the file) and fingerprints of keys to select from these keyrings"16:29
mordredfungi: k. so it looks like what we'd want to do is add signed-by={keyid} to the main apt sources16:29
mordredor - path to the debian keyring file16:29
mordredand ubuntu keying file16:29
mordreds/and/or/16:29
mordredand import the key for google into its own keyring16:30
fungiyeah, the risk is in adding google's archive key into the apt trusted keyring rather than corralling it in its own keyring and telling apt to trust it for that one repository16:30
mordredmaybe just pinning the additional repo to specific package is good enough? it's unlikely that google's key is going to sign things we get from upstream ubuntu anyway?16:31
fungilike i said, we haven't been especially careful about this in the past so i'm okay just punting on it for now, but we should keep it in mind16:31
*** carli has quit IRC16:32
mordred++16:32
fungibasically the model of "add any old third-party key to apt's trusted set" opens us up to the possibility than someone who has (or gains) control of that key could masquerade a repository of their own as an official distro package repository16:32
mordredfungi: I like the idea of starting to add pins for any additional repos we're adding16:32
mordredthat seems very managable - and helps us to audit and be aware of and document _why_ we're adding a repo - what we expect to install from it16:33
fungii mean, i mostly trust google's key to not be used that way (oh, wait, they stopped promising to "do no evil" right?) but some stuff we install from external package repos may be signed by keys which aren't as carefully guarded16:33
mordredyah16:34
mordredin system-config in ansible we currently add 4 repos16:35
mordred(incliuding that kubectl patch)16:35
*** lmiccini has joined #openstack-infra16:35
*** sreejithp has joined #openstack-infra16:35
mordredone for docker, one for podman and one for openafs16:35
mordredI think we could add pins for all of them - and then say that as we continue to transition puppet stuff, when we put in apt_key or apt_repo tasks we should have corresponding pin files16:35
mordredshould make it managable in general16:35
fungiright, it's a pattern i think we should consider switching to as we get time16:36
mordred++16:36
* mordred is about to go get some coffee, but can push up some pin patches for our existing repos when he gets back16:36
mordredinfra-root: ^^ scrollback with fungi and I worth perusing16:36
fungii mean, ultimately, you're still trusting whoever controls that package you're installing not to include some malicious maintscript which apt is going to run with root privs at install time16:37
fungiwhich is why i don't think it's critical that we do it16:37
*** kkalina has joined #openstack-infra16:37
mordredyah. I think the other thing that isn't so much _maliciousness_ but that's still a good idea is accidentally grabbing some other package that happens to be in the repo16:38
mordredthe google repo is a much better example for that than the others, since it's a general purpose repo and might have unrelated packages with newer versions16:38
clarkbyou would still fail right? or will apt fallback to a package it can validate?16:38
mordredwith the pinning, you'd be telling apt to use things from the main repos except for a specific set of packages from the additional repo16:39
mordredso - get docker from docker.io - but not libc16:39
mordredand then if docker decides to ship a docker package that needs an updated by them libc - that would fail, because the deps wouldn't resolve - but that's something we should know about and would want to fail :)16:40
clarkbgot it16:40
*** lpetrut has quit IRC16:40
corvus++16:41
mordredcorvus: I'm going to dig in to this when I get back from coffee, but https://review.opendev.org/#/c/704582/ has a failed job that is not showing any logs16:42
mordredon https://zuul.opendev.org/t/openstack/build/e0d0273a851040c89f2d5e6050d0b58316:42
corvusmordred: it's the thing i dug into earlier; swapping killing zk sessions16:43
mordredah16:43
mordredgreat - then I won't dig in to it16:43
corvusi will restart the scheduler soon16:43
mordredneat16:43
fungicorvus: not sure if you saw me mention earlier, but 710287 isn't going to merge without a retry, zuul-tox-remote failed16:44
corvusfungi: yep, i just rejiggered the queue16:44
fungiaha, indeed i just looked back over at the status page16:44
*** dosaboy has quit IRC16:45
*** jpena is now known as jpena|brb16:45
*** smarcet has joined #openstack-infra16:45
*** jamesmcarthur has quit IRC16:45
*** jamesmcarthur has joined #openstack-infra16:46
*** electrofelix has joined #openstack-infra16:46
*** howell has joined #openstack-infra16:48
*** ijw has joined #openstack-infra16:49
*** ijw has quit IRC16:49
*** ijw has joined #openstack-infra16:49
*** ccamacho has joined #openstack-infra16:53
*** lucasagomes has quit IRC16:58
corvusfungi, mordred: can you review https://review.opendev.org/710283 when you have a second?  i'd like to try the provides/requires shift again today, but this time with test jobs17:01
fungilooking17:01
*** dosaboy has joined #openstack-infra17:01
fungicorvus: adding that role to playbooks/docker-image/pre.yaml is the only thing in there which looks possibly risky17:04
fungii guess the docker-image pre playbook is not heavily used?17:05
corvusfungi: it's a new file (doesn't exist now17:05
fungiahh, right, and any job would have to add that playbook explicitly to use it17:06
*** pgaxatte has quit IRC17:14
openstackgerritTristan Cacqueray proposed zuul/zuul master: executor: blacklist dangerous ansible host vars  https://review.opendev.org/71028717:16
smcginnisAnother job failure that looks like the redirect issue from yesterday - https://zuul.opendev.org/t/openstack/build/07930a201c224525b60b24814537609617:17
smcginnisDid that regex fix get applied?17:18
smcginnisfungi, ianw: ^17:18
fungismcginnis: i'll double-check, ianw switched the dns entry back to it last night17:19
smcginnisWe've had other releases this morning that have been fine.17:19
fungihttps://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt is redirecting to https://opendev.org/openstack/requirements/raw/branch/master/upper-constraints.txt and returning content for me17:19
fungii wonder if one of our backends isn't returning the right content17:21
*** njohnston is now known as neutron-fwaas17:21
*** neutron-fwaas is now known as njohnston17:22
*** rpittau is now known as rpittau|afk17:23
*** jpena|brb is now known as jpena17:23
*** Lucas_Gray has quit IRC17:24
fungithey're all returning the correct content for me17:25
fungii wonder if pip isn't doing sni and is getting the default vhost17:25
*** larainema has quit IRC17:26
fungithough i don't see any indication on the old server that we were explicitly setting one as the default17:27
corvusi am able to reproduce with curl17:28
*** Lucas_Gray has joined #openstack-infra17:28
AJaegersmcginnis: the regex fix merged17:28
smcginnisIt had been working so far today.17:28
AJaegerhttps://review.opendev.org/#/c/710151/1 is the regex fix17:29
smcginnisIncluding unit test coverage.17:29
corvussometimes when i run curl "curl -o - https://git.openstack.org/cgit/openstack/requirements/plain/upper-constraints.txt" i get redirected to The document has moved <a href="https://opendev.org/">here</a>.17:30
corvusother times i get The document has moved <a href="https://opendev.org/openstack/requirements/raw/branch/master/upper-constraints.txt">here</a>.17:30
smcginnisThat seems to be what this job got.17:30
smcginnisOr at least, some type of actual HTML page versus a redirect to the raw file.17:30
corvusthere isn't a load balancer involved, is there?17:31
AJaegerI wonder why that job does not have the requirements repo checked out - looks wrong to me...17:31
* AJaeger fixes17:32
smcginnisThat's what I was confused about last night.17:32
smcginnisI would have expected it to just use a local copy rather than trying to pull it down.17:32
fungiindeed, and yet that redirect is only served from one place17:32
corvusone apache worker process is old17:32
corvusit may have old config data17:33
fungiaha, yep, we've seen that with ssl certs too17:33
corvusi'll kick it17:33
fungisomething holds an open connection to a worker (or the worker doesn't realize a connection has died quietly) and so never terminates17:33
openstackgerritAndreas Jaeger proposed openstack/project-config master: Add requirements repo to publish-tox-docs-releases  https://review.opendev.org/71032217:34
AJaegersmcginnis, config-core, this should fix it ^17:34
*** evrardjp has quit IRC17:35
corvussmcginnis, fungi, AJaeger: i restarted apache on static01, and my highly scientific test of "run curl a bunch" is now returning a consistent redirect17:35
*** evrardjp has joined #openstack-infra17:35
smcginniscorvus, AJaeger: Thank you both!17:35
AJaegercorvus: great. Let's still merge 710322 to avoid this here...17:35
smcginnisThat would be much more efficient.17:35
corvusyes please17:36
*** igordc has joined #openstack-infra17:39
smarcetfungi: afternoon i got a weird error here https://review.opendev.org/#/c/710128/17:40
*** priteau has quit IRC17:40
smarcetfungi: No such file or directory: 'docker': 'docker'",17:41
smarcetfungi: should i recheck17:41
smarcet?17:41
fungismarcet: yes, we temporarily broke the opendev-buildset-registry job on accident yesterday17:42
fungiit should be working now17:42
smarcetfungi: ack thx u17:42
*** lmiccini has quit IRC17:45
fungicorvus: so to revisit your earlier question, i guess technically the answer is yes, there is a load balancer involved. apache is balancing load across multiple worker processes17:47
fungii wonder if setting MaxRequestsPerChild to a nonzero value would solve that17:49
openstackgerritJames E. Blair proposed zuul/nodepool master: Fix GCE volume parameters  https://review.opendev.org/71032417:49
corvusfungi: i thought it had some (large) default17:50
fungihrm, it defaults to 1000017:50
fungiyeah17:50
fungimaybe that worker hadn't handled 10k requests since the config updated, or maybe that's just a suggestion but the worker will continue to be used for more requests if some connection is holding on17:51
corvusyeah, i don't know which, and i didn't hit the status url, so lost the data17:51
fungithat same server is hosting docs.openstack.org now, so i'd be surprised if all its workers hadn't gotten 10k requests since the fix merged17:52
clarkbmy time estimate was good. At my desk now17:54
clarkbfungi: corvus we saw similar stale workers with LE cert rotation on one of the mirrors17:54
clarkba global lifespan limit would probably be a reasonable idea17:54
corvus++17:55
fungiclarkb: yeah, i had been contemplating that as a workaround when we hit it with the stale cert17:57
fungias you suggested earlier though, another possible workaround is to switch back to using full restart instead of graceful restart, and just accept that we'll terminate running connections and possibly have a fraction of a second where connections are refused17:58
clarkbis the zuul memory issue something that still needs attention?17:59
corvusclarkb: yes, but i'm still hoping thet 710287 lands soon18:00
clarkbok, let me know if I can help18:00
clarkbcatching up on email and scrollback otherwise18:00
corvusyeah, i think we're waiting ether for that to land or for the system to go over the cliff18:01
*** igordc has quit IRC18:01
*** jcapitao is now known as jcapitao_off18:02
*** Lucas_Gray has quit IRC18:02
*** gfidente is now known as gfidente|afk18:04
*** igordc has joined #openstack-infra18:04
*** mattw4 has quit IRC18:05
*** mattw4 has joined #openstack-infra18:05
AJaeger710287 has another 30mins, let's keep fingers crossed ;)18:06
clarkbI've WIP'd https://review.opendev.org/#/c/703488/6 because its depends-on has merged but then I guess was reverted?18:09
AJaegerclarkb: yeah, was reverted - mnaser thought it needed different process18:10
*** Lucas_Gray has joined #openstack-infra18:10
AJaegerconfig-core, please review https://review.opendev.org/710237 and https://review.opendev.org/71028018:11
clarkbas confirmation /etc/apache2/mods-available/mpm_event.conf sets MaxConnectionsPerChild   018:14
fungioh!18:14
clarkband mpm event is what we have enabled18:14
fungii should have grepped, i was looking in conf-enabled18:14
*** amoralej is now known as amoralej|off18:14
funginote however that none of those conffiles are linked in mods-enabled18:15
fungioh, wait, they are18:15
fungigrep doesn't follow symlinks18:15
clarkbfungi: ya I had to ls for which mpm was enabled then follow that a bit manually to the config18:15
fungi`grep -i MaxConnectionsPerChild /etc/apache2/mods-enabled/*` does indeed turn up many hits18:16
fungiall setting to 0, which per the docs means indefinite/never expire18:16
fungiokay, so behavior explained there, i suppose18:16
clarkbmod-enabled conf files are loaded well before conf.d conf files so we should be able to drop a file in conf.d/ and override that value18:16
fungiagreed, that's how we've tuned similar values on etherpad18:17
fungiwith a /etc/apache2/conf-enabled/connection-tuning.conf file18:17
fungithough we set MaxRequestsPerChild 0 in etherpad's connection-tuning.conf too18:17
fungi(just as a point of reference)18:18
clarkbmaybe set that value to 8096?18:18
*** rishabhhpe has joined #openstack-infra18:19
clarkb(I don't know what a good balance between forking too much and not actually flushing workers is)18:19
rishabhhpe Hi All, after triggering a zuul job on nodepool spawned instance on which i installed devstack .. i am not able to run any openstack command . it is saying Failed to discover auth URL18:19
fungii suppose "reloading configuration won't take effect on workers held by long-running/defunct clients" is a good reason to reconsider that value18:19
clarkbI was hoping there was a time based rotation option but I don't see one18:19
fungiclarkb: luckily apache has a recommendation for that. docs say default is 10k18:19
fungiseems like as good a number as any18:20
clarkbworks for me18:20
fricklerclarkb: seems the revert of the revert now has the opendev change as dep, so instead of -W we should likely proceed with with 703488 instead? see https://review.opendev.org/71002018:20
fungirishabhhpe: odds are you're lacking a clouds.yaml configured for your devstack deployment18:20
clarkbfungi: hrm I don't think we should land 703488 until we make the openstack change18:21
clarkbmaybe that was part of mnaser's concern?18:21
clarkber frickler ^ sorry bad tab complete18:21
fungiclarkb: i think part of mnaser's concern was that we *hadn't* yet merged our proposed governance18:22
clarkbfungi: right we can't do that until we remove the openstack conflict18:22
clarkbI'm not sure what it would mean to have a non openstack governed project in openstack18:22
fungiand he was reticent to see those repos removed from openstack until we have a clear governance confirmed18:22
clarkbbut to me that doesn't make sense18:22
clarkbdo we land a change that says "this is only in effect if openstack change lands?"18:23
mnaserI think moving forward with a governance is probably good even if we have a gap in time where the reflection of reality isn’t accurate18:23
fungiclarkb: ttx has suggested a possible workaround is for the tc to approve a resolution about it, but i'm not quite clear on what that would entail18:23
clarkbmnaser: I think we can move forward with it, but I don't think we can land it as being in effect18:24
mnaserYeah, I guess having more infra members voting on it too would be a good step too18:24
clarkbbecause that would create a massive conflict18:24
fungiyeah, we can't update our documentation to say it's opendev without it no longer being openstack18:24
clarkb(and I'm not sure how we'd resolve issues within the context of that conflict)18:24
mnaserAt the state when we merged it was 1 infra team vote at least18:24
fungimnaser: i think some of the earlier infra team votes on the openstack governance change got cleared by a new patchset, so not sure if those were also being counted18:25
fungidiablo_rojo had requested a minor adjustment to it18:25
*** jcoufal has quit IRC18:26
fungii can't remember if i got around to reapplying my vote after that18:26
clarkbmnaser: ok would it work to have quorum on https://review.opendev.org/#/c/703488/6 and have that depends on https://review.opendev.org/710020? Then TC can 710020 when sufficient quorum is reached on 703488 allows us to land them close together but with appropriate ordering?18:26
mnaserFWIW I think if we considered OpenDev as an OIP within the OSF as step 1 then openstack tc would have a resolution saying that the shared infra bits are now being “passed onto and promoted to an OIP” and then the governance draft could start?18:26
openstackgerritMerged openstack/project-config master: Add requirements repo to publish-tox-docs-releases  https://review.opendev.org/71032218:26
clarkb*Then TC can land 71002018:26
clarkbfwiw I've been asking for feedback on this for about 12 weeks now18:27
clarkbAnd I've done my best to respond to all feedback that has been received18:27
fungimnaser: i think the osf would need to be very clear on the fact that opendev is not a project focused on producing and distributing software, and that we need to figure out what confirmation requirements would be under those circumstances18:27
clarkbso I was very surprised when finding out there were additional concerns18:27
fungimnaser: so far all open infrastructure projects are writing software, and the confirmation guidelines the board approved make assumptions related to that18:27
fungiwe'd almost certainly need some of those guidelines ignored or adjusted18:28
corvusi'm pretty sure the infra team members are in favor of clarkb's proposal; i think lack of votes on that change are likely simply due to having weighed in very early and missed patchset updates18:28
corvus(if my vote was missing on the final rev, that's almost certainly why)18:29
clarkbto resolve the conflict I would update that change to have a warning block that says this only takes effect if 710020 lands. THen have a change to remove that once 710020 lands18:30
clarkbs/would/could/ I think I prefer having strict ordering via gerrit18:30
rishabhhpefungi: when i installed my master devstack .. i did not configured anything extra for clouds.yaml file18:31
clarkbbut I don't think we should assert two different concurrent governance models as I expect that will only cause confusion18:31
*** mattw4 has quit IRC18:31
*** mattw4 has joined #openstack-infra18:32
clarkbre OSF subproject. I don't have objections to that either, but I agree with fungi that we may have to do some contortioning of rules to make that work (or otherwise update rules so that this applies)18:32
*** jamesmcarthur has quit IRC18:33
*** jamesmcarthur has joined #openstack-infra18:33
*** jpena is now known as jpena|off18:33
mnaserclarkb, corvus, fungi and others: would it make sense to maybe have some sort of call on the bridge we use to maybe iron out some of the discussion18:35
clarkbmnaser: perhaps. I think I'd like to understand why the multiple mailing list threads and gerrit reviews were didn't solicit this feedback though18:35
fungimaybe... what clarification is missing? i'm not even clear on that ;)18:35
clarkb(also I've brought it up in most infra meetings over the last 3 months as well)18:36
mnaserI think I totally agree and I’m in favour of OpenDev.  I just don’t know where OpenDev sits post-split18:36
fungii guess what i'm unclear on is why it needs to sit18:36
fungiit's a community with a (forming) governance structure and defined goals18:36
mnaserbecause for things like being an infrastructure donor18:36
fungiwhat are you looking for specifically? some accountability fallback?18:37
AJaegermnaser: opendev will still be around, do we need to solve everything now - or what explicitely is blocking the change?18:37
fungijust trying to better understand what the precise concern is and what the reasonable solutions are which alleviate it18:37
*** mattw4 has quit IRC18:38
* AJaeger expected that the proposal gave us a way forward to solve any open questions...18:38
mnaserI would say when people came on and provided infrastructure to the infra team, they came in looking to provide resources to OpenStach (the project)18:38
mnaserNow they’re providing it to OpenDev (and whatever projects we currently provide), its a bit of a grey line already now but splitting it out completely..18:39
*** mattw4 has joined #openstack-infra18:39
fungiand people active in openstack worked out ways to leverage those resources to provide benefit to openstack, and those people who are involved in openstack are also being approached with resource donations to help other projects18:39
fungibut also those resources are ephemeral in nature, and there's no binding legal contract i'm aware of requiring people who donated resources previously to continue to do so if they don't wish to18:40
fungiplus, clarkb reached out to all the donors to make them aware of the plan we've been formulating for this and to invite them to provide feedback on it18:41
fungigranted it wasn't as solidified at the time, but at least some of them seem to have been following along18:42
clarkbGranted that was back in the march/aprilish timeframe last year so things may have changed18:42
clarkbI guess, my concern right now is I've practically begged people for feedback for 3 months18:42
clarkband I've done my best to address the feedback I have received18:42
clarkband I'm worried we'll be unable to satisfy concerns in general if ^ didn't work18:43
mnaserI agree that it sucks that the follow up hasn’t been right from my side with my concerns and I haven’t seen much movement on that review either18:43
fungii think most of the team provided feedback early in the process, and the proposal's been held open for the sake of anyone who didn't find time to review it yet18:45
fungia lot of the feedback occurred while it was still being drafted, before it hit gerrit18:45
clarkbIt would be helpful for me to hear what exactly peopke think the next steps are if not what I've proposed and worked at for 3 months18:45
fungiat ptgs, in irc and on the mailing list18:45
mnaseryeah.  I just want to kinda be upfront in that I’m so not against OpenDev, I just don’t want to rush through it so I felt like only 4 votes probably wasn’t enough.18:46
fungiand what landed in the gerrit change attempted to take all of that earlier feedback and discussion into account18:46
fungimnaser: 4 votes on which change? the openstack/governance change or the opendev/system-config change?18:47
mnaserGovernance18:47
* fungi is getting confused with indirect references and multiple changes18:47
fungithanks18:47
fungiand by 4 votes you mean 4 tc rollcall votes or 4 infra core reviewer votes?18:48
mnaserI understand your frustration clarkb and I really hate to be the one who brought up the revert :/ — but this is why I’m trying to personally take ownership and help drive it forward through the tc18:48
clarkbmnaser: right, but now I don't know what I can do to move this forward18:48
clarkbbecause everything I have done hasn't solicited the necessary input18:48
mnaserI think I’m counting the roll calls only off the top of my head. I’m on mobile though so it’s memory based18:48
clarkbcorvus: fungi it looks like that zuul change is hitting the reset problems too18:49
clarkbcorvus: fungi should we consider landing it more forcefully or perhaps after a restart (and plan for a second followup restart)18:50
clarkbI don't know if that chagne has made it through cehck testing? I'm assuming not?18:50
fungii'm in favor of directly submitting it in gerrit and making sure it's checked out and installed on the scheduler before we restart18:50
clarkb(so we don't have a full set of test data)18:51
fungibut maybe after it reports and we get to double-check the builds18:51
mnaserclarkb: I will draft up an update to change with a very small resolution, update the mailing list and personally chase reviews myself and I hope that gets us bigger visibility overall.18:53
clarkbmnaser: thanks. I'm happy to help, but some direction would be useful18:54
mnaserclarkb: cool, sorry for hitting the back button on your progrès but I understand the frustration18:56
fungistatus notice Memory pressure on zuul.opendev.org is causing connection timeouts resulting in POST_FAILURE and RETRY_LIMIT results for some jobs since around 06:00 UTC today; we will be restarting the scheduler shortly to relieve the problem, and will follow up with another notice once running changes are reenqueued.19:03
fungiinfra-root: ^ does that look reasonable to send?19:03
clarkbfungi: yes, though maybe lets decide on the plan for the zuul bugfix in case we need to restart twice?19:04
clarkbfwiw I think I'm willing to risk force merging that change, if it fails we can restart on HEAD~119:04
clarkbthen sort out the problem with a happier zuul19:04
fungiwell, i wasn't exactly going to say how many times we're restarting, so we can do twice if we want19:05
clarkbfair enough.19:05
clarkbmordred: looks like we have an review.o*.org cert now \o/19:05
clarkbmordred: has any work been done yet to consume that? If not I can probably give that a go19:06
mordredclarkb: woot!19:06
mordredand no - I mean - other than the ansible work19:06
mordredclarkb: we could consider planning the ansible rollout - to my knowledge we're not really waiting on anything else so could probably get 2.13 in container in a month's time19:07
fricklerfungi: +119:07
*** rlandy is now known as rlandy|brb19:07
clarkbmordred: we're happy with where review-dev is then with webservers and all that?19:08
*** ralonsoh has quit IRC19:08
mordredclarkb: yeah - I think so - it's got the redirects working with the certs19:09
fungi#status notice Memory pressure on zuul.opendev.org is causing connection timeouts resulting in POST_FAILURE and RETRY_LIMIT results for some jobs since around 06:00 UTC today; we will be restarting the scheduler shortly to relieve the problem, and will follow up with another notice once running changes are reenqueued.19:09
openstackstatusfungi: sending notice19:09
mordredclarkb: we might want to think about how manage-projects is working19:09
roman_gHello team. What could be the way to see real reasons behind NODE_FAILURE and RETRY_LIMIT errors for jobs? https://zuul.opendev.org/t/openstack/builds?job_name=airship-airshipctl-gate-test19:10
mordredclarkb: actually - let me take a look at that for a few minutes - but I think we might be better served by banging that out instead of puppeting the LE stuff (since we're close anyway)19:10
clarkbroman_g: today we've had zookeeper connection problems with zuul leading to the retry limits. Your node failures last I checked were due to the new cloud being unable to boot the requested resources19:11
-openstackstatus- NOTICE: Memory pressure on zuul.opendev.org is causing connection timeouts resulting in POST_FAILURE and RETRY_LIMIT results for some jobs since around 06:00 UTC today; we will be restarting the scheduler shortly to relieve the problem, and will follow up with another notice once running changes are reenqueued.19:11
clarkbroman_g: that was what I sent email about to you and robert and jan-erik19:11
fungithough i suppose it's also possible for intermittent zookeeper connection flaps to result in NODE_FAILURE results too, right?19:12
clarkbroman_g: I can check logs really quickly to see if that issue persists with the NODE_FAILURES19:12
roman_gclarkb: thank you. When did you send it?19:12
*** jamesmcarthur has quit IRC19:12
clarkbroman_g: february 10, 202019:12
clarkbfungi: yes I think that is possible if the zk connection dies when processing the node request19:12
roman_gclarkb: thanks. Reaching back to Robetr then.19:12
clarkb(but not 100% sure of that)19:13
clarkbroman_g: let me double check logs really quickly just to rule out the other existing issue19:13
prometheanfiredoes a scheduler restart mean we will need to recheck?19:13
fungiprometheanfire: if the builds have already reported in a change then yes, i plan to mention in that in the next notice following the restart19:13
fungiif the change is still enqueued at the time of the restart, we'll be reenqueuing it19:13
corvusfungi, clarkb: is someone restarting zuul or should i?19:14
prometheanfirekk19:14
prometheanfireI thought it might be a common question19:14
fungicorvus: we were just discussing whether we should submit that change in gerrit first and then make sure it's checked out and installed on zuul.o.o19:14
fungido you have any input?19:15
corvusfungi: nah, let's restart zuul on master then let that merge and later restart executors19:15
clarkbroman_g: looking at logs I see at least one successful node request for the larger nodes so robert and jan-erik may have solved the problem19:15
corvusfungi: it has not passed a full test suite yet19:15
clarkbroman_g: might be best to wait for us to solve the zookeeper connection issue (should be fixed shortly) then rerun and see what we end up with19:15
fungiokay, i can get to work on the scheduler restart now in that case19:15
corvusfungi: ok, all yours, thanks19:15
clarkbcorvus: fungi ++ let me know if I can help19:15
*** jcapitao_off has quit IRC19:15
mordred++19:16
*** rishabhhpe has quit IRC19:16
roman_gclarkb: thank you.19:16
fungiusing `python /opt/zuul/tools/zuul-changes.py http://zuul.opendev.org >queue.sh` as root on zuul.o.o to take a snapshot of the current pipelines19:16
fungistopping the scheduler daemon now19:17
fungidebug log says it's "Stopping Gerrit Connection/Watchers"19:18
fungihow long is that expected to usually take?19:18
clarkbfungi: I think no more than a few minutes19:18
fungiokay, it's only been about 1.5 minutes so far. i'll give it a couple more19:18
*** rlandy|brb is now known as rlandy19:19
fungiwe're now at the 3 minute mark since it logged that19:20
clarkbis the process still running?19:20
fungiyes19:20
fungioh, nope19:20
fungiit was19:20
clarkbya I think you are good to start it again19:20
fungibut i guess that's the last thing it logged19:20
*** electrofelix has quit IRC19:20
fungiyeah starting the service now19:20
clarkbthen pause for configs to load before reenqueing jobs19:20
corvusmay need to restart zuul-web after it returns19:20
fungi2020-02-27 19:24:11,004 INFO zuul.Scheduler: Full reconfiguration complete (duration: 195.238 seconds)19:26
fungii guess i can start enqueuing things now19:26
clarkb++19:26
fungii've restarted, even stopped and started, zuul-web but it's still complaining about the api being unavailable19:27
fungihrm, it's not running19:28
fungistale pidfile i think19:28
fungithere it goes19:28
clarkbya new process straces with activity19:29
clarkband web ui loads for me19:29
fungi`bash -x queue.sh` is running as root now19:29
fungistatus notice The scheduler for zuul.opendev.org has been restarted; any changes which were in queues at the time of the restart have been reenqueued automatically, but any changes whose jobs failed with a RETRY_LIMIT, POST_FAILURE or NODE_FAILURE build result in the past 14 hours should be manually rechecked for fresh results19:29
fungihow does that ^ look for once the reenqueuing is done?19:30
corvus++19:30
AJaegerLGTM19:30
smcginnis+119:30
clarkbfungi: lgtm19:31
*** ahosam has joined #openstack-infra19:35
openstackgerritMerged opendev/base-jobs master: Add test docker image build jobs  https://review.opendev.org/71028319:36
*** jamesmcarthur has joined #openstack-infra19:40
fungireenqueuing is completed, sending notice now19:43
fungi#status notice The scheduler for zuul.opendev.org has been restarted; any changes which were in queues at the time of the restart have been reenqueued automatically, but any changes whose jobs failed with a RETRY_LIMIT, POST_FAILURE or NODE_FAILURE build result in the past 14 hours should be manually rechecked for fresh results19:43
openstackstatusfungi: sending notice19:43
*** eharney has quit IRC19:43
-openstackstatus- NOTICE: The scheduler for zuul.opendev.org has been restarted; any changes which were in queues at the time of the restart have been reenqueued automatically, but any changes whose jobs failed with a RETRY_LIMIT, POST_FAILURE or NODE_FAILURE build result in the past 14 hours should be manually rechecked for fresh results19:44
openstackstatusfungi: finished sending notice19:46
clarkbthe zuul change has all of the jobs that can start running now19:47
clarkbI guess we are stable now? http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=64792&rra_id=all looks happy19:48
* clarkb finds lunch19:48
*** ociuhandu has joined #openstack-infra19:48
clarkbroman_g: ^ if you are still around can you give your airship jobs another attempt?19:49
*** gyee has quit IRC19:49
clarkbroman_g: I think we'll get cleaner data on those now and then we can cross check logs if still failing19:49
*** gyee has joined #openstack-infra19:49
*** Lucas_Gray has quit IRC19:52
openstackgerritMonty Taylor proposed opendev/system-config master: Replace kubectl snap with apt repo  https://review.opendev.org/70925319:58
openstackgerritMerged zuul/nodepool master: Fix GCE volume parameters  https://review.opendev.org/71032420:02
*** nicolasbock has quit IRC20:07
*** ijw has quit IRC20:23
*** kozhukalov has quit IRC20:27
openstackgerritMerged opendev/system-config master: OpenStackId v3.0.4 Deployment  https://review.opendev.org/71012820:28
*** kozhukalov has joined #openstack-infra20:28
fricklerAJaeger: dirk: DNS for opensuse.de/com seems borked, is this something you can influence or have contacts for? http://paste.openstack.org/show/790085/20:38
ianwfungi / clarkb: if you wouldn't mind looking in on https://review.opendev.org/#/c/710160/ it should be ready; implements the redirects in apache20:47
fungii'm headed out for an early dinner but will take a look once i get back20:48
*** ijw has joined #openstack-infra20:48
ianwthanks; and double thanks to you and corvus for looking into the redirect issues from scrollback!20:48
*** Lucas_Gray has joined #openstack-infra20:49
cmurphyfrickler: they're looking into it20:52
*** ijw has quit IRC20:53
AJaegerthanks, cmurphy !20:53
*** ijw has joined #openstack-infra20:53
*** ociuhandu has quit IRC20:53
AJaegerconfig-core, please review https://review.opendev.org/710237 and https://review.opendev.org/71028020:53
*** ijw has quit IRC20:54
*** ijw has joined #openstack-infra20:54
*** jamesmcarthur has quit IRC20:54
AJaegerthanks, ianw20:57
*** kkalina has quit IRC21:01
*** kozhukalov has quit IRC21:08
openstackgerritMerged openstack/project-config master: fix stackviz publishing  https://review.opendev.org/71023721:08
openstackgerritMerged openstack/project-config master: Retire devstack-plugin-bdd repo  https://review.opendev.org/71028021:12
*** smarcet has quit IRC21:14
*** Lucas_Gray has quit IRC21:14
*** slaweq has quit IRC21:15
*** jcapitao_off has joined #openstack-infra21:17
*** Goneri has quit IRC21:19
*** Lucas_Gray has joined #openstack-infra21:19
*** aarents has quit IRC21:23
*** rosmaita has quit IRC21:29
*** mattw4 has quit IRC21:30
*** mattw4 has joined #openstack-infra21:31
*** rosmaita has joined #openstack-infra21:31
*** Lucas_Gray has quit IRC21:38
*** kozhukalov has joined #openstack-infra21:40
*** pkopec has quit IRC21:42
*** dpawlik has quit IRC21:43
*** rcernin has joined #openstack-infra21:44
jrosseri'm seeing a few NODE_FAILURE like here https://review.opendev.org/#/c/709795/21:48
*** xek__ has quit IRC21:53
ianwjrosser: hrm, only centos-7 or random?21:53
*** Goneri has joined #openstack-infra21:53
*** jamesmcarthur has joined #openstack-infra21:54
jrosserianw: i'm going to hazard a guess at centos721:54
jrosserthis has jobs in progress but a centos-7 one has failed https://review.opendev.org/#/c/708097/21:54
ianw2020-02-27 18:03:43,246 DEBUG nodepool.driver.NodeRequestHandler[nl01-8524-PoolWorker.rax-ord-main]: Accepting node request 200-000766029121:57
ianw2020-02-27 21:56:47,855 INFO nodepool.driver.NodeRequestHandler[nl01-8524-PoolWorker.rax-ord-main]: Node request 200-0007660291 disappeared21:57
ianwi've never seen this combo before21:57
jrosserfound a 3rd centos-7 node failure on this too https://review.opendev.org/#/c/710256/21:58
ianwthere is a lot of these messages for rax21:58
*** smarcet has joined #openstack-infra21:59
clarkbianw: yup looking22:00
clarkbianw: jrosser that is due to the memory leak22:01
clarkbwhen zuul scheduelr runs out of memory its zk connection dies. That then causes the znodes that require a connection to be valid to go away22:01
clarkbzuul was restarted around 2000UTC to address this22:01
*** smarcet has quit IRC22:02
openstackgerritMerged zuul/zuul master: executor: blacklist dangerous ansible host vars  https://review.opendev.org/71028722:02
clarkbif that issue is persisting we may need to look in further22:02
jrosseri think these might all have been rechecked after that22:03
ianwclarkb: tailing the logs right now on nl, i'm seeing the disconnect errors22:03
ianwnl0122:03
ianwzuul.openstack.org doesn't seem under any particular memory or cpu pressure, however22:04
clarkbI wonder if some other issue is precipitating the connections failures post restart (and possible before too)22:05
clarkbcorvus: fungi ^ fyi22:05
ianwthe launcher might need a restart too?22:06
clarkbya or maybe zk is unhappy?22:06
ianwthe last message on zk01 at least is from 2018-11-30 22:42:38,18522:07
corvusthe scheduler has not lost a zk connection since the restart22:07
clarkbcacti shows healthy looking zk fwiw22:08
*** jamesmcarthur has quit IRC22:08
corvusianw: that pair of log lines spans the restart i think22:08
clarkbya was just going to mention that22:08
clarkbpossible that we didn't time out the node requests quickly for some reason (perhaps zuul scheduler shutdown didn't properly FIN those tcp connections so had to wait for keepalive?)22:09
ianwcorvus: probably, sorry that was just a random pull.  i am seeing a lot of kazoo.exceptions.NoNodeError just tailing right now22:09
ianwif, however, they relate to something that started before the restart, not sure22:09
corvusrestart was at 19:1522:09
fungiokay, pizza consumed, catching up22:10
corvusis nodepool not cleaning up disappearing requests?22:10
corvusbecause that request,  200-0007660291, keeps showing up in nl01's logs22:11
*** slaweq has joined #openstack-infra22:11
corvusShrews: ^ fyi22:11
ianwok, might be a red herring http://paste.openstack.org/show/790088/22:11
ianwsome nodes have thousands of the same thing22:11
*** jamesmcarthur has joined #openstack-infra22:11
corvusthis does seem like it may be a nodepool bug22:12
corvusif the request disappeared, the launcher should clean it up22:12
Shrewshrm, that's never been an issue before22:12
*** stevebaker has quit IRC22:13
corvusShrews: yeah, it seems weird to me; it's not like we haven't had the scheduler bomb out before22:13
Shrewscorvus: the rax-ord-main thread is paused, so i don't think it can handle cleanup until it unpauses22:14
*** slaweq has quit IRC22:15
Shrewsthey seem to be clearing out?22:16
ShrewsNode request 200-0007660291 disappeared22:16
corvusyeah, that happened 2 hours ago and it's still trying to write to it22:16
corvuser 3 hours even22:16
Shrewsoh yeah22:17
clarkbianw: AJaeger can you check my comments on https://review.opendev.org/#/c/710160/7 I think most can be addressed in a followup but the qa one should probably be fixed early22:17
Shrewsmy buffer got weird22:17
clarkb(in which case fixing all of them is probably smart)22:17
ianwclarkb: ajaeger did do a follow up -> https://review.opendev.org/#/c/710195/222:18
*** slaweq has joined #openstack-infra22:18
corvusShrews: oh, it's the storeNode call that's failing22:18
clarkbianw: oh cool22:18
corvusShrews: so this is happening because not only has the node request disappeared, but so has at least one of the underlying nodes we already assigned to it22:18
clarkbthey'll land together then so less concern about caching bad redirects.22:18
clarkbianw: should I approve the parent change or would you prefer to?22:19
corvusShrews: let's move to #zuul22:19
ianwclarkb: i guess qa.openstack.org is already broken anyway22:19
*** ijw has quit IRC22:19
ianwclarkb: please approve; i've made all the _acme-challange CNAMEs so it should deploy.  i can update dns when that's active22:19
gmannianw: clarkb there is no qa.o.o yet. so leaving that as it is or redirecting to QA wiki is fine.22:20
clarkbianw: done22:20
ianwgmann: i just copied whatever was currently being done for this transition :)  ajaeger has a change up that modified it @ https://review.opendev.org/#/c/710195/2/playbooks/roles/static/files/50-qa.openstack.org.conf22:22
ianwi'm sure he won't mind if you want to edit that to point to, whatever22:22
*** slaweq has quit IRC22:22
clarkbcorvus: ianw: sounds like restarting the launcher is the expected short term fix22:25
clarkbshould I do that or is someone planning to already?22:25
gmannianw: ok. we thought of building qa doc site sometime back but never got time to do. we can update it once we build.22:25
corvusclarkb: i'm not, go for it22:26
clarkbok I'll restart all 422:27
fungigmann: or once you build a qa doc site you can just use whatever the proper url is for it. would be really nice to be able to retire some of these old vanity domains22:27
clarkbnodepool==3.11.1.dev34  # git sha 5d37a0a is what nl01 has been restarted on22:28
fungii tried at one point and then some folks freaked out over a few of them (they used to be hosted on the wiki server, of all places)22:28
clarkbI'm going to watch it for a few minutes before doing the other 322:28
fungithanks clarkb!22:28
*** eharney has joined #openstack-infra22:29
clarkbwe are at quota in a couple regions so there is some noise from that but not seeing any hard errors yet22:30
gmannfungi: yeah, qa.o.o was never the thing. it is fine to just kill it.22:32
fungiianw: ^22:32
clarkbI'm following node 0014870455 which has been building for 3 monites22:32
*** Goneri has quit IRC22:32
clarkbif that one goes ready/in-use I'll see that as the all clear to restart the other three22:32
clarkband it just went ready22:33
clarkbproceeding with nl02-nl04 now22:33
openstackgerritDavid Shrewsbury proposed zuul/nodepool master: Fix for clearing assigned nodes that have vanished  https://review.opendev.org/71034322:33
clarkb#status log Restarted nodepool launcher on nl01-nl04 to clear out state related to a deleted znode. Launchers now running nodepool==3.11.1.dev34  # git sha 5d37a0a22:35
openstackstatusclarkb: finished logging22:35
fungiis a monites more like a month or more like a minute?22:38
fungiahh, from context i'm guessing minute22:38
ianwfungi: filed  38883 to deal with that later22:39
ianwhttps://storyboard.openstack.org/#!/story/200659822:39
fungithanks, i'm a ways down an openstack vmt tunnel at the momenty22:42
clarkbfungi: minute22:42
fungiyeah, i worked it out eventually, sorry for the noise22:43
ianweverything that files.openstack.org is serving is now served by static.opendev.org.  i'm going to switch the files.openstack.org CNAME to static.opendev.org and complete the transition.  the server will still run as files02.openstack.org for now22:43
ianwhang on, let me double check the serveralias for files.openstack.org is working on static.opendev.org22:44
*** ijw has joined #openstack-infra22:47
ianwhrm the static.opendev.org cert doesn't seem to cover files.openstack.org for some reason :/22:47
ianwit picked up the change ... [Wed Feb 26 02:21:15 UTC 2020] Multi domain='DNS:static.opendev.org,DNS:static01.opendev.org,DNS:files.openstack.org,DNS:static.openstack.org22:49
*** jamesmcarthur has quit IRC22:49
fungiis there a cap on the number of domain names in a le cert?22:49
ianwi think it's like 10022:50
*** sreejithp has quit IRC22:50
clarkband the encourage more names per cert (as it reduces their api overhead)22:50
*** tkajinam has joined #openstack-infra22:51
*** tkajinam has quit IRC22:51
*** tkajinam has joined #openstack-infra22:51
*** ijw has quit IRC22:52
*** ijw has joined #openstack-infra22:52
*** ahosam has quit IRC22:54
mordredianw: did the same thing happen as with review-dev - we already had an old one and didn't get a new one?22:55
ianwmordred: yeah, i'm starting to think this is a bug in acme.sh and it's manual dns update mode22:56
mordredianw: if a cert is on disk and the only thing that changes is the addition of a host to the altname list - it erroneously does not get a new cert22:56
mordredthat would be the description of the bug, yeah?22:56
ianwi think it runs, gets the TXT records to commit and updates it's db/config file like it got the new domains22:57
ianwwhen really the validation step hasn't been done22:57
mordredyeah22:57
*** owalsh has quit IRC22:58
ianwwe could write out a .stamp file where we force a renewal when we know an update has happened22:58
*** mattw4 has quit IRC23:02
tristanCdear openstack-infra folks, would it be possible to have fedora-31 mirror in afs? if so, what would be the place to propose such addition?23:03
mordredianw: yay makefile tricks!23:04
clarkbtristanC: it is possible, but as I'ev mentioned to others I think we need to start with f31 images23:04
clarkbtristanC: and for that to happen we need to figure out how to manage dnf's new rpm package format (new builders or image element that runs in container fs context or something) ianw likely has better thoughts on that than me though23:05
*** mattw4 has joined #openstack-infra23:06
ianwfor now i just want to get a nb deployed with containerised and that will "fix" it, for now, as the tools will be recent enough23:06
mordredianw: are we blocked on that for anything (other than time)?23:07
mordredianw: and would it be useful for me to help?23:07
ianwmordred: just me doing it.  i *think* i've sorted out all the issues; that's what i spent a bunch of time getting the nodepool container jobs working23:07
mordredok. cool23:08
ianwso in theory, it's just dropping it on a host23:08
ianw... in theory ... :)23:08
mordredI'm happy to page stuff in and help out if you need23:08
mordredianw: but good to know we should be fixed again by our new container overlords23:09
*** slaweq has joined #openstack-infra23:11
ianwhttps://github.com/acmesh-official/acme.sh/issues/2763 filed a bug on acme.sh23:12
*** ociuhandu has joined #openstack-infra23:12
ianwi'd prefer not to run a fork23:12
mordredI agree23:12
*** slaweq has quit IRC23:16
*** owalsh has joined #openstack-infra23:16
clarkbianw: looks like you can pass a --force23:18
ianwclarkb: yeah, the problem is known when to pass it :)23:18
clarkbya23:18
clarkbcan we infer that from the txt record var state?23:18
*** ociuhandu_ has joined #openstack-infra23:19
clarkbbasically if this cert has a txt record in ansible then we also need to --force it23:19
ianwyeah, i think we can do something like that23:20
ianwi'm trying to think why i didn't already ... maybe i just missed it23:20
*** ociuhandu_ has quit IRC23:21
clarkbpretty sure we don't have txt records when steady state not needing to renew (based on log data)23:21
clarkbbut I think that is one risk, we could renew every hour and then not be able to renew anymore due to rate limiting23:21
ianw  when: acme_txt_required | length > 023:21
*** aarents has joined #openstack-infra23:21
*** ociuhandu has quit IRC23:22
ianwi think that's what i intended to happen23:22
*** stevebaker has joined #openstack-infra23:24
*** jamesmcarthur has joined #openstack-infra23:25
clarkbianw: corvus if you have a moment can you check my comment on https://review.opendev.org/#/c/709236/3 specifically the note about simply publishing the stats data to zuul swift logs to start23:27
clarkbdo you think that is worthy of a new patchset ?23:27
ianwclarkb: i don't think you have to.  with the periodic job working now, i think we have many options23:33
ianwfor example, we can not install anything on the static server, but copy the latest logs to a nodepool node, and run the analysis tool on it from there?23:33
ianwthat way we never run the chance of screwing something up on the static server23:34
clarkbianw: that is an interesting idea. I think in general not moving the data is a good thing but maybe we can do a `ssh static cat *.log | goaccess -` sort of thing23:34
clarkbthat way the logs never end up on disk until they've been sanitized23:34
fungifrom a pii safety perspective, the fewer copies of those files reside on additional systems, the better23:34
*** kozhukalov has quit IRC23:35
ianwyeah, a network stream sounds good.  also less overhead, if goaccess pins the cpu for a few minutes * a few sites periodically, that could be annoying23:37
openstackgerritMerged zuul/nodepool master: Fix for clearing assigned nodes that have vanished  https://review.opendev.org/71034323:38
ianwalthough it looks like it's designed to not do that23:38
clarkbinfra-root ^ I'll try to remember to restart launchers again once puppet updates them with that change23:38
clarkbianw: you can pass it a stdin stream iirc23:39
clarkbthey just don't document it well /me looks again23:39
clarkbianw: see https://goaccess.io/man#examples23:40
ianwyeah, i was looking at the incremental thing with the db ... in theory you could take the db from the last run from a zuul artifact and feed it into the next run?23:42
ianw... oh, although maybe the db isn't sanitised, so can't really publish it23:42
clarkbianw: ya I haven't checked the db. I think to start we can just not use the db23:43
ianw... we could encrypt it with the system-config key.  then decrypt it in the job23:43
ianwso many options :)23:43
clarkbianw: we'll get a new 30 day window (or whatever our rotation is) each periodic run23:43
clarkbthat isn't a regression for the 404 accounting case (we only ever looked at what was present and didn't tally over time beyond that)23:43
openstackgerritGhanshyam Mann proposed openstack/hacking master: DNM: testing nova fix  https://review.opendev.org/71034923:48
*** jamesdenton has quit IRC23:49
*** jamesdenton has joined #openstack-infra23:50
*** dchen has joined #openstack-infra23:53
*** dychen has joined #openstack-infra23:55
*** dychen has quit IRC23:56
*** stevebaker has quit IRC23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!