Monday, 2019-08-19

*** rcernin has joined #openstack-infra00:02
*** exsdev has quit IRC00:09
*** exsdev has joined #openstack-infra00:10
mnaserinfra-root: is there anyone around that can get me a hold on 677024 ?  it's timing out in gate but works fine for me locally00:12
ianwmnaser: looking ...00:12
ianwmanser: kue-integration-1-node ?00:14
mnaserianw: sure, any of the two works00:15
mnaserthey both timeout :)00:15
ianw| opendev | opendev.org/vexxhost/kue | kue-integration-1-node | refs/changes/24/677024/.* |   1   | mnaser: timeouts |00:17
mnaserianw: awesome, lemme try again00:17
clarkbmnaser: I'm not sure that zuul can ssh as zuul on any of the nodes00:18
mnaserclarkb: why not? it's just a normal nodepool vm?00:19
clarkbbecause the execytor can ssh as zuul but not the test bodes00:19
mnaseroh i see what you mean00:19
mnaserhmm00:19
mnaseri tried to iterate with installing this on the executor00:20
mnaserbut i guess i dont have sudo so i have to do a user install00:20
openstackgerritIan Wienand proposed opendev/system-config master: Convert nested bridge.o.o ARA report to static HTML  https://review.opendev.org/67709600:29
clarkbmnaser: ya you can' install software on the executor00:29
mnasermaybe a user install might be the best idea00:29
clarkbit is pretty limited to reading/writing files to the scratch space and making http requests externally00:29
clarkb(to prevent breakouts)00:29
mnasercould a python user install properly work? or not worth trying?00:30
clarkbno00:30
clarkbnot without being a trusted job00:30
mnaserok so i have to generate and distribute ssh keys to make this work?00:30
clarkbthe issue is in executing arbitrary code not user vs system installs00:30
clarkbyes, that is what multinode devstack jobs do for example00:31
mnaserjust curious how openstack multinode jobs currently do that, is there a role or nah?00:31
clarkbthat I don't know.00:31
clarkbwith the zuulv2 jobs devstack-gate did it I think00:31
clarkbso there may be a role in devstack now that does it for zuulv3 jobs00:32
clarkbhttps://opendev.org/openstack/devstack/src/branch/master/roles/orchestrate-devstack/tasks/main.yaml#L9-L13 I think that is the role you want00:32
mnaseroh awesome00:33
clarkblives at https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/copy-build-sshkey00:33
*** zhurong has quit IRC00:37
*** yamamoto has joined #openstack-infra00:39
clarkbianw: logan- quick check of a centos host in limestone shows it has a private ipv4 addr configured. That issue must not be a 100% failure then. IPv4 is configured via dhcp according to config drive network_data.json and the sysconfig file glean wrote for eth0 on that host00:41
clarkbthat doesn't help narrow down what may be happening there very much, but I think we need to check that dhcp is working and that glean is properly writing out that config every time (maybe we have another NM race ?)00:42
fungii guess we'll need to hold a node for whatever tripleo job is consistently hitting it00:42
ianwclarkb: umm, i guessing i should scroll backwards ... doing that now :)00:42
fungior hold a bunch of nodes and recheck-spam until we get a held one in ls00:43
fungidepending on how likely the job is to also fail for unrelated reasons00:43
clarkbianw: tripleo and kolla have found that some jobs running on centos 7 on limestone don't have ipv4 addrs configured (they are the private only addr in that cloud)00:44
ianwright, so only limestone as far as we know?00:44
ianwthere is also a comment in backscroll that logstash results stop about the time we moved to swift logs00:44
clarkbipv6 is working so job runs there then fails because they need working ipv4 to talk between test nodes in multinode setup ? and ya I think only limestone so far. POssibly because if it happens elsewhere nodepool sees that as a failed boot instead00:44
fungithose have been the only example so far, yeah00:45
ianwok, first just checking on logstash before i query ... logs from at least 2019-08-19T10:45:20.175+10:0000:49
clarkbre lgostash I think swift broke it00:49
clarkbthat is the other thing on my list but I don't want to get sucked into that until tomorrow :P00:49
fungirather, our switch to storing job logs in swift00:49
clarkbya00:49
* fungi doesn't blame swift itself at all00:50
clarkbmy guess is that either swift doesn't like our old severity filter parameter on the requests or the volume of logs without that filter being active is just too high and causing it to fall over or maybe even both00:50
clarkbbut its a good chance it doesn't like the severity parameter and fixing that will then cause it to fall over due to volume (and we'll need to add aggressive filters to replace those we had)00:51
ianwoh right, that was from the os-loganalyze middleware right?00:52
clarkbyup00:52
*** prometheanfire has quit IRC00:52
ianwwhich is now zuul javascript?00:52
clarkbya00:52
clarkbI think what we can do is have logstash's first rule be drop anything that is a debug log00:53
*** prometheanfire has joined #openstack-infra00:56
ianwfungi / clarkb: if you have a quick second for https://review.opendev.org/#/c/677096 to fix ara-reports in the nested system-config jobs ... i'm trying to show them to upstream testinfra as a sales pitch for opendev.org but it's a bit sucky to say "yeah, it's great but we just have to fix this one bug" :)00:57
clarkbianw: are we able to run ara from an untrusted job like that on the executor?01:00
ianwclarkb: well, the results are there; i think because it's all already installed01:01
ianwara is probably a special case01:01
fungilooks surprisingly simple01:02
clarkbhttps://storage.gra1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/logs_96/677096/2/check/system-config-run-base/f8a7f40/bridge.openstack.org/ara-report/ ya that seems to ahve worked01:02
fungiand yeah, likely relying on our preinstalled ara on the executors01:02
clarkbright but it isn't just being preinstalled, we don't allow you to run commands either?01:02
ianwtrusted roles can?  an ara-report role is in in the base jobs01:04
clarkboh that must be it01:04
clarkbthe role itself is trusted01:04
ianwthanks; the first 3rd-party test failed with a timeout, another thing i know we're looking into ... https://github.com/philpep/testinfra/pull/482#issuecomment-52227943401:07
ianwanyway, i can do some re-running now01:07
*** spsurya has joined #openstack-infra01:09
*** yamamoto has quit IRC01:11
*** yamamoto has joined #openstack-infra01:11
*** markvoelker has joined #openstack-infra01:20
*** zhurong has joined #openstack-infra01:22
openstackgerritMerged opendev/system-config master: Convert nested bridge.o.o ARA report to static HTML  https://review.opendev.org/67709601:25
*** markvoelker has quit IRC01:25
*** pkopec has quit IRC01:25
mnaserianw: i think the hold worked, but it might have failed in a different way..01:31
mnaseri have a job currently stuck here again after doing teh ssh key distribution01:31
*** redrobot has quit IRC02:23
*** Guest90568 has joined #openstack-infra02:29
*** Guest90568 is now known as redrobot02:32
*** jamesmcarthur has joined #openstack-infra02:37
*** bhavikdbavishi has joined #openstack-infra02:41
*** bhavikdbavishi1 has joined #openstack-infra02:44
*** bhavikdbavishi has quit IRC02:45
*** bhavikdbavishi1 is now known as bhavikdbavishi02:45
*** ramishra has joined #openstack-infra02:45
openstackgerritIan Wienand proposed opendev/puppet-log_processor master: log-gearman-worker: handle deflate encoded values  https://review.opendev.org/67710402:49
openstackgerritIan Wienand proposed opendev/puppet-log_processor master: log-gearman-worker: Remove jenkins streaming workaround  https://review.opendev.org/67710502:49
ianwclarkb: ^ https://review.opendev.org/#/c/677104/1 note dropped a comment on why i think rax is returning deflated data despite us not asking for it ... let me see if i can replicate with curl maybe02:52
ianwooohh, you know this is more likely to be that we've switched to https, isn't it ...02:54
ianwno, so it's not just rax ... i guess that we put the data in as deflate encoded, and then that's how swift serves it back03:00
*** diablo_rojo has joined #openstack-infra03:00
ianwroles/upload-logs-swift/library/zuul_swift_upload.py:                headers['content-encoding'] = 'deflate03:01
ianwyeah, ok, mystery solved03:01
ianwlet me update the comment ...03:01
openstackgerritIan Wienand proposed opendev/puppet-log_processor master: log-gearman-worker: handle deflate encoded values  https://review.opendev.org/67710403:07
openstackgerritIan Wienand proposed opendev/puppet-log_processor master: log-gearman-worker: Remove jenkins streaming workaround  https://review.opendev.org/67710503:07
*** exsdev has quit IRC03:10
*** ricolin has joined #openstack-infra03:10
clarkbianw: do you need to decodethe zlib decompress output at utf8 too?03:19
clarkbin 10403:19
*** odicha has joined #openstack-infra03:23
*** exsdev has joined #openstack-infra03:24
*** jamesmcarthur has quit IRC03:33
ianwclarkb: no, i don't think so, i remove that in the follow-on03:35
ianwanyway, as you say i think we need to filter the logs too, but this is i think step one in at least getting the logs :)03:35
*** ricolin has quit IRC03:44
*** dave-mccowan has quit IRC03:46
*** dklyle has joined #openstack-infra03:46
*** jamesmcarthur has joined #openstack-infra03:47
clarkb++03:49
*** ykarel has joined #openstack-infra03:51
*** dklyle has quit IRC03:52
*** jamesmcarthur has quit IRC04:00
*** jamesmcarthur has joined #openstack-infra04:00
AJaegerconfig-core, please review a change to test whether we can remove the logurl that is wrong now with swift: https://review.opendev.org/676755 - and also please review these two for preparing to use promote pipeline: https://review.opendev.org/#/c/676624/ and https://review.opendev.org/#/c/67663004:02
*** yamamoto has quit IRC04:15
*** yamamoto has joined #openstack-infra04:15
*** jhesketh has quit IRC04:18
*** jhesketh has joined #openstack-infra04:19
*** jamesmcarthur has quit IRC04:24
*** jamesmcarthur has joined #openstack-infra04:26
*** diablo_rojo has quit IRC04:28
*** udesale has joined #openstack-infra04:30
*** jamesmcarthur has quit IRC04:37
*** jamesmcarthur has joined #openstack-infra04:44
*** jamesmcarthur has quit IRC05:00
*** dpawlik has joined #openstack-infra05:00
*** dchen has quit IRC05:00
*** jamesmcarthur has joined #openstack-infra05:05
*** dchen has joined #openstack-infra05:05
*** trident has quit IRC05:06
*** ykarel has quit IRC05:09
*** ykarel has joined #openstack-infra05:10
*** ykarel is now known as ykarel|afk05:14
*** ociuhandu has joined #openstack-infra05:15
*** trident has joined #openstack-infra05:15
*** ricolin has joined #openstack-infra05:16
*** kopecmartin|off is now known as kopecmartin05:18
*** trident has quit IRC05:21
*** markvoelker has joined #openstack-infra05:23
*** raukadah is now known as chkumar|ruck05:23
*** janki has joined #openstack-infra05:24
*** markvoelker has quit IRC05:27
*** trident has joined #openstack-infra05:27
*** dychen has joined #openstack-infra05:29
*** dychen has quit IRC05:31
*** dychen has joined #openstack-infra05:32
*** dchen has quit IRC05:32
*** ociuhandu has quit IRC05:36
*** jamesmcarthur has quit IRC05:38
*** jamesmcarthur has joined #openstack-infra05:42
*** AJaeger has quit IRC05:42
*** yikun has quit IRC05:43
*** jamesmcarthur has quit IRC05:44
*** AJaeger has joined #openstack-infra05:45
AJaegerianw, frickler, could you review, please? ^05:46
*** ykarel|afk is now known as ykarel05:49
*** jaosorior has joined #openstack-infra05:51
*** dingyichen has joined #openstack-infra05:54
*** dychen has quit IRC05:57
*** ociuhandu has joined #openstack-infra06:02
openstackgerritFelix Schmidt proposed zuul/zuul master: Make direct-push configurable on project-level  https://review.opendev.org/67710906:02
openstackgerritFelix Schmidt proposed zuul/zuul master: Implement push job in merger  https://review.opendev.org/67711006:02
openstackgerritFelix Schmidt proposed zuul/zuul master: Push changes in GerritReporter if direct-push is enabled  https://review.opendev.org/67711106:02
*** jbadiapa has joined #openstack-infra06:05
*** n-saito has joined #openstack-infra06:12
*** dychen has joined #openstack-infra06:12
AJaegerthanks, ianw !06:14
*** dingyichen has quit IRC06:15
*** yamamoto has quit IRC06:15
*** ykarel is now known as ykarel|afk06:21
openstackgerritMerged openstack/openstack-zuul-jobs master: Use opendev-tox-docs for api jobs  https://review.opendev.org/67662406:22
openstackgerritMerged openstack/project-config master: Add promote api-ref/guide jobs  https://review.opendev.org/67663006:23
AJaegerconfig-core, please review https://review.opendev.org/677091 to keep base-test in sync for opendev/base-jobs06:32
*** yamamoto has joined #openstack-infra06:34
*** threestrands has quit IRC06:34
*** threestrands has joined #openstack-infra06:35
*** e0ne has joined #openstack-infra06:37
*** e0ne has quit IRC06:38
*** ykarel|afk is now known as ykarel06:38
*** kjackal has joined #openstack-infra06:39
*** dychen has quit IRC06:40
*** dchen has joined #openstack-infra06:41
*** dchen has quit IRC06:43
openstackgerritMerged opendev/base-jobs master: Remove log_url from emit-job-header (base-test)  https://review.opendev.org/67675506:43
*** dchen has joined #openstack-infra06:48
*** pkopec has joined #openstack-infra06:51
*** ociuhandu has quit IRC06:53
openstackgerritMerged opendev/base-jobs master: Sync base-test jobs  https://review.opendev.org/67709106:58
*** jtomasek has joined #openstack-infra06:58
*** ociuhandu has joined #openstack-infra07:00
*** udesale has quit IRC07:05
*** udesale has joined #openstack-infra07:06
*** dchen has quit IRC07:14
*** jaosorior has quit IRC07:14
openstackgerritIan Wienand proposed opendev/puppet-log_processor master: Debug stripping: remove obsolete GET filter, add local filter  https://review.opendev.org/67712207:15
openstackgerritIan Wienand proposed opendev/puppet-log_processor master: log-gearman-worker: remove obsolete GET debug filter, add local filter  https://review.opendev.org/67712207:15
*** ociuhandu has quit IRC07:16
*** ykarel is now known as ykarel|afk07:16
*** rcernin has quit IRC07:17
*** apetrich has joined #openstack-infra07:17
*** dchen has joined #openstack-infra07:18
*** bhavikdbavishi has quit IRC07:19
*** bhavikdbavishi has joined #openstack-infra07:19
*** threestrands has quit IRC07:19
ianwclarkb: ^ so i think that should be roughly what's required to get logs ingested again ... lightly tested, on logstash-worker01 i have  /home/ianw/ls-test/test.py which has extracted most of the core bits for testing, but it's not our best example of CI based development :)07:21
*** jtomasek has quit IRC07:21
*** ykarel|afk has quit IRC07:23
*** dpawlik has quit IRC07:34
*** rpittau|afk is now known as rpittau07:35
*** takamatsu has joined #openstack-infra07:37
*** yolanda has quit IRC07:42
*** yolanda__ has joined #openstack-infra07:43
*** udesale has quit IRC07:43
*** jpena|off is now known as jpena07:44
*** udesale has joined #openstack-infra07:44
*** ykarel|afk has joined #openstack-infra07:46
*** ykarel|afk is now known as ykarel07:46
*** lucasagomes has joined #openstack-infra07:48
openstackgerritAndreas Jaeger proposed opendev/bindep master: Update openSUSE testing  https://review.opendev.org/67713307:48
*** xarses_ has quit IRC07:53
*** andreww has joined #openstack-infra07:54
*** andreww has quit IRC07:54
*** andreww has joined #openstack-infra07:54
*** dougsz has joined #openstack-infra08:02
*** jaosorior has joined #openstack-infra08:05
*** trident has quit IRC08:11
*** xenos76 has joined #openstack-infra08:15
*** tkajinam has quit IRC08:18
*** trident has joined #openstack-infra08:19
AJaegerHere're two reviews for bindep, please: https://review.opendev.org/#/c/667533/5 and https://review.opendev.org/#/c/667614/08:23
*** ykarel is now known as ykarel|afk08:23
AJaegerfrickler, ianw, could either of you review these, please? ^08:24
*** dchen has quit IRC08:24
*** yolanda__ is now known as yolanda08:28
*** adriant has quit IRC08:29
*** iokiwi has quit IRC08:29
*** adriant has joined #openstack-infra08:31
*** iokiwi has joined #openstack-infra08:31
openstackgerritFelix Schmidt proposed zuul/zuul master: Implement push job in merger  https://review.opendev.org/67711008:32
openstackgerritFelix Schmidt proposed zuul/zuul master: Push changes in GerritReporter if direct-push is enabled  https://review.opendev.org/67711108:32
*** noorul has joined #openstack-infra08:33
*** noorul has quit IRC08:34
*** derekh has joined #openstack-infra08:34
*** ociuhandu has joined #openstack-infra08:37
sshnaidmfolks, seems like limestone completely down with centos, network issues fail all jobs08:38
sshnaidmcan we exclude it somehow from jobs?08:39
*** kjackal has quit IRC08:41
*** ociuhandu has quit IRC08:41
*** e0ne has joined #openstack-infra08:42
*** kjackal has joined #openstack-infra08:43
*** kjackal has quit IRC08:47
*** kjackal has joined #openstack-infra08:49
*** ykarel|afk is now known as ykarel08:51
AJaegersshnaidm: could you help debugging so that it gets fixed? What has changed that it now fails?08:55
sshnaidmAJaeger, according to fungi and donnyd it's problem " centos-7 images there are not correctly obtaining their ipv4 address configuration", not sure how I can help debug this08:56
*** xenos76 has quit IRC09:00
*** dpawlik has joined #openstack-infra09:03
*** roman_g has joined #openstack-infra09:03
*** janki has quit IRC09:17
*** yamamoto has quit IRC09:21
*** n-saito has left #openstack-infra09:21
*** yamamoto has joined #openstack-infra09:23
*** yamamoto has quit IRC09:24
*** yamamoto has joined #openstack-infra09:31
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Remove openSUSE 42.3  https://review.opendev.org/67715809:33
*** markvoelker has joined #openstack-infra09:35
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Remove openSUSE 42.3  https://review.opendev.org/67715809:39
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Add legacy-opensuse-15 nodeset  https://review.opendev.org/67716209:40
*** markvoelker has quit IRC09:40
openstackgerritKartik Sharma proposed openstack/os-performance-tools master: Remove the incorrect bug tracker link  https://review.opendev.org/67716309:50
*** apetrich has quit IRC09:56
chkumar|ruckAJaeger: Is it possible to hold the node from tripleo project patch running on limestone and see what is happening there?10:07
AJaegeryou need an infra-root for this, I don't have those permissions. I expect somebody can help you later10:09
chkumar|ruckAJaeger: thanks!10:09
*** ociuhandu has joined #openstack-infra10:15
*** dougsz has quit IRC10:17
*** ociuhandu has quit IRC10:20
*** kjackal has quit IRC10:28
yoctozeptohey infra, Zuul did something odd to one of our checks10:29
yoctozeptoplease see http://zuul.openstack.org/status10:29
yoctozeptofor 67714410:29
yoctozeptoit queued up jobs that should not run10:29
yoctozeptothey were not in queue at the beginning10:30
*** Wasaac has quit IRC10:30
yoctozeptoit must've happened in the meantime10:30
*** Wasaac has joined #openstack-infra10:31
*** dougsz has joined #openstack-infra10:35
*** ralonsoh has joined #openstack-infra10:37
*** elod is now known as elod_off10:47
yoctozeptoafter a little digging10:51
yoctozeptoit looks like this merged in the meantime: https://review.opendev.org/66663410:52
yoctozeptocould it reset the job in kolla to check all optional jobs?10:52
yoctozepto(even if should not be matched)10:52
yoctozeptolooks like a bug in Zuul10:52
donnydThere is not an issue at FN with centos. I posted links showing that is was functional yesterday10:55
*** xenos76 has joined #openstack-infra10:57
AJaegeryoctozepto: what exactly is the problem? Which job should not run and is run?11:00
yoctozeptoAJaeger: a bunch: e.g. kolla-ansible-ubuntu-source-ironic, kolla-ansible-centos-source-cinder-lvm11:02
yoctozeptothey appeared later than the other, proper, did11:03
yoctozeptothe only merged changed during that time was the one I linked just above11:03
AJaegeryoctozepto: no direct idea. Please see my -1 on the change and adjust, you should never have non-voting jobs in gate queue...11:04
AJaegerbetter uncomment the lines...11:04
*** dave-mccowan has joined #openstack-infra11:06
*** dchen has joined #openstack-infra11:06
yoctozeptoAJaeger: yeah, will switch to commenting out now because ooo has their CI broken11:07
yoctozeptoso no point in running that at all11:07
yoctozeptojust waiting for Zuul to finish11:07
*** xenos76 has quit IRC11:07
yoctozeptoso that I have a proof it ran too many checks... but why reconfig of kolla-ansible would cause this is beyond me :-)11:08
AJaegeryoctozepto: hope somebody else has an idea.11:09
AJaegerYou could add "debug: true" to the check pipeline configuration, that shows after the run why jobs were run or not run.11:10
AJaeger(maybe only why not run)11:10
*** gfidente has joined #openstack-infra11:10
yoctozeptoAJaeger: well, they got added later, not during init, that's suspicious :D11:12
yoctozeptofirst time I saw such behavior11:12
AJaegerweird11:13
*** dchen has quit IRC11:13
*** xenos76 has joined #openstack-infra11:14
*** udesale has quit IRC11:15
*** tesseract has joined #openstack-infra11:15
*** dougsz has quit IRC11:17
*** kjackal has joined #openstack-infra11:27
dirkAJaeger: evrardjp: what is the urgency of the 42.3 job removal? we should still have a 42.3 nodepool image?11:32
frickleryoctozepto: 2019-08-19 10:04:33,230 DEBUG zuul.layout: [e: d383ebbbfc1248a3b59af867de2eaba2] The configuration of job <Job kolla-ansible-ubuntu-source-ironic branches: {MatchAny:{BranchMatcher:master}} source: opendev/base-jobs/zuul.d/jobs.yaml@master#25> is changed by <Change 0x7f185d716908 openstack/kolla 677144,1>; ignoring file matcher11:33
AJaegerdirk: it's not building since two months according the Shrews (see backscroll)11:34
fricklerso it seems that indeed the reconfiguraton triggered by 666634 causes new jobs to be added because the file matcher is ignored in this special situation11:34
yoctozeptoit should not touch check queue as it is independent11:35
yoctozeptofeels weird to me11:35
yoctozeptothanks for confirming though11:35
yoctozeptomakes me sound less insane ;D11:35
*** apetrich has joined #openstack-infra11:36
*** markvoelker has joined #openstack-infra11:36
dirkAJaeger: ok, yeah, the dreaded systemd-logger/rsyslog thing. which is still a bug in leap 15.* as well11:37
dirkAJaeger: how about we simply fix that one?11:37
*** jpena is now known as jpena|lunch11:39
AJaegerdirk, evrardjp, yes, that's an option as well. Question still remains what to do with openSUSE 42.3: Should it be removed from master and replaced with 15? And what about old branches where we have experimental jobs sometimes?11:40
aspiersis it just me or is gitea agonisingly slow?11:41
aspiersit's taking 10-20s per page load, e.g. https://opendev.org/openstack/nova/11:41
*** markvoelker has quit IRC11:41
AJaegeraspiers: nova is too large - gitea does some git operations and that slow it down...11:41
aspiersI wonder which git operations11:41
aspierssounds like a performance bug11:42
AJaegeraspiers: I don't remember - it was discussed in backscroll sometimes here, so please check logs - or source code ;)11:42
*** yamamoto has quit IRC11:43
*** rlandy has joined #openstack-infra11:47
*** rlandy is now known as rlandy|rover11:48
aspiersshame that gitea doesn't support URLs with shortened SHA1s11:49
*** chkumar|ruck is now known as chkumar|rover11:49
*** rlandy|rover is now known as rlandy|ruck11:49
*** janki has joined #openstack-infra11:50
*** dougsz has joined #openstack-infra11:55
openstackgerritDirk Mueller proposed openstack/diskimage-builder master: zypper-minimal: install without recommends  https://review.opendev.org/67718811:56
dirkAJaeger: I would replace 42.3 with 15 jobs, what to do on stable branches is a good question..11:57
dirkAJaeger: I can see reasons for removing the jobs as well there (or just keeping them around)11:57
*** markvoelker has joined #openstack-infra11:57
dirkAJaeger: ^^ above should fix the 42.3 problem11:57
*** tdasilva has joined #openstack-infra11:59
*** weshay_pto is now known as weshay12:00
*** rh-jelabarre has joined #openstack-infra12:04
*** yamamoto has joined #openstack-infra12:04
*** ykarel is now known as ykarel|afk12:05
*** yamamoto has quit IRC12:05
*** yamamoto has joined #openstack-infra12:06
*** ociuhandu has joined #openstack-infra12:07
*** ociuhandu has quit IRC12:11
*** jaosorior has quit IRC12:14
*** lucasagomes has quit IRC12:17
*** lucasagomes has joined #openstack-infra12:18
*** apetrich has quit IRC12:20
*** yamamoto has quit IRC12:24
*** xenos76 has quit IRC12:27
*** apetrich has joined #openstack-infra12:28
*** ykarel|afk is now known as ykarel12:28
*** rfolco has joined #openstack-infra12:28
*** odicha has quit IRC12:29
*** yamamoto has joined #openstack-infra12:30
*** jpena|lunch is now known as jpena12:31
*** yamamoto has quit IRC12:35
AJaegerthanks, dirk12:36
ykarelfatal: unable to access 'https://github.com/voxpupuli/puppet-git_resource/': Failed to connect to 140.82.114.3: Network is unreachable12:38
*** jaosorior has joined #openstack-infra12:38
ykarelseeing ^^ in centos7 jobs running in limestone-regioneone12:39
ykarelhttps://storage.bhs1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/logs_92/677192/2/check/puppet-openstack-integration-5-scenario001-tempest-centos-7-luminous/7140e5e/job-output.txt12:39
ykarelis the issue already known?12:39
AJaegerykarel: looks like known IPv4 issue with CentOS in limestone12:40
ykarelAJaeger, hmm i see ipv6 address in zuul inventory, http://storage.bhs1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/logs_92/677192/2/check/puppet-openstack-integration-5-scenario001-tempest-centos-7-luminous/7140e5e/zuul-info/inventory.yaml12:42
*** spsurya has quit IRC12:43
*** xenos76 has joined #openstack-infra12:44
*** ykarel is now known as ykarel|away12:47
openstackgerritMerged openstack/openstack-zuul-jobs master: Switch to promote jobs for api-ref/-guide  https://review.opendev.org/67663112:47
efriedAJaeger: Thanks for the fixups yesterday12:52
*** ykarel|away has quit IRC12:52
AJaegerefried: now to get it merged ;)12:53
*** yamamoto has joined #openstack-infra12:53
efriedYeah, I think I saw it was still failing, but didn't look into it at all.12:53
AJaegerefried: you can recheck it now since the blacklist change is in - can't you?12:54
AJaegerit showed exactly that blacklisting is needed ;)12:54
efriedAJaeger: Oh, that's weird, I had it based on top of that change before specifically to prove that it would alert us to the problem, but pass once the problem was fixed.12:54
efriedWhen you patched it up, did you rebase it to master?12:54
AJaegerefried: no, didn't rebase on purpose ;(12:55
efriedhm, no, looks like I ripped it out at PS3, not even sure how I did that.12:55
AJaeger;)12:55
efriedokay, so yeah, I'll recheck it now that kombu is quarantined.12:55
AJaegercool12:55
efriedit's possible I edited it from my phone; who knows what happens when I do that :P12:56
*** mriedem has joined #openstack-infra12:57
efriedhm, so I don't need to cite the ensure-python role at all? I admit, I really don't understand how this stuff works.12:57
openstackgerritSorin Sbarnea proposed opendev/bindep master: Expose base python version as an atom  https://review.opendev.org/63995112:58
AJaegerefried: variables are global and not specific to a role12:58
*** dougsz has quit IRC12:59
efriedso I guess that role must be specified somewhere in the job's ancestry.12:59
AJaegeryes13:00
efriedcool13:00
*** dougsz has joined #openstack-infra13:03
*** eharney has joined #openstack-infra13:07
*** janki has quit IRC13:20
*** beekneemech is now known as bnemec13:26
*** dklyle has joined #openstack-infra13:29
*** ociuhandu has joined #openstack-infra13:30
*** ociuhandu has quit IRC13:34
*** xenos76 has quit IRC13:34
*** ykarel|away has joined #openstack-infra13:43
zbrAJaeger: can you please help with https://review.opendev.org/#/c/667614/ ? is blocking other changes.13:44
AJaegerzbr: I'm not a core on bindep13:44
AJaegerinfra-root, could you review https://review.opendev.org/667614 , https://review.opendev.org/667533 , https://review.opendev.org/667694  on bindep, please?13:45
zbrclarkb: mordred fungi  ^13:45
*** ykarel|away is now known as ykarel13:45
zbrthanks. good idea to use the magic word13:45
openstackgerritSorin Sbarnea proposed opendev/bindep master: Expose base python version as an atom  https://review.opendev.org/63995113:46
openstackgerritMerged opendev/base-jobs master: Remove log_url from emit-job-header  https://review.opendev.org/67675613:46
openstackgerritSorin Sbarnea proposed opendev/bindep master: Fix tox python3 overrides  https://review.opendev.org/60561313:49
*** jeliu_ has joined #openstack-infra13:57
*** ociuhandu has joined #openstack-infra13:58
fungisshnaidm: chkumar|rover: i was merely parroting what clarkb had guessed earlier. except then he later checked a centos-7 node in limestone and it had working ipv4 configured so we need to try and debug an actual failure there (it may be the job is doing something to the v4 config, or it may be that only some nodes booted there have this problem)13:59
sshnaidmfungi, maybe we can hold one and debug there14:00
fungiaspiers: AJaeger: the issue seems to be that when you're in file browse mode gitea wants to look up most recent commit info for each file it's displaying, and the larger the repository the longer that ends up taking. corvus suggested some ways that interested folks can help gitea upstream to make that more efficient but so far there have been no takers14:01
*** ociuhandu has quit IRC14:03
openstackgerritJames E. Blair proposed opendev/system-config master: Run a gerrit container (test only)  https://review.opendev.org/63040614:04
fungiaspiers: i also suggested gitea add support for abbreviated commit ids in urls: https://github.com/go-gitea/gitea/issues/645014:05
*** dougsz has quit IRC14:05
AJaegerfrickler: "ERROR: InterpreterNotFound: pypy" ;(14:07
AJaegerfrickler: we shoulud probably install pypy for that job14:08
*** pkopec has quit IRC14:08
fricklerAJaeger: hmm, it looked like it was passing on other patches, but maybe they were even older14:08
AJaegerfrickler: we removed the default bindep-ballback file that was installing pypy everywhere14:09
*** pkopec has joined #openstack-infra14:09
AJaegerso, now it's not installed anymore14:09
AJaegerfrickler: want to +2A 667614 now?14:11
AJaegerWe can add pypy to the bindep.txt file of bindep14:11
*** ociuhandu has joined #openstack-infra14:11
*** xenos76 has joined #openstack-infra14:15
*** pkopec has quit IRC14:15
*** sshnaidm is now known as sshnaidm|bbl14:16
fricklerAJaeger: I agree that the pypy fix is likely independent of the current stack of patches. I still don't feel confident enough on bindep to +A those, rather wait for someone else to give another look at it14:17
AJaegerfungi, it's up to you ;) could you review https://review.opendev.org/667614 , https://review.opendev.org/667533 , https://review.opendev.org/667694  on bindep, please?14:18
*** dpawlik has quit IRC14:20
*** pkopec has joined #openstack-infra14:20
*** eharney_ has joined #openstack-infra14:25
*** dougsz has joined #openstack-infra14:26
*** eharney has quit IRC14:26
openstackgerritAndreas Jaeger proposed opendev/bindep master: Add bindep.txt for pypy  https://review.opendev.org/67721614:29
AJaegerfrickler: here's the proposed fix ^14:29
AJaegerconfig-core, could you review https://review.opendev.org/677162 - to add legacy openSUSE 15 nodeset, please?14:31
*** yolanda has quit IRC14:31
*** SpamapS has quit IRC14:31
*** yolanda has joined #openstack-infra14:36
aspiersfungi: nice, thanks!14:37
AJaegerfungi, I'll add pinning for hacking...14:37
fungiAJaeger: see latest comment i just added14:38
fungitox minversion needs to be set for the basepython conflict option14:38
AJaegerwill add as well...14:38
openstackgerritAndreas Jaeger proposed opendev/bindep master: Some cleanups  https://review.opendev.org/67722014:40
AJaegerfungi: ^14:40
AJaegerand thanks for reviewing!14:40
fungithanks for the patches!14:40
AJaegerfungi: I'll rebase - one more change...14:41
fungicool, i'll still go through the others too14:41
AJaegerArgh, misread - all fine...14:41
AJaegerwe can merge 677220 as is14:41
*** markvoelker has quit IRC14:42
openstackgerritAndreas Jaeger proposed opendev/bindep master: Replace Trusty with Bionic in the testing  https://review.opendev.org/66769414:44
AJaegerconfig-core, two reviews for switching openstack-tox-docs to a promote job: ttps://review.opendev.org/677008 - promote jobs for tox-docs14:47
AJaegerand https://review.opendev.org/677009 , please14:48
*** noorul has joined #openstack-infra14:48
*** sgw has joined #openstack-infra14:51
clarkbAJaeger: I've +2'd the bindep followon changes if fungi can review those I think that will be done14:55
clarkbnow looking at tox-docs14:55
AJaegerthanks, clarkb - and good morning14:55
*** armax has joined #openstack-infra14:55
AJaegerclarkb: and https://review.opendev.org/677162  as well, please14:55
openstackgerritMerged opendev/bindep master: Use Python 3.x by default for testing  https://review.opendev.org/66761414:56
*** ykarel has quit IRC14:57
openstackgerritJavier Peña proposed opendev/puppet-openstackci master: Add AFS mirror support for RHEL/CentOS  https://review.opendev.org/52873914:57
AJaegertwo more bindep changes for review, please https://review.opendev.org/667694 and https://review.opendev.org/62232514:58
*** SpamapS has joined #openstack-infra14:59
*** josephrsandoval has joined #openstack-infra15:01
*** pkopec has quit IRC15:02
*** markvoelker has joined #openstack-infra15:03
*** jaosorior has quit IRC15:04
openstackgerritMerged openstack/openstack-zuul-jobs master: Add legacy-opensuse-15 nodeset  https://review.opendev.org/67716215:05
*** xenos76 has quit IRC15:09
openstackgerritMerged opendev/bindep master: Switch to opensuse-15 nodeset for bindep testing  https://review.opendev.org/66753315:09
*** xenos76 has joined #openstack-infra15:11
*** pkopec has joined #openstack-infra15:12
zbrfungi: another bindep https://review.opendev.org/#/c/667694/15:14
fungiyep, that's in the list of changes i was already looking at15:15
clarkbinfra-root https://review.opendev.org/#/c/677122/2 the stack that ends there should hopefully get us some indexed logs again (and we can continue to make incremental improvements as described in my comment in that change)15:16
clarkbI can work on that change in a bit too15:16
*** jamesmcarthur has joined #openstack-infra15:19
*** eharney_ is now known as eharney15:20
fungiAJaeger: was there a change to add pypy to bindep's bindep.txt test profile?15:21
fungior was that merely discussed as an option15:21
*** chkumar|rover is now known as raukadah15:24
openstackgerritClark Boylan proposed opendev/puppet-log_processor master: Don't try to get .gz suffixed files in addition to base url  https://review.opendev.org/67723615:26
clarkbcorvus: ^ You may want to double check my assertion in that one15:26
AJaegerfungi: https://review.opendev.org/677216 adds pypi15:27
openstackgerritMerged opendev/bindep master: Replace Trusty with Bionic in the testing  https://review.opendev.org/66769415:27
corvusclarkb: correct15:28
fungithanks AJaeger!15:28
*** noorul has quit IRC15:31
*** noorul has joined #openstack-infra15:31
*** factor has quit IRC15:32
*** factor has joined #openstack-infra15:32
*** kjackal has quit IRC15:32
AJaegercorvus, fungi, https://review.opendev.org/677008 adds a promote job for openstack-tox-docs - could I get a +2A on it, please?15:33
openstackgerritTristan Cacqueray proposed zuul/zuul master: trigger: add job filter event  https://review.opendev.org/63990515:37
*** ociuhandu has quit IRC15:37
*** gyee has joined #openstack-infra15:38
AJaegerthanks, corvus !15:39
openstackgerritMerged opendev/puppet-log_processor master: log-gearman-worker: handle deflate encoded values  https://review.opendev.org/67710415:39
openstackgerritMerged opendev/puppet-log_processor master: log-gearman-worker: Remove jenkins streaming workaround  https://review.opendev.org/67710515:39
openstackgerritTristan Cacqueray proposed zuul/zuul master: webtrigger: add initial driver and event  https://review.opendev.org/55515315:40
*** factor has quit IRC15:40
*** factor has joined #openstack-infra15:41
openstackgerritMerged openstack/project-config master: Add promote-openstack-tox-docs  https://review.opendev.org/67700815:44
AJaegerfungi, commented on the promote job - see https://opendev.org/opendev/base-jobs/src/branch/master/zuul.d/secrets.yaml#L52, this is how the job is defined.15:44
fungiyep15:45
fungithanks15:45
fungiit makes sense that we'd tightly-couple path data with the credentials in this case, just odd more generally to see file paths as part of the credential set15:46
AJaegerit's this way far easier to configure - compared to a separate playbook that we use now for publish jobs15:46
openstackgerritTristan Cacqueray proposed zuul/zuul master: webtrigger: add web route and rpclistener  https://review.opendev.org/55483915:46
openstackgerritTristan Cacqueray proposed zuul/zuul master: web: add build button to trigger job  https://review.opendev.org/63571615:46
*** mattw4 has joined #openstack-infra15:48
*** ociuhandu has joined #openstack-infra15:51
*** josephrsandoval has quit IRC15:52
openstackgerritMerged opendev/puppet-log_processor master: log-gearman-worker: remove obsolete GET debug filter, add local filter  https://review.opendev.org/67712215:53
openstackgerritAndreas Jaeger proposed openstack/project-config master: Remove now obsolete publish-jobs  https://review.opendev.org/67701315:55
openstackgerritAndreas Jaeger proposed openstack/project-config master: Avoid duplication of secret  https://review.opendev.org/67701615:55
openstackgerritAndreas Jaeger proposed openstack/project-config master: Avoid duplication of secret  https://review.opendev.org/67701615:58
openstackgerritJames E. Blair proposed zuul/zuul master: Render the logfile under the manifest  https://review.opendev.org/67684316:00
*** sgw has quit IRC16:02
openstackgerritMerged opendev/bindep master: Change openstack-dev to openstack-discuss  https://review.opendev.org/62232516:02
openstackgerritMerged opendev/bindep master: Add bindep.txt for pypy  https://review.opendev.org/67721616:02
openstackgerritMerged opendev/bindep master: Some cleanups  https://review.opendev.org/67722016:02
AJaegerconfig-core, want to switch openstack-tox-docs to promote ? https://review.opendev.org/677009 is ready now...16:03
*** gfidente has quit IRC16:03
mriedemclarkb: efried was asking why we aren't getting some hits from the console log here,16:03
mriedemhttps://storage.bhs1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/logs_27/656027/25/check/openstack-tox-py37/369bc38/16:03
mriedemand i think it's probably b/c the console log is huge, 222MB16:04
mriedemi don't see post-run log compression/publish failures, but guessing we're hitting issues indexing that console log?16:04
mriedemoh ha16:04
mriedemDelay in Elastic Search: Indexing behind by 94 hours16:04
mriedemhttp://status.openstack.org/elastic-recheck/16:04
clarkbmriedem: we stopped indexing things with the switch to swift hosted logs (due to assumptions/bugs in the indexing pipeline), we are pushing up fixes for that now. That said yes huge console logs like that will be a problem too16:04
clarkbmriedem: https://review.opendev.org/#/c/677236/1 is the end of the stack16:05
mriedemclarkb: is there a reason the job-output.txt isn't compressed?16:05
clarkbmriedem: it is compressed in transit (and I think in storage) but shown to consumers uncompressed16:06
mriedemah ok16:06
mriedemhence the 'deflate' patch at the bottom of that series?16:06
*** rpittau is now known as rpittau|afk16:07
clarkbmriedem: for example if you wget the file you get a zlib formatted file (which is annoyin gbecause decompressing it is a pain due ot missing magic number for gzip)16:07
corvusyeah, we're using content-encoding to do it  (and yes, transit and storage)16:07
clarkbmriedem: so if you wget it you'll need to use python zlib.decompress or add a gzip magic number or some other method of decompressing locally16:07
efriedokay, so I'm hearing this was a) bad timing, but also b) big logs are a problem -- because why?16:07
*** icarusfactor has joined #openstack-infra16:07
*** tdasilva has quit IRC16:08
efriedI mean, I get that hundreds of megs of log will add up very quickly.16:08
clarkbmriedem: maybe what I need is an alias for gzip called zlib that prepends the maic number then calls gzip16:08
efriedDoes e-r literally copy the logs somewhere? Just to index them or does it keep them?16:09
clarkbefried: they take longer to index, use more disk (the index size is much larger than the input size), they are not easily consumed by humans (try open that file in your browser it will probably blow up), but most importantly you get enough of those in the pipeline at once and things really start to get unhappy particularly around memory use16:09
*** factor has quit IRC16:09
mriedemefried: e-r is just a front-end to the infra elasticsearch cluster that runs queries for known bugs16:10
clarkbefried: the basic process if job finishes and uploads to "long term" log storage (aka swift now). Then zuul sends notices to a gearman client about which log files were just uploaded. That client checks those files against a list of things it knows it cares about and for matches it submits gearman jobs to index the log files. This then creates one gearman job per file which runs on a gearman worker16:10
clarkbwhich fetches the log file, processes it with logstash, then indexes it in elasticserach16:10
*** jamesmcarthur has quit IRC16:11
clarkbwe have to filter out all debug logs to make that fit into elasticsearch and even then still only fit 10 days16:11
*** jpena is now known as jpena|off16:11
mriedem*and* for console logs it's all lines16:11
clarkbthe main reason to not want a 200MB console log though is a human will have a hard time interacting with it. Its just noise16:11
mriedemfor screen logs it's only INFO+16:11
efriedyup, all that makes sense.16:12
efriedSo is it worth trying to "solve" (or at least band-aid) the problem when we have a situation where logs blow way up like this?16:13
efriedLike maybe taking the tail xxMB of the log so we would at least have *something*?16:13
clarkbime the problem is that projects like to warn about things and warn all the time about them16:13
clarkbrather than use the warn once functionality in the warnings lib16:14
*** mattw4 has quit IRC16:14
mriedemwhich we've used to reduce the noise that causes the subunit parser fail bu16:14
mriedem*bug16:14
*** mattw4 has joined #openstack-infra16:14
mriedemthis kombu thing is new so we don't have a filter in place for it16:14
clarkbfwiw in the past testr kept all that output out for stdout/stderr16:14
clarkbs/out for/out of/16:15
mriedemstill does,16:15
clarkbmriedem: no stestr prints all test names and stderr16:15
mriedemthe problem is most of the tests fail and dump the outout16:15
clarkb(for some projects there is a lot of stderr)16:15
efriedright, it was something like 12k unit tests16:15
clarkbefried: mriedem in that case maybe we have stestr limit failure output to like the first 100 failures?16:16
clarkbI could see that being useful locally too16:16
zbrfungi: another bindep https://review.opendev.org/#/c/668740/116:16
efriedclarkb: that would also work16:17
*** lucasagomes has quit IRC16:18
*** igordc has joined #openstack-infra16:18
clarkb100 failures is a lot more manageable than 12k and in the case of 12k failures you likely need only fix one or two issues that have widespread impact caught by those 10016:22
openstackgerritSorin Sbarnea proposed opendev/bindep master: Expose base python version as an atom  https://review.opendev.org/63995116:25
openstackgerritSorin Sbarnea proposed opendev/bindep master: Fix emerge testcases  https://review.opendev.org/46021716:26
*** mattw4 has quit IRC16:28
*** tesseract has quit IRC16:35
*** aaronsheffield has joined #openstack-infra16:38
*** ramishra has quit IRC16:42
*** smarcet has joined #openstack-infra16:42
*** tdasilva has joined #openstack-infra16:45
zbrand finally after dealing with ten other changes, i get my py[23] atom ready again https://review.opendev.org/#/c/639951/16:45
zbri also added support for debian and ubuntu, and documented all platforms supporting the new atoms.16:46
*** smarcet has left #openstack-infra16:47
*** smarcet has joined #openstack-infra16:55
*** jamesmcarthur has joined #openstack-infra16:56
openstackgerritClark Boylan proposed openstack/project-config master: Include ref info on stmp reporter subjects  https://review.opendev.org/67725416:57
clarkbfungi: corvus mgoddard ^ something like that16:58
corvusstmp would have been a great name for the email protocol.  you could pronounce it 'stamp'16:58
clarkbugh I fixed at least one of those typos while writing that commit message. Must've missed another :)16:59
*** smarcet has quit IRC16:59
clarkbcorvus: then you could have stamp collectors17:00
corvusway better that MTAs17:00
fungior philatelist17:00
mgoddardclarkb: assume this is for branches in stable-maint & other emails? Looks reasonable to me17:00
clarkbmgoddard: ya17:01
fungimgoddard: so having it in the subject instead of the url will work for you?17:01
fungiseems like a better place for metadata anyway17:02
clarkbif that works well I'll probably submit a change to zuul to have the default subject include that info too17:03
*** derekh has quit IRC17:03
clarkb(we override anyway so figured I'd start with our overrides)17:03
*** tdasilva has quit IRC17:05
*** ijw has joined #openstack-infra17:05
*** pkopec has quit IRC17:06
*** ijw has quit IRC17:06
*** ijw has joined #openstack-infra17:07
openstackgerritMark Meyer proposed zuul/zuul master: Change branch variable in PR  https://review.opendev.org/67709317:09
openstackgerritMark Meyer proposed zuul/zuul master: Change PR url  https://review.opendev.org/67725717:09
*** e0ne has quit IRC17:10
openstackgerritJames E. Blair proposed zuul/zuul master: Render the logfile under the manifest  https://review.opendev.org/67684317:13
openstackgerritMark Meyer proposed zuul/zuul master: Change PR url to point to the PR not the Repo  https://review.opendev.org/67725717:13
openstackgerritMark Meyer proposed zuul/zuul master: Change branch variable in PR  https://review.opendev.org/67709317:13
*** jamesmcarthur has quit IRC17:15
*** ociuhandu_ has joined #openstack-infra17:16
*** ralonsoh has quit IRC17:16
*** jamesmcarthur has joined #openstack-infra17:17
*** ociuhandu has quit IRC17:19
openstackgerritMerged openstack/project-config master: Include ref info on stmp reporter subjects  https://review.opendev.org/67725417:19
*** ociuhandu_ has quit IRC17:21
*** jamesmcarthur has quit IRC17:22
fungithe STaMP protocol will forever live on thanks to the project-config repo's git log17:24
*** ociuhandu has joined #openstack-infra17:24
openstackgerritClark Boylan proposed opendev/puppet-log_processor master: Fix systemd severity filter input data  https://review.opendev.org/67726017:27
clarkbinfra-root I think ^ will get us to a mostly working spot with log indexing. I have log worker A on worker02 running that17:28
*** ociuhandu has quit IRC17:29
fungiso 677236 was a dead-end?17:30
clarkbfungi: no I think we need to update the job submitter role to not remove the .gz suffix then we can merge 67723617:30
clarkbI was going to look at that next17:31
*** ricolin has quit IRC17:31
fungiahh17:31
fungii couldn't tell whether your comment meant that it wasn't needed, or was simply incomplete17:31
clarkbthe job submitter on the zuul side must be preserving old behavior17:33
*** kopecmartin is now known as kopecmartin|off17:33
clarkbThat isn't necessary anymore iwth the canonical urls all being correct relative to swift now17:33
clarkb(before we would pretend the ungzipped file was a thing)17:33
*** mattw4 has joined #openstack-infra17:33
*** sthussey has joined #openstack-infra17:43
*** bhavikdbavishi1 has joined #openstack-infra17:52
*** icarusfactor has quit IRC17:53
openstackgerritClark Boylan proposed openstack/project-config master: Stop treating .gz files as special in log handling  https://review.opendev.org/67726517:53
openstackgerritClark Boylan proposed opendev/puppet-log_processor master: Don't try to get .gz suffixed files in addition to base url  https://review.opendev.org/67723617:53
*** bhavikdbavishi has quit IRC17:53
*** bhavikdbavishi1 is now known as bhavikdbavishi17:53
clarkbcorvus: fungi: ^ that should do it17:54
*** jamesmcarthur has joined #openstack-infra17:54
clarkbI'm going to pop out for a bike ride now but will be back to help shepherd in the logstash worker stuff (note those changes will be applid to the server files but services are not automaticlaly restarted so we can safely approve them all and wait for updates then have ansible restart things when ready17:59
*** jamesmcarthur has quit IRC17:59
clarkbalso I suppose I should start to try and reproduce the centos7 network trouble too17:59
clarkbbut all that in a bit17:59
*** jamesmcarthur has joined #openstack-infra18:00
*** jamesmcarthur has quit IRC18:05
corvusclarkb: is 677265 going to mess up e-r queries which include filenames?18:10
corvusclarkb: (ie, are we going to start having filenames with .gz in ES ?)18:10
corvusclarkb: (at the same time there are e-r queries for "filename:n-api.log" or somesuch)18:11
clarkbhrm ya probably18:14
clarkbmaybe we wait on that change for now and stabilize first then coordinate with e-r to update queries?18:15
*** michael-beaver has joined #openstack-infra18:17
*** e0ne has joined #openstack-infra18:17
openstackgerritMark Meyer proposed zuul/zuul master: Change PR url to point to the PR not the Repo  https://review.opendev.org/67725718:18
openstackgerritMark Meyer proposed zuul/zuul master: Change branch variable in PR  https://review.opendev.org/67709318:18
*** noorul has quit IRC18:21
*** jamesmcarthur has joined #openstack-infra18:23
corvusefried: here's a more polished version of the 'logfiles under manifest' change if you to try it out: https://storage.gra1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/logs_43/676843/4/check/zuul-build-dashboard/6a10611/npm/html/18:33
efried...18:34
efriedcorvus: Not sure if this is related to just your test setup, but browser Back button is giving me 404 every time I use it (which is because I f'ed up a nav and actually want to go Back and try again)18:35
*** SpamapS has quit IRC18:35
efriedcorvus: refinement: only when I f up by clicking through to one of the blue files that takes me out of the dashboard.18:37
corvusefried: ah, yep, that's the test setup; will work in prod18:37
efriedcorvus: I'm having trouble finding a build that has the black file names (is that still how I'm supposed to find an artifact that does this thing?)18:37
corvusefried: also, i have some ui tweaks in mind to make it less likely to f up in that situation18:37
*** strigazi has quit IRC18:37
*** strigazi has joined #openstack-infra18:38
efriedwell, presumably once you have all the files doing the in-app rendering it would be n/a.18:38
corvusefried: oh, sorry, you no longer need to look for black filenames, it's back to the current behavior (they look like links)18:38
efriedoh, then I'm definitely having trouble finding one that "works". Every blue link I've tried so far has taken me to a new page for that file.18:38
efriedwant to walk me through one?18:38
corvusefried: an easy one in every build should be 'job-output.txt'18:39
corvusthat can always render in-app18:39
efriedgot it18:40
efriedyes, seems to work good18:40
fungioof, i picked one at random which i thought would be small because it was just a unit test job, and the job-output.txt took a while to load because... turns out it's >5k lines long18:42
fungii wonder if we can include some indication of file size/length?18:42
fungithat may influence the choice for someone to click through to the rendered version vs the raw link18:43
corvusfungi: yes, we have the file size in bytes, we can add that to the tree view18:46
fungichanging the level filter on a long log takes a while too... does it re-download the logfile when you do that, or is it really just taking that long for my browser to analyze the loglevel for every one of these 5k+ lines?18:46
corvusthe second thing18:46
fungimakes sense18:46
*** tdasilva has joined #openstack-infra18:46
fungioh, hah, and now i see my folly. i thought i was picking a tox job, but in actuality i mis-selected a tripleo-ci job instead :/18:47
corvusthere may be more that can be done to optimize it, however, you'd be surprised how fast our js that changes everything about that actually runs.  it's really quite fast.  but then after we return our final result to react, something happens which takes a long time.  i don't have the react/js knowledge to know what that is, and if there's something we could do that would help it along.18:47
*** ociuhandu has joined #openstack-infra18:48
efriedLet's get a progress meter18:50
fungioh, for when content download/analysis is occurring/.18:50
fungi?18:50
efriedLoading |-----     | 120 of 259Kb18:50
efriedyeah18:50
* efried says glibly, as if it's an easy thing to do.18:50
*** sgw has joined #openstack-infra18:51
tdasilvaI'm wondering if it's possible to pass a list of tags to `opendev-promote-docker-image` gate job. I'm starting to investigate if it's possible to build py2 and py3 swift docker images for a given patch, but I think the last part is where I need the promote job to tag an image something else besides "latest"18:51
tdasilvahave you guys run into this at all for zuul?18:51
fungiefried: i think even just something which indicates the thing you clicked is still loading, even without indication of progress, would help18:52
efriedagreed ^18:52
efriedthough for my cpu cycles, I would just as soon have a static "Loading..." message as a fancy spinner. But I know I'm and old codger where UI is concerned.18:53
*** ociuhandu has quit IRC18:53
corvustdasilva: yes, you can specify tags (we only use latest for zuul, but i think i have another example handy; one sec)18:55
fungihttps://hub.docker.com/r/zuul/zuul/tags also shows a change tag in addition to latest18:55
corvusfungi, efried: there should be a spinner in the top right when it's downloading a file; however, you may have scrolled it out of view after looking at the first file18:55
corvus(so i think we need something bigger that maybe overlays over the file contents area or something)18:55
efriedcorvus: or have the spinner follow18:56
efriedthere's some kind of css thingy to say "keep it in coordinates x,y of the visible window", nah?18:56
fungicorvus: oh the irony, the top-right of my browser monitor is obscured because it's leaned up against a wall waiting for me to send the one i would be using out for repair18:56
efriedhahahahahahahaha18:56
corvusfungi: i, uh, did not plan for that in my ui design.  i am a failure.18:57
efriedactual lol18:57
efriedthat's going in your permanent record18:57
fungiindeed18:57
efriedwhile you're fixing that one, you may as well include "hippie hair obscures field of vision"18:58
* efried growing hair out, will also need ^18:58
corvusefried: oh, i got that one covered; i don't think i've had a haircut since we merged the zuulv3 spec18:58
fungiwell, that technical difficulty aside, i think my browser needs a restart because it's being really slow to register clicks on these links and i'm starting to suspect it's memory problems not the demo site18:59
corvusfungi: yeah, this will eat memory.  it will cache the data from each file you click on in memory, to make it easier to switch back and forth between them.  it should clear that if you close the tab or switch to a different build.  i don't know how effective that is in practice.19:00
corvustdasilva: here we go: https://opendev.org/opendev/system-config/src/branch/master/.zuul.yaml#L172-L17319:00
fungialso possible switching builds isn't actually clearing it because i used the same tab to switch from the giant tripleo-ci console log to a different build19:00
corvustdasilva: that will tag it as 2.13 *instead of* latest (iow, the default for "tags" is ['latest'])19:01
*** bhavikdbavishi has quit IRC19:01
*** SpamapS has joined #openstack-infra19:01
corvusfungi: it should happen even if you switch builds in the same tab19:02
fungiyep, also possible it did that correctly and my browser was already really bloated on memory utilization19:02
*** dougsz has quit IRC19:03
tdasilvacorvus: interesting that I'd expect that to be in the promote job and not the build job. line 186 seems to infer that the promote job would change the tag from 2.13 to latest? sorry if i'm missing something19:04
fungiyeah, after a browser restart i pulled up a <1k-line job-output.txt for a failed keystone pep8 build and it can switch filter levels in less than a second19:05
AJaegerconfig-core, do we want to switch openstack-tox-docs to promote publishing? Then, please review https://review.opendev.org/677009 and https://review.opendev.org/67701319:07
tdasilvaoh, interesting, I guess I'd have an image with tags py2 *and* latest and then anothe image for the same patch with just a py3 tag19:07
corvustdasilva: all 3 jobs take the same data structure (note we use yaml anchors there to copy it to the upload and promote jobs).  technically i think the build job ignores it, but the upload job does use it -- it forms part of the metadata that get put into zuul so the promote job gets the right images.19:07
*** jamesmcarthur has quit IRC19:08
clarkbAJaeger: all +2 from me19:08
corvusi highly recommend using the same data for all 3 jobs19:08
clarkbcorvus: have a moment for https://review.opendev.org/#/c/677260/1 ? I'll restart all the workers once that is landed and on the nodes19:09
tdasilvacorvus: TIL yaml anchors :)  thanks!19:09
AJaegerthanks, clarkb19:11
corvustdasilva: they're scoped to the individual file in zuul (so nothing outside of that single file will see them).  so far, that's been ideal :)  i think that's the only thing to be aware of.  otherwise, go crazy :)19:11
tdasilvacorvus: thanks! while we are on this subject, we last also talked about the idea of tagging releases. is that possible?19:13
tdasilvacorvus: in this case it would be a one time only tag for when we tag a swift release19:15
AJaegerclarkb: do you want https://review.opendev.org/677265 to merge or not? I wasn't sure from backlog and removed my +A but left a +2. Please self-approve once ready19:18
clarkbAJaeger: ya I think the point corvus made was a good one I will WIP it for now19:19
*** sshnaidm|bbl is now known as sshnaidm19:19
clarkbmriedem: when you get a chance can you read my comment on https://review.opendev.org/#/c/677265/1 and provide input on how bad that would be for e-r? I don't think it would be too bad but there might be a rough patch while we update things19:19
*** dougsz has joined #openstack-infra19:21
fungicould do a mass update via sed probably? but all outstanding changes for e-r queries would need fixing too19:22
clarkbfungi: ya and some files would need the update and others wouldn't I think19:22
clarkbmriedem should have a better sense for that19:22
*** factor has joined #openstack-infra19:22
clarkb(than me)19:22
*** tdasilva has quit IRC19:23
*** tdasilva has joined #openstack-infra19:23
*** eharney has quit IRC19:26
openstackgerritMark Meyer proposed zuul/zuul master: Change PR url to point to the PR not the Repo  https://review.opendev.org/67725719:28
openstackgerritMark Meyer proposed zuul/zuul master: Change branch variable in PR  https://review.opendev.org/67709319:28
*** mriedem has quit IRC19:29
*** mriedem has joined #openstack-infra19:30
openstackgerritDavid Shrewsbury proposed zuul/zuul master: Store autohold requests in zookeeper  https://review.opendev.org/66111419:31
mriedemthere aren't really any outstanding e-r changes that aren't mostly abandoned19:31
openstackgerritClark Boylan proposed zuul/zuul master: Include ref info in smtp reporter subjects  https://review.opendev.org/67728519:33
openstackgerritDavid Shrewsbury proposed zuul/zuul master: Add caching of autohold requests  https://review.opendev.org/66341219:35
openstackgerritDavid Shrewsbury proposed zuul/zuul master: Add autohold-info CLI command  https://review.opendev.org/66248719:35
openstackgerritDavid Shrewsbury proposed zuul/zuul master: Record held node IDs with autohold request  https://review.opendev.org/66249819:35
openstackgerritDavid Shrewsbury proposed zuul/zuul master: Auto-delete expired autohold requests  https://review.opendev.org/66376219:35
openstackgerritDavid Shrewsbury proposed zuul/zuul master: Mark nodes as USED when deleting autohold  https://review.opendev.org/66406019:35
mriedemclarkb: so, "this will change the filenames that e-r queries against", do you mean for the tags?19:41
mriedeme.g. message:"foo" and tags:"screen-n-cpu.txt"?19:41
openstackgerritMerged opendev/puppet-log_processor master: Fix systemd severity filter input data  https://review.opendev.org/67726019:41
mriedemwe don't include .gz in the tags,19:41
mriedembut we'll need to now?19:41
*** smarcet has joined #openstack-infra19:41
mriedemwill the prefix on the filename still be ignored? e.g. looking at this query:19:44
mriedemmessage:"Unable to update the attachment.: MessagingTimeout" AND tags:"screen-c-api.txt" AND voting:119:44
mriedemthat has hits on filename controller/logs/screen-c-api.txt and logs/screen-c-api.txt19:44
*** icarusfactor has joined #openstack-infra19:44
mriedemso would tags need to be: tags:"screen-c-api.txt.gz" or also include the controller/logs OR logs/ prefix?19:44
mriedemthat latter would be pretty annoying, especially for things like grenade jobs that have logs in old/ and new/19:44
*** factor has quit IRC19:46
clarkbmriedem: I'd have to double check on tags I think we might continue to remove the .gz there19:46
clarkbmriedem: but filename would change19:47
clarkbmriedem: the prefixes won't be affected19:47
clarkbmriedem: its all about whether or not we logically treat the file in swift called foo.txt.gz as foo.txt or start referring to it as foo.txt.gz always19:47
clarkbmriedem: with the pre swift stuff we had a webserver smart enough to treat those as the same file19:47
clarkbbut swift would require us to upload the file twice I think (or do two objects) so for swift it is easier to just always refer to it as the foo.txt.gz file19:48
mriedemok we've tried to use tags over filename in queries for a long time so i don't think the impact would be big19:48
clarkbok let me confirm that tags won't be affected19:49
mriedemfilename:"job-output.txt" -> filename:"job-output.txt.gz" yeah?19:52
mriedemwhat i'd probably do is go through and remove old queries with no hits, make the change and then see what's new that doesn't hit to see if i missed something19:53
mriedemthere are only 3 non-test queries that use filename and they all use job-output.txt19:53
openstackgerritAndreas Jaeger proposed openstack/openstack-zuul-jobs master: Add support for building PDFs  https://review.opendev.org/66455519:54
corvustdasilva: sorry, i haven't done any work on tagging releases yet; i agree we should have that :)19:56
openstackgerritAndreas Jaeger proposed openstack/project-config master: Avoid duplication of secret  https://review.opendev.org/67701620:02
AJaegercorvus: so, I don't get the YAML in a state that Zuul is happy about ^ - I'll abandon20:04
*** smarcet has quit IRC20:05
openstackgerritClark Boylan proposed openstack/project-config master: Stop treating .gz files as special in log handling  https://review.opendev.org/67726520:07
clarkbcorvus: mriedem ^ I think that should largely address things for e-r as that will ensure the tag value remains20:07
mriedemi'm working on the e-r side, but finally taking the time to write a script to automatically cleanup old scripts which i've just always done by hand20:09
mriedem*old queries20:09
openstackgerritMerged openstack/project-config master: Add ceph/ceph-ansible to untrusted github projects  https://review.opendev.org/67640220:09
clarkband now to debug centos on limestone. I think my plan there is to boot instances until I catch one with no ipv4? might be a bit slow20:10
*** e0ne has quit IRC20:11
openstackgerritMatthew Thode proposed openstack/diskimage-builder master: update gentoo systemd profile to 17.1 from 17.0  https://review.opendev.org/67729020:14
*** eharney has joined #openstack-infra20:14
tdasilvacorvus: no worries, just wanted to double-check20:18
*** e0ne has joined #openstack-infra20:21
clarkbmaybe that was easier than I expected, clarkb-test on limestone seems to be exhibiting the no ipv4 addr behavior20:22
clarkb2607:ff68:100:54:f816:3eff:fea9:c890 if others want to look too20:22
clarkbI do not immediately see what may be wrong. syslog claims that it is using dhclient and sysconfig says dhcp should be used20:22
clarkbhowever /var/lib/dhclient shows no leases20:22
clarkband there is no dhclient running20:22
*** tdasilva has quit IRC20:24
clarkbianw: ^ maybe you can take a look when your day starts20:25
*** tdasilva has joined #openstack-infra20:25
fungi/etc/sysconfig/network-scripts/ifcfg-eth0 has BOOTPROTO=dhcp20:26
fungiethernet address on eth0 also matches the HWADDR in there20:26
fungiaccording to /var/log/messages it used dhclient for its dhcp-init via networkmanager20:27
*** jamesmcarthur has joined #openstack-infra20:28
*** xenos76 has quit IRC20:29
*** dougsz has quit IRC20:29
fungido working nodes there have dual-stack eth0 or separate v4 and v6 interfaces?20:30
clarkbdual stack eth020:30
fungithis is certainly bizzare20:31
fungiand you'd booted one there over the weekend which got working v4 right? so it's intermittent?20:31
clarkbI just ssh'd into a node nodepool had booted to see if it had ipv4 or not and it did have an ipv4 address20:31
clarkbpretty sure it is intermittent20:31
logan-yeah very intermittent from what ive seen20:32
*** andreww has quit IRC20:33
logan-i'm afk but when i'm back in 1-2 hours i can work on tracing down dhcp packets and check logs etc20:33
clarkbI wonder if this is the inverse of the ipv6 problem we had in fn20:33
*** xarses has joined #openstack-infra20:33
clarkbbasically a race that prevents NM from configuring the ip stack on an interface because it thinks osmething else is doing it20:33
fungiwith the systemd->networkmanager->dhclient maze i'm not entirely sure where dhclient is going to log its activities20:33
clarkbif ^ is the case I bet it will be happy after manually running dhclient20:33
clarkbfungi: I think in the NM log which I don't see messages for but /me double checks journalctl20:34
fungiyeah, i did `journalctl -u NetworkManager.service` and it said it would use dhclient but i don't see actual dhclient logging20:34
clarkbjournalctl -u NetworkManager is where I owuld expect such things20:34
fungialso glean's logs say it ran a couple seconds before networkmanager, so no sequencing issue there i don't think20:35
clarkbya the sequencing issue we had with ipv6 was the kernel configuring the interface because it got an RA then NM deciding it shouldn't configure it20:36
clarkbhowever I don't think ipv4 has any kernel built in stuff that might interfere with NM that way20:36
fungilooks like networkmanager doesn't leave dhclient running as a persistent daemon either (or else didn't realize it failed to start?)20:36
clarkbfungi: I think it only runs it when it needs a new lease20:36
fungino, kernel doesn't do v4 autoconfiguration, that's entirely userspace20:37
fungi/var/lib/dhclient/ is empty and created as part of the image but stat says it was last accessed at 20:21:47 which was ~2.5 minutes after boot20:39
clarkbI'm guessing the clues are in the NM logs sequence of events for eth020:39
clarkbfungi: I ls'ed that dir20:39
fungiimmediately after boot?20:39
clarkbfungi: ya it wasn't long after I did server create that I ssh'd in20:39
fungii did an ls on it too but that doesn't seem to have updated the last access timestamp20:39
clarkbI cd'd into it too20:40
clarkbnot sure if that would change it20:40
fungiaha, yeah that'd do it20:40
fungiactually, no, that also didn't update it when i tried20:40
*** jamesmcarthur has quit IRC20:42
*** jamesmcarthur has joined #openstack-infra20:50
clarkbfinally getting to sorting out a meeting agenda. Do we still need to talk about the afs mirroring status of things? hrm seems fedora last updated ~9 days ago so sounds like it20:57
*** mattw4 has quit IRC21:05
*** mattw4 has joined #openstack-infra21:06
*** jamesmcarthur has quit IRC21:08
*** DinaBelova has quit IRC21:09
*** DinaBelova has joined #openstack-infra21:10
*** jamesmcarthur has joined #openstack-infra21:10
*** jamesmcarthur has quit IRC21:10
*** pkopec has joined #openstack-infra21:14
clarkbdid we update our gitea ssh image more recently than the logging change? https://storage.bhs1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/logs_10/676510/1/check/system-config-run-gitea/4fd8551/gitea99.opendev.org/docker/giteadocker_gitea-ssh_1.txt is an odd failure to have21:15
clarkbor wait we actually have to listen on 222 because we use host networkin right?21:16
clarkbso that is correct to fail there I think21:16
*** jamesmcarthur has joined #openstack-infra21:17
*** eharney has quit IRC21:18
*** jamesmcarthur has quit IRC21:22
*** mattw4 has quit IRC21:23
*** mattw4 has joined #openstack-infra21:24
*** sshnaidm is now known as sshnaidm|afk21:24
*** tdasilva has quit IRC21:26
*** tdasilva has joined #openstack-infra21:26
*** jamesmcarthur has joined #openstack-infra21:28
*** jamesmcarthur has quit IRC21:29
*** pkopec has quit IRC21:30
openstackgerritMatt Riedemann proposed opendev/elastic-recheck master: Add script to remove queries for fixed bugs  https://review.opendev.org/67730221:31
*** kjackal has joined #openstack-infra21:33
clarkbfungi: fwiw my read of the logs that suse did provide are that their secure mail gateway is killing the email when it gets to that point21:33
clarkband our lists server isn't ever getting a connection for that21:33
clarkband then they mentioned the lack of an MX record makes me wonder if that is a known issue with that mail gateay21:34
*** markvoelker has quit IRC21:36
*** tdasilva has quit IRC21:38
*** tdasilva has joined #openstack-infra21:38
fungipossible? if so they ought to fix that21:42
ianwclarkb: ok, just gotta sort a few things then will see if i can help on centos; is it still only limestone?21:43
*** jamesmcarthur has joined #openstack-infra21:44
*** jamesmcarthur has quit IRC21:45
fungiseems that way21:45
fungiunless nodes with similar problems in other providers are manifesting early enough that they get discarded and rebuilt or jobs requeued21:46
fungiafaik we haven't managed to rule that possibility out yet21:46
fungii'm about at a dead-end trying to find where dhcp logging is happening (though this may also mean that the underlying problem is resulting in dhclient never getting invoked)21:47
fungiis it normal for /etc/dhcp/dhclient-eth0.conf to have 'send host-name "<hostname>";'? is that even legal (seems like < and > would be non-rfc-compliant in a hostname string)21:49
fungimaybe networkmanager does some magic to that and treats it as a replacement string?21:49
fungithe dhclient.conf(5) manpage doesn't mention it as a thing, at any rate21:52
ianwhrm, for values of "normal" as in i don't think we changed anything ... but i agree it's not right21:52
ianwis there a /var/run/nm-dhclient-eth0.conf?21:53
fungithere is no /var/run/nm-* at all21:54
openstackgerritMatt Riedemann proposed opendev/elastic-recheck master: Add script to remove queries for fixed bugs  https://review.opendev.org/67730221:54
openstackgerritMatt Riedemann proposed opendev/elastic-recheck master: Remove old queries: 2019-08-19  https://review.opendev.org/67730621:54
openstackgerritJames E. Blair proposed zuul/zuul master: WIP: remove displayedFile from state  https://review.opendev.org/67730721:56
*** jamesmcarthur has joined #openstack-infra21:58
ianwipv4.method:                            disabled21:59
openstackgerritJames E. Blair proposed zuul/zuul master: Add Tristan to Zuul Maintainers  https://review.opendev.org/67730822:00
fungiianw: where's that? a recursive grep of /etc doesn't find the string "ipv4.method"22:01
ianwfungi: ohhh, config in /etc?  yeah right grandpa :)  i got that from ... nmcli c sh 704bbd46-f652-4a65-94f9-9441e1e7bdb7 ... yeah i don't exactly know *why* it's decided that :/22:02
fungithere are days i wish we could go back to expecting to find service and system configuration in plain files within /etc... lately those days are every day22:02
clarkbianw: hrm that was similar to ipv6 and it had decided to leave it alone because kernel had configured things under it22:02
*** trident has quit IRC22:03
*** jamesmcarthur has quit IRC22:03
*** diablo_rojo has joined #openstack-infra22:03
fungiright, i think we don't care that nm is ignoring ipv6 (or even prefer that) if we're doing slaac/ra22:04
fungibut we're relying on it to invoke dhclient in this case22:04
fungifor ipv422:04
clarkbfungi: well it had the side effect of no working ipv6 in that case so we had to increase the kernel timeout before listening to RAs22:04
fungiyeah22:05
clarkbI'm going to restart logstash workers on all the hosts nowish22:05
*** rh-jelabarre has quit IRC22:05
clarkband done. I think we should have logs again22:09
clarkb*indexed logs again22:09
*** jeliu_ has quit IRC22:09
*** jeliu_ has joined #openstack-infra22:10
*** trident has joined #openstack-infra22:11
clarkbwe should keep an eye on queue lengths to ensure we are keeping up with the changes to filtering stuff22:11
*** mriedem has quit IRC22:12
clarkbianw: I seem to recall being able to increase the verbosity of NM logging when I was looking at the ipv6 things. Not sure if that will help here but maybe that is a change we can make to our images?22:16
clarkblooks like our epel mirror has been running vos release since july 2622:17
clarkband the fedora mirror is locked but not running anything22:18
logan-i'm back for a bit. lmk if i can help with anything.22:19
clarkblogan-: I think what ianw found from nmcli points to an issue with our images22:19
logan-ah ok, that lines up well with this only occurring on centos images22:21
clarkbI'm going to grab the fedora lockfile on mirror-update.opendev.org, delete the lock in the vldb, then rerun the rsync without vos release22:21
*** kjackal has quit IRC22:21
*** dave-mccowan has quit IRC22:22
clarkbhrm someone just ran vos unlock against pypi and centos?22:22
clarkbinfra-root ^ is someone else doing afs cleanup?22:22
fungii am not, no22:23
fungilsof the lock?22:23
ianwnot i22:23
clarkbfungi: well its the afs command22:23
fungihuh22:23
ianwcentos would be coming from the new mirror host (rsync)22:23
ianwpypi ... should be disabled?22:23
clarkb  586  2019-08-19T22:18:15+0000 vos unlock mirror.pypi -localauth22:24
clarkbwhat is odd is those come before my commands to unlock fedora a while back22:24
fungimaybe some cron operation we've scripted does a vos unlock?22:24
clarkbso the timestamp there doesn't seem trustworthy22:24
clarkbsince neither of thoes are fedora I'll proceed with fedora22:24
fungioh! those are in root's shell history?22:25
funginot on mirror-update.opendev.org i guess... where are you seeing that?22:27
fungione of the fileservers?22:27
*** tdasilva has quit IRC22:27
clarkbya afs01.dfw.o.o sorry I wasn't clear about that22:27
clarkbI'm tlaking about the afs locks in this case not the coordination locks for cron on mirror-update22:28
clarkbI checked history to refresh my memory on the unlock command and found those22:28
fungiyeah history is up to 1123 lines now, so line 586 is relatively ancient22:28
clarkbfedora rsync script without vos release is running on mirror-update.opendev.org now22:28
fungii suspect that's a side effect of turning on command timestamping and how it shows commands which predate it (so likely have no timestamp)22:29
clarkbI expect that mirror has grown too big to reliably update or that release of fedora-30 has cuased churn there?22:29
corvus(i did not do any afs things)22:29
clarkbit ran for about a week after I fixed it last time22:29
clarkband then for epel I don't know why those processes have stuck around so long. I'm guessing we have to kill the sync on mirror-update, grab the lockfile, remove the vldb lock, manually run rsync then start it over again like fedora?22:30
fungilooks like command timestamping was probably turned on around 2018-04-18 so any commands run before that are being reported by `history` with today's date and time. i wouldn't be worried about it22:30
clarkbah22:31
corvusha.  so "now" means either "now" or "so long ago it doesn't matter".  take your pick.  practically the same thing.22:31
fungibut definitely surprising, thanks for bringing that to light22:31
clarkbok fedora rsync went quickly actually22:32
clarkbnow going to start a vos release with localauth on afs01.dfw22:32
*** xarses has quit IRC22:36
*** xarses_ has joined #openstack-infra22:36
clarkbcorvus: any idea how bad it would be to kill a vos release process on mirror-update that has been running since july 26th?22:41
*** jamesmcarthur has joined #openstack-infra22:43
*** tdasilva has joined #openstack-infra22:45
ianweth0         704bbd46-f652-4a65-94f9-9441e1e7bdb7  ethernet  eth022:45
ianwSystem eth0  5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03  ethernet  --22:45
ianwthis looks exactly like https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=75520222:45
openstackDebian bug 755202 in network-manager "network-manager: keeps creating and using new connection "eth0" that does not work" [Important,Open]22:45
ianwi think this is because ipv6 kernel autoconf has given it an address before nm starts (which yes, ties into the last issue...)22:46
corvusclarkb: if there's no transaction running, probably completely safe22:47
fungiianw: oh joy, because none of us grew tired of debugging that already ;)22:47
clarkbianw: maybe our timeout isn't long enough in all cases then22:48
clarkbianw: the value I chose was a bit trial and error22:48
corvusclarkb: and i don't see any old transactions22:48
clarkbcorvus: k I'll look at doing that next then (currently writing a patch to further restrict what we pull for fedora atomic)22:48
openstackgerritJeff Liu proposed zuul/zuul-operator master: Add PerconaXDB Cluster to Zuul-Operator  https://review.opendev.org/67731522:49
ianwclarkb: i think this summary sounds saneish -> http://paste.openstack.org/show/760139/22:50
clarkbianw: ya in that case maybe we incrase the timeout value we chose22:50
clarkbin fact it should be mostly safe to have that timeout be quite large since NM is configuring interfaces anyway and doesn't care about that sysctl value22:51
*** tdasilva has quit IRC22:51
ianwor disable it?22:51
*** tdasilva has joined #openstack-infra22:51
*** rlandy|ruck is now known as rlandy|ruck|bbl22:52
openstackgerritClark Boylan proposed opendev/system-config master: Add more fedora-atomic mirror exclusions  https://review.opendev.org/67731822:54
clarkbianw: I want to say when I tested disabling it that NM didn't get the RAs22:54
*** jeliu_ has quit IRC22:54
*** EmilienM|pto has quit IRC22:54
clarkbinfra-root 677318 should further reduce the size of the fedora mirror. I'm happy to continue holding the lock for that mirror if we want to get that in and I'll rerun my syncs22:55
*** ijw has quit IRC22:55
fungilooking22:55
*** EmilienM has joined #openstack-infra22:56
fungiwe don't think anyone's relying on us to mirror isos, disk images or bootloaders/payloads?22:56
fungiseems unlikely, yeah22:56
*** lathiat has quit IRC22:57
clarkbthe mirror was set up to mirror qcow2 images for use with libvirt22:57
clarkbwhile they could be using isos or efi configs that would be silly22:57
*** rcernin has joined #openstack-infra22:57
fungiyeah22:57
clarkb(and openstack doesn't pxeboot so I think pxeboot should be safe)22:57
fungioh, the .img files were for installer images22:57
fungirighty-o22:57
*** lathiat has joined #openstack-infra22:57
clarkbya that is why I didn't *.img22:57
fungifire in the hole22:58
clarkbideally magnum would stop using fedora entirely as they are still on 27 iirc22:59
clarkbseems like a platform thatdoesn't update every 6 months might be more appropriate given that23:00
*** diablo_rojo has quit IRC23:03
fungithe explanation i heard was that since rhel-8 was based on fedora-27 they're using that as a stand-in for centos-823:06
fungiand if that's the case, they ought to be able to switch to centos-8 for those jobs as soon as it becomes available23:06
fungiwhich, i agree, will make a lot more sense23:07
clarkbfungi: that is 2823:07
clarkbwhich is also eol23:07
fungiahh, then it was !magnum i guess... tripleo?23:07
clarkbya tripleo not magnum23:08
fungigot it23:08
clarkbif you deploy a magnum k8s in say vexxhost you get a fedora 27 host23:08
clarkbbecause that is the image their deployer supports23:08
fungiwell, at any rate, yeah i'd push them to switch to centos-8 once we have it23:08
fungiat least that will be receiving security fixes23:08
clarkbnew git adds `git restore` which is a neat helper command23:11
clarkbalright I'm going to kill the epel stuff on mirror-update.opendev.org now23:15
ianwclarkb: ohhh, right ... haha yes so i'm pointing you to the bug you found and notated in the dib change to put the pause in23:22
ianwsorry, picture only just now coalescing in my mind :)23:23
clarkbepel mirror is getting vos released now23:25
*** tkajinam has joined #openstack-infra23:27
clarkbianw: ya maybe we just bump that timeout to be very big?23:29
fungiwithout reading the release notes, i'm going to guess `git restore` has something to do with figuring out and rolling back to previous worktree states from the reflog23:30
clarkbfungi: ya23:30
clarkbsince the checkout commands to do that are so painful23:30
*** dchen has joined #openstack-infra23:30
fungii don't personally find the checkout commands as painful as figuring out what state i wanted from the reflog23:31
fungithen again, i'm steeped in git esoterica every day23:31
*** dychen has joined #openstack-infra23:33
*** jamesmcarthur has quit IRC23:33
*** dchen has quit IRC23:35
clarkbI have to read the manpage every time I want to check out a file from the past23:37
*** e0ne has quit IRC23:40
ianwoh, sorry i just rebooted clarkb-test ... but testing longer timeout in sysctl23:41
clarkbno problem I hvane't touched it for a bit (working on afs things)23:41
clarkbfeel free to take it over23:41
clarkbepel vos release failed23:42
clarkbI might need to leave that one there for now23:42
ianwok, it got an ipv4 address now ...23:42
clarkbfedora release is still in progress23:43
ianwi can take a look at epel ... epel 8 was released maybe?  don't know if that is related23:44
clarkbianw: the rsync went quick23:47
clarkbbut the vos release failed after a few minutes23:47
clarkbhttp://paste.openstack.org/show/760140/23:48
ianwnow i put the timeout back to 15, rebooted, and it also got an ipv4 address :/23:48
*** markvoelker has joined #openstack-infra23:48
clarkbif we are really close to the timeout it may come down to load and luck23:48
openstackgerritMerged opendev/system-config master: Add more fedora-atomic mirror exclusions  https://review.opendev.org/67731823:49
ianwMon Aug 19 23:40:47 2019 warning: volume 536870968 recursively checked out by programType id 423:50
clarkbI've got the lockfile held on mirror-update. Could it be the other mirorr update causing problems again?23:53
clarkbthat crontab is still commented out23:53
fungithere may be a broad race between start time and the ra timeout such that adjusting the timeout merely reduces or increases the random chance we'll get no ipv4 address23:56
*** jamesmcarthur has joined #openstack-infra23:56
clarkbthats a good point23:57
clarkbin which case a very long timeout is probably what we want?23:57

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!