Friday, 2017-09-29

*** baoli has quit IRC00:00
*** Swami has quit IRC00:01
mnaserjeblair do we want to recheck 508336 ?00:01
*** mat128 has joined #openstack-infra00:01
*** baoli has joined #openstack-infra00:01
mordredmnaser: just hit the recheck button00:02
*** lukebrowning has quit IRC00:02
mnaserokay so once that merges, ill restart my test case00:03
mordredyah00:03
mordredand then if that passes on centos nodes we'll know we're good to land the real one00:03
fungiclarkb: i have some inline comments on 508348, one cosmetic and one about the regex00:04
mordredfungi: oops. that first one was my bad :)00:04
clarkbfungi: re regex its from mordred and so we don't have t oescape a $00:04
clarkbwe could write it the other way but meh00:04
* mordred can do a follow up to fix the name00:05
openstackgerritMonty Taylor proposed openstack-infra/openstack-zuul-jobs master: Fix name of zuul sudo script task  https://review.openstack.org/50836100:07
mordredfungi: ^^00:07
mordredclarkb: you too00:07
fungiokay, was mostly worried that it was matching on the variable's name rather than its value. i guess my ansible-fu is weak and i really have no idea why that regex is there00:07
fungioh! now i get it00:07
*** ekcs has quit IRC00:07
clarkbfungi: the regex is matching contents of a file. If a line has that content it replcaes it with the line we provide00:07
fungithis is the ansible equivalent of sed s/00:07
clarkbyes00:07
*** lukebrowning has joined #openstack-infra00:08
*** gildub has joined #openstack-infra00:08
fungiso lineinfile is for inline editing?00:08
clarkbya00:09
clarkbits also super clunky and now I don't like it00:09
fungii can certainly see why00:09
clarkbwould rather just command: sed -e"s/foo/bar/"00:09
fungi-i00:10
fungibut yeah00:10
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix sql reporting start/end times  https://review.openstack.org/50836200:11
jlkI don't like touching files like that at all00:11
jlkbut I can see the allure00:11
jeblairclarkb, mordred: real noop fix ^00:11
clarkbjlk: its far easier to understand imo and more powerful00:12
clarkbjlk: expressing that in lineinfile is much more complicated and error prone00:12
*** lukebrowning has quit IRC00:12
openstackgerritMerged openstack-infra/project-config master: Update fetch-zuul-cloner in base-test  https://review.openstack.org/50833600:12
jlkno  no, I get sed vs lineinfile, I just don't like editing a file with a SCM00:12
jlkI'd rather replace the file with a template or something like that00:12
mordredjlk: totally agree00:12
mnasermordred jeblair i'll recheck in a few minutes, im at stage where tempest is running in current puppet job so i wanna see if it completes00:12
mordredmnaser: \o/00:13
mnaserand if it does then woo, if it doesn't it'll save me waiting another 20-30 minutes till it fails again00:13
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Added legacy-puppet-openstack-integration templates  https://review.openstack.org/50833200:13
jeblairmnaser: wfm00:13
*** lukebrowning has joined #openstack-infra00:14
clarkbjeblair: mnaser https://review.openstack.org/#/c/508334/ the depends on for that merged, so the error there is weird looking00:14
mnaserclarkb i was gonna rebase and repush... even tho im not sure why00:15
mnaserbut i can leave it as is if you want to check00:15
clarkbmnaser: I don't think you need a rebase, I think you can likely just recheck it00:15
clarkbmnaser: gerrit at least doesn't seem to think it is a merge failure00:15
clarkbwhich makes me think some interaction with depends on maybe?00:15
*** thorst has quit IRC00:15
mordredclarkb: gerrit shows merge conflict to me00:16
mnaserbig red "Patch in Merge Conflict" i see in the UI isn't gerrit?00:16
clarkbmnaser: no that is from the ci results00:16
mnaserTIL, so let me throw a recheck i guess00:16
clarkbthe box with owner it in will say merge conflict if gerrit itself says its a conflict00:16
*** lukebrowning has quit IRC00:18
clarkb508302 is showing that some of the less trivial d-g jobs are passing00:19
*** lukebrowning has joined #openstack-infra00:20
mnaserif my legacy jobs are failing and im ignoring them and writing new jobs, how will i get the .zuul.yaml file merged then (without leaving the project with no ci for that duration?)00:21
clarkbjlk: mordred I think its important to treat ansible as a remote execution language here rather than configuration management. You can definitely clean things up but right now especially for legacy jobs its literally there for "run these scripts"00:21
mnaser(this could be really obvious)00:21
clarkbmnaser: you'd update your .zuul.yaml to run the new jobs and remove the old jobs from the project-config list I think00:22
jlkclarkb: agreed00:22
mnaserclarkb but i cant get my .zuul.yaml merged if my legacy jobs are failing? unless there's some depends-on black magic that would go on00:22
clarkbmnaser: you remove the legacy jobs so they won't run00:22
jlkclarkb: I'm also kinda over touching config on systems after deploy as a whole. But that's just the container koolaid talking00:22
jeblairclarkb: that merge failure may be the "someone pushed up a new patchset of the dependency" error00:23
clarkbjeblair: ya except in this case the dependency merged I think00:23
mnaserremove jobs in project-config, change that adds .zuul.yaml depends-on that one .. maybe that might workaround it?00:23
clarkbjeblair: so no new patchsets00:23
clarkbmnaser: ya00:23
jeblairclarkb: oh00:23
*** lukebrowning has quit IRC00:24
clarkbmriedem_dinner: is nova net expected to work right now http://logs.openstack.org/02/508302/3/check/legacy-tempest-dsvm-nnet/ac16495/logs/devstacklog.txt.gz#_2017-09-29_00_22_37_388 ?00:25
*** alex_xu has joined #openstack-infra00:25
*** alex_xu has quit IRC00:25
*** mrunge_ has joined #openstack-infra00:25
*** alex_xu has joined #openstack-infra00:25
mnaserhttps://review.openstack.org/#/c/508296/00:26
mnaserwoo my first success00:26
mnaserjeblair im rechecking to test base-test00:26
Jeffrey4lhi, where is the /etc/nodepool/provider file? it is disappeared.00:26
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [DNM] gate testing  https://review.openstack.org/50836700:26
*** mrunge has quit IRC00:26
*** lukebrowning has joined #openstack-infra00:26
*** thorst has joined #openstack-infra00:26
clarkbJeffrey4l: the information that was in those files is available in ansible but not necessarily in /etc/nodepool in all cases any longer. What are you trying to accomplish with it?00:27
Jeffrey4lclarkb, need get the variable and set into kolla's configuration file to build image.00:27
clarkbJeffrey4l: right but what is it being used for?00:27
clarkbJeffrey4l: there may be an existing role we can add to your jobs to accomplish the same task00:27
clarkbor otherwise advice on the best way to get that info00:28
Jeffrey4lthis should be added into one jinja2 template. not /etc/yum.repos.d/ files.00:28
Jeffrey4lhow can i see these variables?00:28
jeblairclarkb, Jeffrey4l: if this is for a legacy job, the conversion script may have set the wrong parent for the job00:29
*** rcernin has quit IRC00:29
Jeffrey4lclarkb, it is used for build docker images.00:29
Jeffrey4ljeblair, yes. all jobs in kolla is red now00:29
clarkbJeffrey4l: http://logs.openstack.org/02/508302/3/check/legacy-devstack-gate-tox-run-tests/8971b5f/zuul-info/inventory.yaml it is part of the node inventory now00:29
jlkoh hey, ansible-lint change was merged upstream, and will be released tomorrow.00:30
clarkbJeffrey4l: are you using it to build the mirror info for builds?00:30
clarkbbecause that info is more directly consumable elsewhere00:30
Jeffrey4luse the mirror repo to build docker images.00:30
*** lukebrowning has quit IRC00:31
Jeffrey4lsince there are in ansible variable. how can not access it in my plain bash script?00:31
Jeffrey4lthey are in *00:31
clarkb/etc/ci/mirror_info.sh iirc00:31
clarkbJeffrey4l: ^ is probably the best way to consume the particular mirror info in a bash script00:31
clarkbyou source it then you get things like NODEPOOL_FEDORA_MIRROR and NODEPOOL_EPEL_MIRROR and so on00:32
Jeffrey4lok. let me check. thanks.00:32
*** lukebrowning has joined #openstack-infra00:32
clarkb508348 is waiting on a trusty node, its child change in gate got one before it :/00:33
*** mat128 has quit IRC00:34
mnaser ya i noticed that too :X00:34
*** lukebrowning has quit IRC00:37
Jeffrey4lbtw, how the /etc/ci/mirror_info.sh file is added into the image?00:37
mnaserJeffrey4l there is a role that zuul runs before your job starts00:37
*** zhurong has joined #openstack-infra00:37
clarkbJeffrey4l: with zuulv2 it is added by nodepool, with zuulv3 its part of base job setup00:37
Jeffrey4lmind give me a code link?00:38
jeblairclarkb: a different project-config change merged around the same time that the dependent change in openstack-zuul-jobs merged.  it triggered merge-check events for all project-config changes, including 508334.  since the ozj change had landed, it did not include it in the speculative merge anymore.  however, it had not yet performed the reconfiguration needed by the ozj change, so it didn't have the new configuration cached.  so the ...00:38
jeblair... configuration syntax check failed.  it was actually the merge-check pipeline that reported that error.00:38
jeblairi think that's two strikes against merge-check00:39
openstackgerritMerged openstack-infra/project-config master: Switch puppet jobs to legacy template  https://review.openstack.org/50833400:39
mnaserjeblair it failed even with depends-on and that change merged - http://logs.openstack.org/96/508296/9/check/puppet-openstack-integration-4-scenario001-tempest-centos-7/a614a5e/job-output.txt.gz (change in question - https://review.openstack.org/#/c/508296/)00:39
clarkbJeffrey4l: https://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/roles/mirror-info00:39
*** thorst has quit IRC00:40
mnaserjeblair sorry, here's a more easier acccessible link - http://logs.openstack.org/96/508296/9/check/puppet-openstack-integration-4-scenario001-tempest-centos-7/a614a5e/job-output.txt.gz#_2017-09-29_00_37_40_08257200:40
clarkbjeblair: I'm thinking maybe we should rely on gerrit for those checks?00:40
Jeffrey4lgot, thanks a lot.00:40
mnaserjeblair oh shoot, you added base-test to the legacy base which im not using here00:41
mnaserlet me manually add it to my new base job00:41
openstackgerritJames E. Blair proposed openstack-infra/project-config master: Disable merge-check pipeline  https://review.openstack.org/50837100:41
jeblairclarkb: ^00:41
jeblairthat's a quick disable which we can roll forward or backwards later as needed.00:41
*** lukebrowning has joined #openstack-infra00:42
jeblairmnaser: ah yeah, the reproducer you linked earlier was a legacy- job right?  you move fast :)00:42
mnaserjeblair figured it was easier than spamming project-config changes here all day :D00:42
ianwjeblair / mordred : https://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/playbooks/legacy/dib-dsvm-functests-python2-centos-7/run.yaml#n37 <- did that happen for a reason, is it safe to convert back to "|" ?00:43
jeblairianw: i have no idea!00:44
clarkbjeblair: and re 508348 should I just be patient for it to find a trusty node?00:45
ianwok, i just remembered something flying by about | format00:45
jlkzuul didn't like that change00:45
fungijeblair: wow, the error on 508371 is amazing!00:46
jlk"Unknown configuration error"00:46
jlkdo we error if a pipeline doesn't have a trigger?00:46
clarkbmight have to delete it entirely or make the trigger unresolvable?00:46
jeblairwhew, i was worried it was going to print out the whole config or something :)00:46
*** lukebrowning has quit IRC00:46
jeblairclarkb: well, there are lots of templates00:46
jeblairi probably need some [] or {} or something00:46
*** thorst has joined #openstack-infra00:47
openstackgerritJames E. Blair proposed openstack-infra/project-config master: Disable merge-check pipeline  https://review.openstack.org/50837100:47
jlktrigger: {}00:47
jeblairjlk: agreed ^ :)00:47
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Fix sql reporting start/end times  https://review.openstack.org/50836200:48
jlklolz00:48
*** lukebrowning has joined #openstack-infra00:48
jeblair362 will need a scheduler restart00:48
clarkblets make sure 508348 doesn't get lost with ^, that fixes a whole ton of jobs00:49
jeblairyeah, wasn't planning on doing it right now00:49
jeblairclarkb: your trusty node is request 100-000004967300:49
ianwJeffrey4l: you might also be interested in https://git.openstack.org/cgit/openstack/diskimage-builder/tree/contrib/setup-gate-mirrors.sh and related in dib, where we use the mirrors00:50
jeblairclarkb: | 0000044323 | rax-dfw                | None       | ubuntu-trusty    | 0e5b9e3d-cbab-40ef-986a-9231c9ea15b2 | building | 00:00:40:54 | locked   | ubuntu-trusty-rax-dfw-0000044323                   | None            | None            | None                                   | 22       | nl01.openstack.org-30932-PoolWorker.rax-dfw-main                | 100-0000049673 | None                                                        ...00:50
jeblair...                                     | None       |00:50
clarkbjeblair: how do I get to ^ command on zuulv3.o.o ?00:51
jeblairwe're going to be a little more exposed to node build times in v3; if that exceeds our tolerance, we may want to decrease our build timeout in nodepool.00:51
Jeffrey4lyes. it is useful. thanks ianw00:51
*** thorst has quit IRC00:51
jeblairclarkb: http://paste.openstack.org/show/622240/00:52
mordredianw: definitely not on purpose00:52
clarkbjeblair: thanks!00:52
*** threestrands has joined #openstack-infra00:52
*** threestrands has quit IRC00:52
*** threestrands has joined #openstack-infra00:52
*** lukebrowning has quit IRC00:52
openstackgerritMohammed Naser proposed openstack-infra/project-config master: Switch to legacy puppet check jobs  https://review.openstack.org/50837300:53
*** erlon has quit IRC00:53
openstackgerritMohammed Naser proposed openstack-infra/openstack-zuul-jobs master: Add legacy puppet check jobs  https://review.openstack.org/50837400:53
*** cuongnv has joined #openstack-infra00:53
*** lukebrowning has joined #openstack-infra00:54
openstackgerritMohammed Naser proposed openstack-infra/openstack-zuul-jobs master: Remove non-legacy puppet check template  https://review.openstack.org/50837500:54
clarkbjeblair: this is work so definitely not expected today or any time soon, but could't we have node requests pull nodes off a queue in a fifo manner. So you don't actually ssign a node until one is ready?00:54
clarkbzuul -> request -> nodepool -> nodepool gives request queue entry -> nodepool appends build to queue -> as builds complete oldest request in queue gets that node00:55
jeblairclarkb: yeah, we'd need another queue in nodepool (a build queue) distinct from the request queue00:56
*** LindaWang has joined #openstack-infra00:56
mnaserjeblair - http://zuulv3.openstack.org/static/stream.html?uuid=92d92683f8eb46b8bf4ddf1f4339aec8&logfile=console.log - this fix works00:56
mnaserctrl+f => Creating /etc/puppetlabs/code/modules/aodh00:56
mnaserit grabs it from /home/zuul00:56
mnaserand xenial behaviour still works as well00:57
mnaserhttp://zuulv3.openstack.org/static/stream.html?uuid=62758ac8d4d94628a1c763ee9ddee2fb&logfile=console.log00:57
jeblairmnaser: cool, i +3d 508337; the promote base-test to base change00:57
jeblairmnaser: thanks!00:57
mnaserno problem, thank you!00:57
*** thorst has joined #openstack-infra00:58
jeblairclarkb: in the interim, maybe we should look at our median build time for our clouds and set the timeout to some nth percentile over that00:58
clarkbjeblair: ya00:59
*** lukebrowning has quit IRC00:59
openstackgerritmelanie witt proposed openstack-infra/devstack-gate master: WIP Add mysqladmin -v extended-status processlist  https://review.openstack.org/50762600:59
*** thorst has quit IRC01:00
jeblairclarkb: it looks like 10m would be pretty god for rax actually.  maybe 15.01:00
*** thorst has joined #openstack-infra01:00
jeblairhttp://grafana.openstack.org/dashboard/db/nodepool-rackspace?from=1506042021678&to=150664682167801:00
*** lukebrowning has joined #openstack-infra01:00
*** thorst_ has joined #openstack-infra01:01
*** thorst has quit IRC01:04
openstackgerritJames E. Blair proposed openstack-infra/project-config master: Set rackspace launch timeout to 10m  https://review.openstack.org/50837801:04
jeblairclarkb: ^01:04
*** thorst_ has quit IRC01:05
*** lukebrowning has quit IRC01:05
jeblairi'm wilting, i think i need to eod.  anything urgent before i do?01:06
clarkbI too am fading fast. I don't think there is anything super urgent other than continuing to go through failures and fix them01:06
*** lukebrowning has joined #openstack-infra01:06
clarkbgrenade is still unhappy, assuming sudo fixes finally get in I'll likely focus on grenade in the morning01:06
clarkbmaybe we should send an update?01:07
jeblairclarkb: probably a good idea01:08
*** wolverineav has quit IRC01:08
*** wolverineav has joined #openstack-infra01:09
fungiwe should prepare for it to be the top message in the :my job is broken... here's what i did" thread which is certain to follow01:10
clarkbsingle node tempest and grenade look happy now though01:10
ianwclarkb: your sudo fixes will fix everything that's failing 508344 right?01:11
*** lukebrowning has quit IRC01:11
mriedem_dinnerclarkb: i don't know what that job is01:11
*** mriedem_dinner is now known as mriedem01:11
mriedemnova-net will only run in a cellsv1 setup01:12
clarkbmriedem_dinner: its the tempest nova net job01:12
mriedemthere is no tempest nova-net job01:12
mriedemthere is the cells job01:12
clarkbmriedem: its possible that it is a bug01:12
mriedemwhich is cells v1 and runs nova-net01:12
clarkband the job should just go away01:12
mriedemit's likely something changed with branch restrictions or something in project-config01:12
* fungi hopes mriedem_dinner brought enough for the whole class01:12
mriedemdinner did not go well01:12
mriedemand thus,01:12
mriedemmy daughter will be having hers for breakfast01:13
clarkbianw: if you look at 508302 it dep'd on the sudo fix. It fixes a good chunk of stuff01:13
*** lukebrowning has joined #openstack-infra01:13
mriedembecause zucchini is terrifying01:13
clarkbianw: but multinode things are not happy if you want to dig into that01:13
*** bobh has joined #openstack-infra01:13
*** wolverineav has quit IRC01:13
ianwclarkb: ok, i'll see :)  i'd like to get the dib gate unblocked, just in case we need to push something out quickish01:13
openstackgerritSam Yaple proposed openstack-infra/bindep master: Remove grammar duplication  https://review.openstack.org/50680301:15
clarkbwoo sudo fix has trusty node finally01:16
*** lukebrowning has quit IRC01:17
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Swap order of sudoers manipulation  https://review.openstack.org/50834801:18
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Fix name of zuul sudo script task  https://review.openstack.org/50836101:18
*** lukebrowning has joined #openstack-infra01:19
clarkbwith that done, is anyone working on an update email yet?01:20
openstackgerritMohammed Naser proposed openstack-infra/project-config master: Switch to legacy puppet check jobs  https://review.openstack.org/50837301:21
*** ggillies has quit IRC01:21
fungii've already cracked open that bottle of saké, it's getting pretty late over here01:21
clarkbjeblair: ^ if you've not started I'll work on something01:22
*** andreww has joined #openstack-infra01:23
*** ggillies has joined #openstack-infra01:23
openstackgerritSam Yaple proposed openstack-infra/bindep master: Simplify grammar  https://review.openstack.org/50680301:24
clarkbhttps://etherpad.openstack.org/p/cvedg2Y74g01:24
SamYaplegotta say, don't totally hate Parsley now that ive been working with it. its kinda nice01:24
*** lukebrowning has quit IRC01:24
*** Apoorva has quit IRC01:25
*** lukebrowning has joined #openstack-infra01:25
*** xarses_ has quit IRC01:26
*** stakeda has joined #openstack-infra01:27
clarkbhow does ^ etherpad look?01:28
portdirectlooks good01:28
portdirectthough I'm seeing some issues with zuul cloner as well i think? http://logs.openstack.org/54/457754/67/check/legacy-openstack-helm-aio-basic-ovs-radosgw/cd21d38/job-output.txt.gz#_2017-09-29_01_23_05_35158301:29
clarkbya I think z-c is still having some corner case issues01:29
jeblairclarkb: ++01:29
clarkbI was focused on other stuff so don't really have z-c details but if someone does feel free to add to etherpa01:29
jeblairclarkb: etherpad ++ i mean :)01:30
*** lukebrowning has quit IRC01:30
jeblairi'm eoding01:30
ianwclarkb: small grammar update by me01:30
clarkbwhat about a note to avoid approving changes until check is shown to work for $project?01:31
clarkbI see a lot of stuff going into the gate that has no hope of passing01:31
*** sbezverk has joined #openstack-infra01:31
ianwclarkb: you might like to say to fix things in openstack-zuul-jobs legacy to start with01:31
ianwand and then migrate working jobs into your tree?01:31
ianwif people are wondering what to do01:31
*** Sukhdev has quit IRC01:31
*** lukebrowning has joined #openstack-infra01:32
mnaserif anyone is around that can give the +1'd checks a push to help pave the way to clean up the puppet jobs and move them out of ozj01:33
mnaserhttps://review.openstack.org/#/q/owner:mnaser%2540vexxhost.com+status:open+(project:openstack-infra/project-config+OR+project:openstack-infra/openstack-zuul-jobs)01:33
clarkbanyone have that migration docs link handy?01:33
portdirectfor things that have stalled on checks done this morning, what should we do to get them pushed through?01:34
SamYapleianw: ive jumped straight to migrating, but its fairly low entropy project01:34
mnaserclarkb one thing to note, migration docs are out of date because of failing publish jobs :(01:34
ianwSamYaple: it's a valid path forward.  just thought people should know changes to the legacy jobs are accepted01:34
clarkbmnaser: arg01:34
SamYapleack01:35
SpamapSdoh!01:35
clarkbhttps://docs.openstack.org/infra/manual/zuulv3.html that look right?01:35
clarkblooks right to me01:35
clarkbportdirect: ^ shouldhopefully tell you01:35
mnaserhttp://git.openstack.org/cgit/openstack-infra/infra-manual/tree/doc/source/zuulv3.rst is more up to date, tho less formatted01:35
clarkbportdirect: but basically you'll want to identify the failure then likely work to fix it in openstack-infra/openstack-zuul-jobs/playbooks/legacy01:35
*** yamamoto has joined #openstack-infra01:36
*** bobh has quit IRC01:36
*** mat128 has joined #openstack-infra01:36
portdirectgotcha - reading up now, thanks clarkb01:36
openstackgerritMerged openstack-infra/project-config master: Promote base-test to base  https://review.openstack.org/50833701:36
clarkbok I'm gonna send that out now so I can eod too01:36
*** dprince has quit IRC01:37
portdirectalso thx SamYaple for pointing me to what I suspect is exactly what i need to change :)01:37
SamYapleportdirect: i will pass that thanks onto mnaser (it was the required-projects things)01:37
SpamapSclarkb: is another option porting that to not-legacy and putting the playbook in your own project's repo?01:38
SamYaplebtw to the whole crew here, i know its still pretty bumpy, but im very happy to have zuulv3 and i like it alot from using it so far01:38
clarkbSpamapS: yes, but I suspect for most getting legacy working is quicker and simpler01:38
* SpamapS has not kept up on the state of devstack-gate-not-legacy01:38
*** markvoelker has joined #openstack-infra01:38
clarkbSpamapS: since people want to merge code and stuff01:38
* mnaser is at patchset #1301:40
fungigoal with the legacy jobs was for them to be working enough for projects to get by until they have time to replace them with in-repo versions01:40
SpamapSclarkb: yeah, just wondering about the longer term "free them from project-config" effort.01:40
mnaseralready have all integration jobs passing, working on unit/etc01:40
*** lukebrowning has quit IRC01:41
SpamapSfungi: I figured as much.01:41
clarkbemail sent01:41
fungiso making them work in-place is a reasonable next step for the broken ones01:41
openstackgerritIan Wienand proposed openstack-infra/openstack-zuul-jobs master: Fix dib functional tests  https://review.openstack.org/50838301:42
fungi_however_ the self-testing nature of in-repo replacements does make this a relatively quick thing to iterate on, as mnaser is demonstrating ;)01:42
mnasercome watch/join the fun - https://review.openstack.org/#/c/508296/ :p01:42
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [DNM] gate testing  https://review.openstack.org/50836701:42
clarkbthanks everyone! /me finds a beer and dinner now01:42
*** lukebrowning has joined #openstack-infra01:43
fungithanks for sending that out, clarkb!01:43
fungii also like your priority ordering there (beer and dinner, not t'other way 'round)01:43
* ianw is now hungry and goes to find lunch01:44
*** yamamoto has quit IRC01:44
ianwi'll keep an eye on openstack-zuul-jobs for fixes01:44
openstackgerritLogan V proposed openstack-infra/openstack-zuul-jobs master: Add openstack-ansible required-projects parent job  https://review.openstack.org/50828101:45
SpamapSmnaser: it's kind of magical isn't it?01:45
SpamapSI'm doing the same with my internal zuulv3 here at GoDaddy :)01:45
logan-do child jobs required-projects get merged with the parent jobs? or is it a straight override?01:45
SpamapSiterating on a giant string of patches in a PR by just pushing and watching it run the new playbook.01:45
mnasermerged afaik logan-01:45
logan-i was hoping for that answer01:46
logan-thanks01:46
*** stakeda has quit IRC01:46
SpamapSlogan-: you're thinking dependencies not child/parent01:46
SpamapSlogan-: and AFAIK no, required-projects just affects what gets checked out. Each change is still its own entity, and they go in one by one whether they're in one project or another.01:46
melwittas part of the switchover to zuul v3, is http://status.openstack.org/zuul/ expected not to work anymore?01:47
*** lukebrowning has quit IRC01:47
SamYaplemelwitt: its borked right now01:47
SamYaplezuul.openstack.org01:47
fungimelwitt: try reloading? it should be a redirect now01:47
SamYapleor rather what it redirects to, zuulv3.openstack.org01:48
fungiunless that hasn't merged/applied yet01:48
logan-SpamapS: so if in https://review.openstack.org/#/c/508281/3/zuul.d/zuul-legacy-jobs.yaml, let's use the job legacy-openstack-ansible-os_keystone-ansible-uw_apache as an example01:48
melwittfungi: yeah not redirecting yet01:48
melwittSamYaple: cool, thanks01:48
logan-the parent job has a list of required-projects, and this job has a required-project, what is cloned? both this job's required-projects AND legacy-openstack-ansible-base?01:48
fungimelwitt: thanks for the reminder... i01:48
fungii'll track down what happened to the redirect patch01:49
*** camunoz has quit IRC01:50
*** zhurong has quit IRC01:50
melwittI had been using that one bc it was a lot faster than zuul.openstack.org in the past. but zuul.openstack.org seems to be working fast now01:50
SpamapSlogan-: ahhhhh I see what you're asking01:50
mnaseranyone know how i can pull anymore info when i get a "MODULE FAILURE" :( http://logs.openstack.org/96/508296/13/check/puppet-openstack-module-build/efae1ea/job-output.txt.gz#_2017-09-29_01_48_19_84107901:50
mnaserhttps://review.openstack.org/#/c/508296/13/playbooks/prepare-node-unit.yaml01:51
mnaserthis is where it fails01:51
SpamapSlogan-: I believe the list becomes just the one (so openstack/swift)01:51
fungimelwitt: https://review.openstack.org/507244 should take care of it... i just approved01:51
SpamapSlogan-: but I'll double check the code/api/etc.01:51
clarkbmnaser: look at ara01:51
clarkbmnaser:  it tends to be better and ansible related fails01:51
melwittfungi: cool, thanks01:51
fungithanks for reminding me it wasn't merged!01:51
melwitt:)01:52
melwittaccidental reminder01:52
logan-thanks SpamapS01:53
*** kjackal_ has joined #openstack-infra01:53
mnaserclarkb perfect!  i should use it more often :p01:53
SpamapSlogan-: I was wrong. It does add them.01:54
SpamapSlike magic01:54
logan-awesome01:54
logan-thanks a lot for checking01:54
SpamapShttps://github.com/openstack-infra/zuul/blob/feature/zuulv3/zuul/model.py#L906-L90901:55
*** lukebrowning has joined #openstack-infra01:55
SpamapSlogan-: as zuul walks the tree it keeps loading the job from the tree and calling that method on it with newly found required-projects stanzas01:55
fungimagic!01:55
SpamapSso that also, I believe, will let you have a child of a parent that changes the branch of a project01:56
* fungi throws fistfuls of glitterfetti into the air01:56
SpamapSwhich is really sweet01:56
logan-interesting01:56
logan-yeah that is cool01:56
SpamapSso you can have gate-mything-others-master and gate-mything-others-newfeaturebranch01:56
*** zhurong has joined #openstack-infra01:59
*** lukebrowning has quit IRC02:00
*** lukebrowning has joined #openstack-infra02:01
*** ihrachys has quit IRC02:02
*** thorst has joined #openstack-infra02:02
*** ihrachys has joined #openstack-infra02:02
*** liujiong has joined #openstack-infra02:04
*** lukebrowning has quit IRC02:06
*** ihrachys has quit IRC02:07
*** ihrachys has joined #openstack-infra02:07
*** lukebrowning has joined #openstack-infra02:08
*** mat128 has quit IRC02:09
openstackgerritPete Birley proposed openstack-infra/openstack-zuul-jobs master: OpenStack-Helm: Update legacy jobs  https://review.openstack.org/50838702:11
Jeffrey4lcan i get the information about /etc/nodepool/node_private and /etc/nodepool/sub_node_private ?02:11
Jeffrey4lhow can i get*02:11
*** lukebrowning has quit IRC02:12
openstackgerritMohammed Naser proposed openstack-infra/project-config master: Remove legacy-puppet-openstack-integration jobs  https://review.openstack.org/50838802:12
*** hongbin has joined #openstack-infra02:13
*** markvoelker has quit IRC02:13
*** dave-mcc_ has quit IRC02:13
*** lukebrowning has joined #openstack-infra02:14
openstackgerritPete Birley proposed openstack-infra/openstack-zuul-jobs master: OpenStack-Helm: Update legacy jobs  https://review.openstack.org/50838702:14
*** ijw has quit IRC02:16
openstackgerritMohammed Naser proposed openstack-infra/openstack-zuul-jobs master: Remove all legacy puppet openstack integration jobs  https://review.openstack.org/50839002:16
*** ijw has joined #openstack-infra02:16
*** lukebrowning has quit IRC02:18
*** hongbin_ has joined #openstack-infra02:19
fungianother taste of our own medicine... legacy puppet apply, beaker and logstash filter jobs are broken on system-config so 507244 can't merge02:20
*** lukebrowning has joined #openstack-infra02:20
*** rcernin has joined #openstack-infra02:20
*** hongbin has quit IRC02:21
openstackgerritMerged openstack-infra/irc-meetings master: Earlier meeting time for international attendees  https://review.openstack.org/50820202:22
openstackgerritPete Birley proposed openstack-infra/openstack-zuul-jobs master: OpenStack-Helm: Update legacy jobs  https://review.openstack.org/50838702:23
mnaserif anyone's around - https://review.openstack.org/#/c/508333/1/zuul.d/zuul-legacy-project-templates.yaml - before last step which allows us to bring jobs in-repo02:24
*** lukebrowning has quit IRC02:25
mnaserand maybe this one too, first step to removing unit/check jobs - https://review.openstack.org/#/c/508374/02:25
*** spotz has quit IRC02:29
*** mudpuppy has quit IRC02:29
*** lukebrowning has joined #openstack-infra02:31
*** lbragstad has joined #openstack-infra02:31
*** ramishra has joined #openstack-infra02:33
*** lukebrowning has quit IRC02:35
*** yamamoto has joined #openstack-infra02:36
*** spotz has joined #openstack-infra02:37
SpamapSmnaser: I think they all passed out ;)02:37
mnaserSpamapS the fun part is i can just chain things with depends-on02:38
mnaserand get on my merry way02:38
mnaserok just threw up a whole chain which will add new jobs, point projects to new jobs and remove old jobs from ozj02:39
fungii haven't passed out _quite_ yet02:41
fungiwill take a look in a sec02:42
*** lukebrowning has joined #openstack-infra02:42
mnaseranyways im gonna go to bed and hopefully everything is +1'd by Zuul tomorrow02:45
* mnaser &02:45
*** lukebrowning has quit IRC02:47
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: OpenStack-Helm: Update legacy jobs  https://review.openstack.org/50838702:48
*** esberglu has joined #openstack-infra02:49
*** lukebrowning has joined #openstack-infra02:49
openstackgerritIan Wienand proposed openstack-infra/openstack-zuul-jobs master: Add diskimage-builder requirements for heat in updown jobs  https://review.openstack.org/50839602:50
SpamapSmnaser: I can tell, zuulv3 and you are going to be BFF's02:50
*** lukebrowning has quit IRC02:53
*** esberglu has quit IRC02:53
*** lukebrowning has joined #openstack-infra02:55
jianghuawFrom the page of http://zuulv3.openstack.org/; it shows "Zuul version: 2.5.3.dev1373". I'm a little confused. Which version of zuul is used for the upstream CI?02:58
clarkbjianghuaw: we havent tagged it as version 3 yet03:00
clarkbbut its running the code that eill be tagged version 303:00
*** lukebrowning has quit IRC03:00
SpamapSjianghuaw: if you want to see the code, checkout feature/zuulv303:01
*** lukebrowning has joined #openstack-infra03:01
*** armax has quit IRC03:04
ianwi think with 508396 the devstack gate (i.e. the devstack project) might be ok modulo multinode03:07
openstackgerritIan Wienand proposed openstack/diskimage-builder master: [DNM] gate testing  https://review.openstack.org/50836703:09
*** markvoelker has joined #openstack-infra03:10
prometheanfireianw: is dib ready for new changes?03:11
ianwprometheanfire: that's a joke right? :)03:12
*** zhurong has quit IRC03:12
prometheanfireianw: ya, seems to have been a rough week03:13
ianwkeep an eye on the chain of 50836703:13
prometheanfireI am happy that infra got me v3 as a present today of all days though03:13
*** lukebrowning has quit IRC03:14
fungiprometheanfire: your birthiversary?03:14
prometheanfirefungi: something like that03:15
fungias my friends like to say, congrats on making it another year without dying03:16
prometheanfireanother year closer to my inevitable demise03:16
prometheanfire:D03:16
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Add requirements to all pylint jobs  https://review.openstack.org/50830303:16
ianw^ ok, that's one less problem03:16
*** lukebrowning has joined #openstack-infra03:17
prometheanfireya, that was surprising to me03:17
fungior one more, depending on how you classify pylint03:17
prometheanfirelol, status.openstack.org/zuul is dead03:18
fungiprometheanfire: we need system-config's jobs working enough to be able to merge https://review.openstack.org/50724403:19
fungito solve that03:19
fungibut i've run out of steam for looking into it03:20
*** lukebrowning has quit IRC03:21
*** lukebrowning has joined #openstack-infra03:23
ianwi've got so much in flight i'm loosing track.  afk for a bit to let jobs process so i can see where i'm at03:23
prometheanfirefungi: ah, known issue, k03:24
*** baoli has quit IRC03:26
*** lukebrowning has quit IRC03:28
*** hongbin_ has quit IRC03:28
jianghuawSpamapS, thanks for the response. Actually zuul v.3 is used although it shows 2.5.3.dev1373 in the bottom of http://zuulv3.openstack.org/?03:29
*** lukebrowning has joined #openstack-infra03:29
clarkbjianghuaw: yes, because we haven't tagged a zuulv3 release yet03:29
clarkbso the version reported by git is 2.5.3.dev137303:30
clarkbjianghuaw: once things settle in we'll tag aa 3.0 release and that will update03:30
jianghuawclarkb, got it. thanks for the clarification.03:30
jianghuawThat's cool. I will look at zuul v3 and plan to use it for XenServer CI.03:31
SpamapSwell technically git isn't reporting a version03:32
SpamapSpbr is making 2.5.3.dev137303:32
SpamapSjianghuaw: If you need help deploying, let me know. #zuul is also a zuul-specific channel (though it is mostly dev centric)03:33
SpamapSjianghuaw: I use this for deploying:  https://github.com/BonnyCI/hoist03:33
jianghuawSpamapS, Thanks very much.03:34
*** lukebrowning has quit IRC03:34
* SpamapS disappears to find some sushi03:34
*** lukebrowning has joined #openstack-infra03:36
*** ekcs has joined #openstack-infra03:36
*** rlandy has quit IRC03:37
clarkbso many things need requirements http://logs.openstack.org/48/507148/1/gate/legacy-bifrost-integration-tinyipa-opensuse-423/789afce/job-output.txt.gz#_2017-09-29_03_32_20_67043903:37
clarkbmight be worth a special email just for this particular fail case03:38
*** lukebrowning has quit IRC03:40
*** markvoelker has quit IRC03:42
*** lukebrowning has joined #openstack-infra03:43
*** lukebrowning has quit IRC03:48
*** udesale has joined #openstack-infra03:48
*** lukebrowning has joined #openstack-infra03:49
*** ykarel has joined #openstack-infra03:50
*** lukebrowning has quit IRC03:54
*** lukebrowning has joined #openstack-infra03:56
*** lbragstad has quit IRC04:00
*** lukebrowning has quit IRC04:00
*** lukebrowning has joined #openstack-infra04:02
*** links has joined #openstack-infra04:02
*** kjackal_ has quit IRC04:03
*** lukebrowning has quit IRC04:07
*** mat128 has joined #openstack-infra04:07
*** cuongnv has quit IRC04:12
*** cuongnv has joined #openstack-infra04:12
*** lukebrowning has joined #openstack-infra04:13
*** lukebrowning has quit IRC04:18
*** lukebrowning has joined #openstack-infra04:19
*** zhurong has joined #openstack-infra04:19
*** ekcs has quit IRC04:22
*** SumitNaiksatam has joined #openstack-infra04:22
*** lukebrowning has quit IRC04:24
openstackgerritIan Wienand proposed openstack-infra/openstack-zuul-jobs master: Add diskimage-builder/sahara requirements in updown jobs  https://review.openstack.org/50839604:24
*** lukebrowning has joined #openstack-infra04:26
*** namnh has quit IRC04:27
*** namnh has joined #openstack-infra04:28
*** lukebrowning has quit IRC04:30
*** lukebrowning has joined #openstack-infra04:32
ramishrahi guys, any idea why this job is failing after zuul3 migration? http://logs.openstack.org/12/508112/1/check/legacy-heat-dsvm-functional-orig-mysql-lbaasv2/0f6d861/logs/devstacklog.txt.gz#_2017-09-29_02_11_02_71604:32
ramishraIt seems to be installing it from git though http://logs.openstack.org/12/508112/1/check/legacy-heat-dsvm-functional-orig-mysql-lbaasv2/0f6d861/logs/devstacklog.txt.gz#_2017-09-29_01_44_08_25404:33
ramishraianw: Hi, any idea? ^^^04:34
*** lukebrowning has quit IRC04:36
*** esberglu has joined #openstack-infra04:37
*** lukebrowning has joined #openstack-infra04:38
*** markvoelker has joined #openstack-infra04:39
*** mat128 has quit IRC04:40
*** stakeda has joined #openstack-infra04:40
*** coolsvap has joined #openstack-infra04:42
*** esberglu has quit IRC04:42
*** lukebrowning has quit IRC04:43
*** Sukhdev has joined #openstack-infra04:43
*** lukebrowning has joined #openstack-infra04:44
*** Guest50285 has quit IRC04:46
*** lukebrowning has quit IRC04:49
*** lukebrowning has joined #openstack-infra04:50
*** Hal has joined #openstack-infra04:51
*** Hal is now known as Guest1175004:51
*** lukebrowning has quit IRC04:55
*** psachin has joined #openstack-infra04:56
*** lukebrowning has joined #openstack-infra04:57
*** dhajare has joined #openstack-infra04:58
*** lukebrowning has quit IRC05:01
ianwramishra: looking05:01
ianwahh, yeah05:02
ianwwill require something like https://review.openstack.org/#/c/508344/05:02
ramishraianw: Ah, thanks!05:03
ianwthe issue is there's some more devstack issues in the gate too, including multinode.  honestly, your best bet might be just to wait a while at this point as we sort them out05:03
ramishraianw: np, we can wait:)05:03
fricklercan someone check why zuul didn't produce any gate result here? https://review.openstack.org/507798 should I do a recheck to get normal check results for comparison?05:06
*** lukebrowning has joined #openstack-infra05:09
openstackgerritAndreas Jaeger proposed openstack-infra/infra-manual master: Add howto section on migrating legacy jobs to v3  https://review.openstack.org/50829505:10
*** ianychoi has quit IRC05:11
*** ianychoi has joined #openstack-infra05:12
*** markvoelker has quit IRC05:12
*** lukebrowning has quit IRC05:14
openstackgerritAndreas Jaeger proposed openstack-infra/infra-manual master: Add docs about tox jobs and sibling installation  https://review.openstack.org/50832705:14
*** rcernin has quit IRC05:15
*** lukebrowning has joined #openstack-infra05:16
AJaeger_mordred, fixed some trivial problems on you change ^05:17
ianwall indications are that legacy-tempest-dsvm-nnet should not be running on devstack master, but it is.  hmm :/05:18
*** mrunge_ is now known as mrunge05:18
AJaeger_anybody wants to +2a zuul v3 doc improvments? ^05:18
AJaeger_Hi ianw !05:18
ianwboth read fine to me, better to iterate on them05:19
AJaeger_yep - thanks05:20
*** lukebrowning has quit IRC05:20
ianwAJaeger_: you seen anything funny with branch regex matches?05:22
*** armax has joined #openstack-infra05:23
*** gongysh has joined #openstack-infra05:23
AJaeger_ianw: no time to dig into anything ;( I'm still travelling and looked for a 5 minute help ;)05:23
ianwAJaeger_: np05:23
*** pcaruana has joined #openstack-infra05:24
openstackgerritIan Wienand proposed openstack-infra/openstack-zuul-jobs master: Wrap legacy-tempest-dsvm-nnet branch match regex  https://review.openstack.org/50840505:25
*** iyamahat has joined #openstack-infra05:25
*** lukebrowning has joined #openstack-infra05:25
*** pgadiya has joined #openstack-infra05:27
*** pcaruana has quit IRC05:29
*** iyamahat has quit IRC05:30
*** lukebrowning has quit IRC05:30
*** lukebrowning has joined #openstack-infra05:31
AJaeger_team, should we remove the old project-config/jenkins/jobs directory and layout/zuul.yaml files - so that nobody can submit changes anymore and they get conflicts on existing changes?05:35
* AJaeger_ just added comments and -1 to some existing changes that need to be adopted for zuul v3 and noticed changes submitted this week still touching the old files05:36
*** lukebrowning has quit IRC05:36
*** Sukhdev has quit IRC05:37
openstackgerritMerged openstack-infra/infra-manual master: Add howto section on migrating legacy jobs to v3  https://review.openstack.org/50829505:38
openstackgerritMerged openstack-infra/infra-manual master: Add docs about tox jobs and sibling installation  https://review.openstack.org/50832705:38
ianwAJaeger_: for the immediate time i've found it useful to cross-reference against them05:43
ianwbut, in a few days when this has settled down, sure05:43
*** lukebrowning has joined #openstack-infra05:44
AJaeger_ianw: ok, then I'll add some more -1 when new changes come in05:45
*** lukebrowning has quit IRC05:48
*** rcernin has joined #openstack-infra05:52
*** yamamoto_ has joined #openstack-infra05:53
*** lukebrowning has joined #openstack-infra05:55
*** yamamoto has quit IRC05:57
*** lukebrowning has quit IRC05:59
*** lukebrowning has joined #openstack-infra06:01
*** stakeda has quit IRC06:01
*** lukebrowning has quit IRC06:06
*** lukebrowning has joined #openstack-infra06:07
*** markvoelker has joined #openstack-infra06:09
*** lukebrowning has quit IRC06:12
*** hashar has joined #openstack-infra06:12
*** lukebrowning has joined #openstack-infra06:13
*** masber has joined #openstack-infra06:15
*** lukebrowning has quit IRC06:18
*** lukebrowning has joined #openstack-infra06:20
*** kiennt26 has joined #openstack-infra06:22
yolandahi AJaeger_ , ianw , what's the status with infra? jobs broken? i'm seeing errors on my bifrost jobs06:24
*** lukebrowning has quit IRC06:24
*** esberglu has joined #openstack-infra06:25
*** lukebrowning has joined #openstack-infra06:26
*** andreas_s has joined #openstack-infra06:29
*** esberglu has quit IRC06:29
*** lukebrowning has quit IRC06:31
*** lukebrowning has joined #openstack-infra06:32
*** lukebrowning has quit IRC06:37
*** mat128 has joined #openstack-infra06:39
*** markvoelker has quit IRC06:43
*** lukebrowning has joined #openstack-infra06:44
SamYaplecan someone clarify for me how to remove the legacy jobs? They are currently b0rked and we are just going to setup zuulv3 gates from scratch for LOCI06:45
SamYaplefor now we just want to purge the legacy jobs and noop zuulv306:45
*** tmorin has joined #openstack-infra06:47
*** lukebrowning has quit IRC06:49
*** gongysh has quit IRC06:49
*** lukebrowning has joined #openstack-infra06:50
*** jtomasek has joined #openstack-infra06:53
*** wewe0901 has joined #openstack-infra06:54
*** lukebrowning has quit IRC06:55
*** lukebrowning has joined #openstack-infra06:56
openstackgerritTony Breeds proposed openstack-infra/openstack-zuul-jobs master: Pin legacy-requirements-python34 to a trusty node  https://review.openstack.org/50842106:59
*** shardy_afk is now known as shardy07:00
* tonyb has no idea if ^^ is right 07:00
yamamoto_is RETRY_LIMIT thing is recheck'able?07:01
yamamoto_eg. https://review.openstack.org/#/c/507037/07:01
*** lukebrowning has quit IRC07:01
*** lukebrowning has joined #openstack-infra07:03
*** gildub has quit IRC07:03
*** pgadiya has quit IRC07:04
*** pcaruana has joined #openstack-infra07:04
*** gongysh has joined #openstack-infra07:04
*** lukebrowning has quit IRC07:07
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Fix dib functional tests  https://review.openstack.org/50838307:09
*** lukebrowning has joined #openstack-infra07:09
*** mat128 has quit IRC07:10
*** jpich has joined #openstack-infra07:12
*** ihrachys has quit IRC07:12
*** ihrachys has joined #openstack-infra07:12
*** lukebrowning has quit IRC07:13
*** gk_ has joined #openstack-infra07:13
*** gk_ has quit IRC07:14
*** Guest11750 has quit IRC07:15
*** lukebrowning has joined #openstack-infra07:15
*** gk_ has joined #openstack-infra07:16
*** florianf has joined #openstack-infra07:16
*** gk_ has quit IRC07:17
AJaeger_SamYaple: docs just merged on what to do - see https://review.openstack.org/508327  . Unfortunately those are not published yet07:18
AJaeger_you can read draft which is up07:18
AJaeger_infra-root, infra-manual publishing did fail somehow for 50832707:18
AJaeger_yolanda: I'm not up to speed, see http://lists.openstack.org/pipermail/openstack-dev/2017-September/122834.html for last email on it07:19
*** lukebrowning has quit IRC07:19
yolandayep, seems we hit "Missing inclusion of the requirements repo"07:20
*** kiennt26 has quit IRC07:20
*** lukebrowning has joined #openstack-infra07:21
SamYapleAJaeger_: thanks. ill check it out07:22
SamYapleim having a bit of trouble with openstack/loci right now. it only had a noop job.... but nothing seems to be working with that repo07:23
SamYaplei see it in zuulv3.o.o , but it never reports07:23
*** lukebrowning has quit IRC07:26
*** lukebrowning has joined #openstack-infra07:28
*** shardy is now known as shardy_afk07:30
*** namnh has quit IRC07:31
*** masber has quit IRC07:31
*** namnh has joined #openstack-infra07:32
*** lukebrowning has quit IRC07:32
*** lukebrowning has joined #openstack-infra07:34
SamYaplehttps://review.openstack.org/#/c/508425/ like this, zuul doesnt seem to kick anything off. i can't figure out where to begin looking07:36
*** ccamacho has joined #openstack-infra07:37
*** lukebrowning has quit IRC07:38
AJaeger_SamYaple: sorry, currently travelling and not up to speed - best come back when the US wakes up. Or reply to the email...07:39
*** rpittau has joined #openstack-infra07:39
matbu_SamYaple: yep looks like zuul is not wake up : http://status.openstack.org/zuul/07:39
*** jpena|off is now known as jpena07:39
SamYaplematbu_: that page is b0rked still. go to zuul.openstack.org07:40
AJaeger_matbu_: that's zuul v2 - SamYaple had the right URL which is zuulv3.openstack.org07:40
*** markvoelker has joined #openstack-infra07:40
AJaeger_SamYaple: zuul*v3*07:40
SamYapleAJaeger_: it redirects07:40
AJaeger_SamYaple: Ah!07:40
openstackgerritTony Breeds proposed openstack-infra/openstack-zuul-jobs master: Pin legacy-requirements-python34 to a trusty node  https://review.openstack.org/50842107:41
SamYaplek well i should sleep anyway. im just going to hope its all fixed when i wake up :)07:42
openstackgerritPavlo Shchelokovskyy proposed openstack-infra/project-config master: Add separate coverage job for ironic-inspector  https://review.openstack.org/50812907:42
matbu_ha thx better now07:42
*** egonzalez has joined #openstack-infra07:44
*** lukebrowning has joined #openstack-infra07:45
matbu_im wondering why zuul is not kicked btw here https://review.openstack.org/#/c/487496/07:48
chandankumarAJaeger_: regarding this review https://review.openstack.org/#/c/507038/ do i need to submit changes against something else by following zuulv3 docs?07:48
matbu_with the A+07:48
*** lukebrowning has quit IRC07:49
*** lukebrowning has joined #openstack-infra07:51
*** rossella_s has joined #openstack-infra07:53
*** threestrands has quit IRC07:53
tonybI'm seeing a few jobs fail with something like: http://logs.openstack.org/49/508249/1/check/legacy-cross-nova-func/ad36a73/job-output.txt.gz#_2017-09-29_02_05_54_784418 any ideas?07:54
*** verdurin has quit IRC07:54
*** lukebrowning has quit IRC07:55
*** lukebrowning has joined #openstack-infra07:57
*** armax has quit IRC07:57
*** shardy_afk is now known as shardy07:58
*** chem has joined #openstack-infra08:00
*** chenying_ has quit IRC08:00
*** chenying_ has joined #openstack-infra08:01
*** lukebrowning has quit IRC08:02
*** namnh has quit IRC08:03
*** lukebrowning has joined #openstack-infra08:03
*** namnh has joined #openstack-infra08:04
chemhi, I have a job (https://review.openstack.org/#/c/474967/) that isn't picked up by zuul and http://status.openstack.org/zuul/ is not loading.08:04
chemis that because of the workflow -1 or am i missing something08:04
chem?08:04
*** rossella_s has quit IRC08:05
chemoki, I've checked http://zuulv3.openstack.org/ and cannot find 47496708:06
*** tushar has joined #openstack-infra08:07
*** lukebrowning has quit IRC08:08
tusharHi All, From few hours back the third party CI stopped listening to the gerrit patches08:09
*** lukebrowning has joined #openstack-infra08:09
tusharI think this might because of changes related to zuul v308:10
*** gongysh has quit IRC08:10
tusharCan any body knows what changes are required in third party CI setup?08:11
*** rossella_s has joined #openstack-infra08:11
*** markvoelker has quit IRC08:12
*** esberglu has joined #openstack-infra08:13
*** lukebrowning has quit IRC08:14
*** ykarel is now known as ykarel|lunch08:14
*** esberglu has quit IRC08:18
*** lukebrowning has joined #openstack-infra08:22
evrardjpthis may look like a dumb question but, when, in the retirement of a repo process, does the repo disappear from cgit?08:23
*** e0ne has joined #openstack-infra08:24
*** ralonsoh has joined #openstack-infra08:25
*** lukebrowning has quit IRC08:26
*** gongysh has joined #openstack-infra08:26
fricklerevrardjp: it doesn't, there will only be an empty repo pushed as the last commit, but the history will be kept forever (whatever that may be in term of os-infra)08:27
evrardjpmmmm08:28
evrardjpwhere is the procedure for renaming a repo then?08:28
*** gongysh has quit IRC08:28
*** lukebrowning has joined #openstack-infra08:28
evrardjpI think I see the end of the tunnel08:28
fricklerevrardjp: https://docs.openstack.org/infra/manual/creators.html#project-renames08:29
evrardjpfrickler: thanks!08:29
evrardjpfrickler: ok let me explain the problem08:30
*** lukebrowning has quit IRC08:33
*** lukebrowning has joined #openstack-infra08:34
evrardjpI don't see any topic project-rename , but I still don't see openstack-ansible-security in cgit08:35
evrardjp(this is part of a bigger issue, but let's say we tackle that one)08:35
*** rossella_s has quit IRC08:36
evrardjpopenstack-ansible-security was retired in favor of ansible-hardening, but I still need old references for old stable branches into this repo08:36
evrardjp(the openstack-ansible-security one)08:36
*** caphrim007 has quit IRC08:37
*** caphrim007_ has joined #openstack-infra08:37
*** jaosorior has joined #openstack-infra08:37
*** alexchadin has joined #openstack-infra08:37
*** sbezverk has quit IRC08:38
*** lukebrowning has quit IRC08:39
openstackgerritAndrea Frittoli proposed openstack-infra/devstack-gate master: Throwaway patch to check subunit file processing  https://review.openstack.org/50817108:39
*** tosky has joined #openstack-infra08:44
evrardjpat least https://git.openstack.org/cgit/openstack/openstack-ansible-security  doesn't seem to exist anymore08:45
evrardjpthat blocks me from releasing anything...08:47
fricklerevrardjp: hmm, seems cgit is a different thing indeed, need to wait for some infra-root with more knowledge, then08:49
evrardjpthanks for the effort and for the help already!08:50
openstackgerritKrzysztof Klimonda proposed openstack-infra/zuul feature/zuulv3: Add zuul supplementary groups before setgid/setuid  https://review.openstack.org/50844408:51
openstackgerritMehdi Abaakouk (sileht) proposed openstack-infra/openstack-zuul-jobs master: Add missing projects to telemetry jobs  https://review.openstack.org/50844808:56
*** bhavik1 has joined #openstack-infra08:57
odyssey4meevrardjp the openstack-ansible-security git mirror is still in github, so it's like still in git.o.o too?08:57
evrardjpnope that's what I thought08:57
evrardjpthat's why I lost time :p08:57
evrardjphave a look at my link08:58
evrardjpI was always checking github insted of cgit08:58
evrardjpinstead*08:58
*** ykarel|lunch is now known as ykarel08:59
odyssey4megerrit still knows about it09:00
*** pas-ha has joined #openstack-infra09:00
odyssey4megit remote set-url origin https://git.openstack.org/openstack/openstack-ansible-security09:00
odyssey4methat works09:00
*** mriedem has quit IRC09:01
evrardjpyes, and git also09:01
odyssey4mecgit might just be set to not show repositories which are read-only09:01
evrardjpmmm09:01
evrardjpso the ACL on gerrit could have an impact?09:01
odyssey4mewhy is cgit important in this equation?09:01
evrardjpit's what's used in releases09:01
odyssey4meoh really? that's a problem09:01
*** yamamoto_ has quit IRC09:01
*** andreas_s_ has joined #openstack-infra09:01
evrardjpin the validation tooling we are checking the URL on cgit09:01
evrardjpodyssey4me: indeed :p09:02
evrardjpall our releases are broken right now.09:02
odyssey4meah, we'll have to wait for an infra-root to help then09:02
evrardjpif that's ACL I can fix that09:02
evrardjpbut then it will impact other things09:02
evrardjpso I'd rather wait for more experience09:03
AJaeger_evrardjp: repo is retired - that means it's frozen for all branches.09:03
evrardjpthe alternative would be to clone in the release tooling but a comment in code made me think it was tried and not a good idea09:03
AJaeger_evrardjp: if you want to do release on old branches, you shouldn't have retired it...09:03
AJaeger_and that's why it's hidden in cgit09:03
evrardjpfirst, I haven't retired it :p09:03
evrardjpsecond we still want to release, but not on this one09:04
evrardjpso our OLD deliverables for Ocata for example, still contain this retired repo09:04
openstackgerrityolanda.robla proposed openstack-infra/openstack-zuul-jobs master: Add requirements to bifrost jobs  https://review.openstack.org/50845209:05
AJaeger_evrardjp: sorry, need to go offline again and can't help further for now09:05
*** andreas_s has quit IRC09:05
evrardjpAJaeger_: are you suggesting I need to change the release tooling for that case?09:05
evrardjpAJaeger_: no worries :)09:05
evrardjpI will talk in release then09:06
tonybevrardjp: It's a somewhat known issue.  I expect dhellmann and ttx will fix it ASAP09:08
*** mat128 has joined #openstack-infra09:08
*** markvoelker has joined #openstack-infra09:09
*** bhavik1 has quit IRC09:10
evrardjpI will fix it09:10
evrardjpit doesn't look hard09:10
evrardjpI will ping dhellmann and ttx09:10
evrardjpfor reviews09:10
ttxhmmm some jobs look stuck in the queue09:11
ttxsee 504940 for example09:12
*** filler has quit IRC09:14
openstackgerritPavlo Shchelokovskyy proposed openstack-infra/openstack-zuul-jobs master: Require requirements prj for legacy-requirements  https://review.openstack.org/50846009:16
*** filler has joined #openstack-infra09:16
*** iyamahat has joined #openstack-infra09:23
fricklerinfra-root: it seems that at least neutron gate jobs are being merged without unit tests (or not merged due to multinode failure), but that does seem a critical bug to me09:23
*** iyamahat_ has joined #openstack-infra09:24
frickleralso swift is running openstack-tox-py27 on trusty instead of openstack-tox-py27-xenial https://review.openstack.org/47480109:25
mikalIs zuul broken or is it just me? status.openstack.org/zuul never finishes loading.09:26
odyssey4memikal it's between things09:27
odyssey4metry http://zuulv3.openstack.org/09:27
mikalOh fancy curved corners!09:27
odyssey4meit looks like the migration to zuul v3 is still somewhat in progress09:27
dmelladowhat about the status page09:27
dmelladois it broken too?09:27
dmelladodid it get migrated to some another url?09:27
dmelladoI wanted to check the status of an ongoing patch and can't see anything :\09:28
odyssey4meno, that status page is likely broken because the back-end it relied on is not yet migrated09:28
*** iyamahat has quit IRC09:28
odyssey4mefor an interim status check http://zuulv3.openstack.org/ I think - once they rename zuulv3 to zuul then the status page will get back online again09:29
*** yamamoto has joined #openstack-infra09:29
odyssey4mewell, that's my understanding from some chat I saw yesterday09:29
*** iyamahat_ has quit IRC09:29
*** iyamahat has joined #openstack-infra09:29
dmelladoodyssey4me: thanks!09:30
dmelladoI'm trying to check the current status and sadly fix broken things09:30
*** owalsh has joined #openstack-infra09:31
*** panda|off is now known as panda09:31
*** yamamoto has quit IRC09:32
*** lukebrowning has joined #openstack-infra09:36
*** yamamoto has joined #openstack-infra09:40
*** mat128 has quit IRC09:41
*** udesale has quit IRC09:42
*** iyamahat_ has joined #openstack-infra09:42
*** sambetts|afk is now known as sambetts09:42
*** iyamahat has quit IRC09:42
*** markvoelker has quit IRC09:43
openstackgerritJens Harbott (frickler) proposed openstack-infra/openstack-zuul-jobs master: Fix grenade multinode job  https://review.openstack.org/50847309:46
fricklerinfra-root: ^^ I think I've located the cause for the multinode post_failures09:46
*** alexchadin has quit IRC09:47
*** lukebrowning has quit IRC09:48
*** alexchadin has joined #openstack-infra09:48
dmelladoanyone also having 'end of stream' errors?09:49
*** lukebrowning has joined #openstack-infra09:50
*** yamamoto has quit IRC09:54
*** iyamahat_ has quit IRC09:54
*** lukebrowning has quit IRC09:55
*** lukebrowning has joined #openstack-infra09:57
*** lukebrowning has quit IRC10:01
*** esberglu has joined #openstack-infra10:01
fricklerso all the openstack-tox-py27 I checked ran on trusty, could this have the same root-cause as nova-net running on master?10:01
*** yamamoto has joined #openstack-infra10:01
*** egonzalez has quit IRC10:05
*** esberglu has quit IRC10:06
*** adriant has quit IRC10:06
*** stevemar has quit IRC10:06
*** yuval has quit IRC10:06
*** stevemar has joined #openstack-infra10:07
*** jgriffith has quit IRC10:08
*** lukebrowning has joined #openstack-infra10:08
*** numans has quit IRC10:08
*** ari[m] has quit IRC10:08
*** ari[m] has joined #openstack-infra10:08
*** yuval has joined #openstack-infra10:09
*** numans has joined #openstack-infra10:10
*** lukebrowning has quit IRC10:13
*** dhajare has quit IRC10:13
*** lukebrowning has joined #openstack-infra10:14
*** jgriffith has joined #openstack-infra10:14
*** LindaWang has quit IRC10:14
*** tmorin has quit IRC10:16
*** lukebrowning has quit IRC10:19
*** egonzalez has joined #openstack-infra10:20
*** lukebrowning has joined #openstack-infra10:20
*** derekh has joined #openstack-infra10:22
*** adriant has joined #openstack-infra10:22
*** lukebrowning has quit IRC10:24
ianwfrickler: that one's got me beat for now10:26
ianwnova-net on master ... what are you seeing run incorrectly?10:26
*** lukebrowning has joined #openstack-infra10:26
*** liujiong has quit IRC10:27
openstackgerritTom Barron proposed openstack-infra/project-config master: Update manila tempest job skip conditions  https://review.openstack.org/50848510:27
*** masber has joined #openstack-infra10:28
toskytbarron: ^^ I suspect it's going to be rejected as it is (Zuul v3 migration means new places for job definition)10:30
fricklerianw: openstack-tox-py27 is running on trusty nodes instead of xenial. neutron gate jobs are missing py27 jobs completely10:30
tbarrontosky: ack, guess I need to learn the new places :)10:31
*** masber has quit IRC10:32
*** lukebrowning has quit IRC10:32
*** lukebrowning has joined #openstack-infra10:33
*** shardy has quit IRC10:33
*** LindaWang has joined #openstack-infra10:36
*** zhurong has quit IRC10:36
*** lukebrowning has quit IRC10:37
*** lukebrowning has joined #openstack-infra10:39
*** markvoelker has joined #openstack-infra10:40
*** alexchadin has quit IRC10:41
*** seanhandley has joined #openstack-infra10:42
seanhandleyI'm waiting on `Needs Verified Label` for https://review.openstack.org/#/c/508445/10:42
seanhandleybut I don't see where that approval is coming from10:42
seanhandleyZuul seems to have finished running10:42
seanhandleyam I waiting for Jenkins to get involved ?10:43
toskyseanhandley: no "Jenkins" anymore (it was zuul v2.x)10:43
*** lukebrowning has quit IRC10:43
seanhandleyRight10:44
seanhandleySo Zuul will come back and +2 verify at some point10:44
seanhandley?10:44
*** askb has quit IRC10:44
toskythat's the idea, but there may be still bugs, as -infra people are fixing the last issues of the migrations10:45
*** pbourke has joined #openstack-infra10:45
seanhandleyok, thanks :)10:45
*** lukebrowning has joined #openstack-infra10:45
seanhandleyI'll give it a couple of hours10:45
*** namnh has quit IRC10:47
*** lukebrowning has quit IRC10:50
*** lukebrowning has joined #openstack-infra10:52
*** askb has joined #openstack-infra10:52
*** lukebrowning has quit IRC10:56
*** lukebrowning has joined #openstack-infra10:58
*** jesusaur has quit IRC10:59
*** lukebrowning has quit IRC11:02
*** jesusaur has joined #openstack-infra11:03
ianwjeblair (fyi): yolanda has a stuck job on 508452 base-integration-centos-7 : i can see it was assigned node 0000054363 (198.72.124.183); host up for ~2 hours and i can also see "zuul" has never tried to log in.  http://paste.openstack.org/show/622302/11:04
*** rhallisey has joined #openstack-infra11:05
*** lukebrowning has joined #openstack-infra11:09
ianw(of course, where this rates in order of current issues i don't know ;)11:10
*** alexchadin has joined #openstack-infra11:10
*** markvoelker has quit IRC11:12
*** andreas_s_ has quit IRC11:12
openstackgerritDirk Mueller proposed openstack-infra/project-config master: Remove legacy-requirements-python34 job  https://review.openstack.org/50848911:12
ianwfrickler: $ git grep '/ on node' | grep multinode | wc -l11:13
ianw16011:13
ianwi think they probably all want that copy11:13
*** lukebrowning has quit IRC11:13
Shrewsyep, zuul seems wedged. i see lots of locked nodes, but it isn't doing anything with them. there also appear to be zookeeper connection issues again. we'll have to wait for jeblair, i believe11:14
ianwyeah, i don't want to touch anything and destroy anything helpful at this point11:15
Shrewshrm, not wedged, just... ineffective? lots of exceptions about nodes not being locked, which we saw yesterday when the requests got lost11:15
*** lukebrowning has joined #openstack-infra11:15
Shrewsi'm beginning to suspect our load is too much for a single zookeeper node11:15
ianw:/11:16
Shrewswoah, 8G zuul debug log file11:17
ianwi thought the whole idea was it didn't loose stuff11:17
ianwyeah, there seems to be some tight looping11:17
Shrewsianw: the requests for nodes sent through zk are ephemeral. if the zk connection goes away, so does the request11:18
Shrewsperhaps zuul isn't handling that very well? not sure11:18
Shrewsoy, must grab coffee11:19
ianwI'm EOD here in .au ... good luck everyone!11:19
toskyit's also EOW there, I guess :)11:19
*** lukebrowning has quit IRC11:20
ianwtosky: yes, and a holiday monday too!  and daylight savings so i don't have to get up so early for the infra meeting, it's all good :)11:20
toskyuh, daylight saving so early? Interesting11:21
*** edmondsw has quit IRC11:21
*** lukebrowning has joined #openstack-infra11:22
*** adisky has quit IRC11:23
*** alexchadin has quit IRC11:24
*** lukebrowning has quit IRC11:26
*** lukebrowning has joined #openstack-infra11:28
*** tpsilva has joined #openstack-infra11:28
*** alexchadin has joined #openstack-infra11:30
*** lukebrowning has quit IRC11:32
*** lukebrowning has joined #openstack-infra11:34
*** lukebrowning has quit IRC11:39
*** mat128 has joined #openstack-infra11:40
*** alexchadin has quit IRC11:42
*** alexchadin has joined #openstack-infra11:43
*** jpena is now known as jpena|lunch11:43
fricklerI think we might need a status notice that jobs in the integrated-gate are bound to fail currently due to the multinode issues11:50
*** alexchadin has quit IRC11:52
*** baoli has joined #openstack-infra11:53
*** armax has joined #openstack-infra11:53
*** dprince has joined #openstack-infra11:57
*** thorst has quit IRC12:00
*** thorst has joined #openstack-infra12:00
openstackgerritChandan Kumar proposed openstack-infra/project-config master: Add python-tempestconf project  https://review.openstack.org/50850212:01
*** kjackal_ has joined #openstack-infra12:04
ttxFWIW I also have a stuck job on 50494012:08
*** esberglu has joined #openstack-infra12:09
*** markvoelker has joined #openstack-infra12:09
*** baoli has quit IRC12:10
*** alexchadin has joined #openstack-infra12:10
*** cuongnv has quit IRC12:10
*** mat128 has quit IRC12:12
*** edmondsw has joined #openstack-infra12:13
*** mat128 has joined #openstack-infra12:18
*** trown|outtypewww is now known as trown12:20
*** LindaWang has quit IRC12:21
*** LindaWang has joined #openstack-infra12:21
yamamotoshould tox_install.sh style dependencies be in required-projects as well?12:27
*** rlandy has joined #openstack-infra12:28
*** markvoelker has quit IRC12:29
*** markvoelker has joined #openstack-infra12:29
*** wolverineav has joined #openstack-infra12:35
*** hemna_ has joined #openstack-infra12:35
*** lukebrowning has joined #openstack-infra12:35
*** shardy has joined #openstack-infra12:39
*** kiennt26 has joined #openstack-infra12:43
*** camunoz has joined #openstack-infra12:46
*** jpena|lunch is now known as jpena12:47
*** jaypipes has joined #openstack-infra12:48
*** lukebrowning has quit IRC12:48
*** bnemec has joined #openstack-infra12:50
*** lukebrowning has joined #openstack-infra12:51
*** armax has quit IRC12:54
*** baoli has joined #openstack-infra12:56
*** lukebrowning has quit IRC12:56
*** mriedem has joined #openstack-infra12:57
dhellmannI'm looking into a failed job and having some trouble figuring out where the log files are. http://logs.openstack.org/82/508482/1/check/legacy-releases-python35/c868102/ Is there still WIP for the zuul update related to logs?12:57
*** lukebrowning has joined #openstack-infra12:58
*** kiennt26 has quit IRC12:58
dmsimardinfra-root: lots of different problems in the zuul queue. It might tie back all to the same issue but I've had a job queued for 15 minutes without jobs (mergers not processing ?), example: 507889  -- ttx also mentioned 504940 that is indeed stuck12:58
fricklerdhellmann: seems like it failed here: http://logs.openstack.org/82/508482/1/check/legacy-releases-python35/c868102/job-output.txt.gz#_2017-09-29_10_50_11_61805612:58
*** kiennt26 has joined #openstack-infra12:59
dmsimarddhellmann: if you open up the 'ara' folder, the error should be highlighted -- look for red things12:59
dhellmanndmsimard : I've clicked all around and not found anything that looked like job output logging13:00
dmsimardi.e, http://logs.openstack.org/82/508482/1/check/legacy-releases-python35/c868102/ara/ has a red icon and if you expand the tasks panel you'll see the failed13:00
dhellmannara seems to be helpfully showing me the job definition though13:00
dmsimarddhellmann: click on the 'failed' status13:00
dmsimard(the permalink version is http://logs.openstack.org/82/508482/1/check/legacy-releases-python35/c868102/ara/result/b44ba413-2dce-4b7c-afec-fe02dbec8e33/ )13:00
dhellmannoh, wow13:00
dhellmannso, how do I set up my jobs so I don't need so many clicks to get to the failure logs?13:01
dmsimardthe raw output is what frickler linked, it's the equivalent of the console log13:01
dhellmannhow do I find that starting from a link on a gerrit patch?13:01
*** ralonsoh_ has joined #openstack-infra13:02
dmsimardthe link on the gerrit patch should send you straight to the log root which contains the job-output.txt.gz file13:02
dhellmannoh, nm13:02
dhellmannI found it13:02
dhellmannI just scrolled past the error13:02
*** lukebrowning has quit IRC13:02
dhellmannhow hard is it to add new log files to a job? if I wanted a script to log its output separately for example?13:02
dhellmannthe release review process relies on us reading a report that the job generates when it passes13:03
dhellmannnow it looks like that output is likely to be all mixed in with other data13:03
openstackgerritChandan Kumar proposed openstack-infra/project-config master: Add python-tempestconf project  https://review.openstack.org/50850213:03
*** jaypipes is now known as leakypipes13:04
*** lukebrowning has joined #openstack-infra13:04
dmsimarddhellmann: as far as I know, you have control over what logs are sent through a task that looks like this: https://review.openstack.org/#/c/508296/17/playbooks/upload-logs.yaml13:04
*** jcoufal has joined #openstack-infra13:04
Shrewsdmsimard: yep. we are definitely having issues with zookeeper13:04
*** ralonsoh has quit IRC13:05
dmsimarddhellmann: the 'src' would be the location where you would put your log files in13:05
Shrewsjeblair: mordred: pabelanger: i can't even do a "nodepool list" now. getting zk connection errors. going to poke around zk logs13:05
dhellmanndmsimard : ok, cool, so if I created a separate file in the right place it would be copied up. I just need to figure out how to create the separate file13:05
Shrewsjeblair: mordred: pabelanger: ah ha. disk full on nodepool.o.o13:06
Shrewswheeeee13:06
dhellmanndmsimard, frickler : thanks for your help!13:06
dmsimarddhellmann: np, happy to help13:06
dmsimardShrews: oh noes13:06
dmsimardShrews: I wish I had at least read only access to the servers :(13:07
*** ralonsoh has joined #openstack-infra13:07
*** kiennt26 has quit IRC13:08
*** ralonsoh_ has quit IRC13:08
*** kgiusti has joined #openstack-infra13:08
*** lukebrowning has quit IRC13:08
*** Goneri has joined #openstack-infra13:09
mordredShrews: there was 4.6G in old puppet reports in /var  - I cleared them out to give some more headroom13:12
Shrewsso, /var/lib/zookeeper seems to be taking quite a lot13:12
mordredShrews: yah13:13
dmsimardis debug logging enabled or something ?13:13
dmsimardthe debug logging in the zuul unit tests is.... intense13:13
mordredyah it is13:14
dmsimardprobably want to toggle that off unless it's necessary13:14
mriedemi'm not sure where this legacy-tempest-dsvm-nnet job came from but it's 100% fail on master since you can't run nova-network outside of a cellsv1 config, and for that we have the cellsv1 job - should i do something to filter out legacy-tempest-dsvm-nnet or is someone already doing that?13:14
*** lbragstad has joined #openstack-infra13:15
*** lukebrowning has joined #openstack-infra13:15
mriedem"f off, that's the lowest priority thing right now" is an acceptable answer13:15
dmsimardmriedem: if there's no patch opened to remove it from openstack-infra/openstack-zuul-jobs then no one is already doing that13:15
mriedemok13:15
mriedemwill look13:15
dmsimardmriedem: to remove it, you'll want to remove the job component and then that job from the project13:15
mriedemhttps://review.openstack.org/#/c/508405/13:16
Shrewsmordred: zuul is outputting stuff like crazy now. maybe it's unwedged?13:16
Shrewsor made worse. i dunno13:17
mriedemdmsimard: i think we just want to restrict it to <ocata13:17
dmsimardmriedem: you can test that patch with a depends-on, which is what ianw did here https://review.openstack.org/#/c/508409/ and it didn't work.. let me check13:17
*** jdandrea_ has joined #openstack-infra13:18
mordredyah - I'm confused as to why that's not restricted to newton - looking now13:19
*** lukebrowning has quit IRC13:19
fricklermordred: I'm guessing something is broken with branch filtering and many of the issues in http://lists.openstack.org/pipermail/openstack-dev/2017-September/122861.html are related to that13:20
mordredfrickler: thanks for that list!13:20
*** efried is now known as fried_rice13:21
*** lukebrowning has joined #openstack-infra13:21
dmsimardmordred: the template is defined in openstack-zuul-jobs but is used in project-config, would that prevent speculative testing ?13:21
*** mat128 has quit IRC13:22
mordrednope - speculative testing of the template should work fine13:22
mordredI'm VERY confused about why neutron doesn't have unittests in its gate jobs13:22
*** eharney has joined #openstack-infra13:22
*** mat128 has joined #openstack-infra13:22
mordredso - for a more general solution based on ianw's patch here: https://review.openstack.org/#/c/50847313:24
*** hashar is now known as hasharAway13:25
*** lukebrowning has quit IRC13:25
*** jamesdenton has quit IRC13:25
Shrewsfrickler: hmm, for the "openstack-tox-py27 is being run on trusty nodes instead of xenial" issue, do you have an example handy?13:26
esbergluIs there an alternative to zuul.openstack.org now? Or is that dashboard just not up?13:26
mordredesberglu: http://zuulv3.openstack.org/13:27
esberglumordred: Ah tx13:27
*** lukebrowning has joined #openstack-infra13:27
tusharHi All ... the third party CI  is not listening to the gerrit patches after migration to zuul313:29
tusharand result page is also not updating  - http://ci-watch.tintri.com/project?project=cinder&time=7+days13:29
tusharAny chages required in third party CI setup?13:30
*** slaweq has quit IRC13:30
fricklerShrews: http://logs.openstack.org/38/508438/2/check/openstack-tox-py27/3a6fdfe/13:30
*** stephenfin is now known as finucannot13:30
logan-tushar: fwiw my jenkins is still running 3rd party tests as of this morning with no changes made13:31
*** lukebrowning has quit IRC13:31
Shrewsmordred: i might need to restart the np launchers. their zk sessions seem to be permanently suspended, so nothing is happening13:32
mordredShrews: nod13:32
mordredtushar: yah - nothing should have changed re: third party CI13:32
Shrewsnl02 restarted13:33
*** lukebrowning has joined #openstack-infra13:33
fricklertushar: well, if you are using sos-ci, you may want to change your trigger from "Jenkins +1" to "Zuul +1" or similar13:34
Shrewsnl01 restarted. i think that was the source of the wedge13:34
tusharlogan ,mordred : In /etc/zuul/layout/layout.yaml, there is one block approval13:34
tushar      approval:13:34
tushar        - verified: [1, 2]13:34
tushar          username: jenkins13:34
fungitushar: is it possible some third-party ci systems are configured to only run jobs after the upstream ci reports on those changes? if so, the name of the account reporting changed13:34
tusharafter commenting this its working fine13:34
fungitushar: yeah, that looks like what you're doing13:34
*** dansmith is now known as superdan13:34
mordredah. yes.13:34
fungiit's the "zuul" account now13:34
tusharfungi : correct13:35
funginot "jenkins" any longer13:35
mordredfungi: that's worthy of broader communication13:35
mordredbtw - I'm looking at a wider solution to the multinode logs problem based on ianw's patch13:35
*** psachin has quit IRC13:36
tusharfungi : replaced jenkins with zuul , still its trigger CI for any patch13:37
fungii'm almost to the point where i have caffeine and can start digging in. i'm caught up on scrollback but holy moley there's so much it's hard to decide what's top priority. probably the cleanup post nodepool.o.o filling up its filesystem, followed by the branch exclusion misbehavior (or in particular whatever is causing neutron not to run unit tests and swift to run theirs on trusty)13:37
tusharalso I modified verified: [-1 2], as all the patches having zuul -113:37
*** ijw has joined #openstack-infra13:37
*** lukebrowning has quit IRC13:38
tusharfungi : not sure we need to replace "jenkins" with "zuul" or some other key word13:38
mordredfungi: yah - the branch exclusion misbehavior is the thign that worries me the most - but I don't see anything wrong in the config13:39
*** links has quit IRC13:39
*** lukebrowning has joined #openstack-infra13:40
mriedemdoes openstack-zuul-jobs replace project-config now?13:40
fricklermriedem: iiuc it only replaces project-config/jenkins/jobs13:41
Shrewsheh, citycloud-kna1 has two AZs... nova and nova-local13:42
Shrews*sigh*13:42
mordredmriedem: it's a little more complex - there are three main locations for shared jobs13:42
mordredmriedem, frickler: https://docs.openstack.org/infra/manual/zuulv3.html#where-jobs-are-defined-in-zuul-v313:42
mordredthere are currently WAY more things in openstack-zuul-jobs than there eventually will be - because at the moment there are WAY more things defined cenrally than there eventually will be13:43
mriedemok so eventually push project-specific jobs to the repos that run them13:44
mordredyes13:44
mriedemwell that seems neataroo13:44
*** lukebrowning has quit IRC13:44
*** gouthamr has joined #openstack-infra13:44
mordredyah - also - those jobs get tested with the patch they're proposed in (as do patches to zuul-jobs or openstack-zuul-jobs but not project-config fwiw) ...13:44
mriedemi have been able to blissfully ignore this zuulv3 business until this week13:44
mordredso iterating on a job is WAY easier - once we get past these initial pain moments13:45
mriedemyeah non-self testing patches to project-config was always annoying, but workaroundable13:45
openstackgerritMonty Taylor proposed openstack-infra/openstack-zuul-jobs master: Create legacy nodesets and switch all legacy jobs  https://review.openstack.org/50851013:45
openstackgerritMonty Taylor proposed openstack-infra/openstack-zuul-jobs master: Update legacy post playbooks to pull from primary  https://review.openstack.org/50851113:45
mordredinfra-root: ^^ based on what ianw found about multinode log uploading - I believe that ^^ should fix it systemically13:46
*** lukebrowning has joined #openstack-infra13:46
mordredinfra-root: it's a little bit more of a sledgehammer than is strictly necessary - but I can't think of any specific downside13:46
*** ykarel has quit IRC13:47
dmsimardFYI I'm writing a FAQ-ish email to openstack-dev to give a few pointers on how to troubleshoot the legacy and new jobs13:47
*** Dinesh_Bhor has quit IRC13:48
dmsimardleifmadsen: trying to find the quickstart guide, is it just melded in https://docs.openstack.org/infra/zuul/user/index.html ?13:48
chandankumarAJaeger_: for new repo creation, we donot need to add zuul layout?13:48
fricklermordred: woa, how did you get that into merge conflict so fast?13:48
mordredfrickler: I'm very talented13:48
seanhandleyIs one of the current Zuul issues regarding +2 Verify and Merge step? I've had a sphynx project stuck for a few hours with +2 code review, +1 workflow and a +1 verified from Zuul.13:49
AJaeger_chandankumar: don't know yet ;( Best ask here and sent a patch for the infra-manual, please13:49
seanhandleyI'm not sure if I've done something wrong, or I'm caught up in the wider ongoing issues13:49
mordredseanhandley: yah - we're having some issues with the zuul scheduler and nodpeool nodes that jeblair and Shrews are investigating13:49
mordredseanhandley: you have almost certainly not done anything wrong13:49
mordredchandankumar: zuul/layout.yaml is no longer a thing - but we have not updated the project creator's guide yet (thanks for the reminder)13:50
seanhandleythis is the first Gerrit change I've raised on this repo you see :)13:50
mordredseanhandley: oh no!13:50
seanhandleyI'm wondering if I messed up the ACL in project infra perhaps13:50
*** lukebrowning has quit IRC13:50
seanhandleyEither way, it sounds like it's worth discussing more next week when the wider issues have hopefully been fixed13:51
mordredseanhandley: what's the project?13:51
seanhandleyI'm just being impatient :D13:51
chandankumarmordred: https://review.openstack.org/#/c/508502/ is it right?13:51
seanhandleymordred: It's the public cloud WG's doc repo13:51
chandankumarfor new project creation13:51
seanhandleyI'm trying to draft a spec for the Passport Program13:51
openstackgerritMonty Taylor proposed openstack-infra/infra-manual master: Add mention of legacy nodesets to migration instructions  https://review.openstack.org/50851213:52
*** lukebrowning has joined #openstack-infra13:52
AJaeger_mordred, time to merge https://review.openstack.org/508313 - that's the new sudo change - it has three +2s but wasn't approved yet13:52
Shrewsmordred: i'm seeing np handle requests now like crazy, but i don't really see any progress on zuulv3.o.o. maybe the scheduler needs to be kicked? or should we wait for jeblair?13:53
AJaeger_seanhandley: what's the change? Let's look at it in detail, please13:53
seanhandleySure AJaeger_ - thanks. https://review.openstack.org/#/c/508445/13:53
mordredShrews: I think jeblair should be up soon, so let's wait for him13:53
AJaeger_seanhandley: currently on a bus with bad internet - will report back as soon as I can ;)13:54
seanhandleyheh13:54
seanhandleyBeen there before :D13:54
seanhandleyBeen SSH'd into prod boxes there before13:54
mordredfungi, AJaeger_: I'm thinking after I take care of a few of these morning patches I might draft an email to the list letting folks know where we're at ... and we might want to set up a specific place for people to register migration issues13:54
fricklerAJaeger_: seanhandley: that patch is waiting in the gate queue, which is currently backed up 8 hours and counting13:55
AJaeger_mordred: good idea - setting up an etherpad or something like that13:55
mordredyah13:55
AJaeger_frickler: thanks!13:55
seanhandleyfrickler: Ouch! Thanks for checking13:55
mordredor even a storyboard story that people can just add tasks to13:55
AJaeger_seanhandley: so, everything fine, drink a coffee, bake some cookies and ship them to frickler ;)13:55
seanhandleyYup. I will patiently sip coffee and find other things to do while I wait ;)13:55
seanhandleyHe doesn't want to eat cookies I've baked.13:56
seanhandleyNobody ever does :D13:56
mtreinishfungi, mordred, clarkb: if you get a sec can you take a look at: https://review.openstack.org/#/q/topic:restore-name-sanity to try and get openstack-health useable again13:56
mordredAJaeger_, fungi: I'm considering force-merging the sudo fix, since the gate queue is backed up but it affects a large swath of things13:56
*** lukebrowning has quit IRC13:57
AJaeger_mordred: don't ask me for advice, I couldn't follow this week and thus are backed up a bit as well ;) You still have my blessing ;)13:57
openstackgerritMonty Taylor proposed openstack-infra/openstack-zuul-jobs master: Create legacy nodesets and switch all legacy jobs  https://review.openstack.org/50851013:57
openstackgerritMonty Taylor proposed openstack-infra/openstack-zuul-jobs master: Update legacy post playbooks to pull from primary  https://review.openstack.org/50851113:57
*** hongbin has joined #openstack-infra13:57
AJaeger_mordred: I'l fix 508512 - the infra-manual change - for you now...13:58
mordredAJaeger_: well - we went live! :)13:58
*** gongysh has joined #openstack-infra13:58
*** jcoufal_ has joined #openstack-infra13:58
AJaeger_mordred: I notcied ;)13:58
* AJaeger_ is happy about that!13:58
fungimordred: missing a legacy-ubuntu-xenial-2-node nodeset in 50851013:58
fungii think13:58
*** lukebrowning has joined #openstack-infra13:58
*** sbezverk has joined #openstack-infra13:59
openstackgerritMatt Riedemann proposed openstack-infra/project-config master: Remove legacy-tempest-dsvm-nnet-ocata  https://review.openstack.org/50851313:59
mordredfungi: oh - yes, you're right - those weren't strictly needed but made global serach and replace easier - one sec14:00
*** jcoufal has quit IRC14:00
*** kiennt26 has joined #openstack-infra14:00
fungimordred: which sudo fix are we still missing? i'll check my zuulv3 review dashboard14:01
openstackgerritAndreas Jaeger proposed openstack-infra/infra-manual master: Add mention of legacy nodesets to migration instructions  https://review.openstack.org/50851214:01
mordredfungi: https://review.openstack.org/#/c/508313/14:01
*** srobert has joined #openstack-infra14:02
fungiaha, the topic isn't zuulv3, that explains why i wasn't finding it14:02
openstackgerritAndreas Jaeger proposed openstack-infra/infra-manual master: Add mention of legacy nodesets to migration instructions  https://review.openstack.org/50851214:02
*** ralonsoh_ has joined #openstack-infra14:02
openstackgerritMonty Taylor proposed openstack-infra/openstack-zuul-jobs master: Create legacy nodesets and switch all legacy jobs  https://review.openstack.org/50851014:03
openstackgerritMonty Taylor proposed openstack-infra/openstack-zuul-jobs master: Update legacy post playbooks to pull from primary  https://review.openstack.org/50851114:03
*** lukebrowning has quit IRC14:03
openstackgerritMatt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet-ocata  https://review.openstack.org/50851514:03
fungimordred: so, 508313 won't take effect until after we rebuild nodepool images anyway, right?14:03
*** dhajare has joined #openstack-infra14:04
*** camunoz has quit IRC14:05
*** lukebrowning has joined #openstack-infra14:05
ttxfungi: I have a stuck governance change at https://review.openstack.org/#/c/504940/ which prevents my sending of the weekly TC status... Should I somehow cancel and retry it ? Or just let it be ?14:05
mordredfungi: nope - it was done that way to avoid needing to rebuild images14:05
mordredfungi: oh - wait - bother14:05
mordredfungi: the openstack-zuul-jobs version of tha is the one that's important14:05
mordredttx: we have a bunch of stuck changes right now pending some issues being investigated14:06
fungik14:06
ttxok, standing by14:06
*** ralonsoh has quit IRC14:06
*** amoralej is now known as amoralej|off14:06
*** amoralej|off is now known as amoralej|lunch14:06
fricklermordred: so with 508511 you would not collect any logs from subnodes, is that correct? should the primary collect from subnodes? see https://review.openstack.org/508473 too14:07
odyssey4meHi everyone - we've had two specs patches stalled in the queue for nearly 5 hours now. https://review.openstack.org/499882 & https://review.openstack.org/499886 - any thoughts on what's going on there?14:07
mordredfungi: so - what do you think - etherpad for reporting migration issues? Or storyboard story and have people add tasks?14:07
mordredodyssey4me: yup. we have a stall issue ongoing14:07
fungimordred: we could go old school and ask them to follow up to the -dev ml thread14:07
odyssey4meok, will hang on a check back later then - thanks14:08
fungii worry that with an etherpad approach we'll just end up with a mess and not enough detail14:08
openstackgerritAndreas Jaeger proposed openstack-infra/infra-manual master: Ectomy Jenkins from the Infra Manual narrative  https://review.openstack.org/43645514:08
openstackgerritAndreas Jaeger proposed openstack-infra/infra-manual master: Add warning about Zuul v2 examples  https://review.openstack.org/50851814:08
mordredfungi: I think that might be the right choice - especially as I cannot log in to #storyboard right now14:08
leifmadsendmsimard: no quickstart guide yet14:08
leifmadsenstill in progress14:08
leifmadsenlooks like the openstack etherpad is down?14:08
leifmadsendmsimard: no quickstart guide yet14:08
leifmadsenstill in progress, but it's available here: https://etherpad.openstack.org/p/zuulv3-quickstart14:08
leifmadsenworking notes14:08
fungileifmadsen: etherpad seems to be running to me14:09
leifmadsenfungi: yea sorry, it was a local issue14:09
*** lukebrowning has quit IRC14:09
*** alexchadin has quit IRC14:09
leifmadsenI didn't think my msg even went through14:09
mordredinfra-root: not that it's the most important thing on our plate, but logging in to storyboard gives me:14:09
fungimordred: i _do_ think an etherpad for us to coordinate what we're working on fixing makes sense, but replying to the ml seems like a better way to solicit detailed feedback14:10
mordredError Code:14:10
mordredinvalid_grant14:10
*** coolsvap has quit IRC14:10
mordredError Description:14:10
mordredNo description received from server.14:10
openstackgerritMatt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet  https://review.openstack.org/50851914:10
mordredfungi: kk. I mostly want to make sure that people can report specific issues and that we can keep track of duplication, whether they're being worked, and status14:10
mordredfungi: this is one of those times where I think we may be served well by a more traditional formal process :)14:11
AJaeger_mordred, fungi, https://docs.openstack.org/infra/manual is not getting updated by the post job - change merged this morning but http://logs.openstack.org/95/95c4d1433c74ad23894f7296be51a3a23b3c6e56 is empty . That's sad since those merged changes updated content for zuul v3 ;/14:11
mordredmtreinish: awesome ^^ - before you do that patch you should probably remove the mention of that job from zuul.d/projects.yaml in project-config14:12
mordredgah14:12
mordredmriedem: ^^14:12
openstackgerritMatt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet  https://review.openstack.org/50851914:12
openstackgerritMatt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet-ocata  https://review.openstack.org/50852014:12
mriedemmordred: yup already did, just needed the depends-on14:12
mordredmriedem: cool14:13
fungimordred: i just tested storyboard.o.o and i can login, fwiw, so i don't know that we've got broken things to look into there14:13
mordredmriedem: if you feel like it, while you're at it - you could delete ALL jobs from the nova pipeline definition in project-config that are not standard central jobs and add them to .zuul.yaml in the nova repo (don't know how much job jockeying you feel like doing this morning)14:13
fungijust as well ;)14:13
*** bobh has joined #openstack-infra14:13
*** ramishra has quit IRC14:13
jeblairmriedem: or you could leave the old ugly jobs there and add nice new ones to nova14:14
mriedemmordred: i don't have the proper jockey attire on for that14:14
mordredyah14:14
mriedempriority #1 for me is just getting nova unblocked atm14:15
*** hemna_ has quit IRC14:15
*** lukebrowning has joined #openstack-infra14:16
*** ihrachys_ has joined #openstack-infra14:16
*** ihrachys_ has quit IRC14:16
mordredmriedem: totally agree. mostly mentioning it because it MIGHT be worthwhile to do a large 3-patch copy-rename-move sequence and then be able to iterate on nova issues in nova yourself - but it also might not be depending on how many you've got14:17
mriedemi think it's just this nnet job14:17
mriedembtw, is zuulv3 smart enough to ignore abandoned patches which are dependencies via depends-on?14:17
dmsimardinfra-root: FYI I started an etherpad for FAQs and tips on using and troubleshooting v3 https://etherpad.openstack.org/p/zuulv3-migration-faq14:17
mriedemb/c if not, i'll have to fix a change id14:17
*** jcoufal_ has quit IRC14:17
dmsimardposted to openstack-dev via http://lists.openstack.org/pipermail/openstack-dev/2017-September/122880.html14:17
*** jcoufal has joined #openstack-infra14:18
jeblairdmsimard: please include the infra-manual zuul v3 migration document14:19
dmsimardjeblair: that's what I was looking for, I thought that was leifmadsen's doc14:19
dmsimardjeblair: where is it ?14:19
jeblairdmsimard: especially since starting on line 22 you're starting to rewrite it.  :)14:19
jeblairdmsimard: https://docs.openstack.org/infra/manual/zuulv3.html14:19
AJaeger_jeblair: see my comment above - last publish of infra-manual was 25th September, we need it publishing again...14:20
dmsimardjeblair: argh, I was looking in the zuul docs14:20
*** rbrndt has joined #openstack-infra14:20
jeblairdmsimard: the link has been included in every communication about the zuulv3 migration.  it would be great to stay on-message.  :)14:20
*** camunoz has joined #openstack-infra14:20
jeblairAJaeger_: i agree.  mordred, were you looking into that yesterday?14:20
*** mriedem1 has joined #openstack-infra14:20
dmsimardjeblair: added, thanks14:21
*** lukebrowning has quit IRC14:21
andreafdmsimard: shall we have a link to devstack and tempest roles in devstack-gate as well?14:21
mtreinishmordred: I'm not fluent in ansible what does this mean: http://logs.openstack.org/72/508272/1/check/legacy-infra-puppet-apply-3-centos-7/d312021/job-output.txt.gz#_2017-09-29_14_06_23_46995914:21
dmsimardandreaf: sure14:21
jeblairdmsimard: *please* read the infra-manual doc and help improve it before you start over again.  we spent a lot of time on it.14:22
*** lukebrowning has joined #openstack-infra14:22
dmsimardjeblair: my intention is not to restart it, people have been asking the same questions over and over and I'm specifically targetting those questions14:22
openstackgerritMatt Riedemann proposed openstack-infra/project-config master: Remove legacy-tempest-dsvm-nnet-ocata  https://review.openstack.org/50852414:22
*** kfarr has joined #openstack-infra14:23
openstackgerritMatt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet-ocata  https://review.openstack.org/50852014:23
openstackgerritMatt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet  https://review.openstack.org/50851914:23
jeblairdmsimard: yep.  we need actual documentation for all of those answers.  the best answer to a question is a doc link.14:23
mordreddmsimard, jeblair: maybe it's worth adding a FAQ section to the end as we get FAQs? sometimes a short bullet-point summary can be helpful, with an internal link to the longer section?14:23
jeblairdmsimard: i think the etherpad can be a great stop-gap, especially when we get a new question.  but it should be a staging area for getting info into docs.14:23
*** mriedem has quit IRC14:23
mordredjeblair: andyes - I was looking in to infra-manula publication issues - will pick that up in just a bit14:24
jeblairmordred: ya14:24
fricklermtreinish: I think http://logs.openstack.org/72/508272/1/check/legacy-infra-puppet-apply-3-centos-7/d312021/job-output.txt.gz#_2017-09-29_14_06_23_469145 is the error message for that. you might need similar additions like in https://review.openstack.org/50844814:24
andreafmtreinish: when you have a failure in a role I think you'll have better luck looking at it in ARA http://logs.openstack.org/72/508272/1/check/legacy-infra-puppet-apply-3-centos-7/d312021/ara/14:24
AJaeger_mordred: you'll find a few changes to review and merge for infra-manual once you want to test ;)14:25
jeblairfungi, dmsimard: i note that the status page is a faq on dmsimard's list.  that's because https://review.openstack.org/507244 hasn't merged.  is someone looking into those?14:25
mtreinishandreaf: that doesn't help me it's too much clicking I don't know where to look14:26
mordredjeblair, dmsimard, fungi, AJaeger_: I'm also working on an email status update - which will include the suggestion from fungi earlier that we collect specific job migration issues people are having as replies to the thread14:26
andreafmtreinish: heh just look for the red task http://logs.openstack.org/72/508272/1/check/legacy-infra-puppet-apply-3-centos-7/d312021/ara/result/7f8034c2-94e0-4a71-ba9f-51ee2a67c4d0/14:26
mordredinfra-root, dmsimard, AJaeger_: unless we want to suggest a different approach for collecting those14:26
mtreinishfrickler: thanks, ok so now I have to figure out where that job is defined and update it14:26
*** lukebrowning has quit IRC14:27
mtreinishandreaf: right which just gives me the log output14:27
openstackgerritAndreas Jaeger proposed openstack-infra/infra-manual master: Add warning about Zuul v2 examples  https://review.openstack.org/50851814:27
jeblairmordred: i think that sounds fine14:27
mtreinishandreaf: I'd rather just look at the log...14:27
odyssey4mewe'd appreciate some review for https://review.openstack.org/508281 to fix up the required repositories for our jobs if anyone has a moment14:28
*** lukebrowning has joined #openstack-infra14:28
fungijeblair: other than noticing around midnight that we don't have working puppet apply and beaker jobs on system-config, i haven't looked into them yet14:29
jeblairfungi: maybe we should force-merge that change?14:30
fungii'm good with that. it can't really break anything; very limited in scope. i'll do that now14:30
AJaeger_the status patch fails with a cp error, see http://logs.openstack.org/44/507244/1/gate/legacy-infra-puppet-apply-3-centos-7/8156163/job-output.txt.gz#_2017-09-29_01_58_19_30387614:31
jeblairodyssey4me, logan-: the legacy-openstack-ansible-base part of that change looks good, but i'm not sure the linters part is correct.14:31
openstackgerritMerged openstack-infra/system-config master: Add redirect from status.o.o/zuul to zuulv3.openstack.org  https://review.openstack.org/50724414:33
*** lukebrowning has quit IRC14:33
*** jaosorior has quit IRC14:33
*** wewe0901 has quit IRC14:34
odyssey4mejeblair is that just a name issue, or are we misunderstanding how we're supposed to use the template model?14:34
jeblairodyssey4me: maybe both? -- what's the problem you're trying solve there?14:34
*** lukebrowning has joined #openstack-infra14:34
openstackgerritMatthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs  https://review.openstack.org/50852614:35
odyssey4mejeblair our linters test does ansible syntax and lint checking, so it needs all the roles in place, se we need all the repositories there14:35
mtreinishAJaeger_: ^^^ I think that will fix it14:35
jeblairmtreinish: awesome, thanks -- i was just noticing those jobs were failing14:35
mtreinishAJaeger_: I'm hitting the same failure on my puppet-subunit2sql patches to fix things in subunit2sql/openstack-health post migration14:35
mtreinishjeblair: I don't know if those are the only missing repos though, that's just where things were complaining on the failures14:36
jeblairmtreinish: you can have the change you're trying to get through Depends-On that change, and it will test it14:36
openstackgerritMatthew Treinish proposed openstack-infra/puppet-subunit2sql master: Ensure that build_names are unique per project  https://review.openstack.org/50825814:37
jeblairodyssey4me: okay, i think i understand; i'll write a suggestion in review comments14:37
mtreinishjeblair: ^^^ ok that'll test it then14:37
odyssey4methanks jeblair14:38
*** lukebrowning has quit IRC14:39
*** jcoufal_ has joined #openstack-infra14:40
jeblairodyssey4me: done.  and it was mostly a naming issue i'd say.  :)14:40
jeblairokay, i need to dig into a zuul issue; i think it's stuck.14:41
*** dizquierdo has joined #openstack-infra14:42
fungii was just about to ask. so i guess approving more job configuration fixes is futile at the moment14:42
*** jcoufal has quit IRC14:42
odyssey4methanks jeblair14:43
*** mriedem1 is now known as mriedem14:43
*** apuimedo has quit IRC14:43
mordredinfra-root, dmsimard, AJaeger_: https://etherpad.openstack.org/p/MnG27fsAhC draft email to the mailing list - I started a list of common/known job issues and what to do about them at the bottom - although I'm thinking that perhaps I should just point to the dmsimard FAQ etherpad - and we should then start cycling those etherpad FAQ entries into a FAQ section on infra-manual once we get infra-manual14:43
mordredpublication working again14:44
fungisounds like a fine plan14:45
beisnerhi all, how's landing bot?14:45
openstackgerritJesse Pretorius (odyssey4me) proposed openstack-infra/openstack-zuul-jobs master: Add openstack-ansible required-projects parent job  https://review.openstack.org/50828114:45
beisnerseems like we have a few things lost with the socks14:45
dmsimardmordred: we can cross link the etherpads or something, or fold them together, I don't have a strong opinion14:46
mriedemmordred: so my openstack-zuul-jobs change keeps failing on a project-config thing even though i have a depends-on the project-config change https://review.openstack.org/#/c/508520/ - does the project-config change need to merge first? assume so14:46
*** lukebrowning has joined #openstack-infra14:46
fungimriedem: yes, project-config additions aren't safe to test directly since they can be abused to expose secrets14:46
mriedemok14:47
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Handle errors returning nodesets on canceled jobs  https://review.openstack.org/50853214:47
AJaeger_mordred: LGTM, send it out14:48
jeblairmordred: etherpad and plan lgtm.14:49
*** shardy is now known as shardy_mtg14:49
*** amoralej|lunch is now known as amoralej14:49
Shrewsmordred: ++14:49
jeblairfungi: once that change lands, i'm going to want to restart zuul; do you think i should try to save queues?14:50
jeblairwe've never tried that with zuulv314:50
*** lukebrowning has quit IRC14:50
fungijeblair: we have quite a few people who have reported they're waiting on queued stuff to land, so maybe?14:50
* clarkb is catching up.14:50
clarkbmordred: do we know yet why trusty is used on openstack-tox jobs? or why some branch exlcusions seem to be ignored?14:51
jeblairalso, when i restart zuul, i will remove the 19G debug log :(14:51
*** e0ne has quit IRC14:51
fungiprobably just as well14:51
jeblairclarkb: according to the etherpad mordred just wrote, we don't know that yet14:51
fungiclarkb: we don't know yet14:51
fungiclarkb: i've been mulling over the configs there and haven't spotted anything obviously wrong, but more eyes may help14:52
*** icey has joined #openstack-infra14:52
clarkbin zuul-jobs looks like unittests <- tox <- various openstack tox jobs. Unittests doesn't have a parent specified14:53
clarkbis perhaps some implied parenting breaking us?14:53
*** xarses has joined #openstack-infra14:55
mordredfungi, clarkb: no parent = parent: base ... and base should have nodeset: ubuntu-xenial ... butyah- we need to track down what's up with that14:56
*** wolverineav has quit IRC14:56
*** lukebrowning has joined #openstack-infra14:57
*** ykarel has joined #openstack-infra14:57
openstackgerritMatthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs  https://review.openstack.org/50852614:57
fungido we have a good example of a neutron job which skipped unit tests? i see 508438,2 in check right now ran (and passed) openstack-tox-py27 and openstack-tox-py3514:58
fungiStatus: Pass 12888 Skip 1137 http://logs.openstack.org/38/508438/2/check/openstack-tox-py27/3a6fdfe/testr_results.html.gz14:58
clarkb49801314:58
*** wolverineav has joined #openstack-infra14:58
fungithanks14:58
clarkbalso currently in the gate14:58
clarkbit is a change to ocata14:59
fungiaha, so not master14:59
fungimaybe that's the common thread14:59
*** rcernin has quit IRC15:01
mgagneminor issue with grafana and nodepool, some metrics aren't showing since zuulv3, anything I can do? http://grafana.openstack.org/dashboard/db/nodepool-inap15:01
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Always try to unlock nodes when returning  https://review.openstack.org/50853215:01
SpamapSrsync: rename failed for "/var/lib/zuul/builds/f2853eee2bae4e19b8b2a474de36d78b/work/logs/logs/deprecations.txt.gz" (from logs/.~tmp~/deprecations.txt.gz): No such file or directory (2)15:01
*** lukebrowning has quit IRC15:01
*** baoli has quit IRC15:01
AJaeger_logs/logs? Is that the problem?15:02
SpamapSThat's from 488013's post ara15:02
*** ykarel has quit IRC15:02
SpamapS49801315:02
SpamapSthat's the reason for the POST_FAILURE15:02
jeblairmgagne: dmsimard was looking into that15:02
*** egonzalez has quit IRC15:02
*** baoli has joined #openstack-infra15:02
*** baoli has quit IRC15:02
dmsimardyeah there's a patch, I believe it's failing the grafyaml job but haven't had the change to look yet15:02
dmsimardhttps://review.openstack.org/#/c/508349/15:03
fungiSpamapS: was that a multinode job?15:03
*** lukebrowning has joined #openstack-infra15:03
dmsimardError executing: cp -dRl /home/zuul/src/git.openstack.org/openstack-infra/grafyaml/. /home/zuul/src/git.openstack.org/openstack-infra/project-config/.tox/grafyaml/openstack-infra/grafyaml15:03
dmsimardthat doesn't look like a legit error ?15:03
SpamapSfungi: it was15:03
jeblairAJaeger_: logs/logs is probably okay (devstack jobs put logs inside of a directory called logs -- so the first logs us zuul machinery, it's the root of the upload.  the second is devstack machinery, it shows up in the final location)15:03
jeblairs/us/is/15:04
SpamapSdmsimard: rsync will exit non-0 when that happens, but yeah looks like maybe files disappeared while it was running.15:04
SpamapSoh n/m that's your cp error15:04
jeblairdmsimard: needs required-projects grafyaml15:04
dmsimardjeblair: ok, I'll send a patch15:05
dmsimardthanks.15:05
jeblairnp15:05
dmsimardthat should be in the FAQ :D15:05
dmsimardI'll send the patch first though15:05
mordredjust added a specific mention15:05
SpamapSfungi: is that rsync problem a known issue w/ multinode jobs?15:05
SpamapSIt came from http://logs.openstack.org/13/498013/1/gate/legacy-grenade-dsvm-neutron-multinode/f2853ee/15:05
openstackgerritMatt Riedemann proposed openstack-infra/project-config master: Remove nova-net jobs that are >newton  https://review.openstack.org/50852415:06
mriedemmtreinish: ^ since that changes tempest15:06
fungiSpamapS: yeah, mordred has a proposed fix stack15:06
fungibasically right now we're trying to collect logs from every node in the nodeset rather than just the primary node15:06
clarkbI'm noticing that project-config consumes resources out of ozj like openstack-python-jobs template, but ozj is listed after project-config in the project list. Is that a problem? Comments say order matters15:07
mordredjeblair: I know you're looking at zuul deep issue- but could you look at https://review.openstack.org/#/c/508511/ and https://review.openstack.org/#/c/508510/ real quick- just want to make sure you're not opposed to that approach15:07
mordredclarkb: order matters for job definitions ...15:07
*** lukebrowning has quit IRC15:07
mordredclarkb: so, specifically, a job can't have a parent that was defined after it15:08
*** baoli has joined #openstack-infra15:08
SpamapSfungi: k15:08
mordredSpamapS: https://review.openstack.org/#/c/508511/ and https://review.openstack.org/#/c/50851015:08
*** hemna_ has joined #openstack-infra15:08
* SpamapS looks15:08
mordredclarkb: but that's within a class of config - the various classes are loaded into zuul's config in an order that should make sense - so all the jobs and project-templates are loaded before project definitions are loaded15:09
openstackgerritDavid Moreau Simard proposed openstack-infra/openstack-zuul-jobs master: Add openstack-infra/grafyaml to the project-config grafyaml job  https://review.openstack.org/50853715:09
dmsimardjeblair: ^15:09
*** lukebrowning has joined #openstack-infra15:09
*** ramishra has joined #openstack-infra15:10
*** apuimedo has joined #openstack-infra15:10
jeblairmordred: oh! because only the primary node was in the inventory in v2.5, right?15:10
openstackgerritDavid Moreau Simard proposed openstack-infra/project-config master: Update Nodepool graphite metric names  https://review.openstack.org/50834915:11
jeblairmordred: both +215:11
clarkbit seems like explicit nodesets are working in the cases I have checked15:13
jeblairdmsimard: +2 on 508537, comment on 50834915:13
clarkbits just the implicit nodeset that isn't for things like openstack-tox jobs15:13
*** lukebrowning has quit IRC15:14
mordredjeblair: yes15:14
mordredclarkb: WEIRD15:14
jeblairokay, i'm going to force-merge the zuul fix then restart zuul now.15:14
fungithanks jeblair!15:15
mtreinishjeblair: so I think for the puppet-apply jobs it's going to want all the puppet repos used by system-config. Is there a way to wild card that in the job definition15:15
mordredclarkb, jeblair: so - the only difference I can see is that our base job is using an anonymous nodeset rather than referencing the pre-defined ubuntu-xenial nodeset15:15
*** lukebrowning has joined #openstack-infra15:15
mordredperhaps there is a bug in the anonymous nodeset handling code?15:15
jeblairmtreinish: no.  but if you need it for more than one job, you can define a parent job which adds all the repos, then have the jobs inherit from that, so the list is only in one place.15:15
jeblairmordred: which bug are we talking about?15:16
mtreinishjeblair: it'll be needed by all 3 infra puppet apply jobs so sure we can do that15:16
mordredjeblair: there is an issue where jobs are running on trusty nodes when they should be running on xenial nodes15:16
mtreinishjeblair: but that's going to be one big list, there are a ton of openstac-infra/puppet-* repos15:16
*** ykarel has joined #openstack-infra15:16
jeblairmtreinish: yah, so maybe 'infra-puppet-apply-base' or something15:16
mordredjeblair: we can't find any reason for htat - other than that all the instnaces we've seen of it are only jobs that are getting their nodeset implicity through the base job15:17
clarkbmordred: I'm trying to confirm that the tox-* jobs run on xenial as expected15:17
clarkbmordred: as they share the same inheritance even15:17
mordredyah15:17
*** ramishra has quit IRC15:17
jeblairmordred: have any links handy?15:17
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: Remove zuul v2 and jjb content  https://review.openstack.org/50718015:17
mordredjeblair: I do not - clarkb do you?15:17
clarkbjeblair: http://logs.openstack.org/35/507235/2/check/openstack-tox-pep8/4962d18/zuul-info/ is an example15:17
jeblairmordred: i'm going to hold the zuul restart for this in case it's a zuul bug15:17
*** yamamoto has quit IRC15:18
*** vhosakot has joined #openstack-infra15:18
clarkbthat is a change to cinder master and its pep8 job ran on trusty15:18
SpamapSjeblair: do you think these nodes that aren't locked are results of more timeouts?15:18
mnenciaHi, there is any good reason other the lack of interest that stops https://blueprints.launchpad.net/openstack-ci/+spec/jenkins-job-builder-folders to be advanced? Ho I can help it to get iincluded?15:19
jeblairSpamapS: zookeeper filled up its filesystem; i'm assuming anything after that until we restart zuul is because of that15:19
Shrewsinfra-root: fyi, nodepool.o.o disk usage steadily rising. at 90% now. we should keep an eye on it15:19
*** v1k0d3n has quit IRC15:19
clarkbmnencia: we haven't used launchpad blueprints in years. However, not sure if the JJB team has separately decided to use launchpad for that feature again. You'll want to talk to electrofelix and zaro and zxiiro I think15:20
SpamapSmnencia: jjb has fallen off openstack-infra's radar. Note that the current transition going on in here is the ultimate end of jjb use in OpenStack's CI. :-P15:20
*** v1k0d3n has joined #openstack-infra15:20
SpamapSjeblair: OW15:20
*** lukebrowning has quit IRC15:20
*** trown is now known as trown|brb15:20
mordredclarkb: I see it15:21
clarkbjeblair: Shrews /opt has half a terabyte free on nodepool.o.o we can move the zk data root15:21
fungimnencia: also, the jjb team doesn't really use launchpad since some years, so i doubt anyone's tracking that blueprint there anyway15:21
clarkbmordred: oh good, /me awaits enlightenment15:21
Shrewsclarkb: ++15:21
zxiiromnencia: we discuss jjb stuff in #openstack-jjb now. Looks like the folder plugin has a patch here https://review.openstack.org/#/c/134307/ but it's failing Jenkins so I guess someone needs to at least fix the jenkins error first15:22
SpamapSjeblair: remember when I said it's treacherous running a single node zk? One of the things that killed the ZK in the Copenhagen juju debacle of 2012 was the zk disk filling.15:22
*** lukebrowning has joined #openstack-infra15:22
*** dave-mccowan has joined #openstack-infra15:22
ShrewsSpamapS: :(15:22
clarkbbut if you ran 3 you'd just have 3 with full roots15:22
clarkbits not sharding the data aiui15:22
SpamapSBecause then restarting zk required applying every single transaction from the rather large (filled the disk!) transaction log ;)15:22
SpamapSclarkb: no it's filling because the log snapshots build up for some reason.15:23
SpamapSNow...15:23
SpamapSI thought that problem was fixed.15:23
openstackgerritMonty Taylor proposed openstack-infra/project-config master: Remove broken openstack-tox-pep8 variant  https://review.openstack.org/50854215:23
mordredclarkb: ^^15:23
SpamapSI remember specifically looking at it and there was a specific feature added to help single node ZK's not do that.15:23
SpamapSso this was likely something else.15:23
fungii guess the idea is that when you have 3, they won't fill their disks at exactly the same moment and you can run around cleaning up files and restarting them constantly instead to avoid the outage? ;)15:23
mordredclarkb: the pipeline definition has a trusty variant defined but the branch exclusion it has defined is too broad15:23
clarkbmordred: its not just pep8 fwiw15:23
mnenciazxiiro thanks, it is one of the two poc attached to the blueprint. I'm going to ask on openstack-jjb15:24
SpamapSI assume this filled the disk with actual znodes, not logs?15:24
clarkbmordred: the unittest jobs are in the same boat too aiui15:24
mordredclarkb: right- but we need to keep looking at instances and making sure that they don't have similar issues so we can determine if it's a job config issue or a zuul issue15:24
mordredclarkb: on cinder?15:24
clarkbmordred: http://logs.openstack.org/09/485209/7/check/openstack-tox-pep8/dfe7ff9/zuul-info/ there is nova15:25
ShrewsSpamapS: /var/lib/zookeeper/version-2/log.* and snapshot.* files15:25
SpamapSShrews: :(15:25
clarkber thats pep8 too15:25
SpamapSthat is the exact symptom I saw then. Hrm.15:25
clarkbmordred: http://logs.openstack.org/73/502473/6/check/openstack-tox-py27/48058ce/zuul-info/ py27 cinder is trusty too15:26
clarkbpy35 is not15:26
mordredclarkb: ok. col - thanks15:26
ShrewsSpamapS: i think the log.* files are the largest15:26
clarkbmordred: seems to be fairly global on pep8 and py2715:26
SpamapS"A ZooKeeper server will not remove old snapshots and log files, this is the responsibility of the operator. Every serving environment is different and therefore the requirements of managing these files may differ from install to install (backup for example)."15:26
*** lukebrowning has quit IRC15:26
SpamapShttps://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html15:26
clarkbsounds like elasticsearch15:27
*** trown|brb is now known as trown15:27
fungifrickler: you were the one to first report the missing neutron unit tests in here... do you happen to know if all occurrences were for stable branch changes perhaps, with master branch changes running expected jobs instead?15:27
clarkbare we expected to clea nthem out of the fs directly or using some command against the server?15:27
SpamapSclarkb: reading :(15:27
SpamapS java -cp zookeeper.jar:log4j.jar:conf org.apache.zookeeper.server.PurgeTxnLog <dataDir> <snapDir> -n <count>15:27
clarkbthere is stuff from last year in there15:28
SpamapSthe count is the number of snaps to keep15:28
*** lukebrowning has joined #openstack-infra15:28
clarkbwe can probably go to a months owrht and be happy15:28
SpamapSrecommendation is 3 snaps15:28
jeblairShrews, clarkb: i'm about to shut down zuulv3; do we want to take the opportunity to move zk to /opt?  or just keep it running and run the cleanup command?15:28
SpamapS(just in case the most recent logs are corrupted)15:28
clarkbjeblair: I think we can likely get a huge win just with the cleanup command15:28
fungii'm in favor of trusting teh cleanup command15:29
clarkbjeblair: since we have almost a year of stuff in that dir15:29
fungiat least in the near term15:29
fungitrust but verify ;)15:29
jeblairk let's start there then; i won't couple zuul restart to that15:29
*** jdandrea_ has quit IRC15:29
SpamapSif we care about being able to recover this data upon server loss, we should run the cleanup after we backup the server15:29
jeblairanyway, i am really going to restart zuul now.  :)15:29
mordredjeblair: when you have a sec, I think you might want to look at the xenial/trusty thing- but I think it can wait til post-restart15:29
mordredjeblair: ++15:29
SpamapSif we could theoretically just recover by deleting all nodes with a clean ZK, then just run the cleanup every hour.15:29
*** dhajare has quit IRC15:30
clarkbSpamapS: ya I don't think we care too much about data loss other than for debugging purposes15:30
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Always try to unlock nodes when returning  https://review.openstack.org/50853215:30
SpamapSyeah, seeing as we only have 1... hourly cleanup15:30
*** kiennt26 has quit IRC15:30
jeblairmordred: do i need to do a quick change to expose the inheritance path variable?15:31
mordredjeblair: make sure 508532 is installed before you restart :)15:31
mordredjeblair: maybe so?15:31
*** d0ugal has joined #openstack-infra15:31
openstackgerritMatthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs  https://review.openstack.org/50852615:31
mtreinishjeblair: ^^^ hopefully I did that correctly15:31
mordredjeblair: is there anything that will show whether zuulhas decided to apply a variant and if so where it got the variant from?15:31
mordredmtreinish: looks good- except for a tab in front of required-projects15:32
*** lukebrowning has quit IRC15:33
mordredmtreinish: you have some things with their own required-projects and some using the base job you defined - was that on purpose?15:33
electrofelixmnencia: just stopped using blueprints in launchpad to track stuff, combined with other things being more important, drop into the #openstack-jjb channel can help there15:34
fungizuul does at least merge the lists together, so you can have a main set you inherit and then add others in the ancestor15:34
lbragstadSpamapS: when you encountered the rsync issue, did you also see a checksum failure?15:34
SpamapSmordred: I think during the reconfig debug output you get some of that.15:34
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Add inheritance path to zuul vars  https://review.openstack.org/50854315:34
mtreinishmordred: yeah, the beaker jobs are also broken, but don't need all of infra's puppet to work15:35
SpamapSlbragstad: no, but the rsync thing is a known problem that is addressed by 50851115:35
mordredjeblair: lgtm15:35
mtreinishmordred: I could split it up into 2 patches I guess, but I figured just fix all the infra jobs at once15:35
SpamapSjeblair: ooooo I like that15:35
lbragstadSpamapS: awesome - reviewing15:35
mordredmtreinish: nah- looks great- just making sure15:35
mordredmtreinish: fix that tab and I think it's good15:35
mtreinishmordred: sigh I was in paste mode and hit tab, respinning one sec15:35
lbragstadSpamapS: i noticed the checksum thing right before the rsync issue in this specific case15:35
lbragstadhttp://logs.openstack.org/57/486757/22/check/legacy-tempest-dsvm-neutron-full/cbf0f1c/job-output.txt.gz#_2017-09-28_23_09_33_94608415:35
jeblairSpamapS: yeah, i'll make it a nice list of dicts later.15:35
*** chlong has quit IRC15:35
jeblairSpamapS: right now it's a string description; so should be enough for us to have a clue what's up.15:36
openstackgerritMatthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs  https://review.openstack.org/50852615:36
jeblair(rather, it's a list of strings right now)15:36
clarkbI suppose I should review 5051115:36
clarkb*50851115:36
jeblairSpamapS, clarkb: you want to +2 508543 and i'll force-merge?15:36
jeblairinclude it in the restart15:37
mtreinishmordred: although the beaker jobs still aren't passing, but they run at least: http://logs.openstack.org/58/508258/3/check/legacy-openstackci-beaker-ubuntu-trusty/67395a9/job-output.txt.gz#_2017-09-29_15_19_54_48063815:37
mordredinfra-root: https://review.openstack.org/#/c/508524 for nova, companion in https://review.openstack.org/#/c/50851915:37
jlvillalIs there a Zuul v3 status page? For us to find out if things should be or should not be working?15:37
SpamapSok, time to go find breakfast and the office. AFK for a bit15:37
mordredjlvillal: I just sent an email to the mailing list with an update and some links to some things15:38
*** shardy_mtg is now known as shardy15:38
clarkbjlvillal: done15:38
jlvillalmordred, Great. Thanks.15:38
clarkber jeblair done, sorry jlvillal15:38
jlvillalheh, autocomplete on last used nick15:38
*** bauzas is now known as bauwser15:38
jlvillalI have noticed this job failing again and again with POST_FAILURE: https://review.openstack.org/#/c/508287/15:39
fungijlvillal: it should be redirecting on its own any time now as well15:39
jlvillalfungi, The POST_FAILURE issue?15:40
fungijlvillal: the status page15:40
jlvillalfungi, Ah, thanks15:40
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Add inheritance path to zuul vars  https://review.openstack.org/50854315:40
mordredjlvillal: oh! that's a failure in a nice new shiny v3 job :(15:40
mordredwell - this is another instance of openstack-tox-py27 running on trusty - I wonder if that's related15:41
fricklerfungi: re neutron, yes, there wasn't any change in master since the cutover it seems, all stable/*15:41
mordredclarkb: ^^ see http://logs.openstack.org/87/508287/1/check/openstack-tox-py27/4e00cb9/job-output.txt.gz#_2017-09-29_04_10_40_84734615:41
clarkbmordred: re https://review.openstack.org/#/c/508511/3 I don't think that is a noop since primary isn't a thing unless you are a multinode job15:41
fungihrm, the redirect patch merged over an hour ago (14:33z) and doesn't seem to have applied yet. looking into that real quick15:41
jlvillalmordred, Thanks. I wasn't sure if that was a known issue with the POST_FAILURE15:42
fungifrickler: thanks, that suggests there's something going on with branch exclusions in that case15:42
mordredclarkb: see the parent patch15:42
openstackgerritBen Nemec proposed openstack-infra/tripleo-ci master: Switch cistatus page to zuul v3  https://review.openstack.org/50854615:43
clarkbmordred: derp, clearly too early in the morning15:43
jeblairzuul is stopped15:44
clarkbmordred: ok +2'd both changes but didn't approve as waiting for zuul things to complete15:44
jeblairzuul is starting15:44
mordredclarkb: kk15:44
mordredclarkb, jlvillal: I think I see the bug in tox log collection15:44
jlvillal:)15:45
fungiapparently status.o.o isn't updating because we have a system package conflict for npm/nodejs installation which is tanking the whole manifest15:45
mordredfungi: AWESOME15:45
funginpm depends on newer versions of a bunch of nodejs stuff which isn't being installed (for unspecified reasons). i'll probably have to try by hand to see why15:46
fungiE: Unable to correct problems, you have held broken packages.15:47
*** lukebrowning has joined #openstack-infra15:47
clarkbfungi: if npm is trying to update itself that is known to cause problems15:47
SamYaplecan someone help me with https://review.openstack.org/#/c/508425/ ? zuul doesnt seem to be triggering anything at all and it never returns or responds to the ticket15:47
fungiclarkb: puppet is attempting to install npm, and nodejs is apparently already installed from nodesource15:47
mordredfungi, clarkb: dealing with the javascript stack around zuul is on my todo list for once the dust settles here15:47
openstackgerritDavid Moreau Simard proposed openstack-infra/openstack-zuul-jobs master: Add missing required-projects for TripleO jobs using devstack-gate  https://review.openstack.org/50854815:48
jeblairSamYaple: wait a few minutes then do a recheck -- zuul was stuck for a while15:48
jeblairSamYaple: i'm restarting it now15:48
dmsimardinfra-root: ^ is the next step to fix the broken tripleo gate, it's sitting under 2 patches from mordred which also need to land15:48
SamYaplejeblair: its been like this since yesterday and all of last night15:48
SamYaplesince zuulv3 cutover it has not responded to any patchset in openstack/loci nameset15:49
jeblairSamYaple: it may be something else then, but since i just restarted the debug procedure will be the same :|15:49
mordredfungi: short-term, http://paste.openstack.org/show/622318/ is for adding nodesource apt repos for node things15:49
jeblairzuul is up now15:49
openstackgerritAndreas Jaeger proposed openstack-infra/openstack-zuul-jobs master: Fix project-config-grafyaml repos  https://review.openstack.org/50854915:49
clarkbmordred: I think fungi is saying that is already in place on status.o.o15:50
clarkbmordred: there is puppet apt resource management of it iirc15:50
clarkbwe use it in etherpad too15:50
dmsimardclarkb: why did https://review.openstack.org/#/c/508548/ come into merge conflict just now? o_O15:50
fungithis is what's going on for status.o.o: http://paste.openstack.org/show/622320/15:50
dmsimardit's a clean patch on top of the tree15:50
jeblairi'm re-enqueing changes from before i stopped zuul15:50
clarkbdmsimard: it depends on something that failed to merge15:51
clarkbdmsimard: so one of mordreds patches tickled it15:51
dmsimardclarkb: it doesn't depend on anything and the two patches below seem to be fine15:51
SamYaplejeblair: openstack/loci was noop gate only before the cutover, is there a problem with zuulv3 and noop?15:51
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: Remove zuul v2 and jjb content  https://review.openstack.org/50718015:51
dmsimardmaybe something else merged in the meantime15:51
*** lukebrowning has quit IRC15:51
jeblairSamYaple: there was yesterday; should be fixed today15:51
clarkbdmsimard: also possible that is fallout from the zuul restart15:51
jeblairSamYaple: with the current restart15:51
mordredclarkb: ok, nod15:51
fungistatus.o.o is using "deb https://deb.nodesource.com/node_0.12 trusty main" in its sources.list15:52
mordreddmsimard: I rechecked it just now15:52
dmsimardAJaeger_: commented https://review.openstack.org/#/c/508549/15:52
mordredfungi: nod15:52
SamYaplejeblair: ack15:52
*** derekh has quit IRC15:52
*** lukebrowning has joined #openstack-infra15:53
mordredfungi: oh - I wonder if trusty somehow got a backport of npm/node that's newer than what we're getting from nodesource (since 0.12 is rather old)15:53
AJaeger_dmsimard: great, thanks. Will +2A then ;)15:53
fungimordred: possible15:53
mordredfungi: apt-cache policy says I';m wrong :)15:54
openstackgerritMatthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs  https://review.openstack.org/50852615:54
openstackgerritDaniel Mellado proposed openstack-infra/openstack-zuul-jobs master: Fetch python-octaviaclient from pip  https://review.openstack.org/50855015:54
mordredfungi:   Installed: 0.12.14-1nodesource1~trusty115:54
mordred  Candidate: 0.12.18-1nodesource1~trusty115:54
fungilooks like nodejs is pending upgrade (not a security patch so unattended-upgrades doesn't install it automatically)15:54
mordredah15:54
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: Remove zuul v2 and jjb content  https://review.openstack.org/50718015:55
fungii'm going to `sudo apt-get upgrade` on status.o.o for now and see if that unhinges this15:55
funginope, doesn't help15:56
fungiThe following packages have unmet dependencies: nodejs : Conflicts: npm15:57
*** lukebrowning has quit IRC15:57
fungiaha!15:57
fungithe joys of mixing third-party package repositories with upstream15:58
fungier with distro15:58
*** zzzeek has quit IRC15:58
fungione has a nodejs package which provides npm, the other has npm broken out as a separate package15:58
clarkbinheritance path seems to be working well15:58
clarkbhttp://logs.openstack.org/09/508209/3/check/legacy-cinder-tox-functional/772c444/zuul-info/inventory.yaml15:58
*** zzzeek has joined #openstack-infra15:59
SamYaplejeblair: ah perfect. thank you. that fixed it16:00
*** xyang1 has joined #openstack-infra16:00
openstackgerritMatthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs  https://review.openstack.org/50852616:00
fungii've removed the nodesource entry from sources.list temporarily, purged the nodejs package, and am reinstalling npm and nodejs packages from ubuntu16:00
jeblairSamYaple: yay!16:01
jeblairthe process to re-enqueue changes is ongoing; things are moving very slowly, largely since so many of the changes in flight are zuul config changes require dynamic config updates16:02
fungii've now put the nodesource sources.list entry back, updated and upgraded the nodejs package16:02
SamYapleso question.. in project-config, is zuul/layout.yaml still used? or only zuul.d/*16:02
mordredinfra-root: there is a bug in fetch-tox-output where its combination of find: and synchronize: is leading it to try to fetch a thing that doesn't exist ... I'm wokring on a fix16:02
jeblairSamYaple: only zuul.d/16:02
*** yee379 has quit IRC16:03
mordredjeblair: I kinda think we should go ahead and land a patch removing layout.yaml and jenkins/jobs ... I've seen several patches to them come through and them not being there any more is a good way to stop those16:03
*** yee379 has joined #openstack-infra16:03
clarkbhttp://logs.openstack.org/01/474801/1/gate/openstack-tox-py27/a55c57a/zuul-info/inventory.yaml inheritance debug info for why trusty is used in pep8/py27 jobs16:03
*** lukebrowning has joined #openstack-infra16:03
mordredhttps://review.openstack.org/#/c/507180/ <-- AJaeger_ updated the patch to do that16:03
fungiokay, problem recreated. so if we're using the nodesource packages, we should not attempt to install the npm package since their nodejs package provides npm on its own, so attempting to install npm directly at that point results in the observed dependency resolution errors16:03
*** sbezverk has quit IRC16:04
openstackgerritSam Yaple proposed openstack-infra/project-config master: Remove loci-jobs from project-config  https://review.openstack.org/50855216:04
SamYaplejeblair: so ^^ is a good patch?16:04
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Update zuul-changes script for v3  https://review.openstack.org/50855316:04
*** gongysh has quit IRC16:04
openstackgerritMatthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs  https://review.openstack.org/50852616:04
*** sambetts is now known as sambetts|afk16:05
mordredclarkb: :( -- none of those indicate that it thinks it wants to apply a variant with nodeset ubuntu-trusty16:05
openstackgerritMerged openstack-infra/project-config master: Remove nova-net jobs that are >newton  https://review.openstack.org/50852416:05
clarkbmordred: would it though? I'm mostly confused why there are 6 variants at all16:06
clarkbthe job is defined in one place then used in a single template for swift from what I can tell16:06
AJaeger_mordred: I agree - I gave a -1 on everything proposed the last three days already...16:06
jeblairSamYaple: left comment16:06
AJaeger_mordred: I'll rebase 507180 now16:07
clarkbI wouldn't expect any variants, that job is basically defined specifically for swift/cinder/nova/neutron/etc16:08
jeblairclarkb, mordred: 'inherit from <Job base branches: None source: openstack-infra/project-config/zuul.d/secrets.yaml@master>'  *secrets.yaml* ?16:08
*** lukebrowning has quit IRC16:08
clarkbjeblair: oh huh /me looks16:08
clarkbI don't see a job base in secrets.yaml16:09
jeblairnor do i.  that's funky.16:09
clarkbor any job16:09
*** owalsh_ has joined #openstack-infra16:09
SamYaplejeblair: ive never used Needed-By, thats a thing? I am assuming it is the same as Depends-On, only in the opposite direction?16:09
jeblairSamYaple: yeah.  it's not recognized by tooling, it's just for humans.16:09
SamYapleah i see. will do. thanks16:10
*** lukebrowning has joined #openstack-infra16:10
clarkbjeblair: mordred I think secrets.yaml had all its content deleted and was renamed to that path at some point. Is it possible there is some funky git behavior going on around that?16:10
clarkbjeblair: mordred maybe zuul ins't loading a clean content of that file16:10
fungiSamYaple: basically a bookkeeping convenience to signal to reviewers that there's this other change in a different repo depending on it16:10
*** camunoz has quit IRC16:10
jeblairclarkb, mordred: i think we need a tool to manually run a cat job.16:11
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: Remove zuul v2 and jjb content  https://review.openstack.org/50718016:11
*** owalsh has quit IRC16:11
AJaeger_mordred: ^16:11
openstackgerritSam Yaple proposed openstack-infra/openstack-zuul-jobs master: Remove legacy loci jobs  https://review.openstack.org/50855616:12
clarkbAJaeger_: mordred I'd personally like to keep that aroound a little longer as cross referencing while we unbreak the transition has been useful16:12
openstackgerritSam Yaple proposed openstack-infra/project-config master: Remove loci-jobs from project-config  https://review.openstack.org/50855216:12
SamYapleok, i think that should do it16:12
clarkb(I can always checkout old commit though)16:12
clarkbjeblair: that would let us retrieve what zuul is looking at for file contents right?16:13
SamYapleso I even need merge-check in the project-config repo? or can I remove all of that and just do it from the loci repo?16:13
fricklerjeblair: there are five matches for openstack-python-jobs-trusty in p-c/zuul.d/projects.yaml , that would match the 5 variants in your log16:13
clarkbjeblair: if so ++ I think that would be useful16:13
fricklerjeblair: and openstack-python-jobs-trusty sets node: trusty for openstack-tox-py27 unconditionally16:13
fricklerjeblair: so maybe that template isn't applied project-specific?16:13
*** owalsh_ has quit IRC16:14
*** lukebrowning has quit IRC16:14
jeblairclarkb: oh, i think i see the problem; it's a bug in the multi-file parsing; all config objects get the source context of the last file parsed from a repo-branch.  it should not have an adverse affect on security, but it will make debug messages and zuul config error messages look weird.16:14
AJaeger_clarkb: I'm fine with waiting as well - but let's discuss a date/timeframe16:14
*** jascott1 has joined #openstack-infra16:15
clarkbfrickler: that would certainly explain it if the trusty template is somehow getting applied everywhere16:16
*** lukebrowning has joined #openstack-infra16:16
clarkbfrickler: where we apply the -trusty template we also apply the non trusty template which seems odd to me16:17
clarkbperhaps that is causing a collision of some sort that is getting resolved in trusty's favor?16:17
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Add openstack-infra/grafyaml to the project-config grafyaml job  https://review.openstack.org/50853716:17
*** yamamoto has joined #openstack-infra16:18
jeblairclarkb, frickler: the bug i'm seeing would only account for the filename being wrong, not the project16:19
*** owalsh has joined #openstack-infra16:19
clarkblooking at old zuul layout group-based-policy at least wants to run trusty on mitaka branch and xenial on not mitaka16:19
clarkbbut our config doesn't seem to appl ya branch restriction to the openstack-python-jobs-trusty variants16:19
AJaeger_508396 just failed with an unrelated error - that looks strange. zuul complained about syntax error. Could an expert check this, please?16:19
clarkbjeblair: I'm wonderinf that since the -trusty variant comes after hte non trusty and isn't restriced by someting like branch it is just overwriting the base variant for xenial?16:20
openstackgerritDavid Moreau Simard proposed openstack-infra/project-config master: Update Nodepool graphite metric names  https://review.openstack.org/50834916:20
openstackgerritMatt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet >newton jobs  https://review.openstack.org/50852016:20
openstackgerritMatt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet  https://review.openstack.org/50851916:20
*** LindaWang has quit IRC16:20
clarkbjeblair: basically what happens if you say job-foo, then later job-foo: nodeset: trusty16:20
dmsimardjeblair: ^ with your comment addressed. It's worth exploring a broader rework of the provider-specific dashboards but I'd do that in another patch.16:21
jeblairclarkb: trusty16:21
*** lukebrowning has quit IRC16:21
jeblairclarkb: last wins16:21
clarkbjeblair: mordred frickler ok I think that may explain it then16:21
clarkbwe need to restrict the trusty set to branch ^stable/mitaka$16:21
clarkbor somilar16:21
jeblairclarkb: does swift have the openstack-python-jobs-trusty template?16:22
clarkbAJaeger_: looks like maybe extra whitespace snuck in16:22
clarkbjeblair: no16:22
dmsimardeh, that's weird.. 508548 has finished all it's jobs successfully but it appears it's not ending and reporting status to the review and is still on zuulv3.o.o16:22
openstackgerritMatt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet  https://review.openstack.org/50851916:22
*** lukebrowning has joined #openstack-infra16:22
jeblairclarkb: where's the trusty nodeset coming from then?16:22
clarkbjeblair: I think from the projects ahead in config modifying that job by including the -trusty job16:23
clarkbjeblair: I'm making a jump that the nodeset: trusty is side effecting globally after a project loads it16:23
jeblairclarkb: oh, i'm not there yet; that shouldn't happen16:23
AJaeger_clarkb: any idea where exactly?16:24
*** yamamoto has quit IRC16:24
jeblairclarkb: the way it should work is that each variant gets applied to a copy of the job in series.  so they shouldn't affect each other in that way16:24
clarkbAJaeger_: on the blank line between title and paragraph16:24
clarkbAJaeger_: I think16:24
AJaeger_clarkb: will you fix or shall I ?16:25
AJaeger_another thing I don't understand http://logs.openstack.org/39/508539/3/check/legacy-tox-doc-publish-checkbuild/84ac3d4/job-output.txt.gz#_2017-09-29_16_14_48_044446 - why do I get an rsync error here?16:25
clarkbAJaeger_: my local checkout says I'm wrong though, no white space there16:26
*** pcaruana has quit IRC16:26
clarkboh wait it specifically says job base not defined16:26
*** lukebrowning has quit IRC16:26
clarkbwhich is even more confusing16:27
fungicmurphy: clarkb: ianw: okay, i've tracked down the status.o.o updating breakage back to https://review.openstack.org/473136 which seems to be cool for ubuntu system packages on xenial but not for where we're deploying openstack-health on status.o.o running trusty with the nodesource third-party package repository16:27
dmsimardDo we have enough zuul mergers running ? Looks like we're lagging behind16:27
fungicmurphy: clarkb: ianw: https://github.com/voxpupuli/puppet-nodejs#npm_package_ensure suggests the npm_package_ensure option is not intended for use with nodesource's packages16:28
jeblairdmsimard: no; we'll run more when we decommission zuulv216:28
dmsimardjeblair: ack16:28
*** lukebrowning has joined #openstack-infra16:28
clarkbfungi: ah ok sounds like we can just drop that entirely and the nodejs package will give us npm16:30
fungiyep16:30
fungirepo_url_suffix seems to explicitly refer to nodesource sources.list addition16:30
*** trown is now known as trown|lunch16:30
clarkbAJaeger_: rsync: change_dir "/home/zuul//publish-docs" failed: No such file or directory (2) now to see what the directory should be16:30
fungiin http://git.openstack.org/cgit/openstack-infra/system-config/tree/modules/openstack_project/manifests/release_slave.pp we just use repo_url_suffix without npm_package_ensure16:31
clarkbAJaeger_: rsync -a www/static/ publish-docs/www/ that is where it copies to publish-docs, now to see what it is relative to16:32
openstackgerritSam Yaple proposed openstack-infra/openstack-zuul-jobs master: Remove legacy loci jobs  https://review.openstack.org/50855616:32
jeblairinfra-root, AJaeger_: can you keep an eye out for changes to project-config/zuul/* (eg, layout.yaml).  please don't approve those.  they cause full zuul v3 reconfigurations due to puppet even though they don't actually change zuulv3.16:32
AJaeger_clarkb: relative to working dir16:33
*** lukebrowning has quit IRC16:33
AJaeger_jeblair: yeah, we should not touch layout.yaml at all anymore - that change by mriedem was too eager16:33
clarkbAJaeger_: /home/zuul/workspace looks like16:33
clarkbAJaeger_: so we need to update that path16:33
fungijeblair: full ack16:33
dmsimardmordred: hmm, unless mistaken our patch stack (starting from 508510) is not getting enqueued to gate16:34
AJaeger_jeblair: I'll -1 everything that touches zuul/layout or jenkins/jobs - infra-root, let's tread these as *frozen*16:34
dmsimardmordred: I think 508510 needs a rebase ? says the parent is outdated16:35
*** lukebrowning has joined #openstack-infra16:35
* dmsimard rebases16:35
AJaeger_clarkb: I'll prepare a change...16:35
openstackgerritDavid Moreau Simard proposed openstack-infra/openstack-zuul-jobs master: Create legacy nodesets and switch all legacy jobs  https://review.openstack.org/50851016:35
openstackgerritDavid Moreau Simard proposed openstack-infra/openstack-zuul-jobs master: Update legacy post playbooks to pull from primary  https://review.openstack.org/50851116:35
clarkbAJaeger_: ok16:35
openstackgerritDavid Moreau Simard proposed openstack-infra/openstack-zuul-jobs master: Add missing required-projects for TripleO jobs using devstack-gate  https://review.openstack.org/50854816:35
dmsimard^ above stack will need fresh +W's16:36
*** panda is now known as panda|bbl16:36
clarkbdmsimard: what precipitated the new patchsets?16:37
clarkbwas it that merge conflict?16:37
dmsimardclarkb: they all passed check queue but were not getting enqueued to gate16:37
*** edmondsw has quit IRC16:37
openstackgerritAndreas Jaeger proposed openstack-infra/openstack-zuul-jobs master: Fix some publishing jobs  https://review.openstack.org/50856216:37
dmsimardI gave it a good amount of time to allow for merger lag to catch up and they were still not getting enqueued, I figured it was perhaps because the parent commit on 508510 was outdated16:38
AJaeger_clarkb: ^16:38
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Make fetch-tox-output more resilient  https://review.openstack.org/50856316:38
clarkbdmsimard: its possible that zuul was just behind too. The queue counts at the top of the status page should give you an idea if it is caught up (queue sizes of 0)16:38
openstackgerritJeremy Stanley proposed openstack-infra/puppet-openstack_health master: Don't set npm_package_ensure  https://review.openstack.org/50856416:38
clarkbdmsimard: but coul dalso be out of date parent16:38
jeblairclarkb, dmsimard: it is more likely zuul backlog16:39
mordredclarkb, fungi, AJaeger_, jlk: https://review.openstack.org/508563 should fix the issue with failing to fetch tox logs16:39
fungiinfra-root: infra-puppet-core: 508564 should hopefully get puppet going on status.openstack.org again16:39
mordredjeblair, dmsimard: ^^ you too16:39
*** lukebrowning has quit IRC16:39
*** jcoufal has joined #openstack-infra16:40
clarkbya looks like the backlog is ~17 minutes at this point16:41
*** lukebrowning has joined #openstack-infra16:41
mordreddmsimard, clarkb, jeblair: yatin has a comment on 508510 - it seems reasonable to me, but I think maybe a followup16:41
fungiinfra-root: i'm going to hand patch 507244 (the zuul status redirect) onto status.openstack.org in the interim while we wait for 508564 to merge16:41
dmsimardmordred: I thought I fixed that16:42
*** jcoufal_ has quit IRC16:42
dmsimardmordred: hm, it was in one of my previous multinode patchsets but not in the current ones (which will need more rebases T_T)16:43
fungiactually, puppet managed to apply 507244 on its own, but never got far enough into the manifest to reload apache. doing that now16:43
jeblairmordred: either way should work i think16:44
dmsimardmordred: it's a legit fix but can be follow-up. Naming the node that way makes it so you can't use groups['subnodes'] for example.16:44
dmsimardI think devstack-gate uses groups['subnodes'] actually.16:44
openstackgerritMonty Taylor proposed openstack-infra/openstack-zuul-jobs master: Update ubuntu-xenial-2-node to match centos-7-2-node  https://review.openstack.org/50856816:44
mordredjeblair, dmsimard: ^^16:45
jeblairmordred, clarkb: i'm going to go spend some significant time on the trusty variant thing.  i will be incommunicado for a bit.16:45
dmsimardmordred: yeah subnodes usage: http://git.openstack.org/cgit/openstack-infra/devstack-gate/tree/playbooks/devstack-legacy.yaml#n15 and http://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/playbooks/legacy/pre.yaml#n1516:45
clarkbjeblair: gl16:45
mordredjeblair: ok. cool - and good, because I'm stumped by it16:45
*** lukebrowning has quit IRC16:45
mordreddmsimard: you said we're going to need another rebase?16:46
dmsimardmordred: I already did it16:46
*** lnxnut_ has joined #openstack-infra16:46
dmsimardmordred: however it's not certain if it was a rebase issue or if the zuul backlog is >10 minutes16:46
dmsimardafter 10 minutes the change was still not enqueued to gate16:46
mordreddmsimard: nod16:46
*** electrofelix has quit IRC16:47
*** lukebrowning has joined #openstack-infra16:47
clarkbmordred: comment on https://review.openstack.org/#/c/508563/116:47
*** rtjure has quit IRC16:49
*** jcoufal_ has joined #openstack-infra16:50
*** mugsie has quit IRC16:51
*** lukebrowning has quit IRC16:52
*** jcoufal has quit IRC16:53
*** lukebrowning has joined #openstack-infra16:53
*** jdandrea_ has joined #openstack-infra16:55
*** lukebrowning has quit IRC16:58
dmsimardinfra-root: zuul/nodepool are not dequeuing fast enough to cope with the load, we're at ~175 nodes in-use right now16:59
*** lukebrowning has joined #openstack-infra17:00
inc0good morning guys, minor thing https://twitter.com/OpenStackStatus <-  charts are all flat since zuulv317:00
dmsimardthere is a bunch of nodepool capacity we're not tapping into17:00
dmsimardinc0: that's jd_17:00
jeblairdmsimard: zuul is backlogged17:01
*** ykarel has quit IRC17:02
dmsimardjeblair: queue length has dropped though17:02
dmsimardjeblair: unless the backlog would be elsewhere17:02
inc0https://twitter.com/OpenStackStatus/status/913473927996936192 <- I like this one;)17:02
jeblairdmsimard: the system isn't stable until it hit's zero17:02
*** Swami has joined #openstack-infra17:02
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Make fetch-tox-output more resilient  https://review.openstack.org/50856317:02
mordredclarkb: thanks - fixed ^^17:02
*** ralonsoh_ has quit IRC17:02
dmsimardjeblair: ok, anything I can do to help ?17:03
*** jascott1 has quit IRC17:03
mordredclarkb, fungi, dmsimard: has anybody looked in to infra-manula publishing yet?17:03
dmsimardmordred: I haven't, what's the problem ?17:03
mordredwell - it may just be a lost-connection-to-host issue: http://logs.openstack.org/ab/a0e829e5cbd68815cf0b00687a9ac7e5228c56ab/post/publish-openstack-python-docs-infra/9ef3842/job-output.txt.gz17:04
*** lukebrowning has quit IRC17:04
mtreinishjeblair, mordred: any idea what's going on here: http://logs.openstack.org/58/508258/3/check/legacy-infra-puppet-apply-3/4490a3f/job-output.txt.gz#_2017-09-29_16_46_27_131506 ?17:05
openstackgerritMerged openstack-infra/project-config master: Zuul versions of sudo grep checks  https://review.openstack.org/50831317:05
dmsimardmordred: yeah, looks like the host went unreachable midjob -- the SSH key removal task failed as well17:05
jeblairdmsimard: it's not going to get better until we optimize the config loading/parsing.  we had no idea what the config would look like until a few days before the cutover, so we've never seen something like this.  now that we have an at-scale configuration, we can tune for it.  that's going to take a few days -- after we fix all the little fires.17:05
mordreddmsimard: oh - it also might be that the most recent infra-manual patches didn't manage to run the post job17:05
mordreddmsimard: http://logs.openstack.org/95/95c4d1433c74ad23894f7296be51a3a23b3c6e56 doens't exist and 95c4d1433c74ad23894f7296be51a3a23b3c6e56 is the tip ..17:06
*** lukebrowning has joined #openstack-infra17:06
* AJaeger_ goes offline again - final leg of my journey...17:06
mordredfungi clarkb: feel like +3ing https://review.openstack.org/#/c/436455/2 so that we can see it trigger a post job?17:07
mordredmtreinish: looking17:07
SamYaplewhat am I doing wrong with this job removal? https://review.openstack.org/#/c/508556/17:07
dmsimardjeblair: ok, happy to help if it's something I can lend a hand with17:07
mordredmtreinish: I mena- it looks ike puppet-bugdaystats just isn't in required-projects (and the logging is weird/interleaved)17:08
dmsimardSamYaple: I don't believe you can do a depends-on from a patch that is in project-config17:08
fungimordred: done17:08
SamYapledmsimard: yea i was throwing it in there to test. i think youre right17:08
mtreinishmordred: it is though, because above it zuul-cloner pulls it: http://logs.openstack.org/58/508258/3/check/legacy-infra-puppet-apply-3/4490a3f/job-output.txt.gz#_2017-09-29_16_46_27_05689117:08
dmsimardSamYaple: the job content in project-config is 'trusted' which means it contains sensitive things that, if altered, could expose secrets and things like that.17:08
dmsimardSamYaple: zuul doesn't allow to do speculative (depends-on) testing against trusted repos17:09
dmsimardat least that's my understanding17:09
mtreinishmordred: I've got the patch up adding it to the job definition https://review.openstack.org/508526 and that run was my patch with a depends-on on it17:09
mordredmtreinish: nod - lemme look further then17:09
SamYapleok. that makes sense, but im not sure what the next step is for me and removing the legacy job17:10
mordredmtreinish: so - if you look in http://logs.openstack.org/58/508258/3/check/legacy-infra-puppet-apply-3/4490a3f/zuul-info/inventory.yaml17:10
mordredmtreinish: you can see the list of projects zuul thinks it should be running with17:10
*** lukebrowning has quit IRC17:10
*** gouthamr has quit IRC17:10
mordredmtreinish: oh- that was a run of legacy-infra-puppet-apply-317:11
mtreinishmordred: hmm, none of the things I added to required projects is there17:11
mordredmtreinish: in https://review.openstack.org/#/c/508526/7/zuul.d/zuul-legacy-jobs.yaml legacy-infra-puppet-apply-3 does not have bugdaystats17:11
dmsimardSamYaple: added a comment in https://review.openstack.org/#/c/508552/17:11
mordredmtreinish: so I think legacy-infra-puppet-apply-3 needs legacy-infra-puppet-apply-base in its base17:11
mordredmtreinish: you'll need to move it after the legacy-infra-puppet-apply-base definition of course17:12
*** lukebrowning has joined #openstack-infra17:12
SamYapleammaaazzinnng. comments from old patchsets now post17:12
dmsimardSamYaple: you need to wait for the project-config patch to land before you can land the o-z-j repo17:12
mtreinishmordred: oh ffs, I messed up that patch again17:12
dmsimardSamYaple: also, you can already start adding jobs to loci, don't need to wait to remove legacy17:12
SamYapledmsimard: got it. i thought jeblair wassaying i had to get the o-z-j patch in first17:13
SamYapledmsimard: the legacy job is busted and i cant merge anything. since im rewriting i just want to remove it reather than fix it17:13
openstackgerritMatthew Treinish proposed openstack-infra/openstack-zuul-jobs master: Add missing projects for infra puppet jobs  https://review.openstack.org/50852617:13
dmsimardSamYaple: the project-config job definition uses content from o-z-j17:13
dmsimardso it needs to go in first iiuc17:13
SamYaplemakes sense17:13
mordredSamYaple: mnaser had an idea the other day that might apply well to you here (although it'll be another couple of steps to do it)17:14
openstackgerritSam Yaple proposed openstack-infra/project-config master: Remove loci-jobs from project-config  https://review.openstack.org/50855217:14
SamYaplemordred: yea i reviewed the patches, thought just purging the job was the easier route17:14
*** SumitNaiksatam has quit IRC17:15
jd_dmanchad: inc0: is there a new source for that chart? :)17:15
SamYaplebecause this is low-entropy low priority project thats not stable, i can be fast and loose with gating right now :)17:15
mordredSamYaple: which is that you could move the definition of the project-template loci-jobs to one of your loci repos - then revert https://review.openstack.org/#/c/508552 - and then what jobs are in loci-jobs is under your control - but is defined in one place17:15
openstackgerritMerged openstack-infra/project-config master: Set rackspace launch timeout to 10m  https://review.openstack.org/50837817:15
mordredSamYaple: cool17:15
*** jpena is now known as jpena|off17:16
inc0sooo....I'll wait till SamYaple's work merges before I do the same for Kolla;)17:17
SamYapleinc0: you probably dont want to follow my example17:17
mordredinc0, SamYaple: :)17:17
*** lukebrowning has quit IRC17:17
SamYapleim nooping my gates and then redoing all of it from the ground up in loci repo17:17
mordredinc0: mnaser did the dance yesterday - might be a good place to cargo-cult from17:17
inc0nah, it's not like you're stuffing yourself with strong eadibles;)17:17
*** ykarel has joined #openstack-infra17:17
SamYapleyea inc0, you want what mnaser did17:17
SamYapleinc0: are you calling me fat?17:18
inc0big boned17:18
dmsimardmnaser's patch for puppet-openstack zuul v3 things is here: https://review.openstack.org/#/c/508296/17:18
*** lukebrowning has joined #openstack-infra17:18
inc0but I was referring to different kind of eadibles17:18
openstackgerritSam Yaple proposed openstack-infra/openstack-zuul-jobs master: Remove legacy loci jobs  https://review.openstack.org/50855617:18
SamYapleinc0: i know ;)17:18
mordredinc0: also, I landed a patch for infra-manual on this: https://review.openstack.org/#/c/508295/ - infra-manual publishing is in flux atm17:19
*** mugsie has joined #openstack-infra17:19
inc0cool, I'll read through that, thanks17:19
inc0also, did you get secrets sorted out already?17:19
mordredinc0: oh yah- secrets totally work and we're using the heck out of them17:20
*** yamamoto has joined #openstack-infra17:20
inc0so I'll need help with that and registry deployment17:20
*** ekcs has joined #openstack-infra17:21
mordredinc0: https://docs.openstack.org/infra/zuul/feature/zuulv3/user/config.html#secret is the section of the zuul docs about them17:21
inc0so....let me know mordred when I'll be able to borrow your brain for few minutes to discuss that:)17:21
mordredinc0: and yes - as soon as the current issues have settled down my brain is yours17:21
inc0haha, it's like issues ever settles;) anyway, I'll keep pinging17:22
SamYapledont be greedy!17:22
SamYaplecat mordred_brain > paste.openstack.org17:22
SamYaplesimple17:22
inc0I don't think disks on paste.o.o can handle this amount of stuff17:23
*** lukebrowning has quit IRC17:23
SamYaplei think is like a 200 line limit. thats plenty17:24
mordredinc0: what - you're saying we can't write None to paste.o.o ;)17:24
SamYaplehaha same wavelength there17:24
*** lukebrowning has joined #openstack-infra17:24
inc0anyway, I'll leave you guys to zuulv3 and I'll start bugging you early next week17:25
inc0thank you!17:25
*** yamamoto has quit IRC17:25
SpamapSinc0: I'm curious what you mean by registry deployment.17:26
honzaWe seem to be having issues with the new legacy-tripleo-ci-* jobs.  It's as if the job isn't even run (no console.html17:26
mordredneat! I can chromecast a browser tab to my TV17:26
mordredSpamapS: docker docker docker17:26
SpamapSoh cool, like, as a post job to upload to docker?17:26
honzae.g. http://logs.openstack.org/36/508536/1/check/legacy-tripleo-ci-centos-7-undercloud-oooq/dd23de8/17:26
inc0SpamapS: a bit more to tha17:26
inc0t17:27
inc0semi-official registry ran in infra where our iimages will be published17:27
mordredSpamapS: well - that too - first step is just getting a local registry run in infra that docker build jobs can push stuff to so that we're not copying tarball exports around17:27
honzaIt fails to set up the workspace, I guess17:27
honzarsync: change_dir "/home/zuul/src/*/openstack/ceilometer" failed: No such file or directory (2)17:27
honzaHow can I debug this?17:27
mordredhonza: looking17:27
inc0so kolla deploy gates will have some place to pull stable images17:27
fricklerhonza: job-output.txt.gz is the new console.html17:27
cloudnull^ seeing something similar http://logs.openstack.org/03/508503/2/check/legacy-openstack-ansible-openstack-ansible-aio/cc4d0d6/job-output.txt.gz#_2017-09-29_17_17_14_073816 I think17:27
honzaah!17:28
inc0then, afterwards, we'll pull images and push to dockerhub on daily basis17:28
mordredhonza: 2017-09-29 15:17:28.848212 | centos-7 | cat: /etc/nodepool/primary_node_private: No such file or directory17:28
mordredhttp://logs.openstack.org/36/508536/1/check/legacy-tripleo-ci-centos-7-undercloud-oooq/dd23de8/job-output.txt.gz#_2017-09-29_15_17_28_84821217:28
*** tosky has quit IRC17:29
mordredwe're not writing out /etc/nodepool/primary_node_private on single-node jobs17:29
*** rbrndt has quit IRC17:29
*** lukebrowning has quit IRC17:29
openstackgerritMerged openstack-infra/project-config master: Disable merge-check pipeline  https://review.openstack.org/50837117:29
mordred\o/ that'll help17:29
mordredhonza: is the info in there something you need on single-node jobs too?17:30
honzamordred: to be honest, i don't even know what that is for17:30
*** lukebrowning has joined #openstack-infra17:31
mordredhonza: well on multi-node jobs /etc/nodepool/primary_node_private is how you find the address you want to use for intra-cloud traffic to talk to the 'primary' node17:31
mordredhonza: (thus why it's not generally relevant for single-node jobs) ... one sec and I'll take a peek at that job itself17:32
mordredcloudnull: your issue is different- you are missing openstack/ansible-hardening from your required-projects list17:32
cloudnullah !17:32
jeblairclarkb, mordred: i've reproduced the trusty issue in local (enormous) test.  it's definitely looking like a zuul bug.17:33
*** gouthamr has joined #openstack-infra17:34
honzamordred: are you sure that's the cause of the error?  the run continues with a SUCCESS after that, and then it seems to actually fail on the /home/zuul/workspace/devstack-gate/functions.sh: line 180: declare: gate_hook: not found line17:34
honzamordred: or, at least that's the error closest to the FAILURE line17:35
honzaFAILED*17:35
*** lukebrowning has quit IRC17:35
*** florianf has quit IRC17:36
*** lukebrowning has joined #openstack-infra17:37
*** edmondsw has joined #openstack-infra17:37
*** michaelxin has quit IRC17:39
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Add helpful error message about required-projects  https://review.openstack.org/50857617:40
*** esberglu has quit IRC17:40
mordredjeblair: excellent news!17:40
*** esberglu has joined #openstack-infra17:40
mordredhonza: I'm not 100% certain - I shall now look at your job content17:40
*** dhellmann has quit IRC17:41
honzamordred: thanks!17:41
*** esberglu has quit IRC17:41
mordredcloudnull, clarkb, fungi: ^^ https://review.openstack.org/508576 - helpful message about required-projects17:41
*** jascott1 has joined #openstack-infra17:41
*** esberglu has joined #openstack-infra17:41
*** dave-mccowan has quit IRC17:41
*** lukebrowning has quit IRC17:41
*** esberglu has quit IRC17:42
*** esberglu has joined #openstack-infra17:42
*** esberglu has quit IRC17:42
*** michaelxin has joined #openstack-infra17:42
*** esberglu has joined #openstack-infra17:42
openstackgerritJulie Pichon proposed openstack-infra/project-config master: Adjust branches for OSC jobs  https://review.openstack.org/50350017:43
*** ihrachys has quit IRC17:43
*** SumitNaiksatam has joined #openstack-infra17:43
*** ihrachys has joined #openstack-infra17:43
cloudnullthanks mordred17:44
*** thorst has quit IRC17:45
mordredhonza: aha! SOOOO17:45
honza:)17:46
*** dhellmann has joined #openstack-infra17:46
*** lukebrowning has joined #openstack-infra17:47
*** esberglu has quit IRC17:47
mordredhonza: you are referencing a file: /opt/stack/new/tripleo-ci/toci_gate_test.sh in your gate_hook17:47
mordredhonza: that is from a repo that isn't in required-projects OR PROJECTS17:49
*** trown|lunch is now known as trown17:49
mordrednow - that may not be the actual issue - still looking ...17:50
SamYaplecan i get an +2+W on https://review.openstack.org/#/c/508552/ when someone gets a chance to unblock my gates. please and thank you :)17:50
*** tosky has joined #openstack-infra17:51
*** lukebrowning has quit IRC17:52
*** ykarel has quit IRC17:52
openstackgerritAlex Kavanagh proposed openstack-infra/project-config master: Change the docs job to a deploy-publish-job  https://review.openstack.org/50829817:53
clarkbmordred: 508563 lgtm now17:53
*** lukebrowning has joined #openstack-infra17:53
*** rtjure has joined #openstack-infra17:53
honzamordred: https://github.com/openstack-infra/devstack-gate/blob/master/devstack-vm-gate-wrap.sh#L9317:54
mordredhonza: can you add Depends-On: I9cdc182ac5800e1566c04e6f21e454956d82ad33 to that patch? (there are several things in the stack ending at https://review.openstack.org/#/c/508548 that will, I think help that job - and it'll be good to see how it does once we've applied those fixes)17:54
openstackgerritMichael Johnson proposed openstack-infra/openstack-zuul-jobs master: Add missing horizon project for octavia-dashboard  https://review.openstack.org/50857917:54
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Create legacy nodesets and switch all legacy jobs  https://review.openstack.org/50851017:54
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Update legacy post playbooks to pull from primary  https://review.openstack.org/50851117:54
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Add missing required-projects for TripleO jobs using devstack-gate  https://review.openstack.org/50854817:54
*** ihrachys has quit IRC17:54
*** lnxnut_ has left #openstack-infra17:55
*** ihrachys has joined #openstack-infra17:55
mordredhonza: nevermind what I said above - you can try re-checking now that that^^ has landed17:55
honzamordred: excellent!17:55
*** david-lyle has quit IRC17:56
*** david-lyle has joined #openstack-infra17:56
mordredclarkb: the job for 436455 seems hung17:57
honzamordred: thanks for the quick help, much appreciated17:57
mordredhonza: sure thing!17:57
*** jpich has quit IRC17:58
*** lukebrowning has quit IRC17:58
openstackgerritMatt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet >newton jobs  https://review.openstack.org/50852017:58
openstackgerritMatt Riedemann proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-tempest-dsvm-nnet  https://review.openstack.org/50851917:58
clarkbmordred: weird, maybe its that ssh timeout thing?17:58
clarkbmordred: so we are waiting for ssh to fial?17:58
*** lukebrowning has joined #openstack-infra17:59
mordredI dunno - I actually don't see any mention on the executors that it's doing anything18:02
*** camunoz has joined #openstack-infra18:03
clarkbmordred: do you want to address fungi's comment at https://review.openstack.org/#/c/508576/1/roles/fetch-zuul-cloner/templates/zuul-cloner-shim.py.j2 ?18:03
mordredclarkb: I do!18:03
*** lukebrowning has quit IRC18:04
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Add helpful error message about required-projects  https://review.openstack.org/50857618:04
fungiit was purely cosmetic, but i expect this will be showing up in a bunch of job logs so better to be crystal clear18:04
mordredyah18:05
fungithough now you have whitespace characters on otherwise empty lines, it looks like18:05
*** lukebrowning has joined #openstack-infra18:06
fungiyou shame the whitespace gods with your brazen blasphemy18:06
mordredfungi: shall I fix the blasphemy?18:10
*** lukebrowning has quit IRC18:10
*** robled has quit IRC18:10
mordredI'll need to for the pep8 gods won't I?18:10
clarkbprobably18:10
openstackgerritMonty Taylor proposed openstack-infra/zuul-jobs master: Add helpful error message about required-projects  https://review.openstack.org/50857618:10
mordredclarkb: ok - so - here's what I've got18:10
fungimordred: if pep7 is checking that script, then yeah18:11
fungier, pep818:11
mordredclarkb, fungi: I did "grep 436455,2 debug.log | grep Execute" on the scheduler18:11
mordredwhich gave me:18:11
mordred2017-09-29 17:29:54,493 INFO zuul.ExecutorClient: Execute job build-openstack-sphinx-docs (uuid: 752c4e6329c84ad98fd34837f56611d9) on nodes <NodeSet OrderedDict([('ubuntu-xenial', <Node 0000060407 ubuntu-xenial:ubuntu-xenial>)])OrderedDict()> for change <Change 0x7fc64ae694a8 436455,2> with dependent changes []18:11
mordredthe uuid 752c4e6329c84ad98fd34837f56611d9 is important18:11
*** lukebrowning has joined #openstack-infra18:12
mordredI then did: ansible 'ze0*' -m shell -a 'grep 752c4e6329c84ad98fd34837f56611d9 /var/log/zuul/executor-debug.log'18:12
mordredand found that job running on ze0318:12
clarkbAJaeger_: are you still around? there is a linting problem with your publish-docs fix18:12
clarkbAJaeger_: I'll push the fix if you are already afk for the day18:12
*** robled has joined #openstack-infra18:12
*** robled has quit IRC18:12
*** robled has joined #openstack-infra18:12
mordredon ze03, it has done all the cloning18:12
mordredand created the workdir and everything18:13
mordredbut does not seem to be running ansible18:13
mordredin fact, ze03 is not running ANY ansible18:13
fungiclarkb: it sounded like AJaeger_ was headed into a travel blackhole for a while18:14
fungimordred: ze03 is having a bad problem and will not go to space today18:15
fungi?18:15
clarkbok /me pushes fix18:15
*** boris_42_ has joined #openstack-infra18:15
boris_42_Hi there18:16
openstackgerritClark Boylan proposed openstack-infra/openstack-zuul-jobs master: Fix some publishing jobs  https://review.openstack.org/50856218:16
*** lukebrowning has quit IRC18:16
clarkbmordred: ^ that should fix an error with docs publishing that AJaeger_ ran into18:16
* clarkb stops bothering mordred so that ze03 can be debugged18:16
boris_42_Is there way I can help with fixing Rally jobs that are failing after upgrading to zull v318:16
mnaserif things arent on fire - https://review.openstack.org/#/c/508333/18:16
mnaserjust need a +A to remove project templates which aren't prefixed with legacy- so we can move in-repo18:17
fungiboris_42_: is you have a link to one of the failing jobs, we can probably tell you whether one of the changes in flight is expected to fix that or maybe help you pinpoint what needs adjusting where18:18
*** lukebrowning has joined #openstack-infra18:18
mordredclarkb: it's looking like the executor process is stuck in a read() call18:18
clarkbmordred: you should be able to use lsof to figure out what the fd is18:19
openstackgerritTim Burke proposed openstack-infra/project-config master: legacy-swift-dsvm-functional should be voting  https://review.openstack.org/50858518:21
clarkbwe've also got ssh agents that are 10 hours old according to ps18:22
clarkbon ze0318:22
*** yamamoto has joined #openstack-infra18:22
*** lukebrowning has quit IRC18:22
clarkbI think we might be leaking those ssh-agents18:24
*** lukebrowning has joined #openstack-infra18:24
*** yamamoto has quit IRC18:26
*** lukebrowning has quit IRC18:29
*** rossella_s has joined #openstack-infra18:29
*** rbrndt has joined #openstack-infra18:30
mordredclarkb: ok. I'm coming up stumped18:30
fungithe number of ssh-agent processes on ze06 and 07 is similarly large, the rest are around 100-ish18:30
*** lukebrowning has joined #openstack-infra18:30
*** jdandrea_ has quit IRC18:31
fungiand yeah, ssh-agent processes on ze07 also date back to ~10 hours ago18:31
jeblairmordred, clarkb: try a thread dump?18:31
fungino ansible processes there either18:31
fungisame for 0618:32
fungiso 03, 06 and 07 all seem to be in the same boat18:33
fungistale ssh-agent processes as old as 10 hours, no ansible processes18:33
mordredjeblair, clarkb: remind me how to doa thread dump?18:35
*** lukebrowning has quit IRC18:35
jeblairsigusr218:35
*** lukebrowning has joined #openstack-infra18:37
fungiand then it'll appear in the debug log18:37
openstackgerritKendall Nelson proposed openstack-infra/storyboard master: Add Test Migration Directions  https://review.openstack.org/50250918:38
boris_42_@fungi so there are in gate pipeline 2 patches 50727618:38
boris_42_fungi: they have already failed jobs18:38
jeblairi'm going to do the stack dump on ze0318:38
fungiboris_42_: thanks, taking a peek now18:39
*** rossella_s has quit IRC18:39
mordredfungi: ok. I have done poorly at this on 03 and 06 - would you mind doing it for me on 07? I'm having a brain-sad at the moment and don't want to lose things on all thre nodes18:39
mordredjeblair: do 0718:39
boris_42_fungi: thank you18:39
mordredjeblair: I borked 3 and 618:39
mordredjeblair: because somehow I can't do basic unix today18:39
jeblairokay i'll do 718:39
jeblairbefore i do that18:40
*** nikhil has joined #openstack-infra18:40
jeblairi notice that both of those hosts are running18:40
jeblairzuul     28614  0.0  0.4 192192 32908 ?        S    08:37   0:00 git-remote-https origin https://git.openstack.org/openstack/glance-specs18:40
jeblairseems plausible that's the read they're stuck on18:40
mordredyah. I agree18:40
*** rossella_s has joined #openstack-infra18:40
jeblairuntil we figure out how to time that out, can probably just kill that18:40
mordredjeblair: k. want me to do that then clean up after myself on 03 and 06 while you look at 7?18:41
jeblairalso, once again, there are 3 zuul-executor processes running on 0718:41
jeblairthere should only be 218:41
mordredexcellent18:41
*** lukebrowning has quit IRC18:41
jeblairit would be nice to know what happened at 03:19 to cause that18:42
mriedemandreaf: jeblair: https://review.openstack.org/#/c/508519/ and below are rebased and passed CI18:42
mriedemthose are blocking nova so would be sweet if we could get them in18:43
mriedemAJaeger_: ^18:43
mriedemsorry andreaf18:43
fungiwe were still merging job configuration changes up to/around 03:1918:43
mordredmriedem: +2 from me18:44
fungibut no restarts that i'm aware of18:44
mriedemthanks18:44
jeblairinteresting... it's a child of the main proc18:44
jeblairor rather, a child of the child18:44
jeblairi wonder if we have a fork sneaking in we don't know about18:45
mordredjeblair: you still want 03 and 06 or you want me to go ahead and clean up there?18:45
jeblairmordred: go for it18:45
*** thorst has joined #openstack-infra18:45
jeblairyeah, it's the git command; the merger therad is holding the lock on the git repos while running that18:47
*** lukebrowning has joined #openstack-infra18:47
jeblair07 seems unstuck now18:48
jeblairafter i killed the leaf git process18:48
mordred03 is restarted and running properly now18:48
openstackgerritTim Burke proposed openstack-infra/project-config master: Make legacy-swift-tox-xfs-tmp-func-ec voting  https://review.openstack.org/50858918:49
Shrewsjeblair: one child should be the LogStreamer (which then forks children for finger requests)18:49
fungifirefox on my workstation can no longer handle the zuulv3 status page. keeps complaining about jquery running too long18:49
*** thorst has quit IRC18:50
Shrewsfungi: fine under chrome. odd18:50
jeblairShrews: yeah, the child i'm expecting is the logstreamer; the one i'm not expecting is one of its children.  i don't know why there often seems to be exactly one of those after some random time period.18:50
jeblairfungi: yeah, the new status page is way less efficient than the old.18:51
*** hasharAway has quit IRC18:51
*** hashar has joined #openstack-infra18:52
*** lukebrowning has quit IRC18:52
mordredjeblair: ok- 06 is restarted - but doesn't seem to be taking on jobs18:52
SamYaplefungi: working fine for me. what version of FF?18:52
jeblairmordred: give it a minute?  zuul is busy reloading its config18:53
mordredthere it goes18:53
*** lukebrowning has joined #openstack-infra18:54
fungiSamYaple: 52.3.0 but i expect it's gotten crufty. i haven't cleared my preferences since years and should probably start fresh at some point18:54
*** jascott1 has quit IRC18:55
jeblairclarkb, mordred: regarding the trusty bug -- i can say it's related to the use of project templates, and there's no reason it should be limited to nodesets.  it's looking similar to what clarkb was surmising -- when project templates are used together on a project, they seem to somehow be modified and combined so that they apply to later projects.  it's complicated and i still haven't discovered the mechanism, but thought the ...18:55
jeblair... additional potential error symptoms may be useful.18:55
SamYaplefungi: ah. i tested on 54 and 55 (55 is my main) and it appears to be working fine18:56
Shrewsjeblair: yuck18:56
*** dizquierdo has quit IRC18:57
fungiboris_42_: looks like you probably need to add openstack/dib-utils (and probably others?) to this job: http://logs.openstack.org/76/507276/5/gate/legacy-rally-dsvm-keystone-v2api-rally/044bb96/logs/devstack-gate-setup-workspace-new.txt18:57
jeblairi need to grab some lunch; i'll pick this up again afterwords.18:58
fungiboris_42_: in its required-projects list18:58
*** lukebrowning has quit IRC18:58
fungiboris_42_: and this one probably needs openstack/rally in required-projects? hard to tell since the job isn't collecting the setup-workspace log: http://logs.openstack.org/76/507276/5/gate/legacy-rally-dsvm-cli/d2adbe9/job-output.txt.gz#_2017-09-29_17_55_21_87372718:59
kfox1111all the kolla-kubernetes jobs are broken at the moment. :/19:00
kfox1111any idea what this means? http://logs.openstack.org/65/508565/1/check/legacy-kolla-kubernetes-deploy-centos-binary-2-external-ovs/70f5b6a/job-output.txt.gz19:00
kfox1111seems like it breaks before the job even starts.19:00
*** lukebrowning has joined #openstack-infra19:00
fungiboris_42_: and this one looks like maybe we translated some of the shell script fragment incorrectly? /bin/sh is having trouble parsing it: http://logs.openstack.org/76/507276/5/gate/legacy-rally-dsvm-verify-light-discover-resources/f455652/job-output.txt.gz#_2017-09-29_17_55_14_38107819:01
*** lbragstad has quit IRC19:03
fungikfox1111: looks like it's probably missing openstack/requirements in the required-projects list for that job: http://logs.openstack.org/65/508565/1/check/legacy-kolla-kubernetes-deploy-centos-binary-2-external-ovs/70f5b6a/job-output.txt.gz#_2017-09-29_17_18_27_58593619:03
fungimordred has a patch up to make the error condition there a lot more user-friendly19:04
kfox1111I tweak that in project-config?19:04
Shrewskfox1111: see http://lists.openstack.org/pipermail/openstack-dev/2017-September/122880.html and the linked etherpad. i think that covers your exact situation19:04
Shrewsinstructions in that etherpad19:04
*** lukebrowning has quit IRC19:04
kfox1111ok. cool. thanks.19:05
kfox1111oh. there is a single ps that will fix it for everyone?19:06
*** lukebrowning has joined #openstack-infra19:06
mnaserhey folks, is there a knowni ssue with debian-jessie images?19:06
mnaserhttp://logs.openstack.org/33/508333/1/gate/base-integration-debian-jessie/6290c4b/job-output.txt.gz19:07
mnaser"No space left on device"19:07
*** lbragstad has joined #openstack-infra19:07
mnaser(this ran on our cloud, could it be possible resize2fs or whatever didnt do its thing?)19:08
Shrewskfox1111: no, no single fix19:08
kfox1111oh. ok. thanks.19:09
jeblairclarkb, mordred, Shrews: oh!  i found the trusty issue.  the problem and fix are both as subtle as you might expect.  i need to write some tests and clean some stuff up before pushing up a patch after lunch.19:10
jeblairbut we should be able to expect that to be in production within a couple of hours19:10
mtreinishmordred, jeblair: ok, now it's a new failure mode for the puppet jobs: http://logs.openstack.org/58/508258/3/check/legacy-infra-puppet-apply-3/3b271d2/job-output.txt.gz#_2017-09-29_17_47_58_63310919:11
*** lukebrowning has quit IRC19:11
kfox1111Shrews: https://review.openstack.org/#/c/508460/1/zuul.d/zuul-legacy-jobs.yaml looks like its tweaking it for all legacy projects?19:12
*** esberglu has joined #openstack-infra19:12
*** esberglu has quit IRC19:12
*** lukebrowning has joined #openstack-infra19:12
*** esberglu has joined #openstack-infra19:13
Shrewskfox1111: that's fixing the legacy-requirements job. the job you mentioned above (legacy-kolla-kubernetes-deploy-centos-binary-2-external-ovs) does not use that as a parent.19:15
Shrewskfox1111: that job is defined here: https://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/zuul.d/zuul-legacy-jobs.yaml#n408119:16
*** dhajare has joined #openstack-infra19:16
*** lukebrowning has quit IRC19:17
*** esberglu has quit IRC19:17
*** lukebrowning has joined #openstack-infra19:19
kfox1111ah. ok. so I just need to add the same thing to all of our jobs.19:19
Shrewskfox1111: so you would add the required-projects there. Or if you need it on all of the kolla jobs, do something like what https://review.openstack.org/#/c/508281 does and create a new base job for them, and add it there19:19
kfox1111thanks.19:19
*** jascott1 has joined #openstack-infra19:19
fungimnaser: our images should be using the growroot element: http://git.openstack.org/cgit/openstack-infra/project-config/tree/nodepool/nodepool.yaml#n105119:20
fungimnaser: http://git.openstack.org/cgit/openstack/diskimage-builder/tree/diskimage_builder/elements/growroot19:20
smcginnisfungi: Is there a restart still forthcoming?19:20
clarkbjeblair: awesome. I'm taking advantage of lunch time to also cook dinner19:20
mnaserfungi interesting, not sure why it did that... :x19:20
fungismcginnis: i expect so once jeblair has a patch for the inheritance issue19:21
fungismcginnis: probably in a couple hours19:21
smcginnisfungi: ack, thanks19:21
*** ijw has quit IRC19:21
fungimnaser: it's software, so... bugs?19:21
fungimnaser: we'd need logs from early boot (which may be in the journal if we're collecting it)19:22
*** yamamoto has joined #openstack-infra19:23
openstackgerritMonty Taylor proposed openstack-infra/infra-manual master: Update project creators guide with zuul v3 information  https://review.openstack.org/50859619:23
*** lukebrowning has quit IRC19:23
mordredfungi, clarkb:^^ I did a followup to AJaeger_'s patch there19:23
*** lukebrowning has joined #openstack-infra19:25
openstackgerritKevin Fox proposed openstack-infra/openstack-zuul-jobs master: Fix Kolla-Kubernetes missing deps.  https://review.openstack.org/50859719:28
*** yamamoto has quit IRC19:28
kfox1111Shrews: how does that look?19:28
Shrewskfox1111: looks about right  :)  we'll see what zuul says19:29
mnaseris there a big queue or an issue? (taking almost 40-50 minutes for jobs to start?)19:30
mordredkfox1111: lgtm19:30
*** lukebrowning has quit IRC19:30
mordredShrews: do you know if we're stuck in happy fun land again?19:30
*** e0ne has joined #openstack-infra19:30
kfox1111ok. cool. thanks. :)19:31
Shrewsmordred: i do not know. i can check np again19:31
*** lukebrowning has joined #openstack-infra19:31
funginodepool hasn't run out of space on / yet at least19:32
Shrewsmordred: fungi: i see nodepool processing requests, but there are A LOT of requests19:33
Shrewsthat list seems to be slowly declining19:33
mordredShrews, fungi, clarkb, jeblair: there are also a set of jobs at the top of the queues that seem to each have one job that seems somewhat stuck or lost19:34
inc0hey, https://review.openstack.org/#/c/508544/ <- patch from 4hrs ago and zuul still queeues it:(19:34
*** thorst has joined #openstack-infra19:35
fungiinc0: after the zuul restart it made it into the check pipeline a little over an hour ago19:35
fungiinc0: but yes, we're seeing a significant delay in node assignments to jobs too19:35
*** lukebrowning has quit IRC19:36
inc0no worries, just informing you:)19:36
mordredjeblair: we have 5 executor processes running on ze0819:36
inc0let me know if there is anything I can do to help19:36
*** lukebrowning has joined #openstack-infra19:37
Shrewsmordred: if i knew how to map review # to node requests #, i'd be able to look into if they're waiting for nodepool or not. but i do not know how to do that19:38
Shrewsthat might actually be a good enhancement to the zk data model19:38
*** jcoufal has joined #openstack-infra19:39
mordredShrews: for now - if you grep for the change in the scheduler debug log and then for NodeRequest ...19:39
mordredShrews: like: "grep 508505,4 /var/log/zuul/debug.log | grep NodeRequest"19:39
mordredShrews: which will return things like:19:40
mordred2017-09-29 19:36:11,316 INFO zuul.IndependentPipelineManager: Completed node request <NodeRequest 100-0000065636 <NodeSet devstack-single-node OrderedDict([('primary', <Node 0000063610 primary:ubuntu-xenial>)])OrderedDict()>> for job legacy-tempest-dsvm-neutron-full of item <QueueItem 0x7fc5583f1240 for <Change 0x7fc71cbb4f98 508505,4> in check> with nodes <NodeSet devstack-single-node19:40
mordredOrderedDict([('primary', <Node 0000063610 primary:ubuntu-xenial>)])OrderedDict()>19:40
*** jcoufal_ has quit IRC19:41
Shrewsah, handy19:41
*** jtomasek has quit IRC19:42
mordredShrews, jeblair: also - we have four zuul-executor processes on ze0519:42
*** lukebrowning has quit IRC19:42
*** camunoz has quit IRC19:43
*** lukebrowning has joined #openstack-infra19:43
mordredon both ze05 and ze08 - the extra processes are subprocesses of the one child process we expect - and in both cases they are looped reading the console-log file19:44
mordredand the jobs for which they are stuck reading the console-log file are the jobs I'm seeing at the top of the queues that are hung19:44
*** mat128 has quit IRC19:46
Shrewsmordred: do those console logs still exist?19:46
mordredShrews: yes19:47
mordredOK ...19:48
Shrewsmordred: seems to imply zuul hasn't completed doing "something" for those jobs19:48
mordredso there are two things running at the same time19:48
Shrewswhich is the edge of my knowledge19:48
openstackgerritKevin Carter (cloudnull) proposed openstack-infra/openstack-zuul-jobs master: Add openstack-ansible required-projects parent job  https://review.openstack.org/50828119:48
mordredlike - there are two identical ansible-playbook processes both running the same pre-playbook19:49
*** rossella_s has quit IRC19:49
mordredShrews, jeblair, SpamapS: http://paste.openstack.org/show/622341/19:50
*** rossella_s has joined #openstack-infra19:51
mordredI know that the multiple of anecdote isn't data - but I believe we've seen a few times now cases where things that seem hung wind up having an extra executor19:51
Shrewsoh, they really ARE identical19:52
mordredyah- and this is a pattern that we've seen before but have never been able to understand19:52
Shrewsi can't even begin to speculate as to a cause for such a thing19:53
kfox1111thanks for the help. :)19:53
Shrewskfox1111: no problem! sorry for the hassles19:53
clarkbmordred: Shrews possible two nodes got the job?19:54
kfox1111hmmm. still similar problem: http://logs.openstack.org/43/471843/3/check/legacy-kolla-kubernetes-deploy-centos-binary-2-ceph/df4b6e5/job-output.txt.gz#_2017-09-29_18_46_52_64446019:55
kfox1111Shrews: no worries. zuul3 was a huge change. thanks for working on it. :)19:56
*** vhosakot has quit IRC19:57
Shrewskfox1111: that change either needs to wait for your fix to merge, or else it needs to depend on your fix19:57
kfox1111oh. I thought it did merge.... guess I didn't verify it actually made it through though...19:58
kfox1111sorry. probably jumped the gun.19:58
Shrewsnode request list has dropped from ~1300 to ~90020:01
kfox1111ah. yup. its still in the queue.20:01
fungiShrews: any guess at the spike? still just fallout from the reenqueue after zuul was restarted or something else you think?20:02
Shrewsfungi: no. there are just too many variables right now20:02
fungii figured20:02
openstackgerritAndreas Jaeger proposed openstack-infra/openstack-zuul-jobs master: Remove legacy-requirements-python34  https://review.openstack.org/50859820:02
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: Remove legacy-requirements-python34 job  https://review.openstack.org/50848920:03
jeblairShrews, fungi: i just recalled that node request priority is still a TODO item in zuul20:03
jeblairso gate is going to be starved by check20:03
mordredjeblair: welcome back!20:03
mordredjeblair: dunno if you saw in scrollback - but we've got multi-child-processes again on executors20:04
jeblairmordred: thx20:04
jeblairmordred: i was just looking at that, but i've never seen that before20:04
jeblairmordred: i was only previously concerned with multiple zuul-executor processes20:04
jeblairmordred: multiple brap processes for the same playbook is new behavior to me20:04
mordredjeblair: yah. oh - well - I mean, we have multiple zuul-executor processes20:04
mordredjeblair: and the bwrap processes each have a different parent which is one of the z-e processes20:05
mordredjeblair: so - god only knows20:05
mordredit subverts my understanding of, well, unix20:05
jeblairmordred: http://paste.openstack.org/show/622342/20:06
mordredoh - I was reading the number wrong20:06
jeblairya, that looks pretty normal, right?20:06
*** esberglu has joined #openstack-infra20:07
Shrewswhat generated that output?20:07
jeblairmordred, Shrews: now with full lines: http://paste.openstack.org/show/622343/20:07
jeblairShrews: pstree -p -l20:07
mordredjeblair: oh. I maybe wasn't looking at the right thing first20:08
fungiyou can get a sort of similar rendering with `ps afuxww`20:08
fungidoes a parentage tree20:08
fungithe f option does i mean20:08
jeblairfungi: ya, though that is hard to read with bubblewrap cmdlines which tend to be several kB20:09
fungioh, heh indeed line wrapping makes that painful20:09
jeblairi exaggerate.  only like 2kB.20:09
mordredjeblair, fungi: load avg is quite high - and we're swapping- at least on 08 and 0120:09
jeblairmordred: yeah, i think we need more executors.  to be fair, we needed more for zuulv2 as well.20:10
*** xyang1 has quit IRC20:11
jeblairwe also need the executors to have an internal load average limit20:11
mordredyah20:11
mordredhttp://paste.openstack.org/show/622346/20:11
jeblairso they stop accepting new jobs after a certain load average20:11
mordredthat's where we're at right now20:11
*** esberglu has quit IRC20:11
jeblairnicely distributed!20:12
mordredIKR?20:12
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: Remove some pypy jobs that don't work  https://review.openstack.org/50474820:12
mordredjeblair: oh - are we still split with mergers?20:12
jeblairmordred: yes, we only have 420:12
jeblairothers are still zuulv2, in case of rollback20:12
mordredyah20:12
fungiso all the executors are running around 125% of ram... i guess we want a minimum of 4 more?20:13
openstackgerritIhar Hrachyshka proposed openstack-infra/openstack-zuul-jobs master: Removed confusing comments  https://review.openstack.org/50860120:14
jeblairi just emitted suggestions for zuul patches folks can write in #zuul20:14
jeblairi need to go work on the project template fix now20:15
*** e0ne has quit IRC20:15
mordredjeblair: kk20:15
*** e0ne has joined #openstack-infra20:15
fungithanks!20:15
*** e0ne has quit IRC20:16
*** e0ne has joined #openstack-infra20:16
openstackgerritDirk Mueller proposed openstack-infra/openstack-zuul-jobs master: Drop requirements-python34 job  https://review.openstack.org/50860220:16
*** e0ne has quit IRC20:16
*** Goneri has quit IRC20:17
*** e0ne has joined #openstack-infra20:17
*** e0ne has quit IRC20:17
*** jcoufal has quit IRC20:17
*** e0ne has joined #openstack-infra20:17
*** kjackal_ has quit IRC20:17
*** e0ne has quit IRC20:18
*** e0ne has joined #openstack-infra20:18
openstackgerritAndreas Jaeger proposed openstack-infra/project-config master: Remove some pypy jobs that don't work  https://review.openstack.org/50474820:18
*** e0ne has quit IRC20:19
mordredclarkb, fungi: ifyou have a sec, https://review.openstack.org/#/c/508568/120:22
mordredoh - piddle - there's a spurious change in that ...20:22
openstackgerritMerged openstack-infra/infra-manual master: Ectomy Jenkins from the Infra Manual narrative  https://review.openstack.org/43645520:22
clarkbI'm just about done with lunch and dinner back to reviewing shortly20:23
mordredfungi, clarkb: let me fix that in a followup - that change is needed to unbreak some people's jobs and with queue length I'd hate to send it through another check cycle - whereas the tox.ini change won't really break anything20:24
*** yamamoto has joined #openstack-infra20:24
openstackgerritMonty Taylor proposed openstack-infra/openstack-zuul-jobs master: Remove spurious change to tox.ini  https://review.openstack.org/50860720:25
openstackgerritDirk Mueller proposed openstack-infra/openstack-zuul-jobs master: Remove rpm-packaging-tox-lint legacy job  https://review.openstack.org/50860920:28
*** yamamoto has quit IRC20:29
openstackgerritDirk Mueller proposed openstack-infra/project-config master: Remove legacy-rpm-packaging-tox-lint  https://review.openstack.org/50861020:30
*** jtomasek has joined #openstack-infra20:31
openstackgerritTrevor McCasland proposed openstack-infra/subunit2sql master: Add subunit2sql CLI option to use non_subunit_name  https://review.openstack.org/50719220:32
openstackgerritTrevor McCasland proposed openstack-infra/subunit2sql master: Add subunit2sql CLI option to use non_subunit_name  https://review.openstack.org/50719220:36
openstackgerritTrevor McCasland proposed openstack-infra/subunit2sql master: Add subunit2sql CLI option to use non_subunit_name  https://review.openstack.org/50719220:41
*** jtomasek has quit IRC20:42
*** rhallisey has quit IRC20:42
*** rlandy has quit IRC20:44
openstackgerritClark Boylan proposed openstack-infra/zuul-jobs master: Make fetch-tox-output more resilient  https://review.openstack.org/50856320:45
clarkbmordred: ^ fixed an issue there (that testing caught \o/)20:45
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: Fix bug with multiple project-templates  https://review.openstack.org/50861220:46
jeblairclarkb, mordred: ^ lists are mutable20:46
jeblairclarkb, mordred: i did a bunch of nice inheritance path enhancements as part of tracking that down; i'm cleaning that up now as a follow-up change20:47
clarkbjeblair: shiny20:47
*** slaweq has joined #openstack-infra20:47
*** trown is now known as trown|outtypewww20:47
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Map pipeline precedence to nodepool node priority  https://review.openstack.org/50861320:48
*** jdandrea_ has joined #openstack-infra20:48
*** jdandrea_ has quit IRC20:48
*** nikhil has quit IRC20:49
*** Sukhdev has joined #openstack-infra20:49
openstackgerritTrevor McCasland proposed openstack-infra/subunit2sql master: Add subunit2sql CLI option to use non_subunit_name  https://review.openstack.org/50719220:50
*** bnemec has quit IRC20:51
*** jtomasek has joined #openstack-infra20:51
*** jdandrea_ has joined #openstack-infra20:51
clarkbjeblair: mordred each of thoes zuul fixes lgtm20:52
jeblairclarkb: cool, i'm going to -1 mordreds since he's working on tests, but i agree it looks good20:52
clarkbjeblair: can has quick review on https://review.openstack.org/#/c/508563/ ?20:53
mordredclarkb, jeblair: should we start working on spinning up additional executors? and/or should we start converting v2 mergers to v3 mergers? (are we ready for that or do we wanna keep them still)20:54
*** jdandrea_ has quit IRC20:55
clarkbat this point do we see ourselves rolling back? its still doable since everything is in a separate config fs tree (though in some cases shared repo)20:55
clarkbI feel like we've been pretty committed torolling forward20:55
clarkbso converting old mergers is probably fine20:55
jeblairi've been so heads down i don't feel like i have a great handle on how big the fire is, so will rely on others to evaluate general likelihood of rollback.  i will say that with the project-templates fix, i don't know of any zuul bugs i would consider blockers.  the closest is the high amount of pain that dynamic configs will cause us for the next several days at least until we can optimize that.20:56
*** kgiusti has left #openstack-infra20:57
clarkbjeblair: right now I think the biggest fires are mostly around job config bugs. Like the missing workspace in publish-docs path AJaeger_ fixed and adding requirements to reuired repos. Basically things that we'd have a hard time fixing if we rolled back20:57
mordredclarkb, jeblair: I think it seems like, while we have fires, they're all mostly roll-forward-and-fix-the-job types of fires, and mostly things (otherthan config optimizatoin) that are fairly easy to fix once the problem is spotted20:58
clarkbI need to catch up on the situation with multinode jobs too (but again I think itsmostly little corner case edges we are fidning that automated conversion alone isn't likely to catch)20:58
mordredclarkb: yah - I think we've got most of the systemic multinode issues and have the pile of edge cases left20:58
clarkbif we want to maybe run v2 and v3 concurrently then I'd entertain rollback otherwise I think we press forward and fix bugs20:58
clarkbbasically the only way we find and fix these is by stubbing our toes on them20:59
*** armax has joined #openstack-infra21:00
mordredand at this point, folks have already gone through an amount of disruption - rolling back and roling forward again later seems like it's likely to pile on with more disruption than fixing bugs as we find them will21:01
SamYapleplease dont introduce v2 again21:01
AJaeger_Let's move forward...21:02
SamYaple+900121:02
clarkbjeblair: one additional possible zuul bug. legacy-tempest-dsvm-nnet has branches set to stable/newton and yet runs against d-g master changes21:02
clarkbSamYaple: AJaeger_ yup I think that is consensus just wanted to make sure we weighed the options properly.21:02
jeblairclarkb: is it used in project-templates at all?21:02
clarkbjeblair: yes21:02
clarkbintegrated-gate-nova-net is the project-template21:02
clarkbjeblair: is that the same bug as the one you fixed?21:03
jeblairclarkb: let's assume it's fixed by my change until we see otherwise21:03
clarkbkk21:03
AJaeger_regarding multi-node, dmsimard has quite a few open changes that need review love: https://review.openstack.org/#/q/status:open++topic:zuulv3-multinode21:03
jeblairclarkb: (let's check though :)21:03
clarkbAJaeger_: thanks, will review21:03
clarkbAJaeger_: looks like these are for doing native zuul multinode (should review but likely less urgent for now until we get happy with v3 as is)21:05
AJaeger_clarkb: I see...21:06
* AJaeger_ waves good night21:06
clarkbI've just approved https://review.openstack.org/#/c/508460/121:07
boris_42_fungi: so can i help somehow?21:08
fungiboris_42_: did you see any of the analysis i posted earlier of the several different kinds of job failures in your jobs?21:08
clarkb^.*requirements-py[2,3].txt$ the comma in the regex there doens't do what I think it thinks it does :)21:09
SamYaplequestion.. is it generally better to have long running jobs, or lots of short jobs? For LOCI I will be building openstack projects and was considering a job per project with each distro instead of a job per distro building all projects21:09
boris_42_fungi: yep I saw, but not sure that completely understand part about requriments21:09
mordredclarkb: it really doesn't. :)21:09
boris_42_fungi: why do we need to have rally in requirments?21:09
SamYaples/each distro/all distros/21:10
mordredSamYaple: the answer is a *VERY* firm 'it depends' :)21:10
*** thorst has quit IRC21:10
SamYaplehaha. well more context, each *project* takes about 3 minutes to build. and this way it would allow me to exclude projects should certain files not get changed (something we may or may not do)21:10
mordredSamYaple: lots of shorter jobs means that resources can be recycled and used by other things on a more granular level - but there is still a recycle cost ... also, at the moment, we have single-sized flavors - so you'll get an 8G node whether you need it or not21:11
SamYapleright, so im trying to figure out if that "recycle cost" is high.21:12
SamYapleif i end up building all openstack projects that means a patchset could have 40 or so 3-5 minute gates21:12
fungiboris_42_: one of them needed to add openstack/dib-utils to the required-projects list for that job, it looked like. the other two seemed to be the same sort of shell parsing error, looking back at them now21:13
SamYaplerather than 3-4 50m gates21:13
fungiboris_42_: http://logs.openstack.org/76/507276/5/gate/legacy-rally-dsvm-verify-light-discover-resources/f455652/job-output.txt.gz#_2017-09-29_17_55_14_381078 and http://logs.openstack.org/76/507276/5/gate/legacy-rally-dsvm-cli/d2adbe9/job-output.txt.gz#_2017-09-29_17_55_21_87372721:13
mordredSamYaple: yah - but, if you put in exclusions, you could wind up with only 3-4 3-5 minute gates much of the time yeah?21:13
SamYaplethats the desire, yes21:13
SamYaplethough certain patches will still trigger them all21:13
mordredyah- but that's life- you'll need to use that amount of resources on such a patch no matter how you split them21:14
SamYaplefair enough21:14
clarkbre recycle cost its the boot and delete of openstack VMs for the most part21:14
mordredSamYaple: I think right now it's probably not a TON of diference because of quota - however, as we roll out ability to use less resources for a given job, the split jobs may allow you to take advantage of that more21:14
clarkbwhich depending on cloud, region, and load that varies in cost significantly even over the course of a day21:14
mordredit does21:15
mordredit's also worth noting that on some of our clouds the upper bound on quota is number of available ips21:15
SamYapleis this quota per project or for all of infra?21:15
SamYapleoh good point about the ips21:15
mordredSamYaple: all of infra- we currently have one total quota - but we currently only calculate it in terms of number of servers21:15
SamYaplehmmm. well i think ill start with split projects and only setup 6-7 core projects until we can get the exclusions working the way we want21:16
*** vhosakot has joined #openstack-infra21:16
mordredSamYaple: tobiash has written code to look at nova flavors and quota and whatnot and actually allow calculating what the actual quota and actual usage are - which then allows for a 1 G node to count less towards quota than an 8 G node21:16
SamYaplewe can always roll back to a fat gate job21:16
mordredSamYaple: yah - re-organizing it should be easy to do as you poke at it21:16
SamYaplemuch easier now too :)21:17
*** jamesdenton has joined #openstack-infra21:17
SamYaplei think per project gate would be the easy call if i could get a 1GB instance21:18
boris_42_fungi: so where code of jobs is now located ?21:19
boris_42_fungi: does it make to move these jobs to our project as we need to fix them in any case?21:20
boris_42_make sense*21:20
fungimordred: looking at one of the shell parsing errors for the rally job failures, is it possible $ZUUL_PROJECT is no longer being set? http://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/playbooks/legacy/rally-dsvm-verify-light-discover-resources/run.yaml#n3121:20
fungihttp://logs.openstack.org/76/507276/5/gate/legacy-rally-dsvm-verify-light-discover-resources/f455652/job-output.txt.gz#_2017-09-29_17_55_14_38107821:20
*** edmondsw has quit IRC21:20
*** jpena|off has quit IRC21:21
mordredfungi: it should be set by environment: '{{ zuul | zuul_legacy_vars }}'21:21
*** dprince has quit IRC21:21
mordredfungi: although we somehow missed-parsed the shebang line on that one21:21
fungioh!21:22
fungigood eye21:22
fungithat's a broken shebang21:22
*** amoralej has quit IRC21:22
fungiand i bet in the past we just ignored it21:22
mordredfungi: so we should turn line 29: #/bin/bash -xe - into set -x and set -e - and then add exectutable: /bin/bash21:22
mordredfungi: I believe you're likely right21:22
mordredfungi: so - executable: /bin/bash as a sibling to the chdir: at the bottom of the task - and then expanding it into set commands instead of a shebang should fixthat one right up21:23
fungimordred: boris_42_: looks like that same broken shebang appears in half a dozen different rally job definitions according to git grep21:24
boris_42_fungi: ya most of Rally jobs are broken21:24
fungiboris_42_: http://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/playbooks/legacy/rally-dsvm-verify-light-discover-resources/run.yaml#n2921:24
fungiboris_42_: the #/bin/bash should be #!/bin/bash21:24
*** ltomasbo has quit IRC21:24
boris_42_fungi: and what about -xe ?21:25
fungiso the error was copied over from the old jobs, but zuul v2's executor wasn't as picky about that typo and just ignored the line and fed it directly to the shell parser21:25
fungiboris_42_: keep the -xe, i just mean the typo in the original jobs (which we copied over into the new job definitions verbatim) was missing a ! after the #21:25
*** yamamoto has joined #openstack-infra21:25
*** slaweq has quit IRC21:26
boris_42_fungi: okay let me propose patch that fixes the things21:26
boris_42_in all places21:26
openstackgerritJames E. Blair proposed openstack-infra/zuul feature/zuulv3: SourceContext improvements  https://review.openstack.org/50862021:26
jeblairclarkb: ^ new shiny21:26
*** slaweq has joined #openstack-infra21:26
fungiboris_42_: `git grep '#/bin/'` in openstack-infra/openstack-zuul-jobs should show you the ones that are missing !21:26
boris_42_fungi: thanks for help!21:27
clarkbFYI I'm digging into why multinode legacy jobs alls seem to think they don't have a second node21:27
clarkbor attempting to at least21:27
fungiboris_42_: i think you must have made the mistake in one and then copied it to the others, because only rally jobs seem to have that mistake21:27
boris_42_fungi: I believe so  =(21:27
clarkbhttp://logs.openstack.org/51/500351/18/check/legacy-grenade-dsvm-neutron-multinode-live-migration/365ed5b/ example job (note the inventory looks correct so guessing its something runtime with d-g)21:27
*** ltomasbo has joined #openstack-infra21:28
fungiboris_42_: but separately, i expect legacy-rally-dsvm-keystone-v2api-rally will still need (at least) openstack/dib-utils added to its required-projects list based on the error i saw in that one21:28
*** jpena|off has joined #openstack-infra21:29
*** amoralej has joined #openstack-infra21:29
fungiclarkb: changes to how we store multinode metadata in /etc/nodepool maybe? as in which files we make present in the new primary-less multinode world?21:29
clarkbfungi: ya or not setting the vars that trigger multinode code paths, gonna finish reviewing jeblair's change then hope to dig in properly21:30
*** thorst has joined #openstack-infra21:30
*** slaweq has quit IRC21:31
*** yamamoto has quit IRC21:31
fungiyep, i actually popped in on a break to review the inheritance fix. still mired in food-related stuff for a while yet21:31
openstackgerritBoris Pavlovic proposed openstack-infra/openstack-zuul-jobs master: Fix typo in rally jobs #/bin/bash -> #!/bin/bash  https://review.openstack.org/50862221:31
cloudnulllooking at the zuul status page, after a job completes we're seeing something like this "http://zuulv3.openstack.org/legacy-openstack-ansible-linters" - which has posted "node_failure"21:32
cloudnullthe post is testing pr https://review.openstack.org/#/c/508281/21:32
clarkbboris_42_: you'll actually probably want to fix it in the way mordred described21:32
clarkbboris_42_: 21:23:21          mordred | fungi: so we should turn line 29: #/bin/bash -xe - into set -x and set -e - and then add exectutable: /bin/bash21:32
cloudnullagainst our main repo, any advice?21:32
boris_42_@clarkb sure21:32
*** slaweq has joined #openstack-infra21:34
clarkbcloudnull: I think that means your new base job is leaving out the bits that configure things to use our log server21:35
openstackgerritMerged openstack-infra/zuul-jobs master: Add helpful error message about required-projects  https://review.openstack.org/50857621:35
clarkbcloudnull: I would look at the other base jobs to see what they include related to log server data21:35
*** srobert has quit IRC21:35
*** baoli has quit IRC21:35
*** thorst has quit IRC21:35
clarkbcloudnull: though you use the legacy-base as your parent so now I'm just confused21:36
boris_42_clarkb: just one short qustion about set -x and set -e should I put these commands before shebang or inside it ?21:37
jeblairclarkb, cloudnull: node_failure means unable to get a node from nodepool; bad things happened earlier today; is it very recent?  (since last zuul restart)21:37
*** hemna_ has quit IRC21:38
cloudnulli ran the job like an hour ago21:38
cloudnullmaybe it was pre restart ?21:39
jeblaircloudnull: i think the restart was a few hours ago now; it would probably be good to investigate the failure then.  maybe an infra-root can look at it?21:40
clarkbfungi: http://logs.openstack.org/02/508302/3/check/legacy-tempest-dsvm-neutron-multinode-full/b3ea860/logs/etc/nodepool/sub_nodes_private.txt.gz the empty sub nodes private file is why multinode isn't working21:40
clarkblooking to sort that out now21:40
mordredclarkb: the nodeset change should fix that21:42
mordredclarkb: https://review.openstack.org/#/c/508568/21:43
clarkbmordred: ya just read the when on that and it relies on the group21:43
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Map pipeline precedence to nodepool node priority  https://review.openstack.org/50861321:43
mordredclarkb, jeblair: now with a test!21:43
mordredShrews: your path to add change id into the node request would make reading the asserts in that test nicer21:45
jeblairmordred: generally lgtm, question inline21:46
boris_42_mordred: hi there, can you elaborate about  mordred | fungi: so we should turn line 29: #/bin/bash -xe - into set -x and set -e - and then add exectutable: /bin/bash21:46
boris_42_mordred: not sure that I understand why #!/bin/bash -xe won't work ..21:47
clarkbboris_42_: because it is being executed by ansible as a shell script now. And by default it will use sh21:47
mordredboris_42_: sure! it's because putting a shebang line into an ansible shell: block doesn't work21:47
clarkbboris_42_: basically the shebang is no longer interpreted21:47
boris_42_ah ok21:47
jeblairmordred, clarkb: i'm chasing down a branch matcher problem i observed when working on the project-template thing21:48
mordredboris_42_: so if we want it to run under bash, we need to set that in the executable: parameter, and if we want -x or -e we need to use set -e and/or set -x ... we did this on the conversion for most of the jobs already ...21:48
mordredboris_42_: but because the shebang line in that script happened to be misformed, our parser didn't catch it (oops)21:48
mordredjeblair: kk21:48
boris_42_@mordred gottcha21:48
boris_42_going to refactor that piece21:49
openstackgerritMerged openstack-infra/project-config master: Adjust branches for OSC jobs  https://review.openstack.org/50350021:49
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Map pipeline precedence to nodepool node priority  https://review.openstack.org/50861321:49
mordredjeblair: ^^ fixed21:49
clarkbbah new patchset before i could post my comments21:50
mordredclarkb: sorry21:50
clarkbin any case not worth a -1 but had a couple things21:50
*** esberglu has joined #openstack-infra21:50
*** mat128 has joined #openstack-infra21:52
*** thorst has joined #openstack-infra21:52
*** hashar has quit IRC21:54
openstackgerritBoris Pavlovic proposed openstack-infra/openstack-zuul-jobs master: Do not use shebang in rally legacy jobs  https://review.openstack.org/50862221:55
boris_42_okay fixed it ^21:55
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Map pipeline precedence to nodepool node priority  https://review.openstack.org/50861321:56
*** esberglu has quit IRC21:56
mordredclarkb: I went ahead and fixed your request for comments- and reordered the changes so that sequence and priority were in different order to make it clearer21:56
*** thorst has quit IRC21:56
mordredboris_42_: looks great!21:57
openstackgerritDean Troyer proposed openstack-infra/project-config master: Remove python-aodhclient as g-r.txt has aodhclient  https://review.openstack.org/50862621:57
openstackgerritBoris Pavlovic proposed openstack-infra/openstack-zuul-jobs master: Remove unused shebang from legacy jobs  https://review.openstack.org/50862721:59
clarkbmordred: approved, thanks21:59
* clarkb reviews boris_42_'s change21:59
openstackgerritMonty Taylor proposed openstack-infra/infra-manual master: Update project creators guide with zuul v3 information  https://review.openstack.org/50859622:00
clarkbmordred: ^ reminds me, is docs publishing working yet?22:00
*** ijw has joined #openstack-infra22:01
*** Swami has quit IRC22:01
*** ijw has quit IRC22:02
*** bobh has quit IRC22:05
SamYapleis zuul down right now?22:05
SamYaplezuulv3.openstack.org/status.json just hangs22:05
*** tpsilva has quit IRC22:05
mnaserSamYaple i can confirm that behaviour on my side as well22:07
*** slaweq has quit IRC22:08
clarkbmordred: comments on 50859622:09
* clarkb goes to look at zuulv3.o.o22:09
clarkbI confirm only zuul-web is running22:09
clarkbjeblair: ^ you aren't in the process of restarting zuul are you?22:09
jlvillalOn the POST_FAILUREs I see on: https://review.openstack.org/508287  Should I wait until next week for a fix?22:10
clarkbjournalctl and the zuul logs don't seem to know why zuul-scheduler isn't running22:10
SamYaplenow its 503'ing22:11
SamYapleso progress!22:11
clarkbSamYaple: I think that means apache has noticed22:11
clarkbjlvillal: yes there is a fix for that specific issue22:11
clarkbjlvillal: just a matter of getting it merged22:11
mnaserclarkb systemctl status might say reason for service exit?22:11
mordredclarkb, jlvillal: http://paste.openstack.org/show/622352/22:12
mordredgha. jeblair ^^ ... don't konw if that's fatal or not- but it's the most recent error in the log22:12
mordredclarkb, jeblair: nope - thathappens from time to time22:14
clarkbmnaser: no luck fro mthat, has the web server and nothing about scheduler22:14
jlvillalclarkb, mordred Thanks22:14
clarkbmordred: jeblair thinking we might consider merging those zuul patches, then get scheduler running again?22:14
mnaserclarkb22:14
clarkbbut ya I'm not finding anything in logs about why it stopped running oh /me looks at syslog maybe it was oom22:14
mnaseroops, sorry, early enter, not sure then :(22:14
clarkbyup OOM22:15
*** slaweq has joined #openstack-infra22:15
mordredawesome22:15
boris_42_clarkb: mordred is this just race http://logs.openstack.org/76/507276/5/check/legacy-rally-dsvm-manila-multibackend/3de3226/logs/devstack-early.txt.gz#_2017-09-29_19_05_07_119 install_from_lib seems doesn't work ..22:15
clarkbboris_42_: ianw was working to fix that22:16
clarkbboris_42_: I don't understand all the details but I think there is a mmailing list thread22:16
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Protect against builds dict changing while we iterate  https://review.openstack.org/50862922:16
clarkbwe don't have swap on zuulv3.o.o22:17
mordredclarkb: yes - that is correct - I believe we spun that server up before fixing launch node to do the fix-swap dance22:18
fungiclarkb: yeah, i mentioned that earlier in the week but at the time it wasn't using more than 50% of its available ram22:18
clarkbmaybe we should fix that then possibly merge some zuul fixes then start it again?22:18
openstackgerritBoris Pavlovic proposed openstack-infra/openstack-zuul-jobs master: Export missing projects in Rally legacy jobs  https://review.openstack.org/50863022:18
*** lukebrowning has quit IRC22:18
SamYaplecan I statically define the path for the playbook with zuulv3?22:18
fungilooks like the zuul-scheduler process was up over 18gib when the oom-killer decided to take action22:18
clarkbSamYaple: yes I think there should be examples of that22:18
boris_42_@clarkb okay thanks, it's good that someone is working on it22:19
mordredwe actually don't have /dev/xvde1 mounted anywhere22:19
SamYapleok ill search for it22:19
boris_42_@clarkb i fixed a bit more of jobs22:19
clarkbSamYaple: ya look in openstack-zuul-jobs and grep for playbook22:19
fungimordred: yes, i believe the server was built back when swap setup wasn't working with our launch script22:19
mordredfungi, clarkb: we don't really need much disk on that box - we could just turn all of /dev/xvde1 into swap22:19
SamYapleclarkb: found it in the docs https://docs.openstack.org/infra/manual/zuulv3.html#ansible-playbooks22:20
mordred(as an easy way to deal with that, since we're down anyway)22:20
clarkbarg /me has to deal with kids for a bit. But ya I think we should work on rough plan above (not sure how safe those changes are did any pass unittests before it crashed?)22:20
* fungi has no idea what the modern limits of a swap partition size are, nor what negative repercussions there might be if you gave the kernel that much22:20
clarkbif we want to do a bunch probably 2x memory is plenty22:20
mordredclarkb, fungi, jeblair: also - I think https://review.openstack.org/508629 may help in some cases since we're seeing that exception from time to time in the logs22:20
mordredfungi: should we just run our normal swap script?22:21
*** mat128 has quit IRC22:21
fungii believe so, yes22:21
mordredclarkb: also - I agree re: landing the zuul patches - especially the one from jeblair22:21
mordredok. I'm going to do the swap bit right now22:21
fungithough may want to move /opt contents out of the way before running that22:21
fungiand then put them back after22:21
fungimordred: ^22:22
*** jtomasek has quit IRC22:22
mordredfungi: ++22:22
fungifewer surprises22:22
mordredoh - the script does that for us22:22
fungiahh, nice22:22
*** jtomasek has joined #openstack-infra22:23
*** d0ugal has quit IRC22:23
fungifor some reason i thought it ended up blowing away the contents of /opt on one of the mergers? executors? recently when we were fixing those up22:23
mordredfungi: that was a different dance - on the executors we want the volume mounted on /var/lib/zuul instead of in /opt/22:23
fungimaybe that behavior got fixed22:23
fungiahh22:23
funginm22:23
mordredok. swap enabled - volume mounted on /opt and fstab updated22:24
fungiso we have 1x ram as swap, which is fine since each is around 16gb22:24
*** lukebrowning has joined #openstack-infra22:24
SamYaplefrom .zuul.yaml is there a way for me to pass variables in some form to the jobs?22:25
mordredSamYaple: yup. it's "vars" on the job ...22:25
jeblairclarkb, mordred: that doesn't look like an especially sustainable line on the zuul memory graph22:26
mordredSamYaple: https://docs.openstack.org/infra/zuul/feature/zuulv3/user/config.html#attr-job.vars22:26
SamYaplebeautiful. exactly what i was hoping for22:26
*** lukebrowning has quit IRC22:26
*** armax has quit IRC22:26
mordredjeblair: agree22:27
fungimemory growth looks like it's been pretty steady today, yeah http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=63979&rra_id=all22:27
*** lukebrowning has joined #openstack-infra22:27
*** yamamoto has joined #openstack-infra22:27
*** jtomasek has quit IRC22:27
fungihard to tell from the curve there whether it would have topped out around 16gb were there room to page some stuff out22:28
jeblairthe bug that i'm working on is another serious flaw related to project-templates; it will certainly produce erroneous configuration22:28
jeblairit's not going to be fixed by adding a colon22:28
superdanI think we still need this https://review.openstack.org/#/c/508519 to unblock nova, right? looks like it has had two +2s today, just not at the same time.. :)22:28
*** dhajare has quit IRC22:28
*** gouthamr has quit IRC22:29
openstackgerritMonty Taylor proposed openstack-infra/project-config master: Remove broken openstack-tox-pep8 variant  https://review.openstack.org/50854222:29
superdanI hear +2s are happiest in pairs22:30
jeblairmordred, fungi, clarkb: in the column of "reasons to roll back" i think we have: 1) project-template bug #2; 2) dynamic reconfiguration is very slow; 3) unsustainable memory use (leak?)22:30
jeblairboth #2 and #3 are things that would benefit from a period of zuulv3 running check jobs only22:30
fungiif we roll back, what is the interim plan? would we unfreeze project-config?22:31
*** ijw has joined #openstack-infra22:31
jeblair#1 can probably be fixed in a day or so22:32
fungii worry that there are at least some fixes which have been made directly to migrated jobs now, which rerunning the migration script will lose22:32
*** baoli has joined #openstack-infra22:32
jeblairfungi: notably required-projects22:32
fungiyup, and things like rally's typo'd shebangs22:32
*** yamamoto has quit IRC22:32
mordredfungi, jeblair: yah- I'm pretty sure re-running the migration script would cause way more problems than it would solve22:33
jeblairit would cause a lot of problems22:33
SamYaplei would take hourly restarts of zuulv3 over rolling back. people are just coming around to how to do things again22:33
boris_42_fungi: actually we can work on moving Rally jobs to Rally repo22:33
fungiso we should likely leave zuul v2 configuration/jobs frozen if we roll back?22:33
boris_42_fungi: and fixing them there22:33
jeblairSamYaple: hourly restarts aren't long enough to merge a single change22:33
SamYaplejeblair: for you maybe! my jobs take 5 minutes to run! (but point taken)22:34
*** baoli_ has joined #openstack-infra22:34
jeblairSamYaple: for *us*22:34
mnaserfwiw i have an empty weekend and i'd be more than happy to pick up any work to help clear this out and avoid a zuulv2 rollback :>22:34
SamYaple:)22:34
mordredjeblair: it was actually steady memory wise, it seems, until the most recent restart - so we may have a fairly small amount of things to examine to find the memory leak22:35
clarkbya I think if we rolled back we'd run v2 and v3 side by side22:35
jeblairmordred: i believe the steady state was due to zookeeper being broken and nothing happening22:35
clarkbv3 doing check only possibly22:35
jeblairmordred: i think as long as zuulv3 is active it's leaking22:35
clarkband not do a migration script again just roll both sides forward22:36
fungii'm not super keen on rollback for fear of lost traction, but also don't want people losing their weekends to this22:36
jeblairi will not be able to work on this over the weekend22:36
*** baoli has quit IRC22:37
fungiright. we've been hitting this pretty hard all week, so letting v2 take back gating and do check side by side with v3 into early next week sounds compelling22:37
jeblairit looks like we got 6 hours of use out of v3 before it ran out of ram22:38
superdan5.5 hour cron restart and call it good!22:38
jeblairswap will reduce the likelihood of oom killer, but it still make tank performance enough for us to need to restart22:38
jeblairit's also worth noting that it took a couple of hours (because of the reconfig slowness) to re-enqueue changes, so expect a backlog of a couple hours after every restart.22:40
fungii'm leaning toward rollback at this point so we can focus on fixing what we know is broken, at least22:42
mnaserwouldn't there be a large # of resource starvation as both jobs attempt to run22:42
fungibut i still think v2 configuration needs to remain basically frozen22:42
mnasersplitting resources across both nodepools22:42
fungimnaser: yeah, we'd probably have to give ~2/3 of the quota to v222:43
mordredfungi: I think both configs will largely need to remain frozen - at least in terms of what's in what pipeline22:43
jeblairmnaser: v3 would use less since we'd be doing check only22:43
jeblairand i'd expect us to allow it to get fairly backlogged22:43
mordredfungi: well - I take that back - iterating forward on v3 jobs sohuld be fine22:44
fungiif anything, a bit of backlog there probably helps expose issues22:44
SamYaplewont a rollback right now break projects that converted there stuff to v3?22:44
jeblair(actually running jobs is probably not strictly necessary to expose these issues)22:44
clarkbSamYaple: no all your old jobs are still there22:44
fungiunless we approved changes to delete the v2 jobs, which i doubt22:44
*** rbrndt has quit IRC22:44
*** esberglu has joined #openstack-infra22:45
SamYapleah i suppose that makes sense22:45
clarkbhow difficult would it be to run v3 as check only?22:45
mordredSamYaple: not realy - the v2 version fo the jobs ... yah - that clarkb said - and you should still be able to iterate on your local versions of the jobs - it just won't gate on them22:45
mnaseronly thing to keep in mind is to make sure zuul doesnt vote even if it runs on check22:45
jeblairmnaser: it's okay for it to vote, it's a different user than v322:45
mordredmnaser: zuul voting in check is fine - yah, that22:45
mnaserso jenkins can leave +1 and zuul -1 and it'll allow it to merge? dont know acls much but hey if it'll be okay you know better :)22:46
mordredclarkb: so - that's the part I'm a smidge concerned about ... bcause we can write a script to remove pipeline config for things that aren't check ...22:46
*** xarses has quit IRC22:46
fungithough this is going to trip up those third-party-ci users who had configured their systems to only run jobs after zuul voted, since we're toggling the account name back again22:46
mordredclarkb: but if peope make changes to check jobs, re-applying the removed non-check entries will be very hard22:46
fungimnaser: correct, zuul v2 ("jenkins" account) is only configured to care about its own verify +122:47
fungiso it'll ignore the zuul v3 ("zuul" account) votes22:47
mordred(I mean, not impossible, ut the re-applied gate entries will not reflect any changes made to check entries)22:47
jeblairif folks want to keep limping along on v3, that's fine.  i just wanted to be up front that if we were considering flipping the switch right now, i would not do it.  and i'd say we probably need 2 weeks to deal with the performance/memory issues.  they will not be easy to find or fix.22:48
SamYapleas long as i can hammer out my new zuulv3 jobs this weekend, i suppose im ok with a solution that is easiest22:48
*** slaweq has quit IRC22:48
fungiSamYaple: with v3 voting in the check pipeline alongside v2, you can iterate on most jobs (aside from release, post, periodic) pretty easily22:49
clarkbmordred: you mean in layout type stuff as we'd have to reconcile th delta?22:49
SamYapleyea thats what it sounds like. and for this particular repo its noop on v2, so meh. i guys i dont have much of a stake in this race22:49
mordredjeblair: my biggest concern with rolling back is the logistical flux that I think will be hard for people to reason about. but I also agree that finding and fixing the memory and performance issues is not likely to be straightforward or easy22:49
fungiSamYaple: assuming v3 is running well enough to execute jobs and report back at any given point in time that is22:49
SamYaplefungi: fair point22:50
*** esberglu has quit IRC22:50
fungileaving v2 job configuration frozen for a couple weeks will be a tough sell, but i expect the community will understand the reasons22:51
cloudnullI'm at a bit of a loss regarding what I'd need to change here: https://review.openstack.org/#/c/508281 test commits are still resulting in "node_failure", i have no idea why.22:51
jeblairthere is a possibility that it's not so much of a memory leak as just using an inordinate amount of memory.  maybe it will level off at 20G.22:51
mordredwhich is to say - I think I lean *slightly* toward limping forward, but not so much that i'd try to persuade anyone to change their mind if they leaned the other direction22:51
mordredjeblair: this is an excellent point22:52
mnaserwill it actually really slow down if you it swaps out a lot? it could just be idle memory that is largely unused :X22:52
fungii'm more concerned about the correctness of job selection, with whatever new bug it is jeblair has spotted which will take significant engineering to fix22:52
jeblairfungi: expect a fix for that within a day; i wouldn't let it drive a rollback decision.22:53
*** thorst has joined #openstack-infra22:53
clarkbI am booked with family stuff this weekend and am traveling to a conference late next week22:53
clarkbbut otherwise happy to keep rolling forward22:53
mordredwe could leave it in place over the weekend and see how memory grows or doesn't with the swap in place, and stage a rollback if needed early monday (since we'll need to figure out cleanly splitting out the projects.yaml file anyway)22:53
clarkbmy biggest concern is making sure the rest of openstack is able to get their work done too22:54
mordredyah22:54
mordredI agree with that22:54
clarkbmordred: thats a good point since we are quiet over weekend for most part22:54
fungii'm willing to check in periodically over the weekend and restart/reenqueue zuul if memory pressure reaches the danger zone, but can't commit to much more... and also wonder what the load is going to look like on it again when monday rolls around22:54
clarkbits nit  huge loss to delay any rollback based on more data gathering22:54
SamYaplecan we add swap and see if memory levels out over the weekend?22:54
mordredyah - I don't intend to WORK on things over the weekend, but I can check in periodically for a restart if needed if it'll help us gather data22:54
mordredSamYaple: we have added the swap - so that's in place22:55
fungiSamYaple: mordred added swap moments ago22:55
jeblairmordred: how much?22:55
SamYaplegot it. so that will be good data22:55
jeblair16 i see it now22:55
mordredjeblair: 16G22:55
fungi16gb but we can up it22:55
mordredjeblair: we have plenty of disk on the volume we could reallocate if we wanted22:55
SamYapleany reason to not throw the entirety of /etc/xvde1 at swap?22:55
fungithough if it burns through most of that then i don't think more swap is the answer anyway22:55
jeblairyeah, let's actually do that.  i normally would avoid this, but, if it turns out we're leaking idle memory and it doesn't impact performance, we'll get more data and save a debugging cycle22:56
*** yee379 has quit IRC22:56
mordredfungi: well - it might be if we learn that zuul levels off at 45G and we just need to boot a REALLY big server22:56
*** yee379 has joined #openstack-infra22:56
SamYapleright tahts my point22:56
jeblairobviously not a long term answer, but may help answer questions faster and possibly keep limping longer22:56
fungii can buy the moar datas argument22:56
mordredjeblair: by "that" do you mean "just add the whole volume as swap" ?22:56
fungiyeah22:56
jeblairmordred: well, more, maybe not whole22:56
*** wolverineav has quit IRC22:57
fungiit's, what, an 80gb device?22:57
*** mat128 has joined #openstack-infra22:57
jeblairmordred: we're not using it for anything else?22:57
SamYapleive had 1TB swap before... more is fine surely22:57
*** dizquierdo has joined #openstack-infra22:57
mnaseruh if this server is at rax i'd suggest grabbing swap from the local drive because that wont go over the network22:57
mordred133G22:57
mnaser(maybe make a swap file?)22:57
mordredjeblair: we are not22:57
mordredjeblair: it was completely unmounted before we ran the swap script22:57
jeblairmordred: okay sure, whole thing i guess :)22:57
mordredok. let me do that real quick22:58
*** mat128 has quit IRC22:58
jeblairmnaser: i think this is a local volume22:58
jeblairmnaser: i think this is the "ephemeral volume" you get with rax servers22:58
mnaserokay cool, that's ideal22:58
fungiwe'll want to relocate the content for /opt back off it before we destroy that volume though22:58
fungi(obviously)22:59
mnaserim sure this is really obvious but i guess this memory leak started since the latest restart of zuul?22:59
SamYaplemnaser: memory leak OR lots of memory usage.22:59
mnaserhttp://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=63979&rra_id=all22:59
mnaserwell it seemed to be pretty stable22:59
SamYapleone of those is easier to solve :)22:59
mnaseruntil the latest restart which changed the pattern of memory usage significantly23:00
fungimnaser: one interpretation is that it started once we got it under significant load by fixing the zk timeout issues23:00
clarkbto make sure its clear, increase swap on zuulv3.o.o, use weekend to collect data with minimal impact, decide if rollback is necessary monday ish is the rough plan?23:00
fungiclarkb: that sounds right to me23:00
SamYaple+123:00
mordredok. we have 150G of swap now23:02
fungiand did we want to go ahead and merge any of the pending zuul patches before restarting? all of them? specific ones?23:02
*** dizquierdo has quit IRC23:02
mordredfungi: I would like to merge jeblair's patch at least23:03
clarkb++23:03
fungithe [:] patch? i agree that one's pretty critical23:03
jeblairheh, it will use (a little) more memory :)23:03
mordredhttps://review.openstack.org/#/c/508629 and https://review.openstack.org/#/c/508613 are worth looks too23:03
mordredas is https://review.openstack.org/#/c/508620/23:04
*** slaweq has joined #openstack-infra23:04
mordredbut I defer to jeblair on those - I've got the elevated bit - want me to merge the jeblair change?23:05
fungiyeah23:07
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Fix bug with multiple project-templates  https://review.openstack.org/50861223:07
jeblair629 is a good fix but not having it won't cause any problems; let's leave it for normal process23:08
jeblairother two are good candidates for force-merge before restarting23:08
fungi508553 would also be nice for any restarts, but can always use a local fork install in a venv with that23:08
fungier, i guess it doesn't even need installing. standalone utility script23:09
*** hongbin has quit IRC23:09
fungiso can just let it merge normally23:09
superdanhttps://review.openstack.org/#/c/508519/623:10
superdanoops23:10
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Map pipeline precedence to nodepool node priority  https://review.openstack.org/50861323:10
openstackgerritMonty Taylor proposed openstack-infra/zuul feature/zuulv3: Protect against builds dict changing while we iterate  https://review.openstack.org/50862923:10
mordredhad to rebaes the pipeline precedence one - we both added tests23:10
jeblairmordred: rebase looks good23:11
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Map pipeline precedence to nodepool node priority  https://review.openstack.org/50861323:11
openstackgerritMerged openstack-infra/zuul feature/zuulv3: SourceContext improvements  https://review.openstack.org/50862023:12
fungido we need to restart executors for any of those too, or just the scheduler?23:12
openstackgerritMerged openstack-infra/zuul feature/zuulv3: Update zuul-changes script for v3  https://review.openstack.org/50855323:12
jeblairjust sched23:12
mordredk. I thnk that's everything - shall I do a kick from puppetmaster?23:13
jeblairmordred: sounds good23:13
fungithat does seem like the next step23:13
mordredrunning puppet23:13
jeblairi have rm'd the pidfile and run 'service zuul-scheduler stop' so that systemd will not be confused23:14
jeblairmordred: you should be able to just start it normally when ready23:14
fungigood call23:14
mordredok. it's done23:15
fungisad that systemd gets confused by services dying23:15
mordredstarting ...23:15
clarkbfungi: its because zuul manages its own pid file iirc23:15
fungiclarkb: yeah, and backgrounds itself23:15
clarkba sigkill handler likely would be good23:15
clarkbbut later :)23:16
mordredzuul scheduler started23:16
fungithat was ~70 minutes from the oom event at 22:0723:19
jeblairi think we've exceeded cacti's swap limit; i will update it23:20
fungiin case anyone's looking back at the timeline later23:20
SamYaplefungi: i run systemctl restart for that reason. i think that was a design decision23:20
*** bobh has joined #openstack-infra23:21
openstackgerritAndrea Frittoli proposed openstack-infra/devstack-gate master: Basic processing of test results  https://review.openstack.org/50798023:27
openstackgerritAndrea Frittoli proposed openstack-infra/devstack-gate master: Throwaway patch to check subunit file processing  https://review.openstack.org/50817123:27
*** rossella_s has quit IRC23:28
*** yamamoto has joined #openstack-infra23:29
clarkbwe should recheck various fix changes23:30
clarkbmordred groups one and the more reliable tox logs one23:30
*** bobh has quit IRC23:31
*** rossella_s has joined #openstack-infra23:32
*** yamamoto has quit IRC23:35
*** lbragstad has quit IRC23:35
*** slaweq has quit IRC23:36
*** bobh has joined #openstack-infra23:37
mnaserhate to be the bearer of bad news but i think no new jobs are getting queue'd?23:37
superdanmnaser: they are for me23:38
mnasersuperdan i swear as you said that my jobs appeared23:38
mnaserso uh, thanks!23:38
superdanmnaser: you. are. welcome.23:38
fungiit's his superpower23:38
*** esberglu has joined #openstack-infra23:39
*** fried_rice is now known as efried_thbagh23:40
*** esberglu has quit IRC23:44
*** bobh has quit IRC23:44
clarkbhttp://logs.openstack.org/18/505418/13/check/openstack-tox-py27/0b6f094/zuul-info/ says xenial \o/23:45
mordredclarkb: WOOT23:48
clarkbthough nnet job still running against master apparently23:48
clarkbmordred: before I edit https://review.openstack.org/#/c/508519/6/zuul.d/zuul-legacy-project-templates.yaml can you check my comment there that it is sane?23:49
*** slaweq has joined #openstack-infra23:50
clarkbmordred: looking at http://logs.openstack.org/12/506312/7/check/legacy-tempest-dsvm-nnet/54fc679/zuul-info/inventory.yaml it didn't apply the branch restriction23:51
clarkbit says branches: {MatchAny:{BranchMatcher:master}} so guessing that is another configuration bug?23:51
*** zhurong has joined #openstack-infra23:52
SamYapleit cannot find my custom playbook. https://review.openstack.org/#/c/508625/ , im at a loss23:53
mordredclarkb: yes - re https://review.openstack.org/#/c/508519/6/zuul.d/zuul-legacy-project-templates.yaml23:53
*** caphrim007_ has quit IRC23:54
*** caphrim007 has joined #openstack-infra23:54
mordredclarkb: yes! this is, in fact, part of the problem with the thing jeblair is now working to fix23:55
jeblairyeah, i've tracked it down to project-templates getting branch matchers when they shouldn't23:55
clarkbok, I'll work on getting mriedems workaround in place then23:56
jeblairthis is probably also the cause of the earlier neutron not running stuff on stable...23:56
jeblairwhat workaround?23:56
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Fix Kolla-Kubernetes missing deps.  https://review.openstack.org/50859723:56
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Drop non-legacy Puppet project templates  https://review.openstack.org/50833323:56
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Update ubuntu-xenial-2-node to match centos-7-2-node  https://review.openstack.org/50856823:56
openstackgerritMerged openstack-infra/openstack-zuul-jobs master: Remove spurious change to tox.ini  https://review.openstack.org/50860723:56
clarkbwell it may not workaround it now that I think about it23:56
clarkbjeblair: but 50851923:56
clarkbjeblair: it cleans things up around that job23:56
jeblairclarkb: hrm, does the nnew-newton job have a branch matcher?23:57
jeblairnnet-newton23:58
clarkbjeblair: on the job itself I think23:58
clarkbrather than on a template23:58
jeblairclarkb: hrm, i don't see one23:58
*** caphrim007 has quit IRC23:59
*** Sukhdev has quit IRC23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!