Tuesday, 2014-05-27

bknudsonI thought the backup time was going to be changed so gerrit didn't hang at this time.00:00
anteayaI am trying to vote +1 but the screen is frozen00:01
bknudsonok, it's back00:01
anteayayeah, I hit it too00:01
mordredwoot. now I can bother jhesketh for reviews on my things00:11
mordredjhesketh: review all of my things!00:11
jheskethmordred: this is a two way street my friend ;-)00:11
mordredjhesketh: blast. you actually know how this works :(00:12
jheskethI have plenty of unmerged patches :-)00:12
mordredjhesketh: really? I actually see none from you in my revew queue...00:12
anteayajhesketh: look at that +2, very nice00:12
anteayasince you are both here: https://review.openstack.org/#/c/92475/00:12
anteayathat is the initial commit for the infra-manual repo00:13
anteayawhich has licensing stuff in it00:13
jheskethmordred: https://review.openstack.org/#/q/owner:%22Joshua+Hesketh%22+status:open,n,z00:13
* mordred wonders if he is missing project watches00:13
anteayayou don't have to review it now, but I will keep bugging you00:13
jheskethheh, anteaya also knows how this works...00:14
anteayawhen the crowd gathers...00:14
mordredthe diff view shows images00:14
mordredhas it always done that?00:14
anteayaI don't think so, no00:14
anteayano we used to have to download the patch and open with a browser00:15
mattoliverauWow, awesome! Now I can submit all my code in pictures!00:15
anteayaha ha ha00:15
anteayaI like pictures00:15
* mattoliverau wonders if it also supports animated gifs... this could be fun ;P00:16
jheskethmordred: I saw the patches to allow imgs, it was cool00:18
anteayawe need to ban this person from all #openstack channels00:18
anteaya(s)he was in -meeting earlier00:19
mordredhrm. I smell an accessbot feature :)00:19
anteayajhesketh: do you have links?00:19
jheskethanteaya: links to which now?00:19
mordredanteaya: there is a small flaw in that patch - I had to -1 it - if you wanted to correct it and resubmit, I betcha it would not be offensive00:20
anteayajhesketh: the patches to allow images00:20
jheskethoh they merged a while back00:20
jheskethit was just a config change to gerrit00:20
anteayaI'll nuke the blank space I noticed as well00:20
ianwfungi: yeah, my fault from https://review.openstack.org/#/c/93862/ (Handle Workflow in comment matching) i think00:24
openstackgerritAnita Kuno proposed a change to openstack-infra/infra-manual: Initial commit  https://review.openstack.org/9247500:26
*** matsuhashi has joined #openstack-infra00:26
mordredjhesketh: did you see the "Hide CI comments in gerrit" mailing list message?00:27
mordredthere is a javascript thing: https://gist.github.com/rgerganov/35382752557cb975354a00:27
jheskethmordred: yes I did00:27
mordredmakes me wonder if perhaps it might be the sort of thing we might want to just include in our local javascript header for gerrit in general00:27
anteayathere is developer support for it00:28
anteayabased on feedback from the summit00:28
jheskethmordred: that was what I was thinking00:28
anteayafolks were also requesting a seperate column for ci reports00:28
jheskethit's a good first pass and probably the easiest solution, but it'd be neat to pull the comments to the side in the future00:29
jheskethalthough that probably requires quite a bit of UI redesign00:29
mordredI'm going to respond to the list00:29
*** nati_uen_ has quit IRC00:30
jheskethmordred: if my JS patches ever land I can get CI status' overlaying the pages00:31
* jhesketh is less than subtle before coffee sorry00:31
mordredjhesketh: oh. well00:31
mordredjhesketh: it seems your patch is WIP'd00:32
jhesketheh? which one...00:32
mordredyour zuul status overlay00:33
*** matsuhashi has quit IRC00:33
jheskethoh right, yes, that's WIP until all the zuul javascript patches are merged00:34
*** matsuhashi has joined #openstack-infra00:34
mordredoh. gotcha00:34
jheskethmordred: Here's the tail https://review.openstack.org/#/c/91316/00:35
*** nati_uen_ has joined #openstack-infra00:35
mordredjhesketh: I will work through those ... you may find https://review.openstack.org/#/c/90565/ and its ancestors interesting reading...00:35
anteayajenkins is happy with the first infra-manual commit: https://review.openstack.org/#/c/92475/200:36
anteayaanyone know what the zanata demo server is?00:36
jheskeththanks mordred00:36
*** zhiyan_ is now known as zhiyan00:38
*** nati_uen_ has quit IRC00:39
mordredanteaya: yes. pleia2 is working on it - it's step one in trying out zanata as a replacement for transifex00:41
anteayaah cool00:46
anteayaI've invited Carlos A. Munoz to join us on irc and chat about zanata00:48
anteayahe emailed the infra ml00:49
*** gokrokve_ has quit IRC00:51
*** gokrokve has joined #openstack-infra00:52
*** nati_uen_ has joined #openstack-infra00:52
*** nati_ue__ has joined #openstack-infra00:54
*** gokrokve has quit IRC00:56
mordredjhesketh: stack reviewed00:57
mordredanteaya: I have rebased my email response on top of yours00:57
*** igor_ has quit IRC00:57
*** nati_uen_ has quit IRC00:57
*** nati_ueno has joined #openstack-infra00:57
jheskethmordred: awesome, thanks :-)00:59
*** nati_ue__ has quit IRC00:59
anteayamordred: thanks for doing so00:59
*** nati_ueno has quit IRC00:59
*** nati_ueno has joined #openstack-infra01:00
openstackgerritA change was merged to openstack-dev/hacking: Include rule numbers in HACKING.rst  https://review.openstack.org/9347001:00
openstackgerritA change was merged to openstack-dev/hacking: Add Installation section to the readme  https://review.openstack.org/9347101:01
openstackgerritA change was merged to openstack-dev/hacking: Drop 'not in' and 'is not' tests from HACKING.rst  https://review.openstack.org/9347201:01
openstackgerritA change was merged to openstack-dev/hacking: update Commit Message guidelines  https://review.openstack.org/9347301:01
*** nati_ueno has joined #openstack-infra01:09
anteayamordred: thanks01:10
*** yaguang has joined #openstack-infra01:11
openstackgerritA change was merged to openstack-infra/zuul: Add in sparklines to status page pipelines  https://review.openstack.org/8492201:14
openstackgerritJoshua Hesketh proposed a change to openstack-infra/zuul: Fix up fetching jquery.visibility  https://review.openstack.org/9131601:15
openstackgerritJoshua Hesketh proposed a change to openstack-infra/zuul: Move status dom into js app for easy reuse  https://review.openstack.org/9504901:15
lifelessmordred: since you are around01:26
lifelessmordred: I want to have a CI job that runs devtest.sh --build-only01:26
lifelessmordred: this requires root (since it builds images) - I presume that makes it a dsvm job ?01:27
lifelessmordred: also have you read https://etherpad.openstack.org/p/infra-no-floating-ip-slaves ?01:28
*** david-lyle has quit IRC01:30
*** matsuhashi has quit IRC01:33
mordredlifeless: "dsvm" stands for DevStack VM - you just want a bare node01:33
mordredlifeless: and no - looking01:34
*** matsuhashi has joined #openstack-infra01:34
mordredlifeless: first response - we don't wat to add any more features that take advantage of any features of jenkins, since we're making jenkins go away. I don't think that's a problem - just pointing it out since the text talks about teaching jenkins about jumphosts01:35
mordredlifeless: I _think_ we already have some related work done though that you might be able to build off of01:36
mordredlifeless: specifically, the work done to enable multi-node devstack01:36
mordredlifeless: I don't see any general issues with the strategy though01:37
mordredlifeless: "By setting a ProxyCommand in ~/.ssh/config for all the hosts that connect to slaves, we can transparently trigger jump host use for openssh use. " seems potentially weird01:38
mordredlifeless: how do you see that working? when we nova boot a thing, we're going to get back the "public" ip and we're going to want to connect to that - how would we manage that? have each cloud have it's own 10.x subnet for it's 'public' addresses?01:40
mordredand then have an entry for that cloud's 10.123.* with the jump host config?01:40
*** oomichi has joined #openstack-infra01:40
*** alugovoi has joined #openstack-infra01:41
fungiianw: thanks for the refresher, actually--your incomplete change points out that my completion is also incomplete! ;) (fixing now)01:42
mordredlifeless: also, if you want to cry, you may want to look at: http://git.openstack.org/cgit/openstack-infra/nodepool/tree/nodepool/provider_manager.py#n4401:43
*** gokrokve has joined #openstack-infra01:44
openstackgerritJeremy Stanley proposed a change to openstack-infra/config: Broaden recheck/reverify pattern for vote matching  https://review.openstack.org/9561101:44
*** nosnos has joined #openstack-infra01:44
fungiianw: ^01:44
ianwfungi: not sure if it's worth inbuilt tests because it doesn't change much, but even just a comment saying "make sure the regex matches ...." and some examples might help01:45
mordredlifeless: we  have clouds that do not give us an indication as to what network range we shoudl use to talk to them01:45
mordredactually ...01:46
mordredfungi, corvus: then WE get to give it names for things, and that means we can stop guessing about which ip we should be conencting to, because we'd be creating and managing all of that01:47
mordredwe'd need to keep a special case for rackspace until they expose a neutron API - but actually going through and making the primary control path very neutron aware might-could get us to a pretty sane place01:47
clarkbmordred: yup. I think the tricky bit will be scaling down01:48
mordredclarkb: yup01:48
*** gokrokve has quit IRC01:48
clarkbsince we will fragment and cant really defrag01:48
lifelessmordred: not entirely sure01:49
mordredclarkb: yah. although maybe scaling down networks isnt' as important?01:49
fungiianw: honestly, i should have just looked at all the comment_filter lines01:49
clarkbbut we can just avoid it entirely01:49
mordredclarkb: since we're not using them throwaway01:49
clarkbmordred exactly01:49
lifelessmordred: but the jump host for jenkins would be on the same private network as the nodes01:49
clarkbbut people that pay for resources may want scale down01:49
mordredlifeless: right. that makes sense01:49
lifelessmordred: and whether that is a unique private network or a cloud-wide one is irrelevant01:49
mordredlifeless: I'm just asking about how nodepool itself knows to connect to a cloud's jump host when it gets a nova instance back if the jump hosts are configured in .ssh/config01:50
lifelessmordred: I think we'd want a 'use jump host' option in nodepool which could trigger simpler codepaths throughout01:50
mordredlifeless: we've got non-floating-ip code paths too - I'm not worried about that - although I agree with you - it's purely the "how do we map a host to the appropriate jump host" problem I'm concerned with01:51
mordredlifeless: and a mild concern that having that in .ssh/config and not nodepool.yaml might get ... confusing01:51
mordredlifeless: still working on specs repo01:52
mordredlifeless: I mean, specs repo exists- but the initial commit is not landed01:52
*** saschpe has quit IRC01:53
mordredlifeless: however, https://review.openstack.org/#/c/94440/ already has some children01:53
*** igor_ has joined #openstack-infra01:53
lifelessmordred: reworked the work items in the etherpad.01:53
mordredclarkb: now that I'm thinking about using both glance and neutron in nodepool, I REALLY want python-openstacksdk to exist01:53
*** alugovoi has quit IRC01:54
*** saschpe has joined #openstack-infra01:55
lifelessmordred: should I move this to the specs repo now ?01:56
mordredlifeless: sure!01:56
mordredthat way we can capture some of this01:56
lifelessthe one that isn't in the openstack namespace01:57
*** igor_ has quit IRC01:57
*** nati_ueno has joined #openstack-infra02:00
openstackgerritlifeless proposed a change to openstack-infra/infra-specs: Make use of IP per slave optional.  https://review.openstack.org/9562502:08
*** nati_ueno has quit IRC02:10
lifelessmordred: ^tada02:11
anteayalifeless: whitespace on line 2302:12
lifelessanteaya: thanks, but I'm sure there are many more issues than that02:14
anteayafair, that was the one I spotted02:16
*** nati_ueno has joined #openstack-infra02:19
lifelessjhesketh: you should hang in #tripleo :)02:19
jheskethlol, then I'd just have more things to do02:19
*** nati_ueno has quit IRC02:25
lifelessjhesketh: ... and? :)02:26
*** alugovoi has joined #openstack-infra02:26
mayu_who can tell that how does jenkins tell slave node to git pull the specific pathch set ?02:27
*** zhiyan is now known as zhiyan_02:29
*** david-lyle has joined #openstack-infra02:31
*** nati_ueno has joined #openstack-infra02:33
*** david-lyle has quit IRC02:35
*** lcheng_ has joined #openstack-infra02:35
jheskethlifeless: I'm not that gullible02:38
*** gokrokve has joined #openstack-infra02:41
*** gokrokve has quit IRC02:46
*** zhiyan_ is now known as zhiyan02:46
mordredmayu_: zuul prepares a set of repo states and passes those refs as env vars - then we have a script called "gerrit-git-prep" which reads those env vars and does the appropriate git actions02:50
mayu_thanks, mordred02:51
*** zhiyan is now known as zhiyan_02:52
*** zhiyan_ is now known as zhiyan02:52
mayu_@mordred: is there some references to the process ?02:53
*** signed8bit has joined #openstack-infra03:04
*** dims has quit IRC03:16
*** unicell has joined #openstack-infra03:29
*** unicell has quit IRC03:29
*** unicell has joined #openstack-infra03:29
*** david-lyle has quit IRC03:36
*** david-lyle has joined #openstack-infra03:38
*** signed8bit has quit IRC03:40
*** gokrokve has joined #openstack-infra03:42
*** gokrokve has quit IRC03:46
*** unicell has quit IRC03:47
*** bhuvan has joined #openstack-infra03:55
openstackgerritSteve Baker proposed a change to openstack-infra/devstack-gate: Enable dib service by default  https://review.openstack.org/9563604:01
*** zhiyan is now known as zhiyan_04:14
*** zhiyan_ is now known as zhiyan04:21
*** david-lyle has joined #openstack-infra04:24
*** nati_ueno has quit IRC04:31
*** nati_ueno has joined #openstack-infra04:32
*** gokrokve has joined #openstack-infra04:43
*** Longgeek has joined #openstack-infra04:45
jheskethmordred: ping04:46
*** nosnos has joined #openstack-infra04:46
*** gokrokve has quit IRC04:48
*** gokrokve has joined #openstack-infra04:48
*** e0ne has joined #openstack-infra04:48
*** igor_ has joined #openstack-infra04:55
*** ildikov has quit IRC04:56
*** lcheng_ has quit IRC05:01
*** hdd_ has quit IRC05:08
openstackgerritAndreas Jaeger proposed a change to openstack-infra/config: Run checklang gate only on master  https://review.openstack.org/9564205:11
*** vkdrao has joined #openstack-infra05:12
*** gokrokve has quit IRC05:17
*** gokrokve has joined #openstack-infra05:18
*** gokrokve has quit IRC05:22
*** starmer has joined #openstack-infra05:28
stevebakersdague: building an image during devstack gate adds ~11 minutes http://logs.openstack.org/17/95617/3/check/check-tempest-dsvm-neutron-heat-slow/7ef53b4/logs/devstacklog.txt.gz#_2014-05-27_04_15_46_07305:31
*** wenlock has joined #openstack-infra05:35
*** zhiyan is now known as zhiyan_05:40
*** ildikov has joined #openstack-infra05:40
*** zhiyan_ is now known as zhiyan05:41
*** lcheng_ has joined #openstack-infra05:43
*** Ryan_Lane has joined #openstack-infra05:44
*** wenlock has quit IRC05:46
*** _nadya_ has joined #openstack-infra05:46
*** gokrokve has joined #openstack-infra05:48
*** yfried__ has joined #openstack-infra05:49
*** lcheng_ has quit IRC05:52
*** gokrokve has quit IRC05:54
*** igor_ has joined #openstack-infra05:56
*** Ryan_Lane has quit IRC05:57
*** yfried__ has quit IRC06:01
*** yfried__ has joined #openstack-infra06:01
*** yfried__ has quit IRC06:02
*** yfried__ has joined #openstack-infra06:03
*** yfried__ has quit IRC06:04
*** yfried__ has joined #openstack-infra06:04
*** yfried__ has quit IRC06:04
*** yfried has joined #openstack-infra06:05
*** _nadya_ has quit IRC06:08
*** yfried has quit IRC06:11
*** _nadya_ has joined #openstack-infra06:11
*** W00dy_ has quit IRC06:18
*** dstanek_zzz is now known as dstanek06:24
*** W00dy_ has joined #openstack-infra06:42
*** gokrokve has joined #openstack-infra06:49
*** rgerganov has joined #openstack-infra06:52
*** camunoz has joined #openstack-infra06:54
*** gokrokve has quit IRC06:54
*** david-lyle has quit IRC06:54
*** alugovoi has quit IRC06:55
*** igor_ has joined #openstack-infra06:57
*** bhuvan has quit IRC06:58
*** jhesketh has joined #openstack-infra06:59
*** igor_ has quit IRC07:01
*** afazekas has joined #openstack-infra07:08
mattoliverauNight all, have a great night/day everyone!07:16
*** jcoufal has joined #openstack-infra07:17
*** salv-orlando has joined #openstack-infra07:17
*** nati_ueno has quit IRC07:18
*** nati_ueno has joined #openstack-infra07:19
*** hashar has joined #openstack-infra07:19
*** Longgeek has joined #openstack-infra07:22
*** skolekonov has joined #openstack-infra07:22
*** nati_ueno has quit IRC07:23
*** wenlock has joined #openstack-infra07:25
*** flaper87|afk is now known as flaper8707:26
openstackgerritAndreas Jaeger proposed a change to openstack-infra/config: Extract translations for log messages  https://review.openstack.org/9537707:27
*** praneshp has quit IRC07:28
openstackgerritAndreas Jaeger proposed a change to openstack-infra/config: Extract translations for log messages  https://review.openstack.org/9537707:29
*** markmcclain has quit IRC07:35
*** wenlock has quit IRC07:35
*** talluri has joined #openstack-infra07:37
*** ihrachyshka has joined #openstack-infra07:38
*** matrohon has joined #openstack-infra07:43
*** Clabbe has joined #openstack-infra07:48
*** amotoki has joined #openstack-infra07:49
*** gokrokve has joined #openstack-infra07:49
*** nati_ueno has joined #openstack-infra07:50
*** Longgeek has quit IRC07:54
*** gokrokve has quit IRC07:54
*** nati_ueno has quit IRC07:54
*** jhesketh has joined #openstack-infra07:56
*** jpich has joined #openstack-infra08:02
*** jgallard has joined #openstack-infra08:04
*** rdopiera has joined #openstack-infra08:07
rdopierahello, I'm wondering what is the process for adding a package to the global requirements -- is it enough to send the patch for review, or should I also attend some metting or write a bug or e-mail?08:08
fifieldtrdopiera, normally in addition to patches, I see emails on the -dev mailing list for those kind of things08:09
fifieldtbut I'm no expert :)08:09
*** talluri has joined #openstack-infra08:09
StevenKI added os-cloud-config to the global requirements with only a review and no discussion08:10
StevenKDepends what you're proposing to add, I guess.08:10
*** Hal_ has joined #openstack-infra08:10
*** talluri_ has joined #openstack-infra08:12
*** pblaho has joined #openstack-infra08:13
*** pblaho has joined #openstack-infra08:14
*** talluri has quit IRC08:16
*** derekh_ has joined #openstack-infra08:17
*** talluri_ has quit IRC08:18
*** Hal_ has quit IRC08:19
*** Hal has joined #openstack-infra08:20
*** Hal is now known as Guest5154408:20
rdopierafifieldt, StevenK: thank you08:23
*** talluri has joined #openstack-infra08:28
openstackgerritAndreas Jaeger proposed a change to openstack-infra/config: Extract translations for log messages  https://review.openstack.org/9537708:35
openstackgerritAndreas Jaeger proposed a change to openstack-infra/config: Do not run unneeded update_catalog  https://review.openstack.org/9568708:39
*** jamielennox is now known as jamielennox|away08:40
*** igor_ has joined #openstack-infra08:41
*** mrda is now known as mrda_away08:45
*** habib has joined #openstack-infra08:45
*** habib has quit IRC08:48
*** gokrokve has joined #openstack-infra08:49
*** nati_ueno has joined #openstack-infra08:50
*** doude has joined #openstack-infra08:53
*** nosnos has quit IRC08:53
*** gokrokve has quit IRC08:53
*** nati_ueno has quit IRC08:55
*** habib has joined #openstack-infra08:56
*** Longgeek has joined #openstack-infra09:00
*** Longgeek has quit IRC09:00
*** Longgeek has joined #openstack-infra09:00
*** habib has quit IRC09:07
*** jp_at_hp has joined #openstack-infra09:08
chmoueljogo, mordred: you guys in paris?09:09
*** jooools has joined #openstack-infra09:10
*** amotoki has quit IRC09:22
*** andreykurilin_ has quit IRC09:30
*** andreykurilin_ has joined #openstack-infra09:30
*** nosnos has joined #openstack-infra09:35
*** salv-orlando_ has joined #openstack-infra09:35
*** salv-orlando has quit IRC09:37
*** salv-orlando_ is now known as salv-orlando09:37
*** zhiyan is now known as zhiyan_09:39
openstackgerritA change was merged to openstack-infra/storyboard-webclient: Textareas now autoresize their height.  https://review.openstack.org/9293909:42
*** gokrokve has joined #openstack-infra09:49
*** gokrokve_ has joined #openstack-infra09:51
*** mayu_ has quit IRC09:52
*** gokrokve has quit IRC09:53
*** nati_ueno has quit IRC09:55
*** gokrokve_ has quit IRC09:55
*** ihrachyshka has joined #openstack-infra09:56
*** andreykurilin_ is now known as andreykurilin10:06
*** jgallard has quit IRC10:08
*** jgallard has joined #openstack-infra10:08
*** jgallard has quit IRC10:13
*** markmc has joined #openstack-infra10:16
*** ominakov has joined #openstack-infra10:24
*** salv-orlando has quit IRC10:29
*** vkdrao has quit IRC10:29
*** talluri_ has joined #openstack-infra10:29
openstackgerritSergey Lukjanov proposed a change to openstack-infra/config: Add sahara-specs repo  https://review.openstack.org/9571510:30
*** talluri has quit IRC10:33
*** talluri_ has quit IRC10:38
openstackgerritRadomir Dopieralski proposed a change to openstack-infra/config: Add XStatic-* projects with packaged static files for Horizon  https://review.openstack.org/9571610:38
*** Alexei_987 has left #openstack-infra10:46
openstackgerritRadomir Dopieralski proposed a change to openstack-infra/config: Add XStatic-* projects with packaged static files for Horizon  https://review.openstack.org/9571610:47
*** gokrokve has joined #openstack-infra10:49
*** nati_ueno has joined #openstack-infra10:52
*** nati_ueno has quit IRC10:56
*** Ajaeger has joined #openstack-infra10:56
*** yjiang has quit IRC10:58
*** e0ne has quit IRC11:04
*** zhiyan_ is now known as zhiyan11:17
openstackgerritRadomir Dopieralski proposed a change to openstack-infra/config: Add XStatic-* projects with packaged static files for Horizon  https://review.openstack.org/9571611:18
openstackgerritChristian Berendt proposed a change to openstack-infra/gerritbot: replace dict.iteritems() with six.iteritems(dict)  https://review.openstack.org/9572711:21
anteayachmouel: mordred is in palo alto still, he is sick11:23
anteayachmouel: I'm not sure where jogo is11:24
openstackgerritNikita Konovalov proposed a change to openstack-infra/storyboard-webclient: Fix Unknown events in timeline  https://review.openstack.org/9572911:26
openstackgerritNikita Konovalov proposed a change to openstack-infra/storyboard: Small fix to a method name  https://review.openstack.org/9573011:28
sdaguestevebaker: that's probably still ok for now, especially if it leads to more repeatable results11:32
sdaguehttp://logs.openstack.org/17/95617/3/check/check-tempest-dsvm-neutron-heat-slow/7ef53b4/logs/devstacklog.txt.gz#_2014-05-27_04_24_18_771 - grub install seems to take 2 minutes11:32
sdaguewhich is interesting11:32
*** yamahata has quit IRC11:36
*** mburned_out is now known as mburned11:42
openstackgerritJoão Cravo proposed a change to openstack-infra/jenkins-job-builder: Add support for reverse build trigger  https://review.openstack.org/9573411:45
openstackgerritRadomir Dopieralski proposed a change to openstack-infra/config: Add XStatic-* projects with packaged static files for Horizon  https://review.openstack.org/9571611:45
*** salv-orlando has joined #openstack-infra11:49
*** e0ne has joined #openstack-infra11:49
*** gokrokve has joined #openstack-infra11:49
*** gokrokve has quit IRC11:53
*** jgallard has joined #openstack-infra11:54
*** nati_uen_ has joined #openstack-infra11:55
*** _nadya_ has joined #openstack-infra11:55
*** yfried has joined #openstack-infra11:57
openstackgerritAntoine Musso proposed a change to stackforge/python-jenkins: Speed up job existence tests by fetching less info  https://review.openstack.org/8958911:58
*** nati_uen_ has quit IRC11:59
*** _nadya__ has joined #openstack-infra12:00
*** _nadya_ has quit IRC12:00
openstackgerritA change was merged to openstack-dev/pbr: Permit pre-release versions with git metadata  https://review.openstack.org/8085712:01
*** Ajaeger has quit IRC12:02
*** ArxCruz has joined #openstack-infra12:02
openstackgerritA change was merged to openstack-infra/devstack-gate: Modify horizon log copy for Fedora  https://review.openstack.org/9325112:03
*** lcostantino has joined #openstack-infra12:06
*** salv-orlando has quit IRC12:07
*** mwagner_lap has quit IRC12:08
*** flaper87 is now known as flaper87|afk12:11
openstackgerritNikita Konovalov proposed a change to openstack-infra/storyboard: Remove unnecessary files  https://review.openstack.org/9574112:13
*** dstanek_zzz is now known as dstanek12:14
*** yaguang has quit IRC12:14
*** e0ne has joined #openstack-infra12:16
*** dims has quit IRC12:20
openstackgerritRadoslav Gerganov proposed a change to openstack-infra/config: Add button that shows/hides CI comments in Gerrit  https://review.openstack.org/9574312:20
*** ok_delta has joined #openstack-infra12:21
*** afazekas has joined #openstack-infra12:21
rgerganovhi folks12:24
rgerganovI am trying to add a piece of javascirpt in GerritSiteHeader.html that will add "Toggle CI" button12:25
rgerganovin order to show/hide CI comments12:25
rgerganovyou can see the patch above12:25
*** dprince has quit IRC12:25
rgerganovmy question is how can I test such a change? I can install it as userscript in my browser but now I am trying to push this on the server side12:27
*** salv-orlando has joined #openstack-infra12:27
*** yfried_ has quit IRC12:30
hasharrgerganov: Openstack has a Gerrit dev box so they can probably fetch your change there and try it out12:30
*** yfried_ has joined #openstack-infra12:30
rgerganovhashar, thanks. who would be to proper contact for this?12:31
hasharI have no clue :-]12:31
hasharrgerganov: and during the summit there was apparently a discussion to normalize the name of third party CI bots.12:31
rgerganovhashar, yes, that would be nice12:32
hasharI am sure I have seen a mail about normalization12:32
hasharbut can't find it :D12:32
*** yfried has quit IRC12:33
rgerganovI guess I can test my change by hacking an http proxy and inserting the script behind the scenes12:34
rgerganovbut I am looking for something easier :)12:34
hasharthey will wake/show up in a few hours12:34
hasharor you can ping the openstack-infra mailing list http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra12:35
rgerganovhashar, ok, thanks for the info12:35
*** yamahata has joined #openstack-infra12:42
*** yfried_ has quit IRC12:42
*** dims has joined #openstack-infra12:45
*** andreykurilin has quit IRC12:46
*** habib has joined #openstack-infra12:48
*** habib has quit IRC12:49
*** gokrokve has joined #openstack-infra12:49
*** habib has joined #openstack-infra12:49
*** dstanek_zzz is now known as dstanek12:50
*** jgrimm has quit IRC12:53
*** gokrokve has quit IRC12:54
*** nati_ueno has joined #openstack-infra12:55
*** radez_g0n3 is now known as radez12:56
*** nati_ueno has quit IRC13:00
dhellmannmordred: is the intent for the wheel publishing change to stop publishing tarballs? I don't see anything in https://review.openstack.org/#/c/56760 that's doing that, it seems to just be adding wheel publishing13:00
Alex_Gaynordhellmann: this is for our mirror, not pypi right?13:01
Alex_Gaynor(pip will already prefers wheels, so it seems like adding them is more backwards compatible)13:01
dhellmannAlex_Gaynor: the scripts use twine so I thought it was pypi13:01
dhellmannAlex_Gaynor: https://review.openstack.org/#/c/56760/7/modules/openstack_project/files/jenkins_job_builder/config/pypi-jobs.yaml13:01
dhellmannalso "pypi-jobs"13:02
*** yfried_ has joined #openstack-infra13:02
*** yfried_ has quit IRC13:03
*** yfried__ has joined #openstack-infra13:03
fungirgerganov: dhellmann some of the reasoning behind building and uploading wheels was that we could publish wheels to pypi for prereleases (but not tarballs) since versions of pip which grok wheels also don't install prerelease version patterns by default13:03
*** _afazekas_mtg has quit IRC13:04
fungirgerganov: er, sorry, runaway tab completion there13:04
dhellmannfungi: that's what I thought, but that's not what mordred's change (which I'm trying to rebase) seems to be doing13:04
dhellmannfungi: it calls both upload scripts from the pypi-upload job13:05
*** julim has joined #openstack-infra13:05
fungidhellmann: well, it calls both scripts from the tarball builder... still digging to see whether anything besides the wheel publisher does anything with the results13:06
*** jcoufal has quit IRC13:07
*** jcoufal_ has joined #openstack-infra13:07
fungidhellmann: ahh, yeah it does seem to build tarballs too. i suspect we need a tarball-pypi uploader and a wheel-pypi uploader separate13:07
dhellmannfungi: makes sense, and I had actually already made that split to prevent tarball upload issues from breaking wheel upload issues13:08
*** pblaho has quit IRC13:08
fungidhellmann: yeah, i just meant they probably need to be separate jobs so we can add both to the release pipeline, but only wheels to the pre-release pipeline13:09
dhellmannah, right13:09
*** pblaho has joined #openstack-infra13:10
*** signed8bit has joined #openstack-infra13:11
dhellmannfungi: the {name}-tarball job template builds both tarballs and wheels, what do you think about renaming that {name}-dists or something similar?13:11
*** yfried__ is now known as yfried13:12
fungidhellmann: seems fine to me. i'm trying to think about what implications this has for my tarball validation and signing plan, but i haven't had enough coffee yet so i've got nothing13:12
dhellmannthe tarball builder does both, too13:12
fungibuilding both, and even publishing both on tarballs.o.o, seems okay13:13
dhellmannit looks like he piggybacked on the existing tarball stuff pretty heavily13:13
*** changbl has quit IRC13:13
dhellmannfungi: ok, I'll leave the builder alone13:13
fungiit's mainly which we upload to pypi that i'm concerned about being able to split up13:13
Alex_Gaynordhellmann: for pypi we definitely still want to upload sdists13:14
*** oomichi has quit IRC13:14
dhellmannAlex_Gaynor: sure13:15
dhellmannfungi: I'm having some trouble figuring out how to express the pipeline in the layout file13:15
dhellmanndo I want both the tarball-upload and wheel upload to run before the post-mirror-* jobs?13:16
*** _afazekas_mtg has joined #openstack-infra13:16
dhellmannand if so, how would I express that with the tree structure? maybe that's why he kept both uploads in one job?13:16
fungidhellmann: hrm, yeah right now zuul doesn't allow a child job with multiple parents. we've discussed the possibility of being able to set dependencies on job groups, but that's still on the drawing board13:17
fungier, not job groups in the jjb sense, but some as of yet not implemented grouping structure in zuul13:17
dhellmannfungi: how about this: http://paste.openstack.org/show/81690/13:17
dhellmann(the changes are in the pre-release and release sections)13:17
dhellmannI'll need to define separate "both-upload" and "wheel-upload" jobs13:18
fungidhellmann: yep, that would work13:18
dhellmannfungi: great, thanks13:18
*** _afazekas_mtg is now known as afazekas13:19
anteayahashar rgerganov that is the 4th item on today's infra meeting agenda: https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting13:20
anteayahashar rgerganov it would be great to have your input for that item13:20
hasharah I knew I see it somewhere :-]13:20
fungiAlex_Gaynor: you wouldn't happen to have any ideas on how to go about validating/reversing a wheel build to confirm whether it was built from tampered sources, given the wheel file and the tagged git repo, would you? i know how to roughly go about doing that for a tarball, but...13:20
anteayahashar: :D13:21
*** _nadya__ has quit IRC13:21
hasharanteaya: rgerganov wrote a short JS that would let one hide the reviews proposed by CI system. So I am sure normalizing the names will help a lot !13:21
*** andreykurilin has joined #openstack-infra13:21
Alex_Gaynorfungi: build a wheel from the sources and diff with the wheel? I'm not sure if that process is detemrinistic, but it's a good starting point /cc dstufft13:21
anteayayes, I am following the email thread, there are many folks who would use that feature gladly13:22
*** _nadya_ has joined #openstack-infra13:32
fungidhellmann: http://git.openstack.org/cgit/openstack-infra/config/tree/tools/run-layout.sh13:33
openstackgerritNikita Konovalov proposed a change to openstack-infra/storyboard: Name fields checked with regex  https://review.openstack.org/9576313:33
dhellmannfungi: thanks13:33
fungidhellmann: we run it in the gate-config-layout job with "./tools/run-layout.sh openstack-infra config"13:33
*** habib has joined #openstack-infra13:33
*** dripton_ is now known as dripton13:35
openstackgerritValeriy Ponomaryov proposed a change to openstack-infra/config: Enabled pylint job for manila project  https://review.openstack.org/9576513:36
fungidhellmann: and to the other question, i don't see any current examples of a project-template definition invoking another template, so probably not supported (yet anyway)13:36
dhellmannfungi: this is what I was going to try as a template: http://paste.openstack.org/show/81698/13:36
*** _nadya_ has quit IRC13:37
*** sballe_ has joined #openstack-infra13:37
fungidhellmann: looks sane enough13:37
*** habib has quit IRC13:38
*** habib has joined #openstack-infra13:39
*** sballe has quit IRC13:39
*** dkliban_gone is now known as dkliban13:40
mkodererdhellmann: ping13:41
*** sballe_ has quit IRC13:42
dhellmannmkoderer: pong13:42
*** flaper87|afk is now known as flaper8713:44
*** msabramo has joined #openstack-infra13:44
mkodererdhellmann: about your comment on https://review.openstack.org/#/c/95411/13:45
mkodererdhellmann: your are pointing to a merged patch that wasn't merged as I initially uploaded the patch13:46
mkodererhow can this be related? is the wrong sphinx version installed?13:47
*** prad_ has joined #openstack-infra13:47
*** jgrimm has joined #openstack-infra13:47
*** wenlock has joined #openstack-infra13:47
*** liyuezho has joined #openstack-infra13:48
*** liyuezho has quit IRC13:48
dhellmannmkoderer: time passes, it wasn't merged when I left the comment13:48
mkodererdhellmann: ah ok, so I rerun the test but it's still failing13:48
dhellmannmkoderer: I think if you rebase your patch and resubmit it, it should work13:49
dhellmannmkoderer: hang on, let me check if my other stuff landed13:49
mkodererdhellmann: ok13:49
*** gokrokve has joined #openstack-infra13:49
*** jcoufal has joined #openstack-infra13:49
dhellmannmkoderer: yeah, https://review.openstack.org/#/c/95343/ landed so yours should work if you rebase13:49
mkodererdhellmann: k cool thx13:50
*** jcoufal_ has quit IRC13:52
*** doug-fish has joined #openstack-infra13:53
*** gokrokve has quit IRC13:53
*** igor_ has quit IRC13:54
*** homeless has joined #openstack-infra13:55
*** nati_ueno has joined #openstack-infra13:56
*** pblaho has joined #openstack-infra13:57
*** sballe has quit IRC13:59
*** annegentle has joined #openstack-infra13:59
*** Longgeek has quit IRC14:03
*** wenlock has quit IRC14:05
*** lcheng_ has joined #openstack-infra14:06
*** hashar has quit IRC14:06
*** Longgeek has joined #openstack-infra14:07
*** gokrokve has joined #openstack-infra14:07
*** heyongli has quit IRC14:07
*** gokrokve_ has joined #openstack-infra14:08
*** duran has joined #openstack-infra14:08
*** yfried has quit IRC14:09
*** gokrokve has quit IRC14:11
*** pblaho has quit IRC14:12
Alex_GaynorIs some part of zuul having troubles? Just sent a thing to gerrit and it hasn't produced a job14:12
fungiAlex_Gaynor: what thing?14:12
Alex_Gaynorfungi: https://review.openstack.org/#/c/95777/14:12
*** rdopiera has quit IRC14:13
fungiAlex_Gaynor: i do see "Queue lengths: 37 events, 317 results." on http://status.openstack.org/zuul/ so it may be dealing with a config reload or something... checking14:13
Alex_Gaynorfungi: doh, I forgot those lengths were there; thanks!14:13
fungiAlex_Gaynor: the results queue seems to be falling, fwiw, so it will probably right itself here shortly14:14
Alex_Gaynorfungi: yeah, sorry about the noise14:14
*** Kiall_ is now known as Kiall14:14
*** habib has quit IRC14:15
fungiwell, those swift changes in the gate seem to be cycling through devstack-precise nodes at a steady clip, and we did just hit our nodepool image rebuild time14:15
fungiplaying "spot the broken" now14:15
corvusfungi: i'm online if you need a hand14:16
Alex_GaynorDid a jenkins worker die?14:16
*** corvus is now known as jeblair14:16
*** habib has joined #openstack-infra14:17
fungidoesn't seem likely to be a new image causing issues... none have been building long enough to become ready and spawn new nodes. must be something more external14:18
*** msabramo has quit IRC14:18
*** yaguang has joined #openstack-infra14:20
*** lcheng_ has quit IRC14:22
*** lcheng_ has joined #openstack-infra14:24
*** timrc is now known as timrc-afk14:24
*** habib has quit IRC14:25
*** habib has joined #openstack-infra14:25
*** james_li has joined #openstack-infra14:26
jeblairfungi: zuul just lost its gearman connection14:26
fungiand aborted all running jobs14:26
jeblairwhich, on the one hand is bad, but on the other, hopefully it means we have enough log entries this time to debug it14:26
*** ihrachyshka has quit IRC14:26
fungithat's why i'm not finding any new jobs exhibiting an obvious problem14:26
jeblairfungi: that _just_ happened though; i'm not sure if that's related to anything prior14:27
jeblairi'm going to eat breakfast then dive into debugging that14:28
*** sileht has quit IRC14:28
fungisounds good. i'll see if i can dig up anything14:28
*** pcrews has joined #openstack-infra14:29
*** habib has quit IRC14:29
*** gokrokve has joined #openstack-infra14:33
*** atiwari has joined #openstack-infra14:35
openstackgerritMatt Riedemann proposed a change to openstack-infra/elastic-recheck: Add query for Neutron SSH EOFError bug 1323658  https://review.openstack.org/9578214:35
uvirtbotLaunchpad bug 1323658 in neutron "SSH EOFError - Public network connectivity check failed" [Undecided,New] https://launchpad.net/bugs/132365814:35
*** gokrokve_ has quit IRC14:36
*** otherwiseguy has joined #openstack-infra14:36
phschwartzmorning infra14:37
anteayamorning phschwartz14:37
*** sileht has quit IRC14:37
openstackgerritDoug Hellmann proposed a change to openstack-infra/config: Create and upload wheels  https://review.openstack.org/5676014:39
openstackgerritDoug Hellmann proposed a change to openstack-infra/config: Make it possible to run zuul layout test locally  https://review.openstack.org/9578314:39
*** vhoward has left #openstack-infra14:39
*** sileht has joined #openstack-infra14:39
*** vhoward has joined #openstack-infra14:39
dhellmannfungi: ^^ changes related to what we were discussing earlier14:39
jeblairfungi: ah that might explain the event backlog then14:39
openstackgerritSergey Lukjanov proposed a change to openstack-infra/config: Add sahara-specs repo  https://review.openstack.org/9571514:40
fungiseeing what else of note preceded it in the log14:40
*** Longgeek_ has joined #openstack-infra14:41
SergeyLukjanovanteaya, thanks for the top14:41
*** msabramo has joined #openstack-infra14:42
anteayaSergeyLukjanov: np14:44
*** rgerganov has quit IRC14:44
* anteaya reviews again14:44
anteayaSergeyLukjanov: thanks14:44
SergeyLukjanovanteaya, thx ;)14:45
*** sballe_ has quit IRC14:45
*** habib has joined #openstack-infra14:45
*** alugovoi has joined #openstack-infra14:46
*** habib has quit IRC14:46
*** habib has joined #openstack-infra14:47
*** skolekonov has quit IRC14:48
*** sileht has quit IRC14:49
*** wenlock has joined #openstack-infra14:49
mordreddhellmann: woot!14:49
*** timrc-afk is now known as timrc14:50
cody-somervilleclarkb: mordred: Can you review https://review.openstack.org/#/c/93870/ s'il-vous-plait?  :)14:51
dhellmannmordred: those layout changes are a little scary, so please look them over closely -- I did manage to run some tests locally that make me think I have them right, but still.14:52
* signed8bit didn't mean to type that... wrong focus14:52
*** thedodd has joined #openstack-infra14:53
jpichHello! Is there a way to block or report LP users? Someone filed a couple of bugs containing only spam. Marking them as Invalid is fine for now but it's going to get irritating14:55
mordredcody-somerville: lgtm14:55
fungijpich: either try to get someone's attention in #launchpad or file a bug against "launchpad itself" noting the problem behavior14:55
jeblairfungi, jpich: opening a "question" on the "launchpad itself" may be better than filing a bug?14:56
ttxjeblair: yes, they are actually reactive on "questions"14:56
ttxwhereas bugs... not so much14:57
fungioops, yes i forgot it was lp answers not bugs they used for support requests14:57
*** nati_ueno has joined #openstack-infra14:57
*** otherwiseguy has quit IRC14:57
jpichfungi jeblair ttx: Fair enough, will do. Thanks!14:57
*** otherwiseguy has joined #openstack-infra14:58
openstackgerritDan Prince proposed a change to openstack-infra/config: Add yum.openstack.org lightweight Fedora 20 mirror  https://review.openstack.org/9087514:58
openstackgerritDan Prince proposed a change to openstack-infra/config: Import puppet-yum project  https://review.openstack.org/9087414:59
openstackgerritDan Prince proposed a change to openstack-infra/config: Install the openstackci-yum module.  https://review.openstack.org/9578714:59
*** Longgeek_ has quit IRC15:00
*** alugovoi has quit IRC15:00
*** Longgeek has joined #openstack-infra15:01
*** lascii is now known as alaski15:01
*** nati_ueno has quit IRC15:01
*** malini is now known as malini_afk15:02
*** malini_afk is now known as malini15:03
*** yaguang has quit IRC15:03
*** zhiyan_ is now known as zhiyan15:05
*** malini has left #openstack-infra15:07
*** terryw has joined #openstack-infra15:10
Alex_Gaynorfungi: I think CI jobs actually aren't starting this time :-)15:10
*** otherwiseguy has quit IRC15:11
*** andreykurilin has quit IRC15:11
*** BadCub has joined #openstack-infra15:12
*** talluri has joined #openstack-infra15:13
mordreddhellmann: the changes look good so far!15:14
*** zhiyan is now known as zhiyan_15:14
annegentlettx: around?15:15
mordreddhellmann: why not have publish-wheels for openstack/sahara?15:15
dhellmannmordred: I was trying to replace the jobs that were there, without making decisions about adding new ones. Did they have a pre-release publish job before?15:16
dhellmannmordred: ah, they have a tarball but they weren't doing any mirror syncing15:17
*** che-arne has joined #openstack-infra15:17
*** rfolco has joined #openstack-infra15:18
*** talluri has quit IRC15:18
*** afazekas has quit IRC15:18
fungiAlex_Gaynor: it looks like we've had zuul internal gearman timeouts between 14:06 and 14:44, so we may also have considerable node starvation from all the job restarts15:20
Alex_Gaynorfungi: but it'll heal itself?15:20
fungiAlex_Gaynor: not sure yet--still digging but it seems to have stopped flailing15:20
Alex_Gaynorfungi: ok, good to know -- I wont' bother folks in the future if patience is all it takes15:21
annegentleHi all, I'm trying to push a new tag for openstack-doc-tools. It's 0.15, which doesn't exist on github (https://github.com/openstack/openstack-doc-tools/tags) and I don't see how to view tags on gerrit web view... any ideas for me?15:21
openstackgerritMatt Riedemann proposed a change to openstack-infra/config: Index logs/tempest.txt for logstash queries  https://review.openstack.org/9579615:21
mriedemsdague: mtreinish: clarkb: ^15:21
annegentleThe error I'm seeing is "error: src refspec 0.15 matches more than one."15:21
ttxannegentle: multiplexing, but yes15:21
annegentlettx: ok I was looking for guidance on the defcore task, I've worked on many of those line items but wanted to know what we want to do in the tc meeting15:22
annegentlettx: is it research for the red rows?15:22
ttxWe want to verify scores on the "TC direction" column15:22
ttxannegentle: especially the 0.515:23
*** gyee has joined #openstack-infra15:23
ttxannegentle: the red lines are first for PTLs to fill, we'll step up if it's blocked15:23
annegentlettx: ok got it, and does a 1 mean "matches TC direction"15:23
fungiAlex_Gaynor: no, please do bother us ;)15:23
annegentlettx: ok good15:23
annegentlefungi: oh good then I'll bother you with my refspec 0.15 multi match! :)15:23
ttxannegentle: yes, 1 means we care15:23
ttx0 means it's probably deprecated tomorrow15:24
ttx0.5 is "will die some day"15:24
fungiannegentle: i have no idea what that means, but sure, why not15:24
annegentlettx: and 0.5 means they didn't know?15:24
ttxat least that's how I'd score it15:24
clarkbmriedem: would it be mad of me to -1 that on grounds of that log file being huge and full of noise?15:24
annegentlefungi: oh right it's an error I'm seeing when trying to push a tag to gerrit for openstack-doc-tools (scroll up)15:24
annegentleI'm multistasking too much :)15:24
clarkbINFO started http connection over and over and over15:24
annegentlettx: do we get to use "die in a fire"15:24
annegentlettx: actually I don't feel that strongly about any of those to mark them diaf15:25
ttxheh, yes15:25
fungiannegentle: aha. i missed that earlier15:25
ttxannegentle: I found the scores mostly correct imho15:25
mriedemclarkb: is there a way to configure it such that it only indexes on certain log levels?15:25
ttxannegentle: but we'll discuss them later15:25
annegentlettx: yeah I think so too (and I helped with some of them so I hope I remain consistent ha)15:25
mriedemclarkb: like WARNING and higher?15:25
annegentlettx: sounds good thanks15:25
*** mkerrin1 has quit IRC15:26
clarkbmriedem: it does INFO and higher15:26
mriedemclarkb: i know, but wondering if i can tell the tooling to ignore everything below WARNING15:26
mriedemuntil we can clean up the logging15:26
annegentlefungi: I definitely have a 0.15 locally15:26
clarkbI don't think you can configure that today15:26
annegentlefungi: just can't figure out why git thinks there's already one remotely15:26
clarkbbut its just python so that can be changed15:26
fungiannegentle: what command are you running that you're getting that error? 'git tag ...' or 'git push ...' or something else?15:28
annegentlefungi:  git push gerrit 0.1515:28
mriedemclarkb: let me take a look at the tempest log at INFO level and see if there should also be a change in tempest, i.e. INFO started http connection to debug15:28
annegentlefungi: first, git tag -s 0.1515:28
mriedemclarkb: then i can get all the deps lined up15:28
*** mrodden has quit IRC15:29
fungiannegentle: so that error seems to imply that you have more than one local ref named "0.15"15:29
fungiannegentle: probably a branch?15:29
mriedemclarkb: feel free to -1 though15:29
mriedemuntil that happens15:29
fungiannegentle: does 'git branch -a | grep 0\.15' give you something?15:30
*** mkerrin has joined #openstack-infra15:30
annegentlefungi: yep, I have two branches with 0.15 in the name15:30
*** Ajaeger has joined #openstack-infra15:30
annegentlefungi: one from the first time I tried and got the error, then deleted the tag locally, then tried again with the second branch15:30
fungiannegentle: try renaming those with 'git branch -m old_branch new_branch'15:31
annegentlefungi: ok those are renamed now15:32
annegentlefungi: trying again15:33
fungiannegentle: basically git is complaining that it doesn't know whether you're wanting to push a tag named 0.15 or a branch named 0.1515:33
annegentlefungi: ha that did it. Dangit.15:33
fungisince you had both15:33
annegentlefungi: thanks much! Something about memorial day made me not think about that.15:33
annegentlefungi: :)15:33
Ajaegerfungi, great that you could solve the mystery!15:33
fungiannegentle: lots of things about memorial day just made me not think, so i undertand15:34
fungiannegentle: Ajaeger: yep, i see your new tag now at http://git.openstack.org/cgit/openstack/openstack-doc-tools15:35
*** unicell has joined #openstack-infra15:35
*** NithyaG is now known as NithyaG_afk15:35
Ajaegerfungi, in that case I have a few more favors to ask ;) Could you review https://review.openstack.org/95414, please?15:36
AjaegerWe're getting of the special asciidoc handling - and openstack-doc-tools 0.15 contains a corresponding change.15:36
*** pdmars has joined #openstack-infra15:36
*** alugovoi has joined #openstack-infra15:37
AjaegerAnybody else around that can review https://review.openstack.org/95414 as well, please?15:37
*** msabramo has quit IRC15:39
*** jgrimm has quit IRC15:39
anteayaAjaeger: +115:39
*** zhiyan_ is now known as zhiyan15:39
Ajaegerthanks, anteaya15:40
mriedemclarkb: bug 1323726 - 17880 hits of tempest.common.rest_client in one log file, so yeah, a bit excessive15:46
mriedemi'll make a change there15:46
*** markwash has joined #openstack-infra15:49
*** zhiyan is now known as zhiyan_15:51
mordreddhellmann: I thnk the new idea in my head is that we should go ahead and start doing the wheels for pre-releases and tarballs for releases for everything now ... but we can also take that as a second pass15:51
dhellmannmordred: yeah, that makes sense15:51
mordreddhellmann: like, if we have access to the pypi account for something, we should just across teh board do the pre-release/release pattern15:51
* dhellmann nods15:52
*** jgrimm has joined #openstack-infra15:52
*** zehicle_at_dell has quit IRC15:52
*** msabramo has quit IRC15:53
mordreddhellmann: in fact, I wonder if maybe we shouldn't just collapse your two layout templates into a single publish-to-pypi one- so that if we publish to pypi for someone, we do it one way15:53
mordred(or two - publish-to-pypi and publish-2only)15:54
Alex_Gaynormordred, dhellmann: I know we said that uploading wheels of pre-releases would be fine, but when Django tried to do it, we hit some snag, trying to remember what it was /cc dstufft15:54
dhellmannmordred: that would be a clean way to do it; should I do it in this patch?15:54
mordredAlex_Gaynor: oh, that would be a good thing to know15:54
fungidigging into zuul's gearman debug logs, the first ERROR entry is about an unknown job at 14:06, right when we got the first gearman timeouts in the zuul daemon log15:55
Alex_Gaynormordred: I'm hoping donald will remember, in his role as "Keeper of packaging lore"15:55
mordreddhellmann: yeah - let's do it in this patch - I just said the same thing in a review too15:55
*** msabramo has joined #openstack-infra15:55
dhellmannmordred: ok15:55
*** hashar has joined #openstack-infra15:56
mordreddhellmann: but overall, the patch looks awesome15:56
*** zz_gondoi is now known as gondoi15:59
*** alexpilotti has joined #openstack-infra15:59
*** IvanBerezovskiy has left #openstack-infra15:59
*** Ryan_Lane has joined #openstack-infra16:00
*** jlibosva has quit IRC16:02
*** gondoi has quit IRC16:04
*** doude has quit IRC16:09
ildikovI found scripts and gate job templates in the infra config repo, but I'm not 100% sure, I found the place, where some change could be applied to solve this issue and I'm also not familiar how this could be tested locally16:09
anteayaildikov: link to what you found?16:10
ildikovanteaya: https://github.com/openstack-infra/config/blob/master/modules/openstack_project/files/slave_scripts/run-docs.sh16:10
ildikovanteaya: the run-docs script seems to run the sphinx build command and the other one is the template for the docs gate job16:11
jeblairmordred: okay, gertty HEAD is now a good place; though the series at 95769 is also good if you want to try out the hyperlink stuff i've been working on16:11
*** marcoemorais has joined #openstack-infra16:11
*** Ryan_Lane has quit IRC16:11
anteayaildikov: so far so good16:12
ildikovanteaya: I cannot really see the mapping between these too and also it's not 100% clear to me, that where should I change prolly the script to check the logs of the sphinx build16:12
anteayaildikov: this is the template for many jobs to run the docs job16:12
anteayathe question is why is it breaking for ceilometer doc jobs16:13
anteayado you have a patch url that has a broken docs job?16:13
ildikovanteaya: it does not fail in case of errors or at least for one specific error for sure16:13
ildikovanteaya: one sec, I will try to find the last occurence16:13
*** nati_ueno has joined #openstack-infra16:14
anteayaildikov: for reference: http://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/zuul/layout.yaml#n63216:15
ildikovanteaya: https://review.openstack.org/#/c/92365/6/doc/source/measurements.rst16:15
*** zz_gondoi is now known as gondoi16:16
anteayaildikov: note that python-jobs are defined here: http://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/zuul/layout.yaml#n23816:16
*** markmc has quit IRC16:16
openstackgerritJames E. Blair proposed a change to stackforge/gertty: Handle (ignore) no-diff renames  https://review.openstack.org/9576916:16
ildikovanteaya: the patch is corrected now, but for instance in patch set 6 the table for Cinder was not correctly formed16:16
openstackgerritJames E. Blair proposed a change to stackforge/gertty: Add patchset selection in diff  https://review.openstack.org/9576816:16
openstackgerritJames E. Blair proposed a change to stackforge/gertty: Correct a problem with tables at very small widths  https://review.openstack.org/9576716:16
openstackgerritJames E. Blair proposed a change to stackforge/gertty: Add hyperlinks  https://review.openstack.org/9576616:16
morganfainbergi need to poke at gertty more.16:16
ildikovanteaya: and in this case when I ran sphinx build locally, it throws an Error message with malformed table text16:16
ildikovanteaya: but the docs gate job reports success and the result is that the affected table is missing from the generated doc16:17
anteayaildikov: here is the docs build for patchset 6: http://docs-draft.openstack.org/65/92365/6/check/gate-ceilometer-docs/cc989fb/doc/build/html/16:17
anteayacan you show me the broken parts?16:17
*** amotoki has joined #openstack-infra16:17
ildikovanteaya: http://docs-draft.openstack.org/65/92365/6/check/gate-ceilometer-docs/cc989fb/doc/build/html/measurements.html#volume-cinder16:18
openstackgerritDoug Hellmann proposed a change to openstack-infra/config: Create and upload wheels  https://review.openstack.org/5676016:18
openstackgerritDoug Hellmann proposed a change to openstack-infra/config: Make it possible to run zuul layout test locally  https://review.openstack.org/9578316:18
clarkbhttp://logs.openstack.org/65/92365/10/gate/gate-ceilometer-docs/0e286b8/console.html.gz is the build log. ceilometer docs isnt failing on warnings. not sure if this would help16:18
dhellmannmordred: ^^16:18
clarkbbut there are no errors in the build16:18
ildikovanteaya: this is the result of the comment I added in the previous link of that patch16:18
*** ihrachyshka has joined #openstack-infra16:19
anteayaildikov: I see tables16:19
*** talluri has quit IRC16:19
ildikovanteaya: the Cinder table is missing16:19
anteayayeah I confirm the cinder table it missing16:20
ildikovanteaya: in the correct docs it looks like this: http://docs-draft.openstack.org/65/92365/10/gate/gate-ceilometer-docs/0e286b8/doc/build/html/measurements.html#volume-cinder16:20
mordredjeblair:     raise MultipleInvalid([e])16:20
ildikovanteaya: ah, ok16:20
mordredvoluptuous.MultipleInvalid: expected a dictionary16:20
clarkbso you want python setup.py build_sphinx to catch that16:20
*** habib has quit IRC16:21
mordredjeblair: I moved my config to yaml as per instructions - and started gertty and got that - known issue? or should I debug?16:21
jeblairmordred: unknown issue16:21
*** thedodd has quit IRC16:21
*** thedodd has joined #openstack-infra16:21
mordredjeblair: I see it. my bad16:21
anteayaildikov: as clarkb pointed out the build log for the docs job throws no warnings16:22
anteayaildikov: according to the log the build succeeds16:22
ildikovclarkb: and if I make it to catch this issue than it will mark the doc job as failed, right?16:22
clarkbildikov: anteaya if you can get tox -evenv -- python setup.py build_sphinx to catch that then it will be gated on16:22
mordredjeblair: I missed the top level "servers"16:22
*** e0ne has quit IRC16:22
ildikovclarkb: anteaya: I use sphinx build to check my doc related patches, so it throws an error, I never tried to run it in any other way16:23
clarkbanteaya: it has 10 warnings no errors16:23
*** e0ne has joined #openstack-infra16:23
clarkbildikov it may be related to the version of sphinx16:23
ildikovclarkb: hmm, ok, I will check that also16:24
anteayaclarkb: looking at the testenv for docs: http://git.openstack.org/cgit/openstack/ceilometer/tree/tox.ini16:25
clarkbalso it should gate on warnings if using pbr properly16:25
*** ihrachyshka has quit IRC16:25
anteayaclarkb: would a change need to happen in the tox.ini for ceilometer?16:25
clarkbmordred ^16:25
*** ihrachyshka has joined #openstack-infra16:25
clarkbanteaya no we dont use a docs venv16:25
openstackgerritDoug Hellmann proposed a change to openstack-infra/config: Add zuul template for rtfd jobs  https://review.openstack.org/9582516:25
*** jgallard has quit IRC16:26
clarkbwe use tox -evenv -- python setup.py build_sphinx16:26
anteayaclarkb: is the ceilometer docs venv in the tox.ini file mis-leading?16:26
*** derekh_ has quit IRC16:26
hashariirc the docs/doc testenv in repository is just a convenience for devs16:27
*** e0ne has quit IRC16:27
hasharso they can easily generate doc by tox -edocs16:27
clarkbpotentially since the venv virtualenv does some extra things16:27
clarkbit doesnt do that typically but ceilometer appears to have a snowflake16:28
anteayamorning zaro16:29
*** ArxCruz has quit IRC16:29
mordredclarkb: well, wow16:29
ildikovhmm, does the venv have anything to do with the false success?16:29
mordredthat means that ceilo docs jobs in our stuff are going to be trying to start mongo16:29
ildikovsorry for the silly question, I'm not the expert of this part16:30
clarkband who knows what else16:30
fungijeblair: i think this could be where things started to go wrong, but i don't see any smoking gun in the gearman-server.log (this was a fraction of a second before the first ERROR in that log): http://paste.openstack.org/show/81722/ (trying now to correlate with the 5 other disconnects we saw after that one during the time of troubles)16:30
ildikovhmm, anyhow that sounds bad :S16:30
mordredclarkb: I'm not 100% sure what the right choice is here - but I think we might need to dive in16:30
mordredI'm not sure I believe it should be tox's job to start mongodb - but I will admit I have spent all of 30 seconds thinking about this16:31
anteayamordred: where do you see tox starts mongodb?16:31
anteayaI am not seeing that16:31
jeblairfungi: thx16:31
mordredanteaya: in setup-test-env.sh16:32
clarkbanteaya: the setup env thing16:32
anteayaah thanks16:32
*** jcoufal has quit IRC16:32
ildikovanteaya: testenv:venv would be my vote16:32
clarkbfungi: did sys or kern log log anything ?16:33
fungiclarkb: crickets16:33
clarkbI wonder if the host had a sad16:33
clarkbildikov does tox -e docs do the right thing?16:34
clarkbif not then the setup test env thing is probably not to blame16:34
*** gokrokve has quit IRC16:34
*** wenlock_ has joined #openstack-infra16:35
clarkbjust the venv one matters for this16:36
clarkbthe docs section is unused by the gate16:36
*** terryw is now known as otherwiseguy16:36
ildikovclarkb: yeap, you're right, I messed it up a bit16:36
jeblairfungi: ~corvus/logs has just the 14:xx hour of log entries16:38
*** BadCub has left #openstack-infra16:38
ildikovclarkb: do the deps section needed there for instance? I mean in the 'venv' as it is added already in testenv at the beginning16:38
clarkbI think it may need to be there if you override somethibg16:39
fungijeblair: thanks! much faster. i should have done something similar16:39
clarkbbut otherwise it should be fine without it16:39
*** wenlock_ has quit IRC16:40
jeblairfungi: it's interesting that the packets that zuul times out on are eventually received by the gearman server16:41
fungiclearly communication isn't kaput, just taking too long16:42
*** zhiyan_ is now known as zhiyan16:42
fungifocusing on the conversation leading up to each disconnect, i don't see any commonalities whatsoever... http://paste.openstack.org/show/81728/16:43
fungiall in various states16:43
*** lakshmiS has joined #openstack-infra16:44
clarkbarg being hauled to breakfast with family before returning to northern lands16:44
*** msabramo has joined #openstack-infra16:44
clarkbback before meeting16:44
fungihave fun, clarkb16:44
ildikovclarkb: hmm, I'm not 100% sure that what is overwritten where, so I will leave it as is for the first round16:44
ildikovclarkb: anteaya: thanks for the help and the pointers, I will try to play a bit with tox then and see what happens, I guess there should be the solution somewhere16:46
* anteaya nods16:46
fungithis is leading me to conclude there was something environmental affecting zuul's local performance around that time, slowing it down just enough that some local connections exceeded the 30-second timeout by ~10%16:46
*** pdmars has quit IRC16:47
anteayaI hope you find a solution16:47
fungihttps://status.rackspace.com/ looks sort of bad, but nothing obvious there which would impact dfw16:48
*** olaph has joined #openstack-infra16:48
*** w_ has joined #openstack-infra16:48
ildikovanteaya: I will come back with some new questions if not, but hopefully it will not be needed, I'm a bit confused now with that venv section, but anyway I'm ready to play with it a bit :)16:48
ildikovanteaya: so thanks again :)16:49
*** yfried has joined #openstack-infra16:49
fungioh! https://status.rackspace.com/index/viewincidents?start=1401163200 "10:59 AM EDT Our Engineers have identified an issue with one of the storage devices in the DFW1 data center which is causing some sites to experience slow response or timeouts."16:49
fungimaybe? about the right timeframe16:49
*** yfried has quit IRC16:50
*** yfried has joined #openstack-infra16:50
anteayaildikov: questions are always welcome16:51
jeblairfungi: i'm seeing geard logging mostly idle during those timeframes too; so either it's affected by the 'paused host', or it's spending 35 seconds doing something it's not logging16:52
zaroclarkb: would you be able to comment? https://review.openstack.org/#/c/92773/16:53
jeblairfungi: (i wonder what a live migration actually looks like from the host pov)16:53
fungijeblair: omg, it's full of clouds16:53
*** nati_ueno has quit IRC16:53
*** dkliban_brb is now known as dkliban16:54
jeblairfungi: http://paste.openstack.org/show/81729/  that's a call/response pair to a jenkins master separated by 35 seconds16:54
clarkbzaro: yes use the macro in the defaults16:54
*** yamahata has quit IRC16:55
mordreddhellmann: your patches make me happy16:55
jeblairfungi: do you feel that /var/log/jenkins/jenkins.log on jenkins07 should be greater than zero bytes?16:55
mordreddhellmann: they also raise the question of what we shoudl be doing with rtfd again - especially since there are two repos using rtfd that may be able to become "official"16:56
mordredwhich means it might be time for us to actually take a stance16:56
dhellmannmordred: yeah, some of those oslo libs have docs published there as legacy urls; I don't think it hurts to keep them.16:57
*** ildikov has quit IRC16:57
*** hogepodge has joined #openstack-infra16:57
*** dizquierdo has quit IRC16:58
mordredAjaeger: looking16:58
jeblairmordred: it uses names now16:58
Alex_Gaynormordred: you don't need IDs, anymore, the project name is enough16:58
fungijeblair: ooh, good catch16:59
*** alugovoi has quit IRC16:59
dhellmannjeblair: does it still require a project to be registered manually? I guess that's no different than pypi.16:59
Ajaegerthanks, mordred16:59
mordredoh! neat16:59
Alex_Gaynordhellmann: yeah, but, like, you can write some software16:59
mordreddhellmann: I've been meaning to add pypi project registration to manage-projects too16:59
Alex_Gaynordhellmann: hell, it'd probably even be easy to add an API end point to rtd, I'm sure they'd take it16:59
dhellmannAlex_Gaynor: true16:59
openstackgerritBen Nemec proposed a change to openstack-infra/config: Add dib-utils project  https://review.openstack.org/9028116:59
jeblairmordred: i'm in favor of not using rtfd because it's an unecessary extra thing to deal with17:00
*** _nadya_ has joined #openstack-infra17:00
fungijeblair: /var/log/jenkins/jenkins.log on jenkins07 is owned by a nonexistent user/group17:01
fungijeblair: which would explain why the daemon can't write to it17:01
mordredjeblair: I think the thing I'm least in favor of is both dealing with it and also not just having it be done everywhere17:01
jeblairfungi: i'm going to see if any of the other failures had a similar interaction with a different jenkins17:02
fungijeblair: yeah, i think the zero-byte log on jenkins07 is merely an unfortunate coincidence. i'll fix the ownership on it to something sane and consistent with the other masters17:03
fungilooks like it's probably been broken since january 26 (was that when we built it?)17:04
fungilooks like it17:04
jeblairfungi: istr some package/puppet conflicts around users; maybe we didn't do some needed cleanup on that host17:04
fungilast modified time on /etc/hosts is from the same week anyway17:04
fungiseems quite likely17:05
fungiit's owned by jenkins:jenkins now, but will probably need a restart before it writes to it again17:05
mordredjeblair: I think my argument in favor of rtfd is that, like pypi, it's a thing that "python projects use" - and I think that where we can consistently engage with python ecosystem might help mitigate the perception that we're over off in the corner. but I don't feel strongly enough about it to die on a hill or anything17:06
fungidoing a sweep of that host for anything else under the previous user/group and will clean that up too17:07
anteayamordred: don't die on a hill for that17:07
Alex_Gaynormordred: Do you feel strongly enough to die in a place that isn't a hill?17:07
fungiso presumably jenkins got started under the wrong uid/gid and the logs it created were left behind with incorrect ownership17:08
mordredAlex_Gaynor: I feel strongly that I should only die on hills17:09
fungi(mordred of the hill people)17:11
zaroclarkb: using macro in defaults will not work the same way because the default timeout sets timeout value to 30, while the macro sets it to {timeout}.  if we use the macro in the default (as you suggest) then no deault timeout value would be set.17:11
anteayamordred: stay away from hills17:12
clarkbzaro: you set the timeout with the macro to 3017:12
*** sarob has quit IRC17:13
zaroclarkb: so macro timeout is set to static '30' how do jobs overrite this value?  I thought17:15
zaro{timeout} was the thing that lets the override happen?17:15
* JayF increments "number of times jay has been unsub'd from openstack-dev@" to 417:16
fungiJayF: you need one of those workplace safety posters for your wall... "0 days since last unsolicited unsubscribe from an openstack mailing list"17:16
jeblairfungi: the packet that caused the timeout in the second instance was processed 6 minutes after it was sent17:17
fungijeblair: okay, that's certainly a little more than 30 seconds17:17
fungii could not for the life of me make out any system performance problems exhibiting on cacti graphs for zuul during the problem period17:18
mordredjeblair: autoabandon (or the current lackthereof) came up on another channel, and it seems some teams really miss it - which made me wonder about re-thinking it into a thing similar to the channel logging...17:18
mordredjeblair: that is, something with a yaml config somewhere where a ptl/core-team could opt-in if it's the sort of thing that's important for them17:19
*** esker has quit IRC17:19
Ajaegerthanks, fungi!17:19
mordred(or if their solution to would be "write our own bot that does the same thing")17:19
morganfainbergmordred, jeblair, ++ I know i've heard at least the question about autoabandon in 2 projects now.17:19
fungimordred: btw, the current lack of the old behavior would be fixed by https://review.openstack.org/9288417:19
clarkbzaro use the variable in the defaults17:19
jeblairmordred: i think it's a really bad idea not to be able to predict which of the patches you submit to openstack will be automatically abandoned by the system17:19
*** ArxCruz has joined #openstack-infra17:20
clarkbzaro: the same way you do in a job17:20
Ajaegermorganfainberg: https://review.openstack.org/92884 AFAIK17:20
jeblairmorganfainberg: how about instead we figure out what about the things that are supposed to make it unecessary aren't working and try to fix that17:20
morganfainbergjeblair, perhaps an autowip?17:20
morganfainbergjeblair, and yes we should be doing that as well17:20
fungimordred: looks like jeblair and clarkb already +2'd my fix for auto-abandon. any objections to restoring it to working order?17:21
morganfainbergAjaeger, thanks :)17:21
jeblairfungi: that would cause it to start working again?17:21
*** gokrokve has quit IRC17:21
anteayafungi: I have no objections17:21
jeblairi just -2d it17:22
*** ArxCruz has quit IRC17:22
fungifair enough17:22
*** _nadya_ has quit IRC17:22
mordredfungi: I think we should have a more comprehensive answer to what should be happening- and I actually appreciate it not happening on infra things now17:22
jeblairit was our intent to stop it with the gerrit upgrade; something we confirmed at the summit17:22
morganfainbergwait, core's can abandon any change right directly?17:23
fungimorganfainberg: yes, or wip17:23
morganfainbergfungi, then no need to have a bot do it. if the code really is defunct the core team can cleanup.17:23
jeblairso if we re-enable it, it should not be in the mode of correcting a puppet error, but as a more deliberate thing17:23
jeblairmorganfainberg: ++17:23
mordredjeblair: ++17:23
fungimorganfainberg: core wip was around before the upgrade, but core abandon/unabandon is new17:23
mordredmorganfainberg: perhaps, instead of thinking about it as a solution to stale reviews17:23
jeblairfungi: i'm not even sure core wip was universal17:23
openstackgerritA change was merged to openstack-infra/config: Create common translation functions  https://review.openstack.org/9534517:24
mordredmorganfainberg: we should have people get into the habit of WIP-ing patches as they review them if it's something that really does need a new version17:24
mordredlike, be more aggressive with using that feature up front17:24
morganfainbergmordred, ++++++++++17:24
fungijeblair: well, it wasn't necessarily universal, but it was possible (though i think i/someone had a large patch to turn it on pretty much everywhere earlier this year)17:24
anteayalike if the infra-manual initial commit lands, I can create a patch for that suggesting that for core reviewers17:25
zaroclarkb: that doesn't make sense to me.  why would you set a variable in defaults?  shoudn't default values should be static?17:25
mordredso that rather than the problem being "how do we deal with stale patches" it turns in to "how do we communicate more effectively to people that we expect them to come back with more work"17:25
fungijeblair: anyway, i'll abandon 92884 in favor of a patch to properly turn off auto-abandon in that case17:25
mordredthat way, I can -1 something if I dont' like it but want more feedback from other people17:25
jeblairso maybe we should write a message to the list describing how core-WIP, core-abandon and the dashboards can be used effectively17:25
*** ArxCruz has joined #openstack-infra17:26
mordredbut -1 + -1 WIP something if I don't like it, don't want to blocking--2 it - but know it needs to be fixed without question17:26
clarkbzaro it is static you set it there17:26
jeblairmordred: that's a huge timesaver for other cores17:26
clarkbzaro the point is there are other options we want to be the same everywhere and the macro does that for us17:26
*** praneshp has quit IRC17:26
clarkbthe timeout variable is a variable but the other options arent17:27
morganfainbergmordred, i like that messaging. though, the big scary red X for WIP is historically a hard block, we should make sure to be clear when people see that more17:27
mordredmorganfainberg: the big scary red X for WIP is my least favorite thing about eh current impl17:27
morganfainbergmordred, wonder if we could CSS that to something more distinct17:28
anteayado we have any control over the colour of the wip X? can it be yellow?17:28
morganfainberganteaya, ++ my thought exactly17:28
zaroclarkb: it seems like what i already have is what you are describing.  let me do etherpad of your suggestion.17:28
sdaguemorganfainberg: it's unfortunately not an easy css fix iirc17:30
morganfainbergsdague, yeah i'm finding that out w/ inspection of the elements17:31
*** praneshp has joined #openstack-infra17:31
morganfainbergsdague, it's a raw data load it looks like. ugh.17:31
sdagueyeh - http://paste.openstack.org/show/81737/17:31
sdaguewe'd have to see if zaro wanted to hack us a class in there instead in the gerrit src17:32
mordredwhat if we made a -2 workflow status ... but didn't give anyone access to set it17:32
mordredjust so that gerrit wouldn't render a -1 as an X17:33
sdagueit wouldn't block the change then without prolog hacking17:33
jeblairmordred: you'd want to make sure you can make -1 block17:33
morganfainbergmordred, yeah i wouldn't want it to be a soft looking -117:33
morganfainbergi like the X, just... the color17:33
mordredjeblair: yeah. it would need testing for sure17:33
sdaguebecause the default prolog rules are = 'at least one of the largest value, and non of the smallest value)17:33
mordredor we can just grow used to it and wait for vinz to solve the world17:34
morganfainbergmight be able to "fix" it w/ a dirty jquery hack17:34
jeblairthere _is_ a workinprogress plugin, but it's missing some minor functionality on the 2.8 change screen; apparently works with 2.9.17:34
jeblairso before we start hacking gerrit to make the currenty hacky thing better, we might want to focus on that instead.  :)17:34
sdagueso realistically it would just be nice if all the check / x markers were actually themable instead of inline data.17:34
morganfainbergsdague, ++17:35
mordredjeblair: ++17:35
sdaguebecause inline base64 png files is... ug17:35
mordredsdague: that would probably be a nice upstream patch17:35
openstackgerritJeremy Stanley proposed a change to openstack-infra/config: Disable Gerrit auto-abandon  https://review.openstack.org/9583617:35
fungijeblair: ^17:36
anteayaI wouldn't suggest a prolog hack17:36
anteayawhen might we upgrade to 2.9?17:36
zaroclarkb: https://etherpad.openstack.org/p/change-8463717:36
morganfainbergjeblair, would we want to change "WIP" mechanism again in any reasonable timeframe? I'd be concerned about changing workflow too much.17:37
morganfainbergnot sure how the WIP plugin works though (tbh)17:37
mordredjeblair: oh wow. that actually gets rid of the final use of launchpadsync?17:37
fungimordred: yep17:37
fungione more reason to be in favor ;)17:38
mordredclarkb: the 1.1 regions in hp seem to be unhappy again17:38
mordredclarkb: if I'm reading the graph right17:38
jeblairmorganfainberg: the wip plugin is based on our old wip patch we were running on 2.4, so it should be similar to what we were using before17:39
fungimordred: though the lp sync credentials are still used for update bug/bp17:39
fungii believe17:39
fungibut the cron bits are no more17:40
jeblairfungi: the jenkins logs aren't quite detailed enough for me to know when it responded to the NO_JOB packet in the second instance17:40
zarosdague, morganfainberg : i think the scary red X is an image.  which would mean it would be change to image not just color.17:40
morganfainbergjeblair, hm. i think i like this method better (the workflow), but that is purely personal bias. i wouldn't argue not going back to the old system if it made life easier17:40
morganfainbergzaro, if it was a themeable element, it would solve the issue. and it's a raw b64 png dataload, not just CSS or similar.17:41
jeblairmorganfainberg: [switching to ux feedback collection mode] why do you like it better?17:41
morganfainbergjeblair, strictly because it uses the same mechanism for marking WIP as reviewing. the WIP button and the odd "status" never sat well with me.17:42
morganfainbergjeblair, also, part of the approved column at a glance centralizes the information on WIP/Non WIP, CRV17:42
morganfainbergjeblair, like i said, pure personal preference, but not strong enough to jump on a "don't go back" train.17:43
sdagueto make it something which doesn't look like the a -217:43
sdagueI've had multiple people ask me how to remove the X from their code, because they know it's bad17:43
fungiorange "under construction" sign17:43
sdaguewhen the only X was the WIP they set themselves17:43
sdaguefungi: yeh, that would be much better17:44
morganfainbergfungi, can we have that be animated too?17:44
* morganfainberg goes back to web 1.0 days and blink tags.17:44
fungiand swap the approved green checkmark with a thumbs-up17:44
sdaguefungi: yeh, it's kind of sad they didn't do it in unicode :)17:45
fungibeer mug and snowman17:45
morganfainbergsdague, -2 can become a table-flip ascii guy then!17:45
anteayano blinking17:46
openstackgerritArun Kant proposed a change to openstack/requirements: Adding ldappool module dependency as needed by keystone bug #1320997.  https://review.openstack.org/9584217:46
uvirtbotLaunchpad bug 1320997 in keystone "Common Ldap handler connection pooling" [Medium,In progress] https://launchpad.net/bugs/132099717:46
sdaguethe thing that's also weird is the meta data is completely lost on vote columns that are +2 or -217:46
anteayaI like snowmen17:46
morganfainberg<blink>Whats wrong with blinking</blink> :P17:46
sdaguethey don't have the negative or positive class on them17:46
morganfainbergsdague, yeah it's a little odd17:47
morganfainbergsdague, that was my hope so we could just CSS it up.17:47
*** vhoward has joined #openstack-infra17:49
jeblairmorganfainberg: cool, thanks for the feedback on status vs approval category17:50
morganfainbergjeblair, happy to help.17:50
morganfainbergfungi, nice topic on disabling autoabandon bot.  just noticed17:52
openstackgerritMichael Krotscheck proposed a change to openstack-infra/infra-specs: Added specification for storyboard subscription (Story:96)  https://review.openstack.org/9530717:52
*** pblaho has quit IRC17:54
mroddenmordred: https://review.openstack.org/#/c/93986/ fixes the bash8 ignore thing17:56
mroddenhasn't been release yet; i should probably do that since others were asking about fixes for it17:57
openstackgerritMichael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Added search icon to typeahead fields  https://review.openstack.org/9427317:57
mordredjeblair, morganfainberg, sdague: https://etherpad.openstack.org/p/M5qQPxxPyy17:59
mordredthere's a draft email to send out about using WIP more aggressively17:59
openstackgerritA change was merged to openstack-infra/config: Remove special handling of high-availability manual  https://review.openstack.org/9541418:00
morganfainbergmordred quick glance looks good, need to hop over to keystone irc meething though18:00
openstackgerritArun Kant proposed a change to openstack/requirements: Adding ldappool module dependency as needed by keystone bug #1320997.  https://review.openstack.org/9584218:00
uvirtbotLaunchpad bug 1320997 in keystone "Common Ldap handler connection pooling" [Medium,In progress] https://launchpad.net/bugs/132099718:00
anteayamordred: you can have turquoise I have switched to purple18:01
jeblairmordred: made some teensy changes; do you want to mention core-abandon in this?18:02
*** esker has joined #openstack-infra18:03
*** esker has quit IRC18:03
fungimordred: on a related note, you could mention that core reviewers can now abandon/restore patches for their projects18:03
*** alexpilotti has quit IRC18:03
fungier, what jeblair said :/18:03
*** esker has joined #openstack-infra18:03
mroddenmriedem: were you still looking for a bash8 release for that python26 fix?18:04
mriedemmrodden: at some point, but not anytime soon18:04
mriedemnow that i have other things to do after the long break18:04
mriedemafter the > 2 day weekend18:05
*** david-lyle has joined #openstack-infra18:05
mrodden'll probably do a 1.118:05
mroddenignore is nice to have :)18:05
mroddenits not even implemented in the devstack version which i found really surprising18:06
*** sarob_ has quit IRC18:07
Ajaegerclarkb: I'm working on the translation scripts to add extraction of log level messages and thus stumble about some odd things. I'd like to know whether we really need to run the update - could you review https://review.openstack.org/95687, please?18:08
*** _nadya_ has joined #openstack-infra18:09
*** hashar has joined #openstack-infra18:09
sdaguemordred: did you see my analysis of -1ed merged code last week?18:10
sdaguehonestly, I think this is going to just confuse things. And people are already -1ing meaning -1 and WIP today anyway18:11
clarkbAjaeger lgtm will +2 when I sit with laptop18:11
Ajaegerclarkb: Thanks! Then I can code easier on the log level extraction ;)18:11
*** krtaylor_ has quit IRC18:12
*** nati_ueno has joined #openstack-infra18:14
*** rfolco has quit IRC18:14
jeblairfungi: in the third instance, the geard gap was bracketed by two 'receive packet' log entries.  that makes it seem unlikely that geard was blocked on sending network traffic18:15
*** jp_at_hp has quit IRC18:16
jeblairfungi: during all three gaps, geard continued to get new workers connecting to it periodically18:16
jeblairfungi: it's starting to look like all network traffic on existing connections was stopped, then resumed18:17
fungijeblair: this would be consistent with something like a live migration while rackspace tried to deal with block storage backend issues in that region18:17
*** rfolco has joined #openstack-infra18:18
fungiphschwartz: do you happen to know if there was anything going on related to the block storage issues in dfw earlier today which might have caused one of our instances there to get paused for brief periods?18:18
openstackgerritAndreas Jaeger proposed a change to openstack-infra/config: Extract translations for log messages  https://review.openstack.org/9537718:19
*** jpich has quit IRC18:20
Ajaegerdhellmann: now it should look much nicer ^18:20
jeblairfungi: same pattern for fourth occurrence18:21
*** krtaylor has quit IRC18:22
Ajaegerdhellmann: good catch, thanks!18:24
Ajaegerwill update later18:25
anteayahello reed18:26
anteayaare you in austin?18:27
clarkbjeblair: the new gertty HEAD does it have anything new that I don't already have?18:27
jeblairclarkb: nope, new stuff is in review18:27
mtreinishArxCruz, rfolco: is there a reason you guys are reporting running the tempest unit tests on powerkvm ci? I'm not sure there is much value in doing that.18:28
phschwartzfungi: I have not heard of anything. Let me look at notifications and see what needs to get looked at.18:28
mtreinishArxCruz, rfolco: oh and the log links for those jobs are dead...18:28
clarkbjesusaurus: your jenkins changes are happy now?18:28
reedanteaya, on my way, I'll have an answer for you tomorrow18:28
mordredsdague: so - I think we might be talking about different concerns or problem spaces?18:29
phschwartzfungi: oh, and we don't currently do live migrations of active customer instances.18:29
sdaguemordred: that's possible18:29
mordredsdague: the thing the WIP and/or auto-abandon is trying to solve is slimming down or focusing the set of things I should be looking at18:29
fungiphschwartz: between 1300 and 1400 utc (which was shortly before they posted an impact notice for dfw) we saw multiple periods of inexplicable time skips for zuul.openstack.org which resulted in timeouts for some of its internal communication, so just curious whether there was any possible correlation18:29
mordredand isn't really about whether things get merged or not18:29
jesusaurusclarkb: yep :)18:29
sdaguemordred: sure, but you are doing that by offloading more work to other reviewers18:30
fungiphschwartz: er, actually between 1400 and 150018:30
phschwartzfungi: Let me look in that time frame18:30
*** che-arne has quit IRC18:30
sdaguewhen i think most of them are already signaling you that same info with a -1 today18:30
mordredsdague: what more work? just WIP something when you -1 it then18:30
openstackgerritA change was merged to openstack-infra/config: Do not run unneeded update_catalog  https://review.openstack.org/9568718:30
jeblairfungi, phschwartz: http://paste.openstack.org/show/81770/  are the exact times where we either saw gaps in expected network traffic, or geard was doing something very unexpected18:30
mordredit's on the same screen18:30
mordredsdague: the problem is - non-core folks also -1 things18:31
anteayareed: great, thanks, safe journey18:31
sdaguemordred: sure18:31
mordredsdague: adn those -1's are different18:31
mordredtake: https://review.openstack.org/#/c/90234/18:31
mordredsdague: zaro's original -1 vote is a valid thing for him to express and I'm glad he did. but he's not core, which means that this patch still needs to be reviewed by the cores18:32
sdaguemordred: sure, but that was the point on my merge bit.18:32
anteayamtreinish: any dates selected for qa mid-cycle meetup?18:32
phschwartzdo any of you have the uuid for the zuul instance handy?18:33
jeblairphschwartz: i'll get it18:33
fungiphschwartz: getting it for you18:33
sdaguemordred: ok, so at least in the nova team that would trigger zaro coming back and saying "oh, gotcha, I was wrong +1"18:33
clarkbmordred: yes 1.1 looks really unhappy18:33
mordredsdague: right. I think the nova team might be different than many of the other teams18:33
clarkbmordred: and we are using 100 nodes per router:network now18:34
clarkbmordred: so I think that debunks the theory18:34
mordredclarkb: grumble18:34
sdagueit could be, we do the same ish thing on tempest18:34
*** shivharis has joined #openstack-infra18:34
jeblairfungi: i agree.  :)18:34
phschwartzfungi: ty18:34
clarkbmordred: so uh ya18:34
mordredsdague: so I think the thing is - it's a tool that's available to people if they want to use it18:34
mordredthere are teams actively asking for the auto-abandon bot18:34
fungiphschwartz: thank YOU for weighing in ;)18:34
sdaguemordred: yes, and I'm one of them :)18:34
mordred17:26:59        lifeless | I hate autoabandon when it happens to my branches18:34
mordred17:27:12        lifeless | I love it when it happens to those of drive-by contributors18:34
clarkbmordred: its a bit frustrating to get all of this third hand18:35
mordredthere's a good example ^^18:35
mordredclarkb: let me see if I can do that18:35
sdagueto the point that I might just cron it for projects I feel entitled to18:35
mordredsdague: right. which is why I was suggesting that if people are about to start doing that, we should re-think our current bot - but we'd like to explore not needing it if possible18:35
*** rlandy_ has joined #openstack-infra18:36
mordredsdague: if autoabandon is a thing that's useful, then perhaps the "-1 == WIP" your'e talking about actually _isn't_ working18:36
mordredother than that a -1 is an effective block to merging18:36
mordredbut it may not actualy be serving a job as an effective workflow flag18:36
sdaguesure, the issue is -1 not meaning - please respin this, is the exception18:36
sdagueso the fact that it's more work to do the default case, is annoying, and means I'd probably never do that workflow18:37
sdagueif I'm not sure, I leave a 0 review18:37
sdagueor I ask someone18:37
clarkbmordred: it apepars to be load related18:37
clarkbmordred: like if we change the nodepool rate limit it might be happier but with less throughput18:37
clarkbmordred: and that should be enforced on their end imo18:38
mordredclarkb: as in, related to number of requested instances?18:38
clarkbmordred: ya18:38
phschwartzfungi: hmm, nothing looking out of whack with the hv, will have it watched, but if it happens again, lets open a ticket and get it moved to a different hv so we can see if it is truly an hv issue.18:38
jeblairclarkb: are you sure it's the rate and not the total number of instances?18:38
clarkbjeblair: no, it may also just be the totals18:38
fungiphschwartz: definitely, and thanks. we see similar issues occasionally, but only recently got additional logging in place to have a better understanding of what could be triggering it18:39
openstackgerritAndreas Jaeger proposed a change to openstack-infra/config: Extract translations for log messages  https://review.openstack.org/9537718:39
jeblairsdague: could an alteration to the dashboard queries you use that ignores changes with a negative review that are older than a certain age help?18:39
jeblairsdague: essentially, i think that a good dashboard system should make auto-abandon unecessary18:39
phschwartzfungi: as it is seen more, I would love to get as much log info as possible so we can troubleshoot it.18:40
sdaguejeblair: maybe, we've still got the issue of iterating on dashboards18:40
jeblairsdague: istr you wrote a patch to support that18:40
Ajaegerdhellmann, clarkb: The above patch (95377) is ready for review and merge now. It's not urgent (unless dhellmann is really eager ;) but if there are questions, I'm here for discussion.18:40
sdaguejeblair: 7 days no reviews :)18:41
openstackgerritA change was merged to openstack-infra/storyboard-webclient: Added search icon to typeahead fields  https://review.openstack.org/9427318:41
mordredclarkb: I have copied you on an email where I have also requested someone from HP show up in channel to talk to us18:43
clarkbmordred: thank18:44
*** markwash has quit IRC18:44
clarkbmriedem: sdague: so behavior I have noticed is that on "mondays" logstash indexing falls behind18:45
clarkbthough that may be related to the network things that fungi is debugging for zuul18:45
clarkbwill need a larger sample size to be confident in blaming monday rechecks18:45
*** chuckC has joined #openstack-infra18:46
jeblairmordred, sdague: i _think_ adding something like " -(age:14d label:code-review-1)" to the dashboard queries would get you the same filtering effect as auto-abandon18:47
sdaguejeblair: not quite18:48
sdaguebecause if people continue to comment on it, the age gets reset18:48
sdagueand even a 0 comment will trigger clean check18:48
jeblairsdague: sure, but if they comment on it, it's clearly not abandoned18:48
sdagueit might not be them18:48
sdagueit might be a helpful other person coming through saying "you need to handle x"18:49
sdaguewhich I've seen18:49
jeblairthat still sounds like it's not abandoned18:49
mtreinishanteaya: sorry was eating late lunch, planning on the week of the 14th. But working through all the details, nothing set in stone yet.18:49
mtreinishthat will probably come later this week :)18:49
sdaguejeblair: maybe, i feel like it still is18:49
fungiclarkb: i believe the six gearman disconnects followed by all jobs getting restarted chewed through a lot of additional nodes and probably spiked logs as well (since they kept running, zuul merely lost track of them)18:50
sdaguethe problem I've got with people posting log lived, not being worked, patches in gerrit is it's a mutex18:50
jeblairsdague: i think the thing is that we really need to start thinking of abandoned as the Really Big Deal that it is.18:50
clarkbfungi: ah that would do it18:50
sdague"should I work on this... oh, mordred has an active set of patches on it, so no"18:50
clarkbfungi: so I will wait on a larger sample size before I blame anything specific18:50
sdaguewhen they really aren't active18:50
mordredsdague: right, but I might have an active train of dev on a thing that sits with a -1 on it for 8 weeks18:51
sdagueit's like assigning a bug in progress to yourself18:51
jeblairsdague: yeah, but you have all the info you need to work out if it's really active or not, and how to follow up with the author.18:51
fungiclarkb: yeah, we're still trying to catch up from that i think (noting time-in-queue for a lot of changes is still a bit higher than it should be)18:51
jeblairsdague: if those patches are abandoned then that is all lost18:51
openstackgerritA change was merged to openstack-infra/config: Run checklang gate only on master  https://review.openstack.org/9564218:51
anteayamtreinish: sorry to disturb your lunch, thanks, any idea how many days you will meet?18:51
sdaguejeblair: sure, I'm not saying the other way is unicorns and rainbows18:51
jeblairsdague: it definitely has spiky horns18:52
*** hasharOut is now known as hashar18:53
mtreinishanteaya: well it'll be a shared infra/qa midcycle so it'll be 4 or 5 days18:53
mordredclarkb: what are our current node spinup timeouts?18:53
mordredclarkb: still 5 minutes/18:54
openstackgerritA change was merged to openstack-infra/config: Reindex Gerrit after project move/rename  https://review.openstack.org/9560318:54
clarkbmordred: its not that18:54
clarkbthey are comming back with an error status18:54
mordredoh! they are? awesome18:54
clarkbmordred: /var/log/nodepool$ grep 'LaunchStatusException launching node id' debug.log | grep hpcloud-b | wc -l18:54
anteayamtreinish: ah cool it is the the shared infra meetup, great18:54
clarkb2413 when I ran it18:54
*** zhiyan is now known as zhiyan_18:55
anteayareed so if we can steer our event for the week of June 30th or the week of July 7th, that would be the best for me18:55
anteayaI'll be disappointed to miss Canada Day, but oh well18:55
reedanteaya, ACK18:55
mordredoh - wait18:55
mordredclarkb: sorry, I suck18:56
mordredclarkb: those aren't hard errors - that's the exception that gets thrown after a timeout18:57
clarkbmordred: no it specifically says that the node went into ERROR18:57
mordredclarkb: nod. thank you18:57
clarkbI can read code to be double sure but status: ERROR implies to me that it went into ERROR18:58
jeblairfungi: options: a) increase the geard debug level.  we will need a cinder volume (we're already at 40G/day, i expect that to 4x)  b) increase the zuul timeout to 300s and carry on.18:59
*** nati_ueno has joined #openstack-infra18:59
*** arnaud__ has joined #openstack-infra18:59
clarkbmordred: ya that looks like hard error, link with code incoming18:59
jeblairfungi: a.1) add packet logs to the mix to help with debugging18:59
*** arnaud__ has quit IRC18:59
zaroclarkb: i don't think your suggestion would work because there's a ton of jobs that still use the default 'timeout'.  I think your suggestion would require a big refactor to change a bunch of jobs to use 'build-timeout'.  is this correct?18:59
jeblairmeeting time in #openstack-meeting19:00
clarkbmordred: https://git.openstack.org/cgit/openstack-infra/nodepool/tree/nodepool/nodepool.py#n388 it happens after wait for server but it is an explicti check of the status19:00
clarkbmordred: so pretty sure waitforserver returns because node went into hard error19:00
clarkbzaro: yes, but that is a different change19:00
clarkbzaro: this first change is just setting up thedefault appropriately19:01
*** derekh_ has joined #openstack-infra19:02
*** Lingo is now known as BadCub19:05
*** melwitt has joined #openstack-infra19:06
*** pdmars has joined #openstack-infra19:06
*** lcheng_ has joined #openstack-infra19:08
dhellmannshould I write up the cross-test blueprint in infra-specs, or is that repo for experimentation still?19:08
*** james_li has joined #openstack-infra19:12
openstackgerritA change was merged to openstack-infra/config: Add PostgreSQL integration testing for Gnocchi  https://review.openstack.org/9546319:19
openstackgerritA change was merged to openstack-infra/config: Gate Gnocchi on Python 3  https://review.openstack.org/9554319:20
mordredwow. what's gnocchi?19:20
Alex_Gaynormordred: it's a type of pasta19:20
morganfainbergAlex_Gaynor, ++ beat me to it!19:21
mordredAlex_Gaynor wins. and now I'm hungry19:21
rcarrillocruza pasta that can be hard to swallow :P19:21
mordredrcarrillocruz: ++19:22
openstackgerritValeriy Ponomaryov proposed a change to openstack-infra/config: Enabled pylint job for manila project  https://review.openstack.org/9576519:22
ArxCruzmtreinish: hey, I'm looking for the unity tests results19:24
ArxCruzmtreinish: we started to listen the tempest, and was a good idea start unity tests as well :)19:24
fungidhellmann: you can propose it as a child of the initial commit change which is still up for review (there are several which have done so already)19:25
SpamapSis heat-specs misconfigured?19:29
SpamapShttp://paste.ubuntu.com/7531507/ ...19:29
SpamapSgit review -s tries to grab 'orchestration-specs'19:29
clarkbSpamapS: yes it was renamed19:30
clarkbsee .gitreview19:30
mordredSpamapS: I submitted a patch19:30
SpamapSwhich was abandoned?19:31
SpamapSlet me just wave that through19:31
SpamapSmordred: no reason given for the abandon. ?19:32
*** markmc has joined #openstack-infra19:33
SpamapSah and I see because there's no tox.ini19:33
SpamapSand then there's no specs19:33
dhellmannfungi: ok, I'll do that, I just wasn't sure if you were actually using the repo for regular changes or not19:33
SpamapSshouldn't we like, test that a repo has a working gate, before adding the not-noop gate to it? :-P19:34
clarkbyou just make it work on your first commit19:34
SpamapSaight, let me just do that then19:35
mtreinishArxCruz: for example: on https://review.openstack.org/#/c/95843/ , http://dal05.objectstorage.softlayer.net/v1/AUTH_3d8e6ecb-f597-448c-8ec2-164e9f710dd6/pkvmci/95843/1/gate-ibm-tempest-python27/1e8e096 is broken19:35
ArxCruzmtreinish: yes, I'm checking what's wrong, it's the swift script I will let you know when it's fixed :)19:35
mtreinishArxCruz: the unit tests shouldn't be any different on PowerKVM vs anything else, it's just python code that doesn't call anything outside the tree19:35
mtreinishso you huys are just wasting resources running them again19:35
ArxCruzmtreinish: hmmm, not really, we found some tests broken in nova for example, due the lack of ide support on Power19:36
fungiSpamapS: for example https://review.openstack.org/9444019:36
SpamapSsphinx.errors.SphinxWarning: /home/clint/src/heat-specs/doc/source/index.rst:9: WARNING: toctree glob pattern u'specs/*' didn't match any documents19:36
SpamapSbut specs/* has files19:36
*** yfried has quit IRC19:37
mordredSpamapS: https://review.openstack.org/#/c/95297/19:38
mtreinishArxCruz: the nova unit tests may different. I'm saying that you won't ever hit a unit test issue like that. The tempest unit tests don't make any external calls.19:38
SpamapSmordred: yes, that is the error I'm trying to fix on that review19:38
mtreinishif you don't believe me look, there are <500 of them19:38
mordredSpamapS: gotcha. so, it turns out that the cookiecutter repo is bs19:38
mordredand doesn't have a good set of content in it19:38
mordredlook at the infra-specs patch19:38
ArxCruzmriedem: oh, okay. I will talk with kurt, and remove the test :)19:39
mordredI had to 'fix' what cookiecutter did there19:39
mriedemArxCruz: looking for mtreinish?19:39
ArxCruzdamn, again!19:39
mordredat least, I thought I did that for infra - maybe I did it for heat and didnt' psuh?19:39
ArxCruzmtreinish: ^19:39
mordredI fixed this somewhere19:39
jeblairmordred: what did you have to fix for infra?19:39
jeblairmordred: (i think you're making things up since _I_ wrote the init patch for infra; you must have done it somewhere else)19:40
dhellmannfungi: the change I want to make is in openstack-infra/config, which doesn't appear as an option in storyboard. Should I just say that I want to change the specs repo?19:40
fungidhellmann: probably19:41
openstackgerritMichael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Added field restrictions and error messages to project forms  https://review.openstack.org/9587319:41
openstackgerritMichael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Added field restrictions and error messages to project forms  https://review.openstack.org/9587319:42
*** SumitNaiksatam has joined #openstack-infra19:44
*** dprince has quit IRC19:45
*** zhiyan_ is now known as zhiyan19:46
mordredSpamapS: the tip of heat-specs works for me19:46
mordredSpamapS: oh! I think I'm missing a git add19:47
mtreinishArxCruz: were you talking about: https://review.openstack.org/#/c/93621/ on nova?19:47
SpamapSmordred: tip of heat-specs fails because there's no toxini19:47
*** lcheng_ has quit IRC19:47
mordredSpamapS: I meant my change19:48
mordredSpamapS: I just pushed an updated version19:48
mordredsorry - I forgot to git add a file19:48
*** lcheng_ has joined #openstack-infra19:50
mtreinishArxCruz: because the bug there is not a bug because in your env, it's an any non default arch bug. It's still open IMO, you should be mocking the get_arch stuff like on L2030 or L907 of the same file.19:51
*** james_li has quit IRC19:51
mordreddhellmann: btw- see above re: specs-cookiecutter19:51
mordreddhellmann: it does not produce specs repos that can actually build via build_sphinx - I copied what nova did for heat, but I'm not sure it's the _right_ thing19:52
*** dims has quit IRC19:54
mtreinishmordred: did I screw up something in specs-cookiecutter? Patches are welcome :)19:54
*** zhiyan is now known as zhiyan_19:55
dhellmannmordred: yeah, I had to make a bunch of changes for oslo-specs, too. I should submit those back to the cookiecutter repo.19:55
mordreddhellmann: same here19:56
*** signed8bit has quit IRC19:56
mtreinishdhellmann: yeah I made the mistake of doing qa-specs before I did the cookiecutter. So I didn't have to go through the pain firsthand...19:57
*** gyee has quit IRC19:57
openstackgerritAndreas Jaeger proposed a change to openstack/requirements: Update openstack-doc-tools to at least 0.15  https://review.openstack.org/9587919:57
*** saschpe has quit IRC19:58
openstackgerritAndreas Jaeger proposed a change to openstack/requirements: Update openstack-doc-tools to at least 0.15  https://review.openstack.org/9587919:59
*** saschpe has joined #openstack-infra19:59
*** nati_ueno has quit IRC20:00
openstackgerritDoug Hellmann proposed a change to openstack-infra/infra-specs: Add spec for adding cross-project unit test jobs  https://review.openstack.org/9588520:01
*** melwitt has quit IRC20:02
SergeyLukjanovjeblair, fungi, clarkb, mordred, anteaya, sorry folks, I'm not very active in IRC last several days and probably will not be active till the end of week - /me totally destructed by jet lag and backlog20:02
anteayaSergeyLukjanov: yes, I understand jet lag20:02
clarkbmordred: and that is what I meant by making it config. because encoding it in source is ugh20:02
mordredclarkb: ECONTEXT20:02
anteayaSergeyLukjanov: get some sleep20:02
clarkbmordred: test-matrix20:02
mordredclarkb: ah20:02
anteayaSergeyLukjanov: glad to have you back when you are rested20:02
mordredclarkb: so - I think the main thing where we're missing each other20:02
clarkbmordred: the .py file shouldn't know that hp/* exists or */foo20:02
lifelessthe tripleo HP region20:02
mordredclarkb: is that I think there is an expansion of the algorithm which does not need to know about hp/*20:03
clarkbmordred: instead it should be given a list of prefixes (maybe regexes with a substr map)20:03
lifelessI'm trying to investigate the errors there20:03
clarkbmordred: I grok that20:03
mordredclarkb: and does not need for hp/* (or redhat/*) to be put in the config20:03
clarkbmordred: but that algorithm reduces our checks upstream20:03
SergeyLukjanovanteaya, the issue is that I'm sleeping ok just permanently tired due to the feeling of the incorrect timezone ;) that's really funny to feel 8h diff tz20:03
clarkbmordred: we lose a feature by doing that20:03
anteayaSergeyLukjanov: yup20:03
anteayaand dizzyness and headaches20:03
clarkbmordred: a couple really20:03
clarkbmordred: first we map feature/* to master20:04
anteayasleep my friend20:04
clarkbmordred: and second we check that you don't have hp/* branches upstream20:04
mordredclarkb: ok - maybe configurable prefix regexes will do what I'm talking about20:04
*** shivharis has quit IRC20:05
mordredI just want to be able to carry the logic encoded by test-matrix in wholesale except with the existence of a namespace- so $prefix/stable/grizzly should be able to pick up the feature matrix as it is there20:05
mordredfrom stable/grizzly20:05
mordredclarkb: we may be getting close to an understanding...20:05
lifelesscan I get someone to tell me what nodepool thinks the current status is ?20:05
derekh_on a related note has nodepool stoppd talking to the other tripleo region in the last few minutes ?20:06
*** krtaylor has joined #openstack-infra20:06
*** marcoemorais has quit IRC20:07
*** marcoemorais has joined #openstack-infra20:07
clarkbmordred: right so I think  you do a branchmap:\n  ^.*/(stable/.*)$: \120:08
mikaljeblair: can you please add me to nova-coresec?20:08
clarkbmordred: maybe thats gross to write and we can make the config of it less gross20:08
clarkbmordred: but I think that describes what you want20:08
mordredclarkb: where do you do that?20:08
clarkbmordred: in the test-matrix config20:08
sdagueon the list of patches I've got outstanding, I'd like to get opinions on - https://review.openstack.org/#/c/91799/ - which is removing all the pypy jobs20:08
*** BadCub has quit IRC20:08
clarkbsdague: pypy should be fine now20:08
sdagueit's still got a 20% fail rate in the gate20:09
clarkbexcept for az2 and we can remove the pypy nodes from there20:09
clarkbya its hpcloud hating us20:09
mordredclarkb: I'd love to have a way to have some overlay config, so that forking/patching the test-matrix config was not needed20:09
clarkbmordred: but tahts the whole point20:09
clarkbits config it is needed20:09
clarkbotherwise we haev the problem jeblair described20:10
clarkbwhich is we have no way to test and unbreak downstreams20:10
sdagueclarkb: so does someone else have the alternate patch up to make it voting again20:10
clarkblifeless: I haev no idea what is going on20:10
clarkbsdague: yes Alex_Gaynor has a patchto make it voting20:10
jeblairmikal: done20:10
mikaljeblair: thanks20:10
mordredclarkb: I'm sorry - I'm very dense - I do not know what you mean by that. I'm basically just talking about some way to run-parts a config, or have a second config that can be there and optional or something, so that a downstream can consume the upstream config unedited20:11
clarkbmordred: we could do what jjb does20:11
sdagueclarkb: cool20:11
clarkband combine all the yaml docs20:11
*** mbacchi has quit IRC20:12
*** nati_ueno has joined #openstack-infra20:12
anteayajeblair: we talked about making me gerrit admin at the summit20:12
sdaguemordred: can you sketch out what you need after the tc meeting? I only have half the context here20:12
mordredsdague: yah. let's come back to it after meeting20:12
lifelessclarkb: I see three vms in state error20:12
anteayalike do you want to post to the ml20:12
*** hashar has quit IRC20:12
jeblairanteaya: yeah, i've been looking into it and am not quite ready with a proposal yet20:12
anteayaokay great20:13
clarkblifeless: looks like we haev a bunch of nodes in delete and building states20:13
anteayado you need to see something more from me?20:13
*** chuckC has quit IRC20:13
clarkblifeless: and they have been that way for 7 hours20:13
lifelessclarkb: and a f20 template in running state20:13
anteayajeblair: do you need to see something more from me?20:13
clarkblifeless: I would never bet against borked networking in your cloud >_>20:14
lifelessclarkb: and thus need to reset tcp again?20:14
mordredlifeless: it's a firewall. it is a glitch by design :)20:14
lifelessclarkb: said firewall is at the rackspace end AFAICT20:14
lifelesspretty please?20:14
clarkbhow did we bork the firewall in rackspace?20:14
clarkbespecially since rax and hp and rh endpoints not in that DC are fine20:15
lifelessclarkb: hypothesis - too long a period with no response on a socket and it forgets the socket exists, then when you should be getting RST from your packets, you don't20:15
*** radez is now known as radez_g0n320:15
*** dhellman_ has joined #openstack-infra20:15
lifelessclarkb: oh yes, I *know* the fundamental issue is our end, but the firewall that prevents TCP's state machine doing its thing is separate to the cause of the issue.20:16
lifelessso this is an interesting change in the -22 build of linux20:16
lifeless  * vlan: Set correct source MAC address with TX VLAN offload enabled20:16
lifelesstx-vlan-offload: on [fixed]20:17
lifelessand we have a vlan20:17
lifelessso thats a candidate for the issue20:17
derekh_lifeless: is it possible you've done something in R1 that would cause nodepool to stop talking to R2 ?20:18
*** primeministerp has joined #openstack-infra20:18
lifelessderekh_: No.20:18
*** otherwiseguy has quit IRC20:19
clarkbso what connection needs killing? (ESTABLISHED) that one?20:19
derekh_lifeless: ok20:19
clarkbmordred: also your plea to get people on IRC didn't work at all20:19
clarkbunless folks are lurking here?20:20
*** primeministerp has quit IRC20:20
*** primeministerp has joined #openstack-infra20:20
mordredclarkb: it did not - although they are investigating errors on their end20:20
openstackgerritMichael Krotscheck proposed a change to openstack-infra/storyboard-webclient: Added simple logged-in dashboard  https://review.openstack.org/9266220:21
jeblairlifeless: why do you think this is a firewall issue?  i believe our understanding of the problem was lack of keepalive support in novaclient combined with the tripleo controller dropping its connection with no termination packet.20:21
ArxCruzmtreinish: okay, I will talk with kurt :)20:21
*** boris-42 has quit IRC20:21
*** boris-42 has joined #openstack-infra20:21
openstackgerritAndreas Jaeger proposed a change to openstack/requirements: Update openstack-doc-tools to at least 0.15  https://review.openstack.org/9587920:22
*** reed has quit IRC20:23
lifelessjeblair: I'm convinced that the reason it doesn't self correct is a firewall somewhere.20:25
*** afazekas has joined #openstack-infra20:25
lifelessjeblair: tcp keepalive on would allow it to self correct without the RST from the target system.20:25
clarkbzaro: ok20:25
lifelessjeblair: removing the firewall(s) that prevent the RST being triggered/received would also allow it to self correct.20:25
jeblairlifeless: i would not expect an RST without an outgoing packet; in these cases, nodepool is not transmitting, it's waiting for a responses20:26
*** dims has joined #openstack-infra20:26
lifelessjeblair: I am fairly sure the tcp state machine on the nodepool end is in fact trying to get an ACK for the last request sent20:27
zaroclarkb: i think your approach works as well.  mine was to do both in 1 change (but has not been done yet).20:27
lifelessjeblair: so there are outgoing packets, but we can check this easily enough20:27
clarkbzaro: by using the macro from the start any changes to macro end up everywhere20:28
clarkbis port 13774 the nova endpoint?20:29
*** pdmars has quit IRC20:29
lifelessclarkb: yes20:29
jeblairlifeless: at the summit i was told the underlying network problem was mellanox related.  do you still think that's the case?  when do you expect it to be fixed?20:30
lifelessjeblair: its clearly more than just that.20:30
clarkbI can kill (ESTABLISHED) whenever people are ready20:30
lifelessclarkb: please do20:30
lifelessjeblair: I've asked internally for firmware patchse for the hardware20:30
zaroclarkb: yeah, i see the point.  it's a little confusing with the changes that i've already pushed (84637 & 92773) for review.  So i think i'll just abandone those and redo with the approach you suggested.  probably will work out better.20:31
lifelessjeblair: and hopefully we'll be able to redeploy with an HA control plane very very soon20:31
lifelessgreghaynes is on that feature along with jprovazn and one of the dmitry's20:31
clarkbfungi: jeblair: do you want to do any other debugging before I `killcx`20:32
zaroclarkb: did we restart gerrit over the weekend?20:32
clarkbzaro: yesterday fungi did it to reindex20:32
lifelessclarkb: oh you haven't? lets look for traffic20:33
lifelessclarkb: tcpdump -i $whatever host
lifelessclarkb: or similar20:33
zaroclarkb, fungi : darn, gerrit still wrapping files into zip on download https://review.openstack.org/#/c/93108/20:34
lifelessclarkb: whats your external IP for the nodepool machine ? I will dumpon the server side20:34
clarkblifeless: addr:
lifelessohhhh ho ho ho this is interesting20:35
lifelessI just spotted what looks like an egregious misconfiguration in the vlan setup20:35
clarkbI see nothing going in or out with host
lifelessbear with me while I tickle that20:36
openstackgerritAndrea Frittoli  proposed a change to openstack-infra/devstack-gate: Allow to configure git base URL  https://review.openstack.org/9590120:39
lifelessthe ip address was not added to the vlan, but instead to the external bridge20:41
lifelessits a wonder its working at all20:41
*** BadCub has joined #openstack-infra20:41
clarkbits a frame tag20:42
*** BadCub has quit IRC20:42
lifelessclarkb: it is, but ovs is generally a lot better about strictness than linxubridge20:45
openstackgerritAndrea Frittoli  proposed a change to openstack-infra/devstack-gate: Make the master branch configurable  https://review.openstack.org/9590420:45
lifelessclarkb: you are not meant to process traffic for not-your-vlan on a raw interface20:45
fungiunless it has some equivalent of cisco's default<->native translation20:46
fungibut generally correct20:46
fungiclarkb: so tcpdump shows no outgoing packets matching that destination address+port?20:47
fungiif that's the case (not even retransmits), then yes you don't need a firewall silently discarding things to see this behavior20:48
openstackgerritSergey Lukjanov proposed a change to openstack/requirements: Bump min hacking version to 0.8.1  https://review.openstack.org/8233920:48
lifelessits a bit of a thread-the-needle condition, but yes.20:49
lifelesskeepalive ftw20:49
funginow someone just needs to fix the requests library to support it correctly20:49
lifelessit does support it20:49
clarkbtcpdump -i eth0 host whateverip20:49
lifelessthere's just some horrid mess somewhere in the stack20:50
*** lcheng_ has quit IRC20:50
*** marcoemorais has quit IRC20:51
fungii thought there was something with the underlying calls from requests to urllib not setting it up, but i forget the details now20:51
*** marcoemorais has joined #openstack-infra20:51
clarkbfungi: yes see jeblair's comments20:51
lifelessthat may be the mess20:51
*** andreykurilin has quit IRC20:51
fungioh, right, and this is bubbling up through novaclient20:51
*** gokrokve_ has quit IRC20:52
clarkbfungi: lifeless: in any case let me know if/when you want the existing connection to be shot20:55
clarkbjeblair: can you review 95302 as an alternative to 9527720:56
*** zhiyan is now known as zhiyan_20:56
lifelessclarkb: will do20:58
lifelessclarkb: I'm stomping on this bug first20:58
*** markwash has joined #openstack-infra20:58
jeblairclarkb: can you respond to jhesketh on https://review.openstack.org/#/c/95302/1 ?21:00
*** hashar has quit IRC21:01
clarkbjeblair: hrm hard to as there is no explanation to why21:01
*** julim has quit IRC21:01
*** alexpilotti has joined #openstack-infra21:01
clarkbbut I will respond with what I discovred and see what jhesketh says21:01
jeblairclarkb: i thought your commit message was pretty clear21:02
*** gyee has joined #openstack-infra21:02
anteayajeblair: jhesketh has a -1 on your infra-manual initial commit: https://review.openstack.org/#/c/92475/21:02
anteayaI'd really like to see it merged this week, if possible21:02
derekh_nodepool still doesn't seem to be talking to tripleo-test-cloud-rh1 (for about an hour now), anybody got any idea why?21:03
*** bhuvan_ has joined #openstack-infra21:03
*** hashar has joined #openstack-infra21:03
clarkbjeblair: done21:03
lifelessclarkb: ok, I've sorted by head out now. Looks good - please kick.21:04
*** ArxCruz has quit IRC21:04
jeblairclarkb: zuul config change +221:04
Alex_GaynorAre new jenkins builds not starting again?21:05
*** bhuvan has quit IRC21:05
derekh_Alex_Gaynor: +1 , could explain why I'm not seeing any new instances being spawned on tripleo-test-cloud-rh121:06
fungijeblair: back on the zuul gearman disconnects, i worry that a 300s timeout might still occasionally get tripped, especially for downstream users who may run it on more resource-constrained systems than we have. i'm happy to add a 0.25t cinder volume at /var/log/zuul (we've got available quota) and take the next quiet opportunity for a quick zuul restart21:06
clarkbAlex_Gaynor: yes that apepars to be the case21:06
clarkbwhich would explain not takling to rh121:07
*** chuckC has joined #openstack-infra21:07
*** eharney has quit IRC21:07
*** gyee has quit IRC21:07
clarkblifeless: actually looks like we have building nodes to your cloud now21:07
* clarkb looks at zuul21:08
lifelessso we really need to fix this21:08
clarkblifeless: we being you?21:08
fungiclarkb: i'm checking now to see if we ended up with new disconnects21:08
clarkbfungi: did zuul disconnect again?21:08
*** nati_ueno has quit IRC21:08
clarkblifeless: I didn't kill anything btw21:08
lifelessSpamapS: still around? Did we file a bug over novaclient not doing keepalive properly?21:08
clarkbso ya I don't know21:08
lifelessclarkb: ?!21:08
lifelessclarkb: no, I see the same nodes before21:09
lifelessclarkb: or do you mean the rh1 region ?21:09
lifelessclarkb: you do? using what tool21:09
fungiclarkb: are you sure the builds which started 8 hours ago didn't just time out (they were due for it) and get replaced with new builds which are also stuck?21:09
derekh_clarkb: yup, thats the rh region, lifeless is concerned with the hp region21:10
clarkblifeless: nodepool list21:10
clarkbfungi: oh that could be21:10
fungiand yes, we've seen two more local gearman disconnects on zuul. getting timestamps now21:10
clarkbderekh_: its not the rh region21:10
clarkbbut what fungi describes may be te case21:10
clarkblifeless: I will kill connection now21:10
lifelessclarkb: sec21:10
lifelessclarkb: check for tcp traffic again21:10
jeblairfungi: i think we should go for a full 1tb21:10
lifelessclarkb: if nodepool wasn't actively trying anything, that would explain no traffic :)21:11
*** e0ne has quit IRC21:11
fungijeblair: we can. we've got just over 2t of quota open at the moment21:11
fungii'll get it spinning up21:11
*** markmc has quit IRC21:11
jeblairfungi: oh we have quota!  neat.21:11
jeblairclarkb: for geard debug level logs21:11
derekh_clarkb: ok, fair enough, at the same time nodepool has started creating new instances on the rh region (after a 1hr window of silence)21:12
openstackgerritKhai Do proposed a change to openstack-infra/config: Add a build-timeout macro  https://review.openstack.org/9591221:12
fungijeblair: according to cinderclient, we're using 22628 of our 25600 maxTotalVolumeGigabytes21:12
fungiso nearly 3tb quota open in fact21:12
jeblairclarkb: also, istr you said you thought you could get logrotate working with python logging; that would probably really help here21:12
*** aysyd has quit IRC21:12
jeblairfungi: hopefully this is temporary21:12
clarkbjeblair: yes we do it with logstash workers21:12
* clarkb find sa link21:12
*** lcheng_ has joined #openstack-infra21:13
fungibtw, the new gearman disconnects were at 19:31:58 and 19:44:44 in the logs21:14
*** e0ne has quit IRC21:14
fungiso ~1.5 hours ago21:14
lifelessclarkb: nothing ? if so reset please21:14
*** dkliban is now known as dkliban_afk21:15
clarkblifeless: nothing, reseting connection now21:15
jeblairclarkb: would you be up for making such a change?21:15
clarkbjeblair: sure21:15
clarkblifeless: done21:15
lifelessboom traffic21:16
jeblairclarkb: maybe we can land that and the swift change, add fungi's volume, and then restart zuul21:16
*** markwash_ has joined #openstack-infra21:16
clarkbjeblair: ++21:16
clarkbjeblair: do I need to worry about merger logs too?21:16
*** markwash has quit IRC21:17
* fungi finds it disappointing that nova volume-attach understands server display names but not volume display names21:17
jeblairclarkb: probably best to be consistent21:17
jeblairclarkb: so yes?21:17
mordredfungi: there are so many things I find disappointing21:17
*** gyee has joined #openstack-infra21:18
clarkbjeblair: actually looks like merger logs are rotated with python already21:19
*** dhellman_ has quit IRC21:19
*** lcostantino has quit IRC21:19
jeblairclarkb: er, isn't the goal to rotate with logrotate (so we can compress)?21:20
jeblair(and stop rotating with python)21:21
fungijeblair: okay, we have a 1tb filesystem on zuul:/dev/main/logs21:21
*** duran has quit IRC21:21
*** markwash_ has quit IRC21:21
clarkbjeblair: oh I think I missed the wanting compression goal21:21
*** markmcclain has quit IRC21:21
clarkbbut now that I know that I will do all the things21:21
*** markwash has joined #openstack-infra21:21
fungijeblair: i'll add it to /etc/fstab as /var/log/zuul21:21
*** e0ne has joined #openstack-infra21:21
lifelesswhee some terrifying things happen in client libs21:21
jeblairclarkb: cool21:21
uvirtbotLaunchpad bug 1297796 in python-novaclient "nova python client is not process safe " [High,Fix committed]21:21
fungioh, nm, we have /opt/log/zuul right now21:22
jeblairfungi: yeah; i think we want to swap it out21:22
jeblairfungi: so maybe we need to stop;umount;mount;start -- and all that should wait for clarkb's change to merge21:22
fungiokay, so yeah, replace the bindmount with a device mount after we stop the service21:22
*** pcm__ has quit IRC21:22
fungii'll just hold off editing fstab for the moment21:22
*** markwash has quit IRC21:23
clarkbjeblair: fungi: how many days do you want? 30?21:24
jeblairclarkb: let's go to 7 for now; these will be huge21:24
fungifor the gearman debug logs a week is probably more than sufficien21:24
fungior what jeblair proposes21:25
jeblairyeah, actually...21:25
jeblairi don't think we need to make zuul larger21:25
jeblairso i think we can stick with 30 days for zuul itself, and 7 days for the gearman server which we will drop to debug21:25
fungiso have the gearman logs go to a new sub-tree?21:25
fungior do you still want them in the same dir?21:26
clarkbfungi: they are already in a different location21:26
*** prad_ has quit IRC21:26
fungiclarkb: right now they're in different files but in the same dir21:26
*** arnaud__ has joined #openstack-infra21:26
fungijust wondering if we want the gearman debug logs isolated to a new volume, or move all zuul logging to the new volume21:27
derekh_fungi: another gearman disconnect ? looks like zuul is running nothing status.openstack.org/zuul/21:27
fungieasily done21:27
fungiderekh_: we're about to restart it i think21:27
derekh_fungi: ahh ok, never mind me :-)21:28
fungibut yes it does seem to have completely given up running things this time21:28
jeblairzuul seems to be stuck in a loop due to an error21:28
*** mrda_away is now known as mrda21:28
fungii know how it feels21:28
jeblairi've stopped zuul21:29
fungii'll get to work setting up the new mount point21:29
*** e0ne has quit IRC21:29
openstackstatusjeblair: sending alert21:30
fungiclarkb: your cwd is in /var/log/zuul... can you switch to /opt/log/zuul and look at the files from there so i can umount the bindmount?21:31
openstackgerritClark Boylan proposed a change to openstack-infra/config: Rotate zuul logs with logrotate  https://review.openstack.org/9591521:31
jeblairfungi: i was in that dir; changed21:31
fungijeblair: clarkb: thanks!21:31
fungisomeone has a process as root less'ing the debug log still21:32
clarkber was that me?21:32
jeblairfungi: that was me21:32
fungiall better--thanks@21:32
clarkbno not me21:32
jeblairfungi: let me know when it's safe for me to find out why i had to stop zuul.  :)21:32
-openstackstatus- NOTICE: Zuul is offline due to an operational issue; ETA 2200 UTC.21:32
*** ChanServ changes topic to "Zuul is offline due to an operational issue; ETA 2200 UTC."21:32
openstackgerritA change was merged to openstack-infra/config: Pass tenant_name to zuul config.  https://review.openstack.org/9530221:33
openstackgerritMatthew Treinish proposed a change to openstack-infra/elastic-recheck: Add query for bug 1308715  https://review.openstack.org/9591821:33
uvirtbotLaunchpad bug 1308715 in nova/icehouse "Deadlock on quota_usages" [High,In progress] https://launchpad.net/bugs/130871521:33
fungijeblair: want me to copy the logs from /opt/log/zuul into /var/log/zuul or just leave them there?21:34
*** eharney has joined #openstack-infra21:34
jeblairfungi: i think we can leave em21:34
openstackstatusjeblair: finished sending alert21:34
fungiweird, filesystem is reported as the wrong size. investigating21:35
openstackgerritCedric Brandily proposed a change to openstack-infra/git-review: Add --submit-immediately/-S command to submit immediately after push  https://review.openstack.org/9395221:35
openstackgerritJames E. Blair proposed a change to openstack-infra/config: Switch zuul geard to debug  https://review.openstack.org/9592021:36
*** nati_ueno has quit IRC21:37
jeblairmordred: do you want to review https://review.openstack.org/#/c/95915/1 and the one after?21:37
*** Ajaeger has quit IRC21:38
fungiokay, resized up to 1tb. i got mixed up between --size and --extents on the lvcreate :/21:39
fungiresize is still going but should finish in moments21:39
fungidf now reports 1008G avail on /var/log/zuul21:42
*** nati_ueno has joined #openstack-infra21:43
*** marcoemorais has quit IRC21:43
*** zzelle has joined #openstack-infra21:43
*** SumitNaiksatam has quit IRC21:43
fungido zuul's daemons need a hup signal to flush and close old file descriptors and open fresh ones, and if so does logrotate take care of that by default or does it need to be added as a service notify?21:44
*** davidlenwell_ has quit IRC21:44
*** marcoemorais has joined #openstack-infra21:44
jeblairclarkb: ^21:44
clarkbno clue21:44
clarkboh wait21:44
clarkbcopytruncate is the trick21:44
fungiaside from not knowing the answer to that, the change lgtm21:44
clarkbit copies the files then truncates the existing one21:44
clarkbso you never have to change file descriptors21:45
*** marcoemorais has quit IRC21:45
jeblairbut i guess you briefly have 2x the current data on disk21:45
fungiand the pythonlogging implementation knows to reset its pointer to 0 when that happens?21:45
openstackgerritChristian Berendt proposed a change to openstack-infra/git-review: replaced unicode() with six.text_type()  https://review.openstack.org/9592521:45
*** marcoemorais has joined #openstack-infra21:45
*** SumitNaiksatam has joined #openstack-infra21:45
clarkbfungi: yes21:45
*** dims has joined #openstack-infra21:45
clarkbfungi: it works with logstash workers21:45
fungii've definitely seen some services deal poorly with truncating open file descriptors in append21:45
fungiokay, awesome21:45
fungithe 2x one log file isn't too worrysome given how much breathing room we have for it21:46
openstackgerritA change was merged to openstack-infra/config: Rotate zuul logs with logrotate  https://review.openstack.org/9591521:46
clarkbya and it may do the compress as it copies21:46
jeblairfungi: i would not be surprised if we end up with a 200G log file.21:47
fungiprocessing that will be fun21:47
fungiheck, logrotate very well may struggle to compress that21:47
clarkbwe should sahara boot a cluster21:47
*** davidlenwell has joined #openstack-infra21:47
clarkbfungi: ya, I would be more worried about the size of things than python logging getting confused21:48
*** dangers is now known as dangers_away21:48
*** zhiyan_ is now known as zhiyan21:48
openstackgerritA change was merged to openstack-infra/config: Switch zuul geard to debug  https://review.openstack.org/9592021:48
SergeyLukjanovclarkb, /me reading scrollback21:48
jeblairoh no you woke up SergeyLukjanov! :)21:48
jeblairrunning puppet on zuul hosts21:49
jeblairfungi: ready for me to start zuul?21:50
fungijeblair: any time you're ready21:51
SergeyLukjanovstoring logs in hdfs, mmm, than we could process Tbs of them21:51
clarkbSergeyLukjanov: yeah and logstash + ES can use hdfs as a long term store21:51
jeblairImportError: No module named FileHandler21:52
clarkbI even erad the docs21:52
*** mwagner_lap has quit IRC21:52
clarkbjeblair: arg its in the root module21:52
clarkbjeblair: can you remove the .handlers and see if that fixes it?21:52
clarkbshould be logging.FileHandler I fail21:53
jeblairclarkb: then we get: TypeError: __init__() takes at most 5 arguments (33 given)21:54
jeblairclarkb: i think your tuple was missing a ','21:54
fungiahh, yep21:54
fungiet cetera21:55
fungiit was trying to enumerate a string21:55
jeblairclarkb: it works with those 2 corrections21:55
jeblairclarkb: want to push up a change to fix that and we'll go aheand and merge it and start with that?21:56
clarkbok new patch coming21:56
*** harlowja_ is now known as harlowja_away21:57
openstackgerritClark Boylan proposed a change to openstack-infra/config: Fix python FileHandler loggers  https://review.openstack.org/9593121:57
fungiin good news, this has given the logstash job queue time to catch back up!21:57
SergeyLukjanovI'd like to ask you folks for the infra root world tour this week21:57
*** zhiyan is now known as zhiyan_21:57
jeblairi haven't tracked down what was causing the 'already submitted errors', though the timing suggests it could be an edge case with gearman disconnects21:57
*** harlowja_away is now known as harlowja_21:57
jeblairer, 'already reported'21:57
openstackgerritA change was merged to openstack-infra/config: Fix python FileHandler loggers  https://review.openstack.org/9593121:58
openstackgerritKhai Do proposed a change to openstack-infra/config: Simply jobs by using the build-timeout macro  https://review.openstack.org/9593321:59
zzelleif you want to use logrotate, you should use WatchedFileHandler not FileHandler21:59
jeblairzzelle: oh neat22:00
clarkbzzelle: looks like that would be a way around copytruncate22:00
mikalIs there some way for me as PTL to override a core's -2 on a patch?22:00
fungi136M /var/log/zuul/gearman-server.log already22:00
jeblairmikal: it's never come up before22:00
clarkbI can prep another change that will use watchedfilehandler22:00
zaroclarkb: i hope this is what you were looking for.. https://review.openstack.org/9593322:01
mikalOk, I've emailed the core and asked him to tweak his vote22:01
jeblairclarkb: might be worth it; fungi what do you think?22:01
mikalThe problem being he is on vacation22:01
mikalSo if he doesn't reply in a couple of days I might ask for some help22:01
jeblairmikal: okay.  i think we'll want a good paper trail for something like that.22:02
clarkbjeblair: also did swift config make it in?22:02
clarkbjeblair: and it worked this time?22:02
jeblairclarkb: yes22:02
fungijeblair: clarkb: if the current configuration is tested and working for the logstash workers, then i'm fine with considering that an improvement for another day22:02
mikaljeblair: I don't think its contentious, the -2 was "please land the spec first", which is now done.22:02
*** wenlock_ has joined #openstack-infra22:02
mikaljeblair: its just the core involved isn't around to remove the -222:02
mikaljeblair: but like I said, I emailed him and maybe he'll notice22:02
clarkbjeblair: woot22:02
jeblairmikal: ok cool, so probably leave a note on the review asking us to do it w/explanation, and then ping us22:03
*** hashar has quit IRC22:04
mikaljeblair: cool, thanks22:04
fungijeblair: clarkb: probably worth evaluating anywhere else we need to be using logrotate and double-checking that we do it consistently22:04
jeblairstarted mergers and reloading gate queue22:04
openstackgerritClark Boylan proposed a change to openstack-infra/config: Use WatchedFileHandler to avoid copytruncate  https://review.openstack.org/9593522:04
sdaguemordred: you about to explain what you are running into on the feature matrix?22:05
*** nati_ueno has quit IRC22:05
jeblairAlex_Gaynor: things that were in the check or gate queues at shutdown will be restored; but not changes during the downtime22:05
clarkbwe should probably test that though, the current setup is tested in logstash workers22:05
Alex_Gaynorjeblair: k, thanks.22:05
*** otherwiseguy has quit IRC22:06
*** yamahata has joined #openstack-infra22:07
fungiclarkb: looks like the only other place we're obviously using that pattern is the log_processor module22:07
jeblair#status ok Zuul is started and processing changes that were in the queue when it was stopped.  Changes uploaded or approved since then will need to be re-approved or rechecked.22:07
openstackstatusjeblair: sending ok22:07
fungiclarkb: so probably worth fixing that one too22:07
clarkbfungi: yup, and we can test there first with less effort22:08
openstackgerritA change was merged to openstack-infra/storyboard: Small fix to a method name  https://review.openstack.org/9573022:08
*** ildikov has joined #openstack-infra22:08
*** thedodd has quit IRC22:08
fungistuff looks like it's testing/merging, so i'm gonna vanish for a bit22:09
mordredsdague: one sec22:09
jeblairfungi: enjoy rolling in your dough22:09
-openstackstatus- NOTICE: Zuul is started and processing changes that were in the queue when it was stopped. Changes uploaded or approved since then will need to be re-approved or rechecked.22:09
fungiindeed i shall22:09
*** ChanServ changes topic to "Discussion of OpenStack Developer Infrastructure | docs http://ci.openstack.org | bugs https://launchpad.net/openstack-ci/ | https://git.openstack.org/cgit/openstack-infra/config/tree/"22:09
*** wenlock_ has quit IRC22:10
openstackstatusjeblair: finished sending ok22:11
clarkbok I need to drive back to seattle before it gets too late22:12
clarkbjeblair: mordred anything you want me to do first?22:12
jeblairclarkb: drive safely!22:12
mordredclarkb: nope. except for the drive safely. We just hit 6 infra core- I don't want to fall back down to 5...22:13
*** gabriel-bezerra has joined #openstack-infra22:13
clarkbya I don't want that either22:13
mordredola mattoliverau !22:14
gabriel-bezerraHi guys. Is there a way in DevStack's scripts to get the Apache or Ubuntu version that it is running on?22:14
gabriel-bezerraI'd like to check whether devstack is running on Ubuntu with Apache 2.2 or Apache 2.4, so I can configure the scripts with the right names.22:14
openstackgerritA change was merged to openstack-infra/storyboard-webclient: Fix Unknown events in timeline  https://review.openstack.org/9572922:14
anteayamorning mattoliverau22:15
mattoliverauhola mordred and anteaya, have started reading scroll back, but as a cheat, anything interesting happen while I slept? :)22:17
jeblairfungi, clarkb: i think i found why zuul was stuck in that loop...22:19
*** SumitNaiksatam has quit IRC22:19
jeblairfungi, clarkb: the submit job packet for a jenkins "describe" job to update the description for a build was the one that timed out; i think it may have been the final description update and may have affected removing the reported change from the queue22:20
anteayamattoliverau: well zuul just got restarted22:21
anteayamattoliverau: that is the biggest most recent fire22:21
jeblairmattoliverau: we're seeing an unusually high number of incidences of zuul disconnecting from geard because geard is unresponsive.  we've enabled an obscene amount of debugging to try to learn why.22:22
mattoliverauOK, thanks anteaya, so we may get people compaining about how long will it take for there changes to merge then :)22:22
*** SumitNaiksatam has joined #openstack-infra22:23
anteayaor people reporting that their patch isnt' being tested22:24
anteayalike jeblair said, due to zuul disconnecting from geard22:25
jeblairwow, it happened again.  that didn't take long.22:25
anteayanever seen it that high22:26
mattoliveraujeblair: so we might have finally seen a limit to the current geard infrastructure then. Could we throw in another geard server and start moving different type of servers on it, like the mergers etc. to lower the load?22:26
*** jgrimm has quit IRC22:27
jeblairmattoliverau: perhaps, but i'm not at all sure it's that simple.  geard has long periods of gaps in its logs where it is receiving new connections but apparently not receiving gearman packets otherwise22:27
jeblairmattoliverau: we don't know what's happening during those periods, thus the logging increase22:27
*** melwitt has quit IRC22:30
mattoliveraujeblair: hmm, annoying, fair enough, recieving new connections but not getting packets.. is the network link saturated. If the debug logs aren't giving much away, and also have large gaps, maybe the packets aren't reaching the application layer.22:31
anteayaanything new showing up in the logs yet?22:31
anteayaor too early?22:31
*** arnaud__ has quit IRC22:31
*** melwitt has joined #openstack-infra22:31
mattoliveraujeblair: sorry, just thinking out loud22:31
*** melwitt has joined #openstack-infra22:31
*** nati_ueno has joined #openstack-infra22:32
*** melwitt has quit IRC22:32
*** melwitt has joined #openstack-infra22:33
jeblairwe definitely have more data; trying to sort through it now22:33
mattoliveraujeblair: traffic on zuul, does seem a little abnormal on eth1, but this might be normal, or due to all the restarts: http://is.gd/hXo4Sv22:36
anteayamorning jhesketh22:36
anteayajhesketh: zuul is unhappy due to geard issues22:37
jheskethhmm, anything I can do to help?22:37
anteayaprobably, try the backscroll from the last 90 minutes22:37
anteayathat should give you the tl;dr22:37
*** marcoemorais has quit IRC22:38
*** marcoemorais has joined #openstack-infra22:38
*** Sukhdev has joined #openstack-infra22:38
*** marcoemorais has joined #openstack-infra22:39
anteayaqueue length 1180 results22:41
*** nati_uen_ has joined #openstack-infra22:42
*** nati_ueno has quit IRC22:42
jeblairi've started a tcpdump too22:42
hemnaanyone familiar with the gerritlib/gerritbot code ?22:42
anteayahemna: somewhat, what is on your mind?22:43
hemnaanteaya, I'm trying to plug a local gerritbot into a locally installed gerrit.   I'm getting json errors22:43
hemnaanteaya, pastebin.com/1DtG3SRY22:43
jeblairgeard does not log admin requests, which makes it a little tricky to determine if nodepool's interactions with it are having an effect.22:43
hemnamy local gerrit install is version 2.8.422:44
hemnanot sure if there is something that has to be configured on the gerrit side to allow this ?22:44
hemnaI'm using a non-admin user in my gerritbot config22:45
anteayaokay on local gerrit can you ssh and stream events?22:45
anteayathe error is about being unable to ssh22:45
anteayacan you do so manually?22:45
hemnaI can ssh in manually22:45
anteayaor consuming the ssh stream22:45
hemnagerrit drops me out22:45
hemnasaying interactive shells are disabled....22:46
anteayawhat do you mean, gerrit drops me out22:46
anteayaso you can't stream events via your shell?22:46
anteayasounds like a gerrit permissions error22:46
anteayayou have to give your gerrit account permissions to read the stream22:46
hemnathat's a gerrit config option ?22:47
anteayain gerrit 2.8.4 stream events is limited22:47
anteayanot sure where it is set, zaro would know22:47
*** arnaud has joined #openstack-infra22:47
anteayawe made everyone able to stream events for our gerrit22:47
anteayabut that is a change from 2.4 to 2.822:47
*** changbl has quit IRC22:47
anteayaeveryone used to be able to read the stream in 2.422:47
anteayanow your account has to have permission to read the stream22:48
*** zhiyan_ is now known as zhiyan22:48
jogochmouel: no paris for me22:49
*** derekh_ has quit IRC22:51
*** jhesketh has quit IRC22:51
mordredjeblair: is there anything I can do to help?22:52
*** jhesketh_ has joined #openstack-infra22:52
*** dstanek is now known as dstanek_zzz22:52
*** jhesketh_ is now known as jhesketh22:52
zaro hemna : i believe 'registered users' are allowed to view gerrit stream events.  which means you must at least have a gerrit account.22:54
jeblairmordred: not atm...22:54
jeblairmordred, jhesketh, mattoliverau: https://etherpad.openstack.org/p/XrzCW0EARb22:54
jeblairthose are interesting log entries i'm collecting22:54
*** atiwari has quit IRC22:54
hemnazaro, and I do.22:55
jeblairin the last two events, i've seen a correlation with nodepool commands22:55
anteayazaro: that was in 2.422:55
anteayazaro: in 2.8 they have to have express permissions22:56
zarohemna, anteaya : look at global capabilities.. https://review.openstack.org/#/admin/projects/All-Projects,access22:56
anteayafungi said that we gave them to all registered uses for our gerrit to save time with all the third party ci22:56
*** msabramo has quit IRC22:56
openstackgerritMichael Krotscheck proposed a change to openstack-infra/storyboard: Added sort parameters to API  https://review.openstack.org/9595922:56
anteayayeah that is us22:56
anteayahemna: do you have such a page for your gerrit?22:57
anteayawhat do you see under global capabilities > stream events22:57
*** zhiyan is now known as zhiyan_22:58
jheskethjeblair: were there no geard logs corresponding to the submit job request?22:58
jeblairjhesketh: added; it is logged after zuul times out22:59
*** signed8bit has quit IRC22:59
hemnatrying a different user to my local gerrit to see if that works (different privs)22:59
*** andreaf has quit IRC22:59
jeblairjhesketh: this time it was pretty close, previous incidences have it considerably later22:59
jheskethwhat's the load on the machine like?22:59
anteayahemna: kk22:59
jeblairjhesketh: cacti should tell you23:00
anteayahemna: can you navigate to the All-Projects,access page in your gui for your gerrit? is that an option?23:00
hemnaI can't with my normal user.23:01
anteayawho can?23:01
jheskethjeblair: so it looks like the memory is full and the load is reasonably high23:04
jheskethare you able to tell which process is chewing all the memory?23:04
jeblairjhesketh: there's 29G of ram free23:05
jheskethlol, it's because I can't read graphs23:05
jheskethignore me23:05
mattoliveraujeblair: Zuul's connection errors all seem to relate to lo packets, tcpdump output looks like it is only showing eth* packets (i.e. nodepool), can you tcpdump -i lo to show loopback packets.. even though you would think lo packets wouldn't have any issues.23:07
*** mwagner_lap has joined #openstack-infra23:08
mattoliverauLoopback traffic did spike.. but that might just be zuul being reconnected (multiple times).23:10
openstackgerritMichael Krotscheck proposed a change to openstack-infra/storyboard: Added sort parameters to API  https://review.openstack.org/9595923:10
jeblairmattoliverau: started;23:10
jeblairmattoliverau: if it happens again we'll have those logs23:10
zzellehemna, look at https://review.openstack.org/#/admin/projects/All-Projects,access23:13
*** yamahata has quit IRC23:16
openstackgerritA change was merged to openstack-infra/config: Publish api-site for DE and JA  https://review.openstack.org/9545123:16
openstackgerritA change was merged to openstack-infra/config: Name the integrated queue  https://review.openstack.org/9504623:16
*** rfolco has joined #openstack-infra23:17
openstackgerritJames E. Blair proposed a change to openstack-infra/gear: Log admin requests  https://review.openstack.org/9596423:18
jeblairi'm going to restart zuul with that change23:20
jeblairand it just happened again; this time the timed-out packet went over eth123:20
mordredjeblair: I support that decision23:20
jeblairer eth023:21
jeblairnevermind that's wrong23:21
jeblairi'll checkout the lo tcpdumps after the restart23:21
*** SumitNaiksatam has quit IRC23:23
*** SumitNaiksatam has joined #openstack-infra23:24
*** david-lyle has quit IRC23:25
tchaypoI hate to nag, but https://review.openstack.org/#/c/86746/ has been sitting waiting for some love for a long time. It has several +1s, a +2, and jenkins is happy...23:29
jeblairi've been updating the etherpad23:29
jeblairi think that's confirmation the status command took 34 seconds to run23:29
*** michchap has joined #openstack-infra23:32
*** dstanek_zzz is now known as dstanek23:33
mattoliveraujeblair: yup 35 seconds to run the admin request.. is it just me, or is that a rather long time? How long is the resulting status list?23:34
anteayatchaypo: hey there23:34
*** wenlock has quit IRC23:36
jeblairmattoliverau: 244901 bytes and it typically takes a few tenths of a second23:37
jeblair(it's not super efficient, but it's not _that_ inefficient)23:37
mordredjeblair, clarkb: btw- the hp folks looking in to the 1.1 issues have done some digging and seem to think the errors are actually related to the nova schedule23:38
jeblairmordred: i love helping the project!23:38
mordredjeblair: I'll let you know more things when I learn them23:39
mordredjeblair: it's entirely possible that it has something do to with requesting that many 30G nodes all at once and is a real scheduler issue23:39
mordredas in - like - the schuduler is unable to schedule $blah23:39
jeblairmordred: let's increase the large ops value!23:40
mordredjeblair: or, I mean, say it with me ... we could just fix this with AFS23:40
* mordred has no idea how ...23:40
jeblairmordred: heh23:40
jeblairmordred: the zuul/geard problem seems to be happening every few minutes now23:40
jeblairso i think we're dead until we fix it23:41
mordredjeblair: awesome23:41
jeblairit has occurred to me that all the extra logging could be an issue23:41
mordredoh. well yes. it could be23:41
*** gokrokve has joined #openstack-infra23:41
mordredjeblair: how much logging does it write to perform that command?23:42
jeblairmordred: about 4 lines23:42
mordredoh. well, that's not terrible.23:42
mordreddoes the logger serialize globally?23:42
mroddeni think it does23:42
mroddenwe have had issues with it and eventlet i know...23:43
mroddencomstud said it tries to acquire a lock on the file stream or something?23:43
anteayahemna: did we lose you?23:43
mroddenpython logging23:43
anteayamorganfainberg: yay23:47
morganfainberganteaya, hi :)23:47
jeblairthere's only about 15 log entries between the start/end of handling the status command23:47
mattoliveraujeblair: good call on the extra logging, extra logging = more time = more timeouts.. Still something is timing out causeing zuul to disconnect. So the output of the status command doesn't seem anything out of the ordinary then, it isn't extra large or anything?23:48
jeblairso even if the logging is having an impact, it doesn't seem to be doing so during the time period when it's dealing with the admin command23:48
mordredjeblair: hrm. that seems not huge - I wouldn't expect it to take 41 seconds of blocking to write 15 lines23:48
mgagneanyone ever encountered a redirect loop with gerrit after login? (not on review.o.o)23:48
anteayamorganfainberg: you need to do a gitdm patch: https://review.openstack.org/#/q/project:openstack-infra/gitdm,n,z23:48
morganfainberganteaya, correct it looked like that was in config? or is that in gitdm directly?23:49
anteayain gitdm directly23:49
mgagnenvm, found it23:49
morganfainberganteaya, hmm.23:49
anteayait has its own repo23:49
anteayayou have a few samples to choose from23:49
mgagneI had the great idea to set a secure cookie on an insecure url23:49
morganfainberganteaya, right. but i saw other commits about gitdm in config :P so i was confused where to put it. i'll get that posted today once i get my keystone spec (well 1st spec) written up23:50
anteayaBadCub01_: there you are23:51
openstackgerritMarc Abramowitz proposed a change to openstack-infra/jenkins-job-builder: Add tox "coverage" target  https://review.openstack.org/8738223:51
jeblairmordred: i'd like to propose the following "go home" solution:23:52
jeblairmordred: set gear log levels to WARNING and the zuul gear timeout to 30023:53
*** marcoemorais has quit IRC23:53
jeblairmordred: and then work on a way to track this down out of production23:53
*** arnaud has quit IRC23:55
mordredjeblair: ++23:56
mordredjeblair: I support that solution23:56
mattoliveraujeblair: +1. I think you deserve some sleep! Thanks for staying at it so long. I'm sorry I can't really get in there and help so it isn't all on you in the middle of the night.23:56
jeblairi think tomorrow i'll try to reproduce locally; i have managed to scale well past our environment on my workstation; i'll try to do that and see if i can get a really slow status command23:57
mattoliveraujeblair: let me know if there is anything I can do. If there are zuul issues during my day, I'll alert the devs who come in channel to complain :)23:58
openstackgerritMark Sturdevant proposed a change to openstack/requirements: Remove hp3parclient from global-requirements  https://review.openstack.org/9597123:59
jheskethalso happy to help if I can23:59
openstackgerritJames E. Blair proposed a change to openstack-infra/config: Reduce gearman logging level  https://review.openstack.org/9597223:59

