Wednesday, 2014-03-05

openstackgerritA change was merged to openstack-infra/config: Remove neutron isolated jobs
openstackgerritMichael Krotscheck proposed a change to openstack-infra/config: Add NPM mirror
jeblairwenlock: i think this is one for openstack-infra@lists.o.o00:01
kevinbentonsomeone had a cool animation of the speculative merge process in the gate, does anyone have a link to that?00:02
clarkbkevinbenton: yes one sec00:02
clarkbjeblair wins00:03
kevinbentonclarkb, jeblair: thanks00:03
jeblairkevinbenton: is the start of the talk00:03
jeblairkevinbenton: and is a bunch of talks00:03
*** NikitaKonovalov has joined #openstack-infra00:04
kevinbentonjeblair: thanks00:04
*** jhesketh__ has joined #openstack-infra00:05
*** SumitNaiksatam has quit IRC00:07
krotscheckok, clarkb, I thihnk I've addressed all your concerns.00:07
clarkbGheRivero: for 1287975 how are you installing the pypy interpreter?00:08
clarkbGheRivero: because vanilla precise doesn't do pypy iirc00:09
GheRiverofrom the ppa that I found in the infra manifests00:09
clarkbah ok, so pypy is there from the ppa thanks00:09
clarkbAlex_Gaynor: is that something you are familiar with?00:10
Alex_Gaynorclarkb: nope, never seen that error before in my life00:11
clarkbAlex_Gaynor: GheRivero: it is an import error00:12
clarkbAlex_Gaynor: does lxml work with pypy?00:13
clarkbthat is a C binding right?00:13
Alex_Gaynorclarkb: I don't believe so, no, there's a fork that does htough.00:13
GheRiveroit works. I couldn;t reproduce that bug in a vanilla precise, so i guess it works00:14
clarkbjeblair: jhesketh_ is my comment there sane?00:17
clarkbwenlock: for the escaped $, does puppet escpae that in a single quoted variable?00:20
jhesketh__clarkb: yes, that sounds good to me and I share the same opinion. I would go so far to say I don't like the name of the option "git_avoid_http". (I had made similar comments on previous patchsets)00:20
jeblairclarkb, jhesketh__: i think that all may be in service of the code we discussed possibly getting rid of this morning...00:21
wenlockclarkb, trying to recall that one... 1 sec00:21
clarkbjeblair: oh right, good point00:22
jeblairjhesketh__: i think that code is all about making sure that gerrit replication has happened before considering a change merged00:22
jhesketh__jeblair: I missed that discussion, which code are you proposing we remove00:22
jeblairjhesketh__: we're starting to time out on that sometimes, and i think it's actually probably irrelevant for us now...00:22
jeblairjhesketh__: because we expect builders to get time-sensitive repo updates from zuul mergers, and mergers get them directly from gerrit00:23
jeblairjhesketh__: so replication shouldn't be a factor for us00:23
jhesketh__jeblair: I'm not sure I follow what that has to do with using ssh instead of https?00:23
jeblairjhesketh__: i believe _getInfoRefs is only used in that check00:24
*** krotscheck has joined #openstack-infra00:24
openstackgerritMonty Taylor proposed a change to openstack-infra/storyboard: Remove Branch and Milestone legacy tables
jhesketh__ah, so you're saying this patch doesn't matter because that's going away?00:25
*** jcoufal has quit IRC00:25
wenlockclarkb, i recall this one now, i had no way to test it, but the intent was to try to get geppeto syntax check on that line to not throw complaints about the $ being miss interpreted as a value for a variable...   Adding an escape fixes it, and still results on the same string output from puppet when testing with puppet -e00:26
jeblairjhesketh__: it will have a really good commit message when i propose it.  :)00:26
jhesketh__heh, okay00:26
jhesketh__sounds good to me00:26
clarkbwenlock: yeah it does00:26
jhesketh__makes sense for the mergers to be the one caring about the repos00:26
clarkbwenlock: which is really weird to me because $ in '' is meaningless00:26
jhesketh__jeblair: but what about the mergers fetching the change over ssh?00:26
clarkbwenlock: lgtm00:26
jeblairjhesketh__: did this change actually alter that?00:27
clarkbit didn't00:27
clarkbmy change addresses that00:27
clarkbwhcih I will write a test for as soon as I am caught up on other things00:27
* jhesketh__ might be confusing the patches00:28
*** mayu has joined #openstack-infra00:30
*** cadenzajon has quit IRC00:32
clarkbmayu: you should look at the logs devstack logs00:32
*** sarob_ has quit IRC00:32
*** sarob_ has joined #openstack-infra00:33
*** Alexandra has joined #openstack-infra00:35
*** arborism is now known as amcrn00:40
*** nati_ueno has joined #openstack-infra00:41
clarkbjeblair: will that make changes unmergable? it will right? we need a gate pipeline with at least one job in it00:48
jeblairclarkb: i think so00:49
fungiseeing what i've missed...00:52
clarkbit feels good to sit down and do a proper afternoon of review00:52
anteayapleasantly so00:55
clarkbanteaya: yes very strang00:56
anteayaenjoy your walk01:00
fungiclarkb: jeblair: wenlock: yes to meeting agenda01:01
*** yongli has joined #openstack-infra01:01
clarkbwoot success if I adjust the backlight brightness then the screen comes back on after switching bcak to a different display01:04
*** wchrisj has quit IRC01:11
openstackgerritJoshua Hesketh proposed a change to openstack-infra/zuul: Add configurable footer-message reports
geekinutahso just randomly tracking down a bug I ran into this
geekinutahthe errors that this ignored seem related to the bug I am tracking, thought it was interesting that it causes checks to fail in nova but the gate will let it through for tempest01:15
fungigeekinutah: we have a script which is checking the various service logs for errors on tempest runs, but there are currently so many false negatives from it that we aren't enforcing that yet01:18
*** jcoufal has joined #openstack-infra01:19
jeblairfungi: i think in the project meeting today sdague said he planned on tackling that again after i301:19
geekinutahI see, there does seem to be a ton of variance for errors related to trying to set tags on non-existent tap devices01:19
geekinutahI think I have tracked down what appear to be at least 3 seperate services pulling the rug out from under the q-agt01:20
fungii hope so--that was a huge potential benefit for making the logs useful not only for diagnosing test failures but even moreso in production01:20
openstackgerritA change was merged to openstack-infra/config: increase conference extensions range
fungiif that nova change at the head of the gate doesn't take too long merging, there are 9 more changes already with successful job completion ready to merge right behind it01:23
clarkbjogo is this ml thread on scheduler testing be done with fake libvirt like largeops01:25
fungianteaya: yep. 72368,8 (nova) seems to have done the same as we've been witnessing earlier, with 77215,4 (python-keystoneclient) and all the other 9 changes whose testing had completed behind it are being retested now01:26
fungianteaya: looks like it01:26
anteayaincluding 3 neutron01:26
anteayais it just after a nova patch?01:26
fungithe last remaining job on 72368,8 completed successfully and it seems to have merged01:26
fungianteaya: seems like nova is currently the only project with a large enough git repository to cross the timeout threshold01:27
jogoclarkb: I must have missed that thread01:27
jogoI think that was just for local testing01:27
jogonot sure01:27
jogowhats the thread title01:27
clarkbjogo: let me find it for you01:29
*** rwsu has quit IRC01:29
clarkbjogo: It seems to be right up what you did's alley01:29
openstackgerritJames E. Blair proposed a change to openstack-infra/config: Add a job to check IRC channel access
jeblairclarkb, fungi, pleia2: ^ that's about half the work needed to manage irc channel access in the config repo01:34
jeblairmaybe 80%01:34
*** jcoufal has quit IRC01:35
jogoclarkb: so I assumed that thread wasn't about testing in gate01:36
jogobut locally01:36
*** vkozhukalov has joined #openstack-infra01:36
jogoso ignored it01:36
clarkbjogo: it may be, but they are floudering and suggesting silly things like btrfs and containers01:36
clarkbwhen none of that is necessary01:36
*** wchrisj has joined #openstack-infra01:37
geekinutahclarkb: I am interested in what the un-silly way to go about this is01:38
clarkbgeekinutah: do what jogo did01:39
clarkbthen you don't have to touch anything IO bound01:39
clarkbor use unstable filesystems01:39
openstackgerritJames E. Blair proposed a change to openstack-infra/config: Add a job to check IRC channel access
*** yamahata__ has joined #openstack-infra01:39
geekinutahso yeah, that's what I did, used the fake driver01:39
jogoclarkb: so that mostly works01:39
anteayajeblair: commented01:39
geekinutahin tons of greenthreads01:39
anteayajeblair: two typos01:39
geekinutahand it mostly works... :-)01:40
jogoso if that is too heavy, just test the algorithm via unit tests01:40
openstackgerritJames E. Blair proposed a change to openstack-infra/config: Add a job to check IRC channel access
jeblairanteaya: thx01:40
jogogeekinutah: get a bigger box?01:40
clarkbgeekinutah: what doesn't work?01:40
clarkbif fake driver is too heavy containers and btrfs will only make it worse01:41
geekinutahit's not too heavy I don't think01:41
geekinutahI was simulating thousands of compute nodes01:41
geekinutahthe problem was in greenthreads really and the points of concurrency01:41
*** vkozhukalov has quit IRC01:42
jogogeekinutah: ahh so its an eventlet realated issue01:42
geekinutahwhich is why if I had to do it again I would spin up a few containers to run 500 or so threads each instead of10k greenthreads01:42
geekinutahjogo: yeah essentially, also just a normal context switching problem I think01:43
geekinutahkernel can only handle so many of those, and 10k greenthreads was encroaching on the ridiculous zone01:43
clarkbgeekinutah: I guess I am still missing how containers help01:43
geekinutahclarkb: I haven't proved this at all, but my thinking is that we will be able to avoid the limitations of eventlet (not that you need a container to do that)01:44
geekinutahcontainers are pretty light, so it seems like less work to me01:44
clarkbcompared to running multiple processes?01:45
geekinutahI suppose just spinning up more python procs achieves the same thing, I just haven't thought through all the implications01:45
clarkbI feel like we are taking a simple problem and making it hard01:45
clarkbdavidlenwell: ^^ speaking of, imo the refstack thingy shouldn't be containerized either01:45
jeblairStop! Think! There must be a harder way!01:45
clarkbtempest installs in a virtualenv, just pip install tempest and run it01:46
davidlenwellclarkb: I don't dissagree with you in general ..01:46
geekinutahie. I'm not sure if there are synchronization points that all sub-processes would share01:46
geekinutahwhich is why I thought containers01:46
clarkbdavidlenwell: I mean feel free to use containers to run it when you run it01:46
clarkbbut making docker a dependency of refstack is a bit much01:47
davidlenwellclarkb: agreed .. thats why the script runs inside or outside of docker01:47
*** wenlock has quit IRC01:47
davidlenwellmy plan for the official running copy of refstack is to use gearman jobs.. the docker container is meant as an easy way for operators to deploy tests in an envoirnment we have a little control over .. but behind their firewall01:48
openstackgerritMichael Krotscheck proposed a change to openstack-infra/storyboard: Remove Branch and Milestone legacy tables
jogobacking up fora second, what did you do01:48
jogodevstack with fake virt and 500 nova compute instances?01:48
clarkbdavidlenwell: why not pip install tempest?01:48
davidlenwellclarkb: I've also screamed up and down that I don't wanna use docker at all because its too immature01:48
anteayato whom?01:48
davidlenwellanteaya: the defcore and refstack team .. we had a f2f on monday01:49
jogodavidlenwell: wait WAT what do they need docker for?01:49
jogogeekinutah: ^^01:49
jogojust use cloud instances?01:50
jogofor refstck01:50
* clarkb feels bad for ditching the conversation now but is going to do that to play video games01:50
geekinutahjogo: think devstack with fake virt and 500 nova compute greenthreads01:50
davidlenwellthanks clarkb01:50
anteayaclarkb: happy video games01:50
geekinutahjogo: each greenthread sees itself as a nova-compute service01:50
jogogeekinutah: 500 in one process?01:50
clarkbdavidlenwell: its south park humor + obsidian should be great01:50
geekinutahthey don't do much since it's the fake backend01:50
geekinutahbut they excercise the DB and the message bus01:50
jogowhy not do x in greenthread and x processes01:50
davidlenwellclarkb: .. sounds like fun01:51
geekinutahjogo: did that also, works better01:51
jogobecause otherwise your aren't taking advanage of the many CPUs you have01:51
anteayafrom my seat in the peanut gallery it looks to me like refstack and defcore are doing their own thing, and separating away from openstack, rather than positioning themselves to define it01:51
davidlenwellso to be clear .. refstack is not dependant on docker ..01:51
jogogeekinutah: so what was the issue?01:51
clarkbdavidlenwell: ok, good to know I must've misread that that was the official way to run it01:52
davidlenwellwe've made a docker thing that makes it easy to deploy our tests and fetch results .. but thats just some python code that the docker file runs .. that code works in or out of a docker container01:52
jogodavidlenwell: sounds very un-openstack01:52
geekinutahjogo: well, it mostly worked for scheduler testing I think, because it produced the load that we wanted it for (message bus and DB)01:52
anteayajogo: +101:52
clarkbright but what is the documented supported way of doing the thing?01:52
openstackgerritJames E. Blair proposed a change to openstack-infra/config: Add a job to check IRC channel access
geekinutahjogo: but the hacky way we did it broke other things, I never got around to making it right01:52
*** krotscheck has quit IRC01:53
jogogeekinutah: ahh, well what you did sounds like the right way (lightest weight and closest to reality)01:53
davidlenwelljogo: who defines what is openstacky01:53
jogodavidlenwell: the APIs openstack provides01:53
jogorefstack should run on openstack01:53
openstackgerritJames E. Blair proposed a change to openstack-infra/config: Add a job to check IRC channel access
davidlenwellI'm sorry jogo .. we've not been formally introduced..01:53
geekinutahjogo, clarkb: now that I think about it I think the only thing containers could buy you is the ability to simulate "nodes" without spinning up another VM?01:54
jogodavidlenwell: I guess not, hi I am Joe Gordon01:54
davidlenwellsounds like you have a lot of opinions on this topic and I'd love for you to join the conversaiton about it01:54
geekinutahmaybe just a tad bit lighter than a full VM?01:54
jogodavidlenwell: well I think defcore is insane01:54
jogobut I would be happy to rant01:54
davidlenwelljogo: no dissagreements01:54
jogoif you point me where01:54
anteayadavidlenwell: well we do have a way of doing things01:54
clarkbgeekinutah: right01:54
clarkbgeekinutah: I don't think it makes parallelization of resources any easier01:54
clarkbgeekinutah: but would simulate cross host communication of processes01:55
davidlenwellanteaya: we do have a way of doing things .. but there is no mandate that new projects have to follow those guidlines01:55
anteayaand this doesn't seem yet to embrace what is currently in existance in the openstack programs01:55
anteayathat doesn't sound like embracing what is currently going on, to me01:55
davidlenwellanteaya: please don't rush to judgment .. its still very new01:55
geekinutahclarkb: beyond the scheduler test that this thread mentions, there are peer to peer cases that we wanted to test (think zmq)01:56
davidlenwellanteaya: and still very incomplete01:56
geekinutahit's just been so long since I last thought about this seriously.....sigh01:56
anteayaI'm not judging, I'm stating what I see01:56
anteayaa ship pulling away from the harbour may not be at its destination01:56
anteayabut you can tell from whether it is facing east or west where it might end up01:56
davidlenwellSo the mechinism for triggereing tests will look a lot like what infra does with stuff01:56
*** sparkycollier has quit IRC01:56
clarkbdavidlenwell: anyways +1 to not requiring docker. and fwiw the problems with docker are they don't actually make it easier for folk in many cases. You need a very new kernel and until recently needed fs drivers that werent in tree01:57
anteayawhat do you see infra doing with stuff01:57
clarkbthis is getting better with docker which makes me happy but folks that are likely to have firewall trouble are also more likely to be running rhel501:57
*** thuc has quit IRC01:57
jogogeekinutah: what do you wnat to test in zmq case?01:58
*** thuc has joined #openstack-infra01:58
geekinutahjogo: I was trying to simulate the gearman style scheduler that soren and I talked about in HK01:58
*** cody-somerville has joined #openstack-infra01:59
*** cody-somerville has quit IRC01:59
*** cody-somerville has joined #openstack-infra01:59
geekinutahjogo: I was able to validate parts of that story, but needed better isolation of nova-computes to really answer some of my curiosities01:59
jogogeekinutah: you have a big cloud right?02:00
jogogeekinutah: spin up a bunch if really tiny VMs on your cloud02:00
davidlenwellanteaya: So as I have stated .. the docker portion is just a bonus.. the final , official tester will be triggered with gearman workers and work very much the way that you guys trigger tempest runs and digs through the subunit output .. in fact I've borrowed code from infra stuff to do that02:00
geekinutahjogo: I have a big cloud that has most of its resources claimed02:00
jogogeekinutah: use really tiny vms02:01
jeblairdavidlenwell: i'm assuming this will eventually be run in infra, yeah?02:01
davidlenwelljeblair: that is my hope ..02:01
davidlenwellbut it also will alow private cloud operators to run it on there own systems02:01
*** thuc has quit IRC02:02
davidlenwellwhich is why we made the docker thing.. specifially dell and ibm want to start showing off the interop features in their private cloud offerings02:03
jogogeekinutah: so there are some other ways to make the gearmany style schedule testing easier too02:03
geekinutahjogo: I think I could do that maybe..... I really need to look at how much extra capacity we have at any given moment02:03
jogogeekinutah: I assume there are a few parts to this, perf and functinality02:03
jeblairdavidlenwell: makes sense02:03
davidlenwelland jogo.. to answer your question about why we don't just pip install tempest .. we need to be able to get very version specific with our tests .. don't wan to run icehouse tempest against a grizzly cloud02:03
jogodavidlenwell: just install the right version then02:04
davidlenwellsometimes the right version is a specific commit to the git repo ..02:05
davidlenwellisn't packaged02:05
jogowho is running tempest you or someone else?02:05
jogoso we can make releases of tempest so you can pip install02:06
davidlenwellwe'll run it .. and anyone else can run it and collect data using refstack02:06
jogoor however we want to02:06
davidlenwellwe preffer straight from git02:06
jogothis sounds a lot like  how we do tempest02:06
*** dkliban has joined #openstack-infra02:06
jogodavidlenwell: you will be using tempest for the tests though right?02:06
jogogood thats what I meant02:07
geekinutahjogo: we should talk about this more sometime, I gotta run02:07
jogogeekinutah: I like the gearmen scheduler idea alot02:07
davidlenwelljogo: I don't see why we'd use something other that tempest to test openstack .. thats what tempest is for02:07
davidlenwellrefstack just collects data about lots of clouds and creates interop maps02:08
jogodavidlenwell: just making sure02:08
davidlenwelljogo: here is an early blog post about how it works .. we've deviated from this greatly .. but it gets the general point accross02:09
*** mgagne1 has joined #openstack-infra02:10
*** bhuvan_ has quit IRC02:11
*** bhuvan has quit IRC02:11
*** mgagne has quit IRC02:11
*** rfolco has quit IRC02:12
*** dcramer__ has quit IRC02:12
jogodavidlenwell: cool02:12
jogoanyway done to many code reviews today02:12
jogotime to go AFK02:12
*** gokrokve has joined #openstack-infra02:16
*** prad has quit IRC02:17
*** harlowja has joined #openstack-infra02:17
*** dcramer__ has joined #openstack-infra02:24
*** CaptTofu has joined #openstack-infra02:29
*** wchrisj has joined #openstack-infra02:31
*** jcooley_ has quit IRC02:36
openstackgerritMorgan Fainberg proposed a change to openstack-infra/config: Add GerritBot to openstack-keystone
*** nati_ueno has quit IRC02:43
*** changbl has joined #openstack-infra02:50
*** ctracey has joined #openstack-infra02:57
*** dcramer__ has joined #openstack-infra03:03
*** gokrokve has joined #openstack-infra03:05
*** SumitNaiksatam has joined #openstack-infra03:06
openstackgerritMorgan Fainberg proposed a change to openstack-infra/config: Add GerritBot to openstack-keystone
*** thuc has joined #openstack-infra03:08
*** sweston has quit IRC03:09
*** Will has joined #openstack-infra03:21
*** Will is now known as Guest5815303:21
*** weshay has quit IRC03:27
jogopleia2: are you caltraining tomorrow?03:28
*** Shrews has joined #openstack-infra03:32
*** mgagne has joined #openstack-infra03:33
*** mgagne1 has quit IRC03:34
*** chandan_kumar has joined #openstack-infra03:42
*** wchrisj has joined #openstack-infra03:47
*** wchrisj has quit IRC03:49
dkehn_anteaya, nay ideas on what version of gerrit you all are running?03:51
*** CaptTofu has quit IRC03:54
*** Ryan_Lane has joined #openstack-infra03:54
anteayadkehn_: one second03:58
dkehn_anteaya, np03:58
anteayadkehn_: 2.403:58
anteayaa forked version of 2.403:58
*** CaptTofu has quit IRC03:58
anteayaand we are working on an upgrade03:59
dkehn_anteaya, thx, just curious, twiodling around with setting one up03:59
anteayaif you want it03:59
dkehn_anteaya, are you going to 2.8 or somewhere else03:59
anteayadkehn_: 2.8 I do believe03:59
anteayathose are my notes on setting up gerrit with the intention of running manage-projects04:00
*** sweston has joined #openstack-infra04:00
anteayathe setting up gerrit I can do04:00
dkehn_anteaya, so trying to educate myself04:00
anteayathe run manage projects, I can't yet04:00
anteayadkehn_: makes sense to me04:00
anteayafeel free to add if you want, I might have missed somthing04:01
anteayathe top part are the steps I have confidence in the bottom part is my scribbles04:01
*** khyati has joined #openstack-infra04:01
dkehn_will do04:01
*** gyee has quit IRC04:03
dkehn_anteaya, off to bed we will probably be talking about it, seems I volunteered for it04:03
*** esker has quit IRC04:03
znsAnyone know how I can modify the groups for stackforge/satori in gerrit?04:04
anteayadkehn_: good sleep to you04:05
anteayazns are you the ptl for satori?04:06
znsanteaya: yes04:06
anteayathe ptl has to be added in gerrit by gerrit admins04:06
anteayawhat is your gerrit username?04:06
znsWe have satori-ptl and satori-core groups (in gerrit and launchpad).04:07
anteayawhat is your gerrit username?04:07
anteayagreat, when one of the gerrit admins show up, they will read the backscroll and add you04:08
anteayathen they will ping you if you are still in channel04:08
anteayaafter that you can add your core members to core04:08
znsanteaya: great! Thank you :-)04:08
anteayait will probably be in about 10 hours from now, unless one of them drops in before that04:09
*** cody-somerville has quit IRC04:15
pleia2jogo: I ended up getting a room down here for a couple nights04:22
pleia2jeblair: love the irc channel reg check04:23
*** cody-somerville has joined #openstack-infra04:25
anteayaokay I'm off04:26
*** thuc_ has joined #openstack-infra04:38
*** jp_at_hp has quit IRC04:43
*** lcheng_ has joined #openstack-infra04:52
jogopleia2: ahh smart doesn't caltrain stink04:53
*** thuc has joined #openstack-infra04:57
*** thuc_ has joined #openstack-infra04:59
*** ctracey has quit IRC04:59
*** lcheng_ has quit IRC05:09
*** wchrisj has joined #openstack-infra05:11
*** resker has joined #openstack-infra05:17
*** esker has quit IRC05:20
*** thuc has joined #openstack-infra05:26
*** amcrn_ is now known as amcrn05:30
*** thuc_ has quit IRC05:30
*** esker has quit IRC05:33
*** lcheng_ has joined #openstack-infra05:36
*** chandan_kumar has joined #openstack-infra05:38
*** amotoki has joined #openstack-infra05:46
*** alexpilotti_ has joined #openstack-infra05:49
alexpilotti_jeblair: hi05:50
*** nicedice has quit IRC05:55
*** unicell has joined #openstack-infra05:58
*** esker has joined #openstack-infra05:59
*** esker has quit IRC06:04
*** resker has joined #openstack-infra06:11
*** esker has quit IRC06:13
*** krotscheck has quit IRC06:14
*** resker has quit IRC06:16
*** ctracey has joined #openstack-infra06:32
*** denis_makogon has joined #openstack-infra06:34
*** thuc has joined #openstack-infra06:37
*** gokrokve has quit IRC06:37
*** jcooley_ has joined #openstack-infra06:38
*** alexpilotti has quit IRC06:39
*** thuc has quit IRC06:41
*** ctracey has quit IRC06:48
ttxclarkb: still around ?06:51
SpamapSman.. hopefully hardware issues squashed on tripleo-check cloud :-/06:53
*** jcooley_ has quit IRC06:56
* ttx likes when he goes to bed and finds the same reviews at the top of gate he left 10 hours ago06:56
* ttx tries to find the answer in backlog06:57
*** vkozhukalov has quit IRC07:00
*** esker has joined #openstack-infra07:02
*** saju_m has joined #openstack-infra07:02
*** esker has quit IRC07:07
*** yolanda_ has joined #openstack-infra07:08
*** afazekas has joined #openstack-infra07:09
SergeyLukjanovttx, AFAIR there were no issues last night07:11
ttxSergeyLukjanov: weird that the top of gate was enqueued > 10 hours ago though07:12
ttxwhen I left the queue was not THAT deep07:12
ttxmust have been a lot of resets07:12
SergeyLukjanovttx, yup, see it, I think that it was ok when I've fall asleep...07:12
SergeyLukjanovttx, there were free slaves (due to the slaves graph) a few hours ago07:13
SergeyLukjanovttx, oh, probably it was resets due to the nova pep8 fail (check_autoupdate), can't remember anything more07:15
*** flaper87|afk is now known as flaper8707:15
*** jcooley_ has quit IRC07:16
*** jcooley_ has joined #openstack-infra07:16
*** jcooley_ has quit IRC07:21
*** ildikov_ has quit IRC07:21
*** ildikov_ has joined #openstack-infra07:22
*** afazekas is now known as afazekas|food07:23
*** afazekas|food is now known as afazekas07:28
*** amcrn has quit IRC07:36
*** wenlock has quit IRC07:37
*** zns has quit IRC07:37
*** melwitt has quit IRC07:42
*** alexandra is now known as Alexandra07:45
*** nati_ueno has joined #openstack-infra07:47
*** e0ne has joined #openstack-infra07:47
*** khyati has quit IRC07:50
*** Alexandra is now known as alex-out07:53
*** jcooley_ has joined #openstack-infra07:53
*** CaptTofu has joined #openstack-infra07:56
*** basha has quit IRC07:56
*** jcooley_ has quit IRC07:57
*** mgagne1 has joined #openstack-infra08:07
*** e0ne has quit IRC08:11
*** nibalizer has quit IRC08:11
*** plomakin_ has quit IRC08:11
*** SumitNaiksatam_ is now known as SumitNaiksatam08:11
zhiweiclarkb: hi09:02
zhiweifungi: hi09:02
*** fbo_away is now known as fbo09:05
*** jpich has joined #openstack-infra09:05
*** talluri has quit IRC09:12
*** talluri has joined #openstack-infra09:12
kashyapzhiwei, Just post your message, they'll pick it up when they're awake. They maybe in a different timezone.09:14
*** rlandy_ has joined #openstack-infra09:14
zhiweioh, thanks.09:14
*** cody-somerville has quit IRC09:14
*** rossella_s has quit IRC09:15
*** rossella_s has joined #openstack-infra09:15
*** yassine has joined #openstack-infra09:16
*** markmc has quit IRC09:19
*** saju_m has quit IRC09:19
*** rossella-s has joined #openstack-infra09:19
*** andreaf has joined #openstack-infra09:20
*** dizquierdo has quit IRC09:32
openstackgerritAkihiro Motoki proposed a change to openstack-infra/config: Job to push Horizon translation to Transifex
*** saju_m has joined #openstack-infra09:34
*** johnthetubaguy has joined #openstack-infra09:37
*** ociuhandu has quit IRC09:39
*** alexpilotti has joined #openstack-infra09:41
*** markmcclain1 has quit IRC09:43
*** jooools has joined #openstack-infra09:44
*** alexpilotti has quit IRC09:45
*** denis_makogon has quit IRC09:48
*** rossella-s has quit IRC09:48
ttxhrrm, looks like we have routine resets at top of gate, every time I look we added another hour to the top of gate process time09:50
*** oubiwan__ has joined #openstack-infra09:50
*** rossella-s has joined #openstack-infra09:51
*** hashar has joined #openstack-infra09:51
*** zhiwei has quit IRC09:52
*** oubiwan__ has quit IRC09:54
*** ociuhandu has joined #openstack-infra10:08
*** dstanek has quit IRC10:08
*** rossella_s has joined #openstack-infra10:09
*** rossella_s has quit IRC10:10
*** ihrachys|wfh is now known as ihrachys10:15
*** tnurlygayanov has joined #openstack-infra10:16
tnurlygayanovHi there :)10:16
tnurlygayanovi have a small question about the global-requirements gates for all projects10:17
*** rlandy_ has joined #openstack-infra10:17
tnurlygayanovWe have some problems with global requirements now and we plan to fix it, but now we can see that this gate failed for any commits (this gate is 'no voiting')10:18
*** rossella-s has quit IRC10:19
tnurlygayanovand we want to have the ability to ignore some known issues with requirements and check new changes in requirements.txt for our repositories10:21
*** rossella-s has joined #openstack-infra10:22
*** bada has quit IRC10:22
tnurlygayanovwe have custom scripts for this, but, probably we can change infra-script for gates jobs to make this script omre intellectual10:22
*** alexpilotti has joined #openstack-infra10:24
*** rlandy_ is now known as rlandy10:29
*** jhesketh_ has quit IRC10:31
*** jhesketh__ has quit IRC10:31
*** katyafervent is now known as katyafervent_awa10:33
*** rlandy has quit IRC10:33
*** sarob_ has joined #openstack-infra10:40
sdaguetnurlygayanov: what are your problems with global requirements?10:42
sdaguelike what are these known issues10:42
sdaguettx: zuul has a bug10:43
sdaguewhere it requires the git mirroring to complete before a job is considered merged10:43
sdaguewhich can time out under load10:43
ttxsdague: ah, was suspecting something weird was going on but couldn't find explanation in backlog10:43
sdagueI'd been seeing it for a while, fungi finally figured out what it was yesterday10:44
sdagueor monday10:44
sdaguejeblair: said it's caused 4 - 7 resets a day over the last couple of days10:44
ttxsdague: i think that stat just went up10:44
*** sarob_ has quit IRC10:44
sdaguebut they didn't want to risk the fix until after the milestone10:44
sdagueyeh, definitely could be10:44
ttx13h49 at top of gate right now10:45
sdaguethere are some real fails in the resets as well10:45
ttxlet's see if the pile of successes at top of gate right now makes it10:45
sdaguebut yeh, I think it might be the more overwhelming reason for delay at the moment10:46
sdaguehonestly, I only caught the behavior by staring at zuul10:46
ttxyes, been staring all morning10:46
ttxtrying to get WTF was going on10:47
sdagueyeh, there is some scrollback here about it10:47
sdaguefrom a couple of days ago10:47
ttxsdague: btw to facilitate your task next week, been collecting FFEs and expectations on an etherpad10:47
ttxsdague: since anyone can add to rc1 milestone it lets us keep track of..; suspicious additions10:48
sdagueI see yuo went with the incredibly easy to remember url10:49
ttxit has VIM in it :10:49
ttxok ok, will dump it somewhere else10:50
*** ociuhandu has quit IRC10:51
ttxsdague: only Heat was discussed with PTL at this point10:51
sdaguecrap, we just reset again10:51
sdagueon this issue10:51
sdaguenova changes seem to be more suseptible to it10:51
ttxyay, 14:25 at top of gate now10:52
*** rlandy_ has joined #openstack-infra10:53
ttxsdague: well, nova git repo is the largest10:53
sdaguealso our git servers are pretty loaded10:53
sdagueevery time we fix one scaling issue we move to a new one :)10:54
*** gokrokve_ has joined #openstack-infra10:54
ttxsdague: what happens to those fails ? Do they get reenqueued ?10:54
ttxthat would explain why the queue remains deep10:54
sdagueso actually, the thing the "fails" is the one that merged10:54
sdaguebut a bad return is given10:54
sdagueso zuul thinks it failed10:54
sdagueso restarts the queue10:55
sdagueit didn't, it landed10:55
ttxhah. That explains what I've been seeing10:55
ttxit's like, a perfectly working thing gets suddenly delayed10:55
sdaguethere is a fix waiting10:55
ttxgenerally *after* a nova merge10:55
*** rlandy has quit IRC10:55
sdagueand the whole stack is at a pretty high level of reliability right now10:56
tnurlygayanovsdague: now our pproject has some external requirements, which are not presented in global requirements10:56
ttxOK, looks like we'll have to wait a bit before I3 after all10:56
openstackgerritTimur Nurlygayanov proposed a change to openstack/requirements: Added several pacakges to global requirements
sdaguettx: yeh. Honestly, it might be worth figuring out how disruptive the zuul restart would be when fungi gets up.10:56
*** gokrokve has quit IRC10:57
sdaguettx: no, definitely, in the class of fails, this is actually not bad10:57
ttxwe are getting better at fails10:57
*** katyafervent_awa is now known as katyafervent10:58
ttxit's actually not failing at all. Just way slower than it should be10:58
openstackgerritThomas Leaman proposed a change to openstack/requirements: Move to newer version of python-swiftclient
sdaguettx: yeh, well we got a ton of eyes on real bugs in January11:04
sdagueafter the melt down11:04
sdagueand we've been actually not terrible since then11:04
*** zhiwei has joined #openstack-infra11:07
sdaguewe even dealt with an upstream library breaking use twice yesterday, and recovered :)11:07
sdagueeven though that upstream library was us11:07
kashyapI read on a thread where there's a plan to move form launchpad -> StoryBoard11:08
kashyapIs that only for Blueprints? Or for bugs too?11:08
ttxkashyap: for bugs too11:09
ttxbut first it needs to exist.11:09
kashyapttx, Noted. Does it have a CLI client? (Please say yes :-) )11:09
ttxkashyap: it has a REST API, a webclient and will have a CLI/lib11:10
kashyapThank you. (And presumably supports some modern Markup languages?)11:10
kashyapComing by bugzilla land, one of the things I dearly miss with launchpad is bug query/manipulation from CLI11:11
kashyapAlthough inter-webs throw some references to certain CLI tools, etc.11:11
*** jp_at_hp1 has joined #openstack-infra11:11
alexpilottibut looking at I get "Loading..."11:12
zigosdague: I'm lost in the serries, and can't fix the "Depends on commit <sha256> which has no change associated with it." thing, what's the way?11:12
*** jp_at_hp has quit IRC11:12
ttxalexpilotti: the console support in horizon ?11:12
kashyapttx, Is that what you're referring to? --
alexpilottittx yep11:12
sdaguealexpilotti: it's loading fine here.11:12
ttxalexpilotti: #411:12
ttxexepcted ETA 33min11:12
sdaguealexpilotti: what browser?11:12
alexpilottisdague: chrome11:13
sdagueI'd just try a reload11:13
sdagueit might take a little bit to load the data11:13
sdagueit's all js async11:13
alexpilottisdague: safari came up ok11:13
ttxalexpilotti: waiting on it before cutting horizon i311:13
alexpilottittx: thanks!11:13
sdaguezigo: this the migrate series?11:13
ttxalexpilotti: so unless it gets dumped at bottom of queue for some reason, it should be in11:14
zigosdague: Yeah.11:14
*** unicell has joined #openstack-infra11:14
sdaguezigo: on sec, a patch underneath maybe got rebased11:14
zigosdague: Which is what I'm not able to fix, yes. :/11:14
alexpilottisdague: firefox is also ok, only chrome has this issue11:15
sdaguettx: barbican... so close - - but now they seem to have just accidentally disabled their tests11:15
*** ociuhandu has joined #openstack-infra11:15
alexpilottisdague: let me try some housecleaning11:15
zigosdague: Let me know what I'm doing wrong please! :)11:15
ttxkashyap: has the lp-shell command11:16
alexpilottisdague: a cache cleaning fixed it on chrome as well, sorry for the false alarm11:17
sdaguezigo: so in situations like this I tend to just grab at the top of the patch series and do the rebase from there. let me try to clean this up11:17
kashyapttx, Thanks for looking up. (/me is on a slow USB internet.)11:18
zigosdague: That's what I tried. :/11:18
*** unicell has quit IRC11:19
zigosdague: git-review insist into committing again "UniqueConstraint named and escaped twice" which is already in master, then fails on me.11:19
sdagueyeh, so it looks like there was a base rebase issue, then the patches ended up reordered at some point.11:21
sdaguezigo: so why did you do this -
sdagueinstead of a hasattr?11:21
zigosdague: hasattr?11:21
*** dizquierdo has joined #openstack-infra11:21
sdaguelet me redo that patch, because I'm concerned on that one11:22
zigosdague: Well, that's not the last version.11:22
zigosdague: It's there in another patch as well.11:22
zigosdague: Is there a way to use python-coverage?11:23
sdaguezigo: what's the last version?11:24
jpichttx: There are a couple more horizon patches that were approved yesterday that haven't merged yet (e.g. , , )11:24
sdagueok, let me push the fixed rebase I've got11:24
sdaguethen we can talk about that patch11:24
ttxjpich: yes, waiting for those11:25
*** chandan_kumar has quit IRC11:25
jpichttx: Ah, ok! Cool, thank you11:25
sdaguezigo: so I still haven't managed to fully connect the trees, but I did push a fixed version of this the ansisql.py11:30
sdagueI also think we *really* should not merge this code until we get an SQLA expert to look at it. We should poke jaypipes today to get his view on this. As I'm concerned just making the tests work could cause some security exposures here.11:31
*** unicell has joined #openstack-infra11:33
*** CaptTofu has joined #openstack-infra11:35
sdaguealso, it's bad form to +2 your own patches :P11:38
sdagueI just fixed the queue so it's linear again11:38
*** unicell has quit IRC11:39
*** rossella-s has joined #openstack-infra11:40
*** rossella-s has quit IRC11:41
*** rossella-s has joined #openstack-infra11:42
anteayattx sdague yes you figured it out, the rest is due to a repo confirmation which we don't need anymore and are just going to remove11:44
*** fbo is now known as fbo_away11:44
anteayasince the addition of zuul mergers make that repo check obsolete11:45
anteayaand it only happens on a nova patch, the nova patch will failed to be acknowledged as merged even though the tests pass and the gate resets11:45
anteayaand usually only happens when we have a long line of patches that can be merged in quick succession,11:46
anteayaat least those are the observed conditions11:46
*** dripton_ has joined #openstack-infra11:47
anteayathe reason it is nova is due to the nova's size, at least that is the supposition11:47
*** persia__ has joined #openstack-infra11:47
*** mtreinish_ has joined #openstack-infra11:48
*** gaelL_ has joined #openstack-infra11:48
*** dkliban_ has joined #openstack-infra11:50
*** simonmcc_ has joined #openstack-infra11:50
*** dachary_ has joined #openstack-infra11:52
*** gokrokve has joined #openstack-infra11:52
zigosdague: I just tried to +2 my own patch because it didn't work, and wanted to test. :)11:52
zigoThanks for fixing, though I'd really like to know what I should have done.11:52
anteayazigo: what do you need to test with a +2 on a patch?11:53
zigoanteaya: Test if it would merge or not, and I was strongly guessing it wouldn't.11:53
anteayado I plan on fixing -migrate?11:55
*** gaelL has quit IRC11:55
*** mtreinish has quit IRC11:55
*** dkliban has quit IRC11:56
*** wchrisj_ has quit IRC11:56
*** sbadia has quit IRC11:56
zigoOoops! :(11:58
zigosdague: I'm not touching anything then.11:58
sdagueno worries, I'll just push over top11:58
*** rossella-s has quit IRC11:59
zigoanteaya: FYI, nobody is really familliar with that code, OpenStack had to become the new upstream, because upstream stopped working on it.11:59
anteayazigo: sounds like you know more about it than myself12:01
*** rossella-s has joined #openstack-infra12:01
anteayaand thanks for sharing that information, which I did not know12:01
*** Ryan_Lane has joined #openstack-infra12:02
*** rossella-s has quit IRC12:03
*** rossella-s has joined #openstack-infra12:03
sdaguezigo: yeh, we do need to make sure we do this right though and not rush it. Because a lot of people will be impacted if we don't.12:04
sdagueso I colapsed the 2 quote changes into one12:05
sdaguezigo: so I think - is currently obsoleted by me merging those. So you should abandon that patch, it will make looking at the series clearer12:06
*** ArxCruz has joined #openstack-infra12:08
*** CaptTofu has quit IRC12:09
*** CaptTofu has joined #openstack-infra12:09
*** basha has quit IRC12:10
*** chandan_kumar has joined #openstack-infra12:11
*** CaptTofu has quit IRC12:14
*** CaptTofu has joined #openstack-infra12:16
*** rossella_s has joined #openstack-infra12:21
*** rossella_s has quit IRC12:21
ruhesdague: re (new dependencies in global-requirements). if a package is not present in ubuntu archives, should we contribute it to ubuntu-cloud-archive or to the main package repository?12:23
sdagueruhe: both ubuntu and red hat really need to be willing to support the package (if it's not already in there)12:25
sdagueso the big question is asking folks from there what their feeling is on the proposed package12:26
ruhesdague: ok. thank you12:26
sdaguebut even then, if the package isn't being regularly maintained, it's off the list. Because we can't add new dependencies that aren't python3 compatible12:27
sdagueso like I said, that throws out 3 of 4 in that list12:27
zigosdague: I agree about the "not rushing", though I have 20 Debian packages waiting for -migrate to get fixed. :/12:28
sdaguezigo: that's fine, but we're fundamentally touching sql quoting here12:28
sdaguewhich is *exactly* how most sql exploits happen12:29
sdagueso care is really warrented12:29
zigosdague: That's not really runtime though, is it? I mean, migrations would happen only at setup time...12:29
sdaguesome of the migrations end up as data migrations, and actually manage internal data12:30
sdaguebut you are right, the surface is less dangerous12:30
anteayayay 3 merged12:30
sdagueespecially as it will impact upgrade12:30
*** salv-orlando has quit IRC12:31
ruhesdague: we're going to get rid of all 3 of them. but we have our own library used internally. in case if we make it well-documented, python3-compatible, available on pypi, do you think we'll be able to submit DEB and RPM specs to Ubuntu and Fedore respectively? Would it be enough to add this dependency to global-requirements?12:31
sdagueruhe: I think those projects would need to decide12:31
sdagueespecially if it's reusable12:32
sdaguehowever, you will get push back if the functionality could be handled with existing requirements. So make sure you've ruled those out first12:33
ruhesdague: i've got a direction, thank you very much :)12:33
* zigo just understood what "hasattr" does reading the patch12:34
*** sarob_ has joined #openstack-infra12:41
*** rossella-s has quit IRC12:41
*** pcm_ has joined #openstack-infra12:43
ttxsdague: new reset12:43
ttxafter an horizon merge this time12:44
openstackgerritNikita Konovalov proposed a change to openstack-infra/storyboard: Make token storage configurable
openstackgerritNikita Konovalov proposed a change to openstack-infra/storyboard: Auth Token Middleware
*** lcostantino has joined #openstack-infra12:45
sdaguezigo: oh, it looks like I did the logic backwards12:46
*** sarob_ has quit IRC12:46
*** hashar has quit IRC12:46
anteayattx I think a horizon patch just failed, was it one of the ones you were waiting on?12:49
anteayattx did that horizon patch merge?12:49
ttxanteaya: I suspect it did. It's the reset-after-merge bug12:49
anteayanever seen it after horizon12:50
anteayalast one to merge was a cinder patch12:50
*** CaptTofu has quit IRC12:50
*** CaptTofu has joined #openstack-infra12:51
ttxanteaya: hmm, checking12:51
*** wchrisj has joined #openstack-infra12:51
anteayathat is the most recent entry for openstack/openstack12:51
*** gokrokve has joined #openstack-infra12:52
ttxanteaya: was watching a pile of all-green successes and it turned into a bunch of "queued"12:53
ttxcan't find trace of a failed test12:53
anteayaI was concerned about the nova patch beneath12:53
anteayaand -qa doesn't have the fail bot reporting anymore12:54
zigosdague: yup12:54
ttxanteaya: maybe that was
*** CaptTofu has quit IRC12:55
anteayano it was a horizon patch12:55
anteayaI don't have the number, I should have got it12:55
ttxanteaya: no recent test -1 at,n,z12:56
sdaguezigo: ok, redone with a function12:56
*** gokrokve has quit IRC12:56
ttxanteaya: "Submitted, Merge Pending"12:56
anteayathat must be it12:57
anteayaI saw submitted, merge pending for the first time last night12:57
ttxsdague: ever seen one that before ?12:57
ttxthat one12:57
sdaguettx: nope12:58
sdaguethat might be a new failure mode here12:58
ttxanteaya: that's fine. What's not fine is that it's been 16min12:58
anteayaso yeah, it merged or should and also reset the gate, you are correct12:58
anteayaso not limited to nova after all12:58
ttxsdague: sounds more nefarious though12:59
sdaguewell, there is not much to be done until fungi is awake12:59
ttxI fear that this half-done state might break the integrity of the gate12:59
sdagueyeh, don't know13:00
*** rossella-s has joined #openstack-infra13:00
ttxsdague: At least it's the only one stuck there:,n,z13:00
anteaya13 patches in the queue before the next horizon patch13:01
anteayahopefully we can address that patches status before then13:01
ttxat the current rhythm, gives us about 4 hours13:02
anteayafungi should be around in an hour and a half13:02
anteayaand yeah, slow rhythm13:02
*** smarcet has joined #openstack-infra13:05
*** jgallard has joined #openstack-infra13:06
sdaguegah, I screwed up this migrate patch set again.... grrrr. ok finally setting up local unit testing13:06
fungii'll see whether i can dig up that change, if it exists yet... but i also think i realized another way that can break us if we really do stop checking it13:08
openstackgerritA change was merged to openstack-infra/storyboard: Load superusers from a yaml file
anteayamorning fungi13:08
fungihas anyone witnessed any changes besides nova changes triggering this pattern?13:09
anteayathis triggered the latest reset13:09
anteayattx found it13:09
fungiwe've seen that from time to time when gerrit gets confused13:10
anteayaI saw submitted, merge pending last night, which changed to merged when I refreshed the page13:10
ttxfungi: we got a lot of top-of-gate resets, which added up to that 15hours queue length now13:10
anteayalast night I saw a reset and it was due to a nova patch13:10
ttxfungi: most were the reset-after-merge issue, although the lastet was
fungithe change itself successfully merges, but for some reason gerrit doesn't update its database to reflect that, so the ui will list that change as "submitted, merge pending" until we restart gerrit13:11
ttxfungi: the queue is currently sluggish, about 4 per hour all morning13:11
fungiwas every 4th change a nova change? ;)13:11
ttxfungi: mayyybe.13:11
fungithough i guess if you average in legitimate resets for actual job failures, that seems likely13:11
ttxfungi: not sure it's just a DB issue. Change doesn't appear merged13:12
*** fbo_away is now known as fbo13:13
anteayahorizon doesn't show it:
ttxfungi: not mirrored to github13:13
*** eharney has quit IRC13:13
sdaguefungi: so is there a hot fix on the merge timeout issue? like a way to renice some process to make it less likely to fail13:13
ttxnor on cgit13:13
ttxfungi: so it's not just a status issue. It's actually not merged.13:14
fungittx: agreed... checking gerrit's queues now to see if it's wedged13:14
sdaguettx: is it not merged in gerrit?13:14
sdagueremember gerrit is a git server as well13:14
ttxsdague: how can I check that ?13:14
sdaguethat's a good question... and I forget how13:14
fungi(is the easiest way)13:14
fungi'git pull gerrit master' in your clone of horizon13:15
jaypipessdague: can I help with something?13:15
jpichThere's a comment on that "submitted" horizon review reading "Change cannot be merged due to unsatisfiable dependencies", not sure if that's been mentioned13:16
sdaguejaypipes: yes, I'd like your reviews on the sqlalchemy-migrate patches for 0.9 compatibility, to make sure we didn't entirely f this up13:16
*** dstanek has joined #openstack-infra13:16
ttxsdague: gerrit has 3e4f269591feefc6335b51e17de7797fc8da8a8d on head13:16
sdaguejpich: yep, scrollback :)13:16
fungii do see this in progress in gerrit's queue... a7689930                       push
jaypipessdague: link for me pls?13:16
sdaguejaypipes: yep, let me fix this unit test first :)13:16
ttxso apparently not merged in gerrit either13:16
jaypipesno worries13:16
jpichsdague: Ok, thank you! I'm not good at effectively skimming13:16
fungioho, jpich good eye!13:17
sdaguejpich: no worries13:17
anteayait isn't on gerrit13:17
ttxjpich has the laser eyes13:17
anteayagit clone ssh://
sdagueoh, damn, I missed the comment context13:17
fungiso i wonder why we have that change depending on commits which aren't present... investigating13:17
sdaguefungi: because of the break?13:17
ttxfwiw that caused a reset alright13:18
sdagueor an earlier merge race issue13:18
fungitwo of the commits it mentions aren't even approved in gerrit yet13:18
fungiand the third doesn't seem to exist at all13:18
anteayajpich: I'm glad you mentioned it, I must have missed it the first time13:18
fungii wonder whether gerrit somehow missed associating dependent changes13:18
fungii'll clone that change and see what i get13:18
sdagueit's 5 patches, the final one is just a tox change to fix things13:18
jpichanteaya: Cool, it wasn't immediately obvious to me either13:19
sdaguefungi: so there was definitely some total weirdness with the migrate patches this morning in the same way13:19
*** esker has joined #openstack-infra13:19
sdaguewhich I ignored13:19
*** talluri has quit IRC13:20
ttxargh, new reset. Real fail this time13:20
ttxlooks like I won't be able to cut I3 today :/13:21
fungiit doesn't look like we're pushing the gerrit server *too* hard... load average is sometimes spiking up high enough that it reads 10 over a 5-minute sample, but mostly seems to hang around 213:21
anteayathat patch passed earlier13:21
sdaguejaypipes: missed a file13:21
sdagueso jaypipes:,n,z  is repushed13:21
anteayaboth neutron test failures on az213:23
*** dcramer__ has quit IRC13:24
sdagueanteaya: is az2 the one that has wonky half working ipv6?13:24
*** talluri has quit IRC13:25
fungii don't know whether we ever narrowed down where we were occasionally seeing that ipv6 issue in hpcloud13:25
anteayasdague: I know it can't build images very well13:26
anteayasdague: I don't know its ipv6 status13:26
fungiit seemed to subside after i added some extra debugging output to d-g to help us pinpoint it13:26
anteayait is a networking error13:27
anteayachecking logs13:27
fungiand yeah, the nodepool images in az2 are stale by a couple weeks. there's a case open with hpcloud support to look into why we get an eof on the ssh connection to them when trying to build a template server13:27
anteayaneutron-rootwrap timeout error:
sdagueanteaya: ok, this is the ovs timeouts issue13:28
sdagueyeh, so that's just normal fails13:29
sdagueI don't think related to infrastructure really13:29
ttxyep, it still fails the old way from time to time13:30
sdagueyeh, there are still races in there13:32
*** hashar has joined #openstack-infra13:32
*** CaptTofu has joined #openstack-infra13:32
sdaguewe have not yet fixed all the openstack bugs13:32
fungiokay, i think whoever put together the stack of changes which triggered the 73433 merge problem must have done something really, really crazy13:32
anteayasame error in both:
fungicheckout the 73433 change from gerrit and you can see the commits it mentions are in fact the ones which aren't approved yet13:33
fungithe first of which doesn't even exist in gerrit, but looks like a never-pushed version of
ttxjpich: ^13:34
*** rlandy_ has quit IRC13:34
* jpich looking13:34
sdaguefungi: so how did that get gate enqueued?13:35
anteayafungi: where are these dependencies coming from, since that patch lists no dependencies13:35
jpichI pinged the author but he's not online yet - I know it was a series that had to be rebased a few times, dependent patches can get a bit hairy after a while13:35
fungiso i think somehow gerrit allowed someone to push a dependent series with one of the commits not actually existing in gerrit, and that allowed gerrit to consider 73433 disassociated from all those other changes13:35
sdaguefungi: interesting13:35
fungieven though when it came time to merge, gerrit saw that it depended on them and refused13:35
sdaguewell, at least it did the right thing13:36
sdaguein the end13:36
fungiso i strongly suspect this is a corner case in gerrit's handling of dependency identification13:36
*** rlandy__ is now known as rlandy13:36
anteayaso how does this answer why the gate was reset after this patch left the gate?13:37
anteayabecause it couldn't be merged?13:37
fungianteaya: because that change can't actually be merged, yes13:37
sdagueanteaya: right, exactly13:38
sdaguethat's actually working as designed13:38
anteayayay gerrit13:38
sdaguefungi: so on the phantom merge resets13:38
anteayaso how to unwind 73433?13:38
jpichTo resolve it, should I suggest to the author to resubmit the whole series, double-checking the ChangeId are present/correct etc?13:38
sdagueis there something we can do to either provide priority to the mirror job (nice / ionice) or increase the timeouts13:39
fungijpich: right, they could probably cherry-pick each change onto the next, starting from the tip of master, and then push the resulting topic branch13:39
fungisdague: maybe... how do you do that to specific java processes within a common jvm?13:39
sdaguebecause that seems to represent an actual majority of resets right now13:39
sdaguefungi: it's all inside one process?13:39
sdagueyeh, bummer13:40
fungiwell, it depends on which server you're talking about13:40
fungiwithin the ubuntu virtual machine it's all one process13:40
jpichfungi: Thanks, I'll pass on the suggestion13:40
fungiwithin the java virtual machine there are many processes13:40
sdaguefungi: yeh, os level process :)13:40
sdaguebasically something nice would work on13:40
fungisdague: right. someone thought it would be amusing to write a large code review system in java13:40
fungijoke's on us apparently13:41
jaypipessdague: can't find any issues with any of those patches. you and zigo did nice work. :)13:41
sdaguejaypipes: thanks much13:41
sdaguezigo / dripton_: so with jaypipes +1, I'm cool with all that being pushed in13:41
*** pdmars has joined #openstack-infra13:41
anteaya* cue Ryan_Lane and phabricator13:41
zigojaypipes: I was fearing that we miss some of the table.quote things not covered by the unit tests.13:41
jaypipeszigo: not that I could see.13:42
zigoAnd yeah, thanks.13:42
sdagueanteaya: yeh, then we just trade known issues for unknown issues13:42
zigoLet's wait until Jenkins tests it ...13:43
sdaguezigo: yep13:43
fungisdague: so we have a couple options to patch zuul for the "phantom resets". one is to just increase the timeout, which kicks the can down the road (but maybe far enough?) and increases the time it takes zuul to decide whether a change really merged to gerrit. the other is to stop checking whether the change replicated to gerrit's local mirror, but after pondering that some more i think there's a new13:43
fungirace condition that increases our chances of hitting13:43
anteayayeah, java for php13:43
sdaguefungi: so I honestly think increasing the timeout is fine13:44
sdaguewhat's the current timeout?13:44
fungi60 seconds13:44
sdagueso given that git review often takes me 35 seconds13:44
sdagueI'm not entirely surprised that is hitting that from time to time13:45
anteayaif we make a change is it a gerrit restart?13:45
sdaguemy instinct would be to bump that to 300 seconds13:45
sdagueand call it a day13:45
*** nosnos_ has quit IRC13:45
fungiapparently from the time zuul virtually "clicks" the submit button in gerrit to the time gerrit actually replicates that merged ref to its local mirror in /var/lib/git can take more than 60 seconds for nova commits13:45
sdaguebecause loosing a minute here or there is much better than a reset13:45
*** krtaylor has quit IRC13:46
sdagueor is that going to just cause chaos13:46
anteayaif we bump to 300 seconds can we time how long it takes, so that after the next day or so, we can fine tune?13:46
sdaguealso, when was the last gc on the nova tree there?13:47
dstanekwhat is the litmus test for whether a check belongs in hacking or upstream in pep8?13:48
*** akscram has joined #openstack-infra13:48
sdaguedstanek: if you can get it into upstream pep8 :)13:48
dstaneksdague: so basically asking Johann if he'll accept?13:49
*** nosnos has quit IRC13:49
sdaguedstanek: what's the change in question13:49
*** thuc has joined #openstack-infra13:49
*** nosnos has joined #openstack-infra13:50
dstanekthis one triggered my question -
*** thuc_ has joined #openstack-infra13:50
openstackgerritJeremy Stanley proposed a change to openstack-infra/zuul: Increase replication timeout to 5 minutes
fungisdague: so that's what a change to the hard-coded replication timeout looks like ^13:51
dstanekbut i'm creating a bunch more -
fungisdague: ideally if we go that route, we should make it configurable13:51
sdaguedstanek: so if it's in pep8 the document13:51
sdagueit should be in pep8 the code13:51
dstaneksdague: i think we talking about adding an additional 10 - 15 checks info keystone, but i'd like to have a path to transition those things to pep8 or hacking13:52
*** sarob_ has joined #openstack-infra13:52
*** dkranz has joined #openstack-infra13:52
fungisdague: what we discussed yesterday was dropping the replication check entirely, but i'm starting to wonder if that would increase the chance that jobs updating (non-zuul, branch) refs from remote git mirrors test on something older than actual master13:52
sdaguefungi: oh, so this requires a zuul restart?13:52
fungisdague: absolutely13:52
fungisdague: which is why we were putting it off until after i-313:52
*** julim has joined #openstack-infra13:52
sdagueI wasn't sure if this was zuul timing out, or gerrit timing out13:52
fungiall zuul all the time13:53
sdaguedoes zuul have a config file that is continuously reparsed?13:53
sdagueso we could land a timeout adjustment without a restart13:53
*** zehicle_at_dell has joined #openstack-infra13:53
fungisdague: we could land a change to make this value configurable, restart zuul, then adjust it from configuration in the future13:54
*** nosnos has quit IRC13:54
*** thuc has quit IRC13:54
*** alaski_ is now known as alaski13:54
sdaguebecause if it's always going to be a zuul restart, then it's cleaner, but not more useful13:54
fungiif we stick with doing this check, then yes i think we should add a knob for it and make sure it's adjusted on reconfig13:54
dstaneksdague: if some of the changes are not accepted into pep8 should i try hacking? or just keep them in keystone?13:55
fungisdague: 78222 is just a straw man mostly intended to point out where this value is buried13:55
sdaguefungi: yep13:55
sdaguedstanek: honestly, if you can't get upstream pep8 folks to accept pep8 things, there may be a reason13:56
*** sarob_ has quit IRC13:56
sdagueso, anyway, I'd say start there13:56
sdaguethen reevaluate13:56
*** gokrokve has quit IRC13:57
*** eharney has joined #openstack-infra14:00
*** mfink has joined #openstack-infra14:00
*** salv-orlando has joined #openstack-infra14:03
sdaguefungi: so unless you are checking on something else on fire - would be good to get in. So I don't break hooks again by accident14:07
*** jeckersb_gone is now known as jeckersb14:08
sdaguealso, any thoughts on if the gc would help?14:09
*** amotoki has quit IRC14:09
*** lahoucine_ has quit IRC14:12
sdagueok, under / over, does the nova change cause a reset?14:14
*** dhellmann_ is now known as dhellmann14:14
*** saju_m has quit IRC14:15
* ttx watches too14:15
*** zns has joined #openstack-infra14:15
sdaguebecause I think if we are really reseting on every nova change14:15
sdaguewe need to take a zuul restart14:15
SergeyLukjanovsdague, re
ttxsdague: the next nova chnage is pretty far down the queue though14:15
anteayanope it went it14:15
sdagueit seems to have made it14:15
SergeyLukjanovsdague, will devstack start anyway?14:16
sdagueSergeyLukjanov: yes14:16
sdagueSergeyLukjanov: no, we replaced gate_hook14:16
znsHi - any gerrit admins around? Could you add ziad-sawalha to satori-core?14:16
anteayahi zns14:17
anteayawe have a few very important items on the go right now14:17
anteayabut when there is a moment and we have some time to breath, someone will read the backscroll and add you as satori-ptl14:17
SergeyLukjanovsdague, looking on devstack-gate now, the line I'm worry about is and then just "else" and exec under else14:18
sdagueI'm just excited my devstack changes that might fix sqlalchemy tempest run is in there14:18
*** bknudson has quit IRC14:18
*** mriedem has joined #openstack-infra14:18
znsanteaya: ok, thanks.14:19
ttxnice pile of greens at top of gate right now, if those two tempest changes pass14:19
sdaguewe'll never run devstack-vm-gate.sh14:19
SergeyLukjanovsdague, got it, thx for explanation14:19
fungisorry, was busy annotating the replication timeout change14:19
SergeyLukjanovsdague, => your change lgtm14:20
fungisdague: running check-dg-hooks-dsvm voting but only in the check pipeline?14:22
ttxback under 15h, phew14:22
sdaguefungi: that should be enough to block issues14:22
anteayafungi: for later when lahoucine returns to the channel he is after some gerrit db clean up on his account:
fungisdague: gc where... for gerrit's repos and local mirrors? we already repack them weekly14:22
sdaguethough I'm not sure if that was intentional or not :)14:22
sdaguefungi: right, but it's been a busy week14:23
sdagueand I wonder if doing that on nova in /var/lib/git might speed things up14:23
sdagueas a work around14:23
sdagueor hot fix14:23
sdaguejust to reduce the chance of phantom reset14:23
ttxnext nova change in #8 slot14:25
*** gokrokve has joined #openstack-infra14:25
fungisdague: well, i'll do a catch-up repack on /var/lib/git and pack-refs too (we do the latter on the remote mirrors but not on the local mirror)14:25
sdaguefungi: cool14:26
*** gokrokve has quit IRC14:28
anteayaI was planning on being here all morning though I had to leave this afternoon for an appointment, that is until I broke a filling on a tooth last night14:29
anteayagood news is I got an appointment this morning to fix it14:29
anteayabad news is I got an appointment this morning to fix it14:29
ttxfungi: the post pipeline seems to be struggling for resources14:29
fungianteaya: teeth are nothing to trifle with. the sooner the better14:30
ttxfungi: is that expected ?14:30
fungittx: the post pipeline has a lower priority than the gate pipeline14:30
ttxmy scripts timeout after &5 minutes waiting for a tarball build14:30
ttxfungi: ok14:30
fungittx: post and check get to fight it out for resources, gate trumps them14:30
anteayaso I am disappearing and wont be back all day14:31
anteayasorry about that14:31
anteayafungi: thanks for understanding14:31
fungianteaya: good luck!14:31
*** gokrokve has joined #openstack-infra14:33
openstackgerritA change was merged to openstack-infra/config: add job for testing that hooks work
*** thuc has quit IRC14:34
openstackgerritsahid proposed a change to openstack-infra/config: Configure check and gate to stackforge/warm
jomarahi, i am the guy that broke gerrit with a 5-deep nested horizon patch set14:38
*** ildikov_ has quit IRC14:38
fungijomara: it looks like one of the commits didn't make it onto gerrit. were you using git-review? any special command-line options (maybe -R)?14:39
jomarafungi: just basic git review14:39
jomarafungi: no special options that i remember14:39
jomarafungi: i had to fix some things in patch 1 & 2 (where 1 is the first patch), so i got a fresh master, cherry picked them one at a time (fixing them along the way, where necessary), and git-reviewed each time14:40
*** dizquierdo has quit IRC14:40
jomaraso master -> cherry pick a.1 -> modify to a.2 -> git review -> cherry pick b.1 -> modify to b.2 -> git review, etc14:40
*** dizquierdo has joined #openstack-infra14:42
jomarafungi: jpich recommended that i do that process again, which i am happy to do14:43
*** sarob_ has joined #openstack-infra14:43
jpichjomara: when you cherry-picked b.1, were you still on top of a.2?14:44
fungijomara: here's what i see in the git log when i checkout the one which got approved but failed to merge
mordredmorning all14:45
fungijomara: that second commit (the parent) somehow didn't make it onto gerrit14:45
fungijomara: the commit sha (530ecf882dd89a8058a9f78fb01cf2fe953658d8) doesn't correspond to any of the patchsets for change id I87ebd7feaeddcfe7fe76c92b075c53de7016846b14:46
jomarathat is...odd14:47
jomaralet me walk through the history14:47
SergeyLukjanovmordred, o/14:47
fungijomara: that suggests that you somehow managed to modify your commit for I87ebd7feaeddcfe7fe76c92b075c53de7016846b after you ran git review on it14:47
*** sarob_ has quit IRC14:47
*** rlandy_ has joined #openstack-infra14:48
fungiso probably either a commit --amend or rebase or something along those lines14:48
mordredSergeyLukjanov: how're things/14:48
SergeyLukjanovmordred, mostly fine, trying to make a checklist for renaming ;)14:48
jomarafungi: i was rebasing and amending quite a bit to fix some stuff, so its very possible that i made a mistake14:49
SergeyLukjanovmordred, but gate is quite slow14:49
jpichjomara: FWIW when I have to push dependent patches, I use git review only once at the end. It first confirms with you that you're about to upload several patches. Perhaps it would help in a similar situation in the future14:49
jomarajpich: the problem is that i worked on these over a several week period14:49
fungijomara: i always do the same thing jpich is suggesting, so i agree14:49
*** jnoller has joined #openstack-infra14:50
*** rlandy has quit IRC14:50
*** mgagne has joined #openstack-infra14:51
fungizns: you're now the initial member of,members and,members14:51
jpichjomara: Yes, of course. When you start updating one patch in the middle though, the others usually need a freshening up too or their dependency tree shows as "Outdated" - so most will need to be uploaded too14:51
*** thomasem has joined #openstack-infra14:52
jomarajpich: in this case, i started from the first patch though14:53
jomarawhat if i :14:53
*** dcramer__ has joined #openstack-infra14:53
jomaracheckout master again14:53
jomaraapply 1-5 fixed patches14:53
jomaragit review *once*14:53
jomarado you think that will fix the situation?14:53
mordredSergeyLukjanov: woot! I love it when gate is quite slow at FF14:53
jpichjomara: Then the dependencies should be handled correctly by gerrit - IMO. Make sure the ChangeId all remain the same :)14:54
*** ildikov_ has joined #openstack-infra14:54
jomarajpich: the changeid should be in the commit message and not get changed when i cherry pick it right? how would it get changed>?14:54
SergeyLukjanovmordred, 13h is probably too slow :)14:54
jpichjomara: Magic! But you're right there's not reason it should change.14:55
fungijomara: you're right, they shouldn't change unless you do something to change them by editing the commit message14:55
fungijomara: so if you do modify commit messages on them, just make sure not to alter/delete the change-id line and make sure it stays in the final paragraph of the commit message14:56
*** rlandy__ has joined #openstack-infra14:56
mordredSergeyLukjanov: oh. nevermind. wrong kind of slow :)14:56
SergeyLukjanovmordred :)14:56
*** zns has joined #openstack-infra14:58
*** rlandy_ has quit IRC14:58
openstackgerritMonty Taylor proposed a change to openstack-infra/config: Load storyboard projects from projects.yaml
*** homeless has joined #openstack-infra14:58
mordredNikitaKonovalov, SergeyLukjanov: ^^ that look right to you?15:00
jomaraok, im going to walk through this right now15:00
SergeyLukjanovmordred, looking now15:00
jomarafungi: would you mind giving me a sanity check once i have it ready15:00
*** jeckersb is now known as jeckersb_gone15:00
*** resker has joined #openstack-infra15:00
znsfungi: thank you!15:00
SergeyLukjanovmordred, should we add "depends on" ```exec { 'load-projects-yaml':``` and something else... like storyboard starting?15:02
fungijomara: i can, though gerrit will provide you with one as well... just look at the "dependency" section in the webui and follow the links backwards and forwards to make sure the relationships of your changes are reflected appropriately15:02
jomarafungi: ok so15:03
jomaramy 1st change is already merged15:03
jomarashould i this with just 2-4?15:03
fungijomara: it didn't actually merge did it? which change?15:03
mordredSergeyLukjanov:  don't think we need it started ... but I do think we should probably run migrations first if we need to run them15:03
jomarafungi: i just pulled from ... - my commit is the latest15:04
*** eharney has joined #openstack-infra15:04
jomara2014-01-14 10:25 Jordan OMara       o │ [h1] [gerrit/master] [jsomara/heat-stack-update] Heat Stack update view/fo~15:04
fungijomara: oh, okay, not the one which failed to merge then... a different one15:04
jomarafungi: it must have been the 2nd or 3rd one15:04
jomarafungi: ill just star twith 2, and do 2-515:04
jomarasame process though15:04
fungijomara: so yes, i suggest just cherry-picking the others which aren't in master yet15:04
SergeyLukjanovmordred, yup, looks like migrations should be run first to not15:05
denis_makogonGuys, goodday, i need someone who could help me with issue explanation15:07
openstackgerritMonty Taylor proposed a change to openstack-infra/config: Load storyboard projects from projects.yaml
jeblairfungi: good morning. there's a lot of scrollback; anything i should know?15:08
denis_makogon |||| - i'm seing this issue15:08
mordredmorning jeblair15:10
sdaguedenis_makogon: interesting15:10
fungijeblair: is a straw man change for increasing the replication timeout value in zuul, along with some musings on why removing that check could be detrimental after all15:11
sdaguefungi: check out that logstash15:11
denis_makogonsdague, how this even possible ?15:12
sdaguethat's a good question15:12
openstackgerritA change was merged to openstack-infra/gerrit: add a new .gitreview file
SergeyLukjanovmordred, #2 lgtm15:13
fungisdague: what's the concern there?15:13
SergeyLukjanovmordred, # considering my poor puppet knowledge15:14
*** sandywalsh has joined #openstack-infra15:14
sdaguefungi: so what is otherwise a successful run -
sdaguedies on a sudo issue at the end in the jenkins cleanup15:15
jeblairsdague: works as intended then15:15
jomarafungi: if you ACK me i will git-review that15:15
*** jeckersb_gone is now known as jeckersb15:15
jeblairsdague: it's never okay for a unit test suite to "sudo".15:15
fungijomara: i'm hoping i have a moment to take a look once i'm done dealing with broken things15:15
jomarafungi: np15:15
sdaguejeblair: is that what's it's catching?15:16
jeblairsdague: yep.15:16
sdagueok, interesting15:16
sdagueI guess it's hard to understand that from the output15:16
*** CaptTofu has joined #openstack-infra15:16
jeblairsdague: people try to do it so often (especially neutron) that we have a special test for it in the tox wrapper15:16
fungisdague: the only thing i find mildly concerning is that there were three hits for it in the gate pipeline for the past week, in gate-trove-python27 running "sudo innobackupex --stream=xbstream --incremental --incremental-lsn=54321 /var/lib/mysql"15:17
sdaguejeblair: do you think the unit test actually did that?15:17
jeblairsdague: it's supposed to output a message if it fails15:17
sdagueyeh, it's just dumping a line without an explanation15:18
fungisdague: given that this is trove tests and sudoing commands having to do with mysql server management, i'm pretty certain the unit tests are trying to do that15:18
jeblairsdague: i think it might be because of a "set -e" further up in the script15:19
denis_makogonso, it means that unit test trying to perform manipulations over real server ?15:19
fungidenis_makogon: yeah, it looks like something one or more of the trove unit tests do ends up calling sudo15:20
jeblairsdague: so it aborts before printing the error15:20
openstackgerritsahid proposed a change to openstack-infra/config: Configure check and gate to stackforge/warm
denis_makogonfungi, and since jenkins is a java base application, "/usr/bin/killall java" breaks it15:21
openstackgerritSean Dague proposed a change to openstack-infra/config: be more explicit about why we are failing on sudo check
sdagueso how about that15:23
*** pblaho has joined #openstack-infra15:23
sdaguejust to be more explicit15:23
sdaguedenis_makogon: no, it means the jenkins user can't sudo15:23
fungidenis_makogon: yeah, that (if it were allowed, which it isn't because the jenkins user doesn't have sudo access on the machines where that ran) would in theory kill the jenkins slave agent running there, disrupt communication with the jenkins master and cause weird nasty jenkins errors instead of getting you a real error from the slave, possibly15:24
sdagueand we fail tests that try to15:24
denis_makogonsdague, you're awesome =)15:24
jeblairsdague: no you don't understand...15:24
jeblairsdague: there already is an explanation15:24
jeblairsdague: someone broke it by adding a "set -e" higher up in the script15:24
jeblairsdague: in the "" script15:24
sdaguejeblair: oh15:24
jeblairsdague: that's what needs to be fixed15:24
ociuhanduhi all15:24
ociuhanduIn the last couple of days we started seeing in the hyper-v CI a high occurence of the following error in nova-compute.log: "RequiredOptError: value required for option: lock_path" (on tempest runs)15:24
ociuhanduthe only thing we found related to this error is
fungioh, jeblair beat me to it15:26
ociuhanducan anyone help us on this?15:26
*** yamahata has joined #openstack-infra15:27
*** krotscheck has quit IRC15:27
SpamapSfungi: hi. So we think we finally chased down our hardware issues to a combination of the bad motherboard _AND_ a buggy NIC driver....15:27
*** coolsvap has joined #openstack-infra15:27
fungiociuhandu: you might try asking in #openstack-nova15:28
sdaguefungi: yep15:28
ociuhandufungi: ok, thanks15:28
sdagueso did clarkb add that to catch a testr fail?15:28
openstackgerritSean Dague proposed a change to openstack-infra/config: remove inline set -e that is preventing explanations
denis_makogonsdague, so, this issue related to project or to the jenkins ?15:29
sdaguedenis_makogon: your project15:29
sdagueit needs to not call sudo in unit tests15:30
denis_makogonhm, ok, thanks15:30
fungijomara: that looks fine, i suppose. the real test will be checking what gerrit does with it once you run git review15:30
jomarafungi: ok, lets do this!15:32
jomaraok, this looks correct15:32
jomara4 patches15:32
openstackgerritSergey Kolekonov proposed a change to openstack-infra/jenkins-job-builder: Ruby metrics plugin support added
*** mdenny has joined #openstack-infra15:35
*** mgagne1 is now known as mgagne15:35
*** krotscheck has joined #openstack-infra15:37
openstackgerritA change was merged to openstack-infra/storyboard: Auth controller
jeblairfungi: responded to the zuul patch (i still think it's safe)15:37
jeblairto remove15:37
jomarafungi: this is weird, i think the dpeendency chain si correct, but it added back some -1 reviews to one of the patch that have been resolved15:38
sdaguefungi: so on I think if set -e has other side effects, those need to be caught. Because it won't do what is really intended15:38
fungijeblair: oh, i should review devstack-gate. for some reason i thought we pulled zuul refs from the zuul mergers and other refs from the git mirrors15:38
jeblairfungi: we do -- but in what circumstance is that wrong?15:39
*** wenlock has joined #openstack-infra15:39
fungijeblair: i tried to explain the race there in my comment, but maybe i wasn't clear (or maybe i'm just wrong)15:40
jeblairfungi: that's what i meant by "if it's participating in the gate stack" we'll pull it from zuul... anything that isn't participating in the gate stack we won't, but it doesn't matter because that means it's not about to change in a racy way15:40
jeblairfungi: oh, it's about the window15:40
fungijeblair: more about how we might start testing a change which depends on a change in another project which just merged, but there are no other changes ahead for the same project so there's no zuul ref which incorporates that now-merged change15:41
fungiwhich might not yet be on the mirrors when we pull the branch from them15:41
*** krotscheck has quit IRC15:42
fungiand the most common example of "depends on a change in another project" is when we have an integration-test-impacting bug fix which we've promoted in the gate15:42
*** ryanpetrello has joined #openstack-infra15:43
fungiwe might see changes behind it which haven't started testing yet (either because they're beyond the window or because they coincidentally got approved at the same moment) fail on the bug for which we just merged a fix15:43
jeblairfungi: understood15:44
*** david-lyle has joined #openstack-infra15:44
fungii'm not saying that's necessarily likely, nor that it can't happen already now under the right circumstances, just pointing out that by removing the check we *in theory* increase our chances of losing that race15:45
*** pcrews has joined #openstack-infra15:45
fungior winning, for some definitions of winning15:45
jeblairfungi: yeah, since we're checking 1 out of 7 mirrors we replicate to, basically we're increasing the chances because we're removing a "sleep" statement.15:46
jeblairbecause the check itself is ineffective15:46
jeblairit's just the time the check takes that may reduce the chances of what you describe15:46
*** resker has quit IRC15:47
*** esker has joined #openstack-infra15:48
jeblairfungi: i do believe the window affects this because it synchronizes the start of a new change with the merging of an old one; so it actually _increases_ the chance of this15:48
dtroyerclarkb: here yet? really is wedged on the no-newlines in the existing repo.  / looks like it checks the exiting file and fails (the current situation) rather than just using what is in the repo and checking the update.15:49
dtroyerto refresh:  the original requirements.txt and test-requirements.txt files didn't end with newlines.  any attempt to change them fails in / with a 'file doesn't end in newline' error15:50
jeblairfungi: one solution might be to keep the merged change around for a little bit and continue to use it to compose changes15:50
*** Sukhdev has joined #openstack-infra15:51
dtroyercan we push in that fix manually or shall I do disable that job temporarily?15:51
fungijeblair: agreed on the window making this more likely. as for it being purely a sleep statement, not really. we at least know we waited long enough for gerrit to process the submit and start replicating, so it's a sort of dynamic/adaptive sleep15:51
fungisince as we've seen it varies depending on the complexity of the project to which we merged the commit15:52
jeblairfungi: we always wait for it to process the submit -- the check only causes us to wait until 1 out of the 7 replicas completes updating15:52
jeblairfungi: it might be the first or the last of the 715:52
fungijeblair: oh, i completely agree15:52
fungijeblair: and yes, keeping merged changes around for a little while and making zuul refs out of them would likely mitigate this to a great extent while improving gating performance, though it sounds non-trivial to implement15:53
*** zhiyan is now known as zhiyan_15:53
jeblairfungi: yeah, a bit.  just brainstorming.15:53
fungianyway, we know what we are doing now is suboptimal. we're weighing unknowns on deawbacks for the solutions, so i suppose we can just try something and see if it's better/worse than what we're doing now15:55
*** thedodd has joined #openstack-infra15:55
*** esker has joined #openstack-infra15:55
jeblairi'm certainly okay increasing the timeout and restarting immediately if we think this is impacting i315:55
*** krotscheck has joined #openstack-infra15:56
fungithat's a question for sdague/ttx... ^15:56
jeblair(also, i think that the exact problem you describe could happen on a false gate reset, because again, it's synchronizing job starts with a merge-related replication race)15:56
ttxjeblair: maybe we could do that just after the next  reset15:56
fungijeblair: yep, no argument there. it definitely could15:57
ttxjeblair: I have a number of changes coming to the top of the gate that would trigger new I3 cuts when they complete15:57
*** chandan_kumar has joined #openstack-infra15:57
*** rlandy__ is now known as rlandy15:58
*** amotoki has joined #openstack-infra15:58
*** skraynev is now known as skraynev_afk15:58
openstackgerritThierry Carrez proposed a change to openstack-infra/storyboard: Remove Branch and Milestone legacy tables
ttxjeblair: looks like we just had a reset though16:01
ttxjeblair: so i guess restarting now would have limited impact ?16:01
jeblairttx: we haven't approved the change yet, so we're still about 30 minutes from being able to restart it with it.  was mostly looking for whether you wanted to do it at all.16:02
openstackgerritJeremy Stanley proposed a change to openstack-infra/zuul: Stop waiting for Gerrit commit replication
fungiso there's ^ the alternate approach16:02
ttxjeblair: i blame the reset-after-merge for having built that gigantic backlog16:03
ttxjeblair: would definitely like to see it gone, and pushing it in just after a reset should limit the delays16:03
*** jaypipes has quit IRC16:03
*** jergerber has joined #openstack-infra16:04
jeblairfungi: i think if we're in a hurry, then we should approve the 300sec change to minimize functional changes right now16:04
fungijeblair: i could go either way, but if you have a more immediate preference for the timeout increase then let's move forward with it for now16:05
jeblairfungi: yeah, that will change less of zuul's behavior and give us time to think about whether there's a better way to solve your concern16:06
openstackgerritA change was merged to openstack-infra/storyboard-webclient: Auth support
*** krotscheck has quit IRC16:06
*** krotscheck has joined #openstack-infra16:08
*** jcoufal has joined #openstack-infra16:10
jeblairdtroyer: should probably make that job nonvoting for now16:14
dtroyerjeblair: roger16:14
*** freyes has quit IRC16:16
*** mwagner_lap has joined #openstack-infra16:17
*** sandywalsh has quit IRC16:17
*** zehicle_at_dell has quit IRC16:21
*** krotscheck has quit IRC16:21
*** vkozhukalov has joined #openstack-infra16:21
*** rossella-s has quit IRC16:21
*** esker has quit IRC16:23
*** esker has joined #openstack-infra16:24
*** rfolco has quit IRC16:24
*** rfolco has joined #openstack-infra16:25
*** rossella-s has joined #openstack-infra16:25
*** andreaf has quit IRC16:25
openstackgerritDean Troyer proposed a change to openstack-infra/config: Temporarily no-vote the requirements check for openstacksdk
*** resker has joined #openstack-infra16:28
*** resker has quit IRC16:28
dtroyerjeblair: is theis the correct way to set that job non-voting for only openstacksdk?
*** esker has quit IRC16:28
jeblairdtroyer: no i don't think that'll work.  maybe just remove the check-requrements template completely for now.16:29
dtroyerok, will do16:29
jeblairdtroyer: then merge the fix then add it back16:29
openstackgerritDean Troyer proposed a change to openstack-infra/config: Temporarily no-vote the requirements check for openstacksdk
*** melwitt1 has joined #openstack-infra16:31
*** jcoufal has quit IRC16:31
*** sandywalsh has joined #openstack-infra16:32
*** krotscheck has joined #openstack-infra16:33
*** melwitt1 is now known as melwitt16:34
*** vogxn has joined #openstack-infra16:36
jraimsdague: ping16:36
*** esker has joined #openstack-infra16:38
*** sandywalsh has quit IRC16:38
sdaguejraim: pong16:41
*** vkozhukalov has quit IRC16:42
openstackgerritZiad Sawalha proposed a change to openstack-infra/config: Add pypy to satori checks (it's already in gates)
jraimsdague: we fixed our gate issue. Turns out is was caused by a merge of the newer oslo.logging code that broke something in a way our tests didn't catch16:42
jraimsdague: here is the fix -
sdague"2014-03-05 02:02:35.969 | Ran 0 tests in 0.000s"16:44
jraimour tests are all CloudCafe stuff right now, which doesn't run in the gate. Our plan is to start moving things over, but my understanding is that was for graduation16:44
*** sarob_ has joined #openstack-infra16:44
jraimsince it is a non-voting job right now16:44
sdaguejraim: but that doesn't actually demonstrate the service actually responding16:44
jraimsdague: or are you just looking for 1 or 2 sanity tests?16:44
sdaguejraim: right, the point of this is to know the service can come up do something16:45
sdagueand right now there is actually no validation of that16:45
jraimwe included some checks in our devstack exercises and such, just nothing in tempest16:45
sdaguethat gate job doesn't actually tell you if barbican is running16:45
sdaguejraim: this doesn't need to be tempest16:46
*** thuc has joined #openstack-infra16:46
sdaguethe post_test_hook is in your tree16:46
jraimoh okay, just something as part of the jon16:46
sdagueright, so me where in that job barbican is sanity checked to be functioning16:46
jraimwe do have some checks in our exercises, we'll port those over so they run as part of the gate job16:46
*** krotscheck has quit IRC16:46
jraimokay, that seems reasonable16:46
openstackgerritA change was merged to openstack-infra/config: Temporarily no-vote the requirements check for openstacksdk
jraimcool, I'll ping you when we get it moved over16:47
*** gyee has joined #openstack-infra16:47
sdaguejraim: the reason for this whole exercise is that we have integrated projects that are barely demonstrating functioning with the rest of OpenStack, and have taken *many* cycles to get into the gate16:47
*** esker has quit IRC16:47
sdagueso we want to make sure new services, while young, starting in a functioning way with the rest of the stack16:47
sdagueso this isn't a giant surprise that they actually need to work in this environment16:48
jraimsdague: that makes sense to me16:48
sdagueyou guys are the first test of the process, so we'll figure out how to make it more clear in the future on expectations16:48
sdagueso appreciate your patience in working through this16:48
*** openstackstatus has joined #openstack-infra16:48
jraimno problem, you guys have had a lot of patience with our changes - we appreciate it :)16:49
*** sarob_ has quit IRC16:49
jeblair (for reference�16:49
*** mrodden has quit IRC16:49
*** sandywalsh has joined #openstack-infra16:50
*** jlibosva has quit IRC16:51
jeblair (f16:52
jeblair (for referenceÃ16:52
jeblairthat was it16:52
jeblairttx: killed statusbot earlier with a unicode error.  :)16:52
jraimsdague: so poking at this a bit more, we are doing an 'it's running test' here
jraimsdague: it's being called from the post_test_hook16:54
jraimbut it doesn't seem to output anything16:54
*** dklyle has quit IRC16:54
sdaguejraim: yeh, so outputing something would be good, so we have some idea if it worked or not16:54
fungijeblair: yeah, i've got the tracebacks for a couple of ways in which encoding issues have been killing it in the open bug16:54
jraimsdague: okay, I'll poke at it16:55
sdaguejraim: and honestly, can we do at least minimal payload check16:55
fungijeblair: probably just filtering the input for potential encoding translation issues would solve that?16:55
sdaguelike our horizon exercise in devstack scrapes the page and looks for "Login"16:55
sdagueto make sure the return wasn't a stack trace16:55
jraimseems like we could patch that in pretty easily16:56
ttxjeblair: hold steady, potential nova reset17:00
*** reaper has joined #openstack-infra17:00
*** jgrimm has quit IRC17:00
*** jgrimm has quit IRC17:00
ttxsacrificing a chicken to zuul17:00
ttxalthough damage is limited, there was a reset two down17:01
fungittx: if the repack and pack-refs i did on review's /var/lib/git mirror a little while ago helps, then maybe it will replicate quickly enough17:01
ttxI think it passed17:01
*** dkliban has quit IRC17:03
fungiin unrelated news, the resize2fs for the filesystem just finished, so we have a few more terabytes of breathing room which should carry us through at least a couple more months17:03
fungiright now we're holding onto about 8.5tib of job logs from the past 6 months17:04
*** markmcclain1 has joined #openstack-infra17:04
StevenKfungi: We used to have those discussions about the Launchpad Librarian. "We have 1.5T of free space, which will hold us for 1.5 weeks."17:04
*** mrodden has joined #openstack-infra17:05
*** mriedem has quit IRC17:05
*** sabari2 has joined #openstack-infra17:05
StevenKWhen I left, we had 16TiB for the Librarian17:05
ttxfungi: what does the SKIPPED mean on the cinder change in slot #6 at the gate right now ?17:05
fungiStevenK: we can push to about 12.5 tib before we hit the kernel/xen limit on the number of virtual block devices which can be added to a guest17:05
ttxfungi: is it a sure FAIL at this point ?17:06
fungittx: hover over the colored dot next to it17:06
*** markmcclain has quit IRC17:06
ttxfungi: OK, so that one will have to go to the bottom of queue again, right17:06
ttxjust trying to keep track of what may land and what i should be waiting on17:06
*** amotoki has quit IRC17:06
fungittx: it probably merge-conflicts with another cinder change which merged ahead of it in the gate after the last time it had check jobs run17:07
*** SumitNaiksatam has quit IRC17:07
*** harlowja has joined #openstack-infra17:07
ttxthat means i can cut cinder now17:07
*** hogepodge has joined #openstack-infra17:07
*** jcoufal has joined #openstack-infra17:07
*** esker has joined #openstack-infra17:07
ttxfungi: ok, was just wondering why it did not return the failure yet17:07
*** e0ne_ has quit IRC17:08
ttxsince by now it can't be saved17:08
fungittx: zuul isn't currently smart enough to pay attention to the fact that it now merge-conflicts with the branch tip, and eject it immediately17:08
fungittx: i think it's on the to-do list17:09
*** zhiwei has quit IRC17:09
*** esker has quit IRC17:09
*** sabari3 has joined #openstack-infra17:09
*** sabari2 has quit IRC17:10
zarosdague: working on BUILD_TIMEOUT env var ->
*** sabari3 is now known as sabari17:13
*** dangers_away is now known as dangers17:13
*** dizquierdo has quit IRC17:13
*** esker has quit IRC17:14
*** derekh has joined #openstack-infra17:14
*** amcrn has joined #openstack-infra17:15
*** chandan_kumar has quit IRC17:16
*** browne has joined #openstack-infra17:16
*** amcrn has quit IRC17:17
*** browne has left #openstack-infra17:17
*** cadenzajon_ has quit IRC17:17
*** amcrn has joined #openstack-infra17:18
*** reed has joined #openstack-infra17:23
*** KurtMartin is now known as kmartin17:24
*** CaptTofu has quit IRC17:25
*** CaptTofu has joined #openstack-infra17:26
*** whoops has joined #openstack-infra17:28
*** krotscheck has joined #openstack-infra17:30
*** krtaylor has joined #openstack-infra17:30
*** CaptTofu has quit IRC17:30
*** melwitt has quit IRC17:31
*** SumitNaiksatam has joined #openstack-infra17:32
*** zns has quit IRC17:33
*** jpich has quit IRC17:33
*** vkozhukalov has joined #openstack-infra17:34
*** dkliban has joined #openstack-infra17:34
*** krotscheck has quit IRC17:36
*** esker has joined #openstack-infra17:36
*** dims has joined #openstack-infra17:37
*** blamar has joined #openstack-infra17:39
jogowho is the pypy expert around these parts again?17:45
clarkbAlex_Gaynor: ^17:45
Alex_GaynorWhat's up17:46
jogoAlex_Gaynor: oh right thanks17:46
jogoAlex_Gaynor: I just commented on [openstack-dev] [neutron][rootwrap] Performance considerations, sudo?17:46
Alex_Gaynorjogo: I'll take a look, I don't read -dev regularly17:47
jogoAlex_Gaynor: want to make sure I didn't get anything wrong in
Alex_Gaynorjogo: So we don't generally reccomend RPython for anything besides writing interpreter, it's a slightly prickly language; but rootwrap is relatively constrained, so it actually might make sense17:48
*** sabari has quit IRC17:49
*** krotscheck has joined #openstack-infra17:49
jogo[Btly m17:49
*** hashar has quit IRC17:50
*** Sukhdev has quit IRC17:50
jogoAlex_Gaynor: exactly (ignore my ssh timeout above)17:51
fungi...NO CARRIER17:51
Alex_Gaynorjogo: I imagine the biggest issue that will creep up is that almost all of the stdlib is not accessible in RPython17:51
jogoif you know a little RPython ( I don't) you can take a look at and see for yourself17:53
*** mrodden has quit IRC17:53
Alex_Gaynorlogging is probably the biggest thing that jumps out; rpython has no logging library17:54
jogothat shouldn't be a big usse17:56
clarkbfungi: jeblair: catching up on sb. Is run-unittests still broken? that isn't entirely clear to me (I can hack on a fix if it needs it). And did we make a deicsion on what to do with zuul?17:57
fungiclarkb: weigh in on for fixing17:58
jeblairclarkb: we decided to increase the timeout in zuul, but the change hasn't merged yet17:58
*** esker has quit IRC17:59
fungiclarkb: decision on zuul is plus a restart at the soonest opportunity, then either or a fancier approach not yet fleshed-out17:59
*** jgallard has quit IRC17:59
*** CaptTofu has joined #openstack-infra18:00
*** thuc has quit IRC18:02
clarkband it looks like there are window related races because changes behind will pull tip of master which updates post merge when the window shifts?18:02
clarkbthat is a fun one18:02
*** beagles_brb is now known as beagles18:02
fungithough we basically already have that now, we just haven't seen it in action i don't think18:02
*** thuc has joined #openstack-infra18:02
fungiat least haven't witnessed it first hand anyway18:03
*** khyati has joined #openstack-infra18:03
*** resker has joined #openstack-infra18:03
* clarkb looks at git logs18:04
fungii should have done a git blame on it18:04
clarkbthen I had to readd it again later iirc18:05
jogoAlex_Gaynor: so your this might work, would be appreacted on the ML18:05
clarkbfungi: there was no set -e in the old portion that did a similar check18:07
mgagneCould someone give a kick to gerrit bot?18:07
*** fbo is now known as fbo_away18:07
fungi1732 <-- openstackgerrit ( has quit (Ping timeout: 244 seconds)18:08
fungirestarting it18:08
mgagnefungi: thanks!18:08
fungilooks like gerritbot is still running...18:09
fungigerrit2  19137  0.1  0.0 313684 20076 ?        Sl   Mar04   4:23 /usr/bin/python /usr/local/bin/gerritbot /etc/gerritbot/gerritbot.config18:09
clarkbsdague: ^18:09
fungistrace says "futex(0x18011c0, FUTEX_WAIT_PRIVATE, 0, NULL" and just sits there18:09
fungihaving a look in the logs18:10
clarkbfungi: however that may only be necessary for the old foundcount variable18:10
*** zns has joined #openstack-infra18:11
clarkbfungi: I have +2'd sdague's change. Inline comment explains why I Think it is ok18:11
clarkbthat said18:11
clarkbhow did set -e trip if that worked as expected?"18:12
fungiclarkb: okie dokie. i will likely follow suit once i read it18:12
clarkbgah I should think these things before I +218:12
clarkbfungi: why caused set -e to errexit in the first place?18:12
clarkbwe probably need to guard against that18:12
clarkbso ya +218:12
jogosdague: have you been staring at the gate recently?18:12
fungihere's the last thing i find in gerritbot's log...
fungiseeing if there's anything else more enlightening in the debug log18:13
clarkbI hear a mordred around the corner18:13
jomarafungi: i think everything is good with my set of patches EXCEPT a mistake i made merging patch 4 (of 5); i need to fix it and reupload (just a conflict resultion error), but *WHAT* should i run git review on? P5, with P4 correct in its history?18:13
fungijomara: if you still have the topic branch you were using, just rebase -i <commit_before_the_one_you_want_to_change>18:14
fungijomara: then change the "pick" next to the change you need to modify to "edit" instead18:15
fungijomara: then make the modifications you need to that change and do git commit -a followed by git rebase --continue18:15
fungijomara: after that you can 'git log -p' to see whether the changes look the way you intend18:16
jomarafungi: oh awesome18:16
*** Ryan_Lane1 has quit IRC18:16
*** bhuvan_ has joined #openstack-infra18:16
*** bhuvan has joined #openstack-infra18:16
fungijomara: and if they look right, run 'git review' and say yes to the multi-change prompt. it will list all the changes which are still outstanding but will actually only update gerrit with the ones you changed (the edited change and any after it)18:17
fungijomara: that will properly preserve the dependency chain in gerrit18:17
*** malini has joined #openstack-infra18:18
jomarafungi: wow! excellent18:19
fungijomara: in general, git rebase -i is a great tool for safely adjusting the order or content of changes in a topic branch (you can even squash, remove or insert changes that way too)18:19
clarkbjust be careful you don't end up with na extra goto fail >_>18:19
*** Ryan_Lane1 has joined #openstack-infra18:19
fungijomara: experiment with it when you get time, but yeah it's advanced git voodoo18:20
jomarathat would probably would have been superior for my method of "rebasing"18:20
jomarahad i done htat from the start18:20
fungisince those are the best way to save your bacon when you mess up18:21
jomarafungi: to be 100% clear so i dont make mistakes, i rebase -i from the TOP commit, or from the commit in question(that needs edit)18:21
clarkbfungi: re gerritbot, should we catch ServerNotConnected in run() and just reconnect if that happens?18:21
fungiclarkb: i think that would be a good idea18:21
*** mriedem has joined #openstack-infra18:22
fungijomara: rebase on the parent sha of the commit you want to change18:22
malinihello..I am trying to add a new job for marconi in devstack's experimental queue. can somebody point me to a good example?18:22
*** gokrokve has quit IRC18:22
clarkbmalini: will this job need to do things like patch devstack/tempest before the job runs?18:22
clarkbs/before the job/before the tests/18:23
*** mriedem1 has quit IRC18:23
fungijomara: also, read all the comments and output messages you get from those git commands very carefully. they contain pertinent instructions, but they're very easy to ignore18:23
*** reed has quit IRC18:23
*** reed has joined #openstack-infra18:23
fungigit is quite happy to warn you about how to proceed and then let you shoot yourself in the foot when you don't heed its warnings18:24
*** Ryan_Lane has quit IRC18:24
*** Ryan_Lane1 is now known as Ryan_Lane18:24
*** Ryan_Lane has quit IRC18:24
*** Ryan_Lane1 has joined #openstack-infra18:24
maliniclarkb: We already have an experimental job that runs tempest on devstack. I am looking for something that can be used to see the impact, on any devstack changes that I make18:25
malinihope tht makes sense18:26
*** openstackgerrit has joined #openstack-infra18:26
clarkbmalini: it does, but to find a good example I need to know if you are going to run stock devstack with stock tempest or if you need to manipulate those projects before running tests18:26
*** jcooley_ has joined #openstack-infra18:27
fungiclarkb: also worth looking into for gerritbot, i think its initscript may not be looking in the right place for the pidfile because stop and restart always complain that no pidfile was found18:27
maliniclarkb: when you say stock devstack, will it include changes in any proposed patch?18:27
clarkbmalini: yes18:28
maliniclarkb: & what would be an example of manipulating the project?18:28
jomarafungi: ACK, thanks18:28
clarkbmalini: some projects add stuff to devstack/lib/projectname since devstack won't carry those things in tree18:29
clarkbmalini: others add tests to tempest that tempset won't carry in tree18:29
fungiclarkb: actually, skimming the initscript it does seem to be checking the right path for the pidfile, and the current pidfile contains the pid of the process, so maybe something is disappearing the pidfile. i'll try to remember to check for it *before* restarting next time18:29
clarkbfungi: ok18:29
clarkbpypi is down to its static mirror18:30
maliniclarkb: Sounds like stock tempest + stock devstack is all I need,18:30
*** resker has quit IRC18:30
clarkbdstufft: ^ you are probably aware of anything going on but if you aren't18:30
clarkbfungi: ya18:30
clarkbmalini: great, in that case your job probably just needs to enable and disable specific features?18:31
*** johnthetubaguy has quit IRC18:31
*** esker has joined #openstack-infra18:31
clarkbthat already exist in devstack/tempest18:31
fungii *could* pull up my maildir for the python packaging list, but EEFFORT18:31
dstufftpingdom hasn't said anything yet18:31
maliniclarkb: yes..I need to enable marconi18:32
*** amcrn_ has joined #openstack-infra18:32
*** jcoufal_ has joined #openstack-infra18:33
maliniclarkb: some background.. marconi is not already part of tempest..We are seeing some weird issues in tempest-marconi experimental jobs. it was suggested tht having an experimental job in devstack queue for marconi will make troubleshooting easier18:33
*** jcoufal has quit IRC18:33
dstufftah there's pingdom18:33
clarkbmalini: openstack-infra/config c515a55a2d40d87510b3b6d10915b426de8f5c5d is probably pretty close18:33
dstuffthaving a service that lots of people depend on is way more responsive than pingdom18:34
clarkbmalini: that is ironics change to add tests18:34
*** Ryan_Lane1 has quit IRC18:34
clarkbdstufft: glad I could help :)18:34
maliniclarkb: thanks a lot!! appreciate your help!18:34
*** resker has joined #openstack-infra18:34
*** esker has quit IRC18:35
fungidstufft: we keep trying to tell people that too, but they seem to think setting up an actual monitoring system would be less embarrassing ;)18:35
markmcclain1would it be possible to get a review promoted on next reset?18:36
fungimarkmcclain1: probably. what are the reasons?18:36
clarkbdstufft: its back18:36
clarkbor at least useable18:36
dstufftfungi: I don't think we've ever had the monitoring tell us before someone else did18:36
fungimarkmcclain1: looks like it's probably a stability fix?18:36
markmcclain1fungi: we encounter a race, so added better error handling18:36
dstufftyea one our PGpool nodes restarted and when it came back up it got confused18:36
markmcclain1fungi: yep18:36
dstufftwe're bypassing the pool right now and connecting directly to the DB18:37
fungimarkmcclain1: right on. in it goes on the next reset18:37
markmcclain1fungi: thanks18:37
pleia2fungi: could use your security brain for a chat at some point re:
pleia2fungi: modules/openstack_project/files/git/bin/ is kind of scary and awful, and selinux will never love it18:37
fungipleia2: i left it in my other jacket, but i'll see what i can do18:38
pleia2fungi: I'm wondering if we should come up with a saner solution than what ships with cgit ( is slightly modified from upstream) or just live with it and turn off selinux18:38
pleia2we could also allow apache user to execute bash scripts, but at that point we might as well turn off selinux because all is lost18:39
clarkbfungi: so we already have reconnect logic in the bot18:40
fungiclarkb: okay18:40
maliniclarkb: what do jobs like 'gate-climate-devstack-dsvm' do?18:40
maliniclarkb: just enable climate in devstack ?18:40
clarkbmalini: they use a pre test hook to patch devstack, which is then run and if successful the devstack exercises are run last18:41
clarkbmalini: which is why I asked if you would need to patch devstack or tempest because if you do then the test hook stuff becomes important18:41
maliniclarkb: what is a test hook?18:42
fungimarkmcclain1: if you're lucky and there's no gate reset on the 9 changes being tested ahead of it, 78077,2 is estimated to land in about 20 minutes anyway. but i'll bump it ahead if we do get a reset before then18:42
markmcclain1fungi: a little luck would be good18:43
maliniclarkb: I know thts a lot of questions..But we already have the tempest + devstack job similar to ironic18:43
*** UtahDave has joined #openstack-infra18:43
maliniclarkb: But marconi crashes when tht job runs..We are trying to figure out what causes tht crash18:44
*** bhuvan_ has quit IRC18:44
*** bhuvan has quit IRC18:44
fungipleia2: i'm re-reviewing 60375 now... i'll have some pointers for you here in a bit, hopefully18:44
clarkbmalini: in the test runner framework we have a project called devstack-gate. This is the thing that configures a localrc appropriate for each job. It then runs devstack and tempest is the typical configuration. Because some projects may need to do other things like patch devstack or tempest, devstack gate has a pre test hook in which you can do stuff before devstack and tempest run18:44
clarkbmalini: it also has a test hook if you want to do tests that are not tempest and it has a post test hook if you want to run more tests after tempest or clean up particular things or whatever18:44
*** talluri has quit IRC18:44
pleia2fungi: thank you18:44
clarkbmalini: they are just hooks into the various steps of running integration tests18:45
*** talluri has joined #openstack-infra18:45
clarkbfungi: aha! we need a new gerritbot release18:46
clarkbfungi: the retry logic isn't tagged yet18:46
* fungi facepalms18:47
*** Sukhdev has quit IRC18:47
fungilooking at the delta since the last tag...18:47
fungioh, wow... lotsastuff18:48
fungi1.5 years worth of fixes18:48
fungiwhoever first said "release early and often" should not look at this repo :/18:49
*** khyati has quit IRC18:49
*** gema has quit IRC18:49
*** e0ne has joined #openstack-infra18:49
clarkbfungi: now we can move onto the next thing :)18:49
fungiclarkb: for our own sanity, let's shoot for tagging the tip of master as something like 0.2.0 on monday18:49
clarkbfungi: ++18:50
wendarIs logs/devstacklog.txt indexed?18:50
maliniclarkb: thanks for the clarification! let me dig in a lil deeper on what will help us the most!18:50
*** talluri has quit IRC18:50
clarkbwendar: typically I could answer that question, but I am behind on that stuff. Let me check18:50
*** afazekas has quit IRC18:50
clarkbwendar: yes it should be18:51
*** pblaho has quit IRC18:51
clarkblooks like it is18:51
clarkbI have almost a quarter of a million hits for filename:"logs/devstacklog.txt" over the last 15 minutes18:52
fungimarkmcclain1: in good news, all the neutron changes ahead of 78077,2 have completed and passed all their tests, so at least they're not going to cause the next reset (if there is one)18:52
*** resker has quit IRC18:53
*** esker has joined #openstack-infra18:53
* clarkb moves onto catching up on zuul stuff then normal reviews18:54
*** krotscheck has quit IRC18:55
*** rlandy has joined #openstack-infra18:56
*** gema has joined #openstack-infra18:56
*** briancurtin has quit IRC18:56
maliniclarkb: what does 'dsvm' stand for?18:58
clarkbmalini: devstack VM18:58
*** esker has quit IRC18:58
clarkbmalini: its an string we can key off of to make sure we do bookkeeping around those machiens properly18:58
SpamapSohai.. I'm wondering if we can have bumped in the zuul queue (once it passes checks) ... The bug it fixes is causing all of TripleO CI to fail.18:58
fungiclarkb: or at least was, but now that single-use is default, it's not really as important as it was18:59
wendarclarkb: great, thanks!18:59
*** skraynev_afk is now known as skraynev18:59
clarkbfungi: good point18:59
maliniclarkb: thanks! now that I know it, it should have been obvious :)18:59
*** jeremyb has quit IRC18:59
*** jcooley_ has joined #openstack-infra18:59
fungimalini: we used to spell it out, but job names got way, way, way too long18:59
fungiSpamapS: 78310 isn't going to make it into the gate until 77920 does19:00
fungiSpamapS: see gerrit change dependencies there19:01
sdaguejogo: just got back from lunch, what's the question?19:01
sdaguewendar: yes, logs/devstacklog.txt starting getting indexed last week some time19:01
*** dripton_ is now known as dripton19:02
clarkbsdague: I am going to jinx it, but the cluster seems to have kept up after that initial slowdown on monday. I added a few additional gearman workers but other than that haven't touched it19:02
wendarsdague: thanks!19:03
*** nati_uen_ has joined #openstack-infra19:03
clarkbsdague: I think we are just very near the event horizon when things realy pick up. I will spend more cycles on autoscaling those nodes when I have time to sit and hack on stuff19:03
jogosdague: gate backed up 15 hours, have you been looking at it? if not thats fine19:03
jogoI am working with wendar on it19:03
fungipleia2: looking at 60375, i don't immediately see why the selinux rules for it would be particularly challenging to write... should just be able to allow it to run /usr/local/bin/rst2html and things in /usr/libexec/cgit/filters/html-converters right?19:03
*** jcooley_ has quit IRC19:04
*** jcooley_ has joined #openstack-infra19:04
sdagueafter that, it's mostly the normal races that we hit19:04
sdaguea few neutron ones19:04
sdaguegrenade issue where services don't shut down on old side19:04
*** svpress has joined #openstack-infra19:05
fungisdague: jogo: and i should check the zuul logs to see if it's happened since i did a repack and pack-refs on the gerrit local mirror19:05
sdaguefungi: yeh, that would be cool19:05
sdaguejogo: though the classifaction rate has dropped a lot -
*** jcooley_ has quit IRC19:05
fungiit may no longer be biting us19:05
*** mrodden has joined #openstack-infra19:05
sdaguefungi: definitely possible19:05
sdagueI think it was a big reason for not much overnight progress though19:06
pleia2fungi: so the trouble I'm running into is that is calling bash to run those, which selinux isn't happy with (I can run it again and grab the selinux errors)19:06
fungipleia2: that might help narrow it down19:06
sdaguejogo: but honestly, ignore everything older than the 21st19:06
*** jeremyb has joined #openstack-infra19:06
*** nati_ueno has quit IRC19:06
pleia2fungi: ok, will do19:06
sdaguebecause we were in a weird place on log indexing then19:07
*** zns has quit IRC19:07
*** zns has joined #openstack-infra19:07
fungipleia2: one (probably crazy) option would be to run a shell compiler on it at install time, but that's really just papering over the underlying issue i think19:07
jogosdague: ack,  wendar has a patch to ignore old hits19:09
*** packet has joined #openstack-infra19:09
sdagueright, I was +2 on that19:09
sdagueso you should approve it :)19:09
*** zns has joined #openstack-infra19:10
fungimarkmcclain1: it merged on its own without needing help19:11
*** mbacchi has quit IRC19:11
clarkbfungi: woot19:11
markmcclain1fungi: awesome19:11
sdaguejogo: we actually started at 75 at the beginning of the day19:11
clarkb74144 is an unhappy change though19:11
sdagueso the gate is shrinking19:11
sdaguewhich is good19:11
*** mrodden has quit IRC19:12
sdaguemarkmcclain1: whats up with neutron patch failing in the gate?19:13
sdaguealembic explode?19:13
fungisdague: first guess is that it functionally conflicts with one which was approved ahead of it19:13
sdagueit's alembic19:13
clarkbfungi: thinking about the window race. Wouldn't the merger already have the refs for the things that merged so that the zuul ref would appropriately be based on that?19:13
fungibut doesn't merge-conflict19:13
wendarclarkb: 74144, 77927, 74156, and 67862 are all the same error pattern19:14
fungiclarkb: any zuul ref would be based on that, sure. it's non-zuul refs i'm concerned with (integration tests pulling from our git.o.o farm)19:14
clarkbfungi: actually no because we may have any number of mergers19:14
clarkbfungi: right but those don't matter19:14
markmcclain1sdague: yep.. we had a race for a down revision19:14
fungiclarkb: why don't they matter?19:14
clarkbfungi: anything else has been static from zuul's perspective19:15
sdaguethat's something we probably need a better resolution on doing before moving all the other projects to alembic, because in "theory" it supports this nice dependency graph, but in practice it gets funky when more than one in flight19:15
fungiclarkb: _integration_ tests19:15
clarkbfungi: right19:15
*** mbacchi has joined #openstack-infra19:15
markmcclain1sdague: longer term I'm thinking we might need to make a prep script that auto sequences the migrations19:15
clarkbfungi: lets say we are testing nova nova cinder swift19:15
*** mrodden has joined #openstack-infra19:15
fungiclarkb: the change at issue may not be for the project being tested, or for any currently in the queue ahead of it19:15
clarkbfungi: glance won't matter because it was never changed19:15
markmcclain1sdague: unless the patchset declares something else specific19:15
sdaguemarkmcclain1: yeh, that makes sense19:15
sdagueok, I'm going to hide from computers for a bit. I was up super early this morning.19:16
clarkbfungi: we are only concered with nova cinder and swift, and since they have been involved in the merger their refs should be up to date. However this may not be true with multiple mergers19:16
clarkbfungi: now say the first two nova changes merge and we add a neutron change. then we have cinder swift nova. they should still get the proper nova refs because the zuul ref for nova will exist right?19:17
clarkbmaybe thats the bit I am missing, I thought zuul refs are created for projects as long as they were in the gate at some point19:17
clarkbfungi: and maybe ^ is the fix if they don't19:17
sdaguejogo: ok, well I'll let you handle it, I'm going to run away for a bit :)19:17
fungiclarkb: what if the change which just merged was for glance, and fixes an integration bug, then the window slides to start testing a swift change (and there are no swift changes ahead of it) and it pulls glance from the git.o.o mirror because there are no zuul refs for glance19:17
*** markwash has joined #openstack-infra19:17
clarkbfungi: I think htere should be a zuul ref for glance19:17
clarkbfungi: bceause glance was in the pipeline prior19:18
jogosdague:  sounds good, well I really just showing wendar how to take care of it19:18
fungiclarkb: are you suggesting that we add that capability to zuul? i don't believe it currently adds a zuul ref for projects which don't have changes ahead of your change19:18
clarkbfungi: I think zuul already does this19:18
clarkbfungi: but only in the case wheer that project had a change processed at some point in the running memory of zuul19:19
clarkbif you haven't been tested since the last restart it won't do that19:19
clarkbif however zuul doesn't have this behavior then I guess I am proposing this is the way zuul deal with the race19:19
clarkbthis is easy enough to check, /me waits for a window shift19:20
fungiclarkb: git refs are built for the change being tested... so i don't believe it will build a ref for the change being tested on another project which has no changes ahead in the queue, but i could of course be wrong19:21
fungithe setup-workspace.log would have far fewer fallbacks to teh git farm if theat were really the case19:21
*** MarkAtwood has joined #openstack-infra19:21
*** svpress has quit IRC19:22
clarkbfungi: you are correct19:23
clarkbgit fetch refs/zuul/master/Za2b6a99161fb4dceb2d336d2ff55fca4 fails19:23
jeblairclarkb: zuul builds refs for all projects ahead of (and of course including) the changes in the queue19:23
*** briancurtin has joined #openstack-infra19:23
jeblairclarkb: at the time it builds the refs19:23
*** gokrokve has joined #openstack-infra19:23
clarkbjeblair: ooohhhhh now I see why the window is important19:23
jeblairclarkb: so in fungi's situation, the refs for change #20 in the queue, just after the queue has shifted up, do not include the project that just popped off the queue cause it merged19:23
*** coolsvap has quit IRC19:24
jeblairclarkb: yeah, it actually forces the timing of the situation to be near worst-case19:24
fungiclarkb: technically we could also hit it without the window behavior, but it increases the chances due to synchronization19:24
fungisdague: i repacked the gerrit local mirror at 16:29 and since then we hit the replication timeout once,
fungiso that's one occurrence in 3 hours. not great, but not terrible19:27
jeblairfungi: i don't expect repacking to help significantly; that's why i did those timing tests yesterday19:27
fungialso, very tiny sample size, so ymmv19:28
jeblairthe timeout change has propogated so we can restart at will19:29
*** bhuvan has joined #openstack-infra19:29
fungiprobably best to at least let these next 6 which are about to merge make it in... none are nova changes19:29
*** bhuvan_ has joined #openstack-infra19:29
clarkbthere are about 4 changes that can potentially merge shortly19:29
*** ociuhandu has quit IRC19:29
clarkbmaybe after those we do the zuul restart?19:29
clarkbfungi: you win19:29
fungior alternatively wait for the next reset19:30
clarkbnah everything behind those that may merge have an unknown ttm19:30
*** rlandy has quit IRC19:30
*** krotscheck has joined #openstack-infra19:31
fungiokay, the gate just did some very weird reshuffling and restarted testing on things19:31
devanandairc bots down?19:31
hub_capdevananda: that or they dislike u19:31
hub_capid say its 50/5019:32
clarkbfungi: jeblair: I think now is good zuul restart opportnity19:32
jeblairi'll restart now19:32
fungioh yay! 1902 <-- openstackgerrit ( has quit (Ping timeout: 240 seconds)19:32
fungijeblair: clarkb yes, good time for a zuul restart19:32
jeblairit is stopped19:32
*** jcooley_ has joined #openstack-infra19:33
jeblairi'm attempting to set used nodes to delete19:33
fungithough figuring out what that was it just did while it was merging those last few changes would be nice. it seemed to spontaneously start retesting changes behind others it merged19:33
clarkbfungi: I think the ceilomter change failed19:34
jeblairstarting zuul19:35
jeblairi only managed to change about 75 nodes to delete due to row locks19:35
jeblairstill it's something19:35
fungiclarkb: if it did, then why didn't it restart all the changes immediately behind it? seemed it merged several changes after the ceilo change19:35
clarkbfungi: the failing neutron changes were in that gap19:36
clarkbcould be a change after that gap failed too19:36
clarkbbut I thought those had completed19:36
fungiclarkb: the one right before 77719,1 was a tempest change19:36
clarkbjeblair: so process is dump queues with zuul tool, stop zuul, use nodepool to delete used nodes, start zuul, use tool to rebuild quues?19:36
fungibut it restarted tests on changes further down before it was done reflecting that several changes ahead of the tempest change merged. i'm digging in the debug log for enlightenment19:37
jeblairclarkb: yeah.  nodepool is optional, it's just an optimization19:37
jeblairclarkb: then restart zuul mergers because they have a bug19:37
clarkboh right19:37
jeblairwhich i have also just done19:37
*** ildikov_ has joined #openstack-infra19:40
*** jcoufal_ has quit IRC19:41
clarkbjeblair: is that bug something you have debugged yet? would you consider it relatively important to fix (eg should I take a look at it?)19:43
jeblairclarkb: i have not, and it's probably relatively important.  it's certainly annoying.19:43
*** rlandy has joined #openstack-infra19:43
jeblairclarkb: it might be in the zuul merger server, or it might be gearman19:43
jeblairclarkb: it only re-registers one of the 2 functions it has on reconnection19:44
fungifirst breadcrumb... 2014-03-05 19:30:00,991 DEBUG zuul.DependentPipelineManager: Canceling jobs for change <Change 0x7fda28fc5410 77719,1>, behind change <Change 0x7fda29c17750 75588,6>19:45
fungi75588,6 got sniped by a new patchset as it was in the process of getting merged19:46
clarkbfungi: nice!19:47
jeblairah, it's a requirements update patch19:47
fungiif we have those in a long gate and then a requirements change merges, they will all get sniped19:48
*** jcooley_ has quit IRC19:48
jeblairso maybe the proposal job should not propose if the existing change is approved, and instead it should base a new change on that patch and propose it.19:48
fungithat's precisely what i was about to suggest19:48
fungifairly lhf bug on that script19:48
*** jcooley_ has joined #openstack-infra19:48
fungibut also, things working as designed, if not optimally designed ;)19:49
jeblairgate-python-openstackclient-python33: NOT_REGISTERED (non-voting)19:49
jeblairit looks like there are no current 33 or pypy jobs registered19:50
jeblairpossibly just because nodepool hasn't gotten around to creating those nodes19:50
*** kashyap has quit IRC19:51
*** hashar has joined #openstack-infra19:51
StevenKThe check-tripleo and experimental-tripleo queues are empty since the zuul restart.19:51
clarkbjeblair: that sounds correct. Maybe we should have gearman plugin register all jobs it knows about regardless of slaves that are connected19:51
StevenKjeblair: Intended behaviour?19:51
jeblairStevenK: yes; you can recheck if it's important19:52
ttxjeblair: checking in, did you manage to restart zuul ? Looks like you did, given the enqueue timers in the gate19:52
*** ociuhandu has joined #openstack-infra19:52
StevenKjeblair: Fair enough.19:52
jeblairttx: just now19:53
ttxjeblair: cool19:53
jeblairclarkb: then it could end up with a job it can't run19:53
jeblairclarkb: i think that would be worse19:53
*** vkozhukalov has quit IRC19:53
clarkbjeblair: hrm good point19:53
fungi62626,3 (in the gate) managed to get a py3k-precise node for a pypy job19:59
*** rossella-s has quit IRC19:59
jeblairpvo: how's that rate limit increase on /servers coming?  we get to run 1853 more jobs and then we're done for the day.  :(19:59
*** rlandy has quit IRC19:59
fungiTook 2 min 2 sec on py3k-precise-rax-iad-217158119:59
fungijeblair: when we hit the limit, it's time to go out for beer20:00
jeblairpvo: could we perhaps switch back to unlimited until the new class is set up?20:00
jeblairfungi: we've hit dfw, iad has 33 left, ord has 182020:00
fungibeer's close then!20:00
jeblairi'm going to manually set ord to 0 in the nodepool config so it stops spinning its wheels20:01
fungijeblair: what do you think about considering NOT_REGISTERED status as fodder for a job restart?20:02
*** ociuhandu has quit IRC20:02
jeblairfungi: this is very rare; mostly it hits when there really isn't a job for something.20:02
fungigood point20:02
jeblairfungi: (which is hard in our environment, but easily possible elsewhere)20:02
clarkbwe have node minimums to at least 1 right? this is just a race around actually getting that node up20:04
fungiperhaps the jenkins-gearman plugin could remember what it had registered prior to reconnect and re-register those? but it's probably not worth the effort to implement at this stage20:04
jeblairfungi: that's what clarkb was suggesting, but registering a job means you can run it.  if you register a job you might get it assigned.20:05
*** rossella-s has joined #openstack-infra20:05
jeblairclarkb: yeah, not being able to build nodes in rackspace hurts our ability to bring up such a node.20:05
jeblairclarkb: since they are only spec'd for rox.20:05
fungiclarkb: right, i suspect that if nodepool prioritized the first node of each type even more so than just meeting the minimum in general, we could mitigate that somewhat20:06
fungiclarkb: since right now py3k-precise nodes are much lower minimums than other node types20:06
*** skraynev is now known as skraynev_afk20:06
pleia2fungi: selinux fun!
pleia2fungi: so now I actually think it's an issue with /tmp20:07
fungiso they probably get shorted a little if nodepool has to turn over a ton of new nodes all at once20:07
fungipleia2: selinux forces you to learn all sorts of behaviors you never really wanted to know about random programs20:08
pleia2fungi: yeah :\20:08
pleia2like "all the things that write to /tmp by default"20:08
mriedemanyone here ever seen this kind of failure in devstack-gate-cleanup-host? looks like a problem reporting test results with nose?20:11
mriedemsqlalchemy-migrate non-voting tempest job has been failing with that ever sense we got it up and running this week20:11
*** zns has joined #openstack-infra20:11
fungipleia2: yep, so it does look like it's actually trying to use normal tempfile/tempdir patterns to formulate its output20:11
*** vogxn has quit IRC20:11
*** rlandy has joined #openstack-infra20:11
jeblairthere goes iad20:12
clarkbfungi: should we be adding those images to the hpcloud as well?20:13
fungilights winking out one by one20:15
fungiclarkb: seems fine--we've just not tested them with hopcloud's base images20:15
jeblairwow, so dfw just jumped up to 5000 remaining20:15
*** Ryan_Lane has quit IRC20:15
jeblairso i put it back in20:15
*** ArxCruz has quit IRC20:15
*** akerr has joined #openstack-infra20:15
clarkbbut that means hipchat20:16
fungijeblair: i'm guessing pvo rolled our counter over! ;)20:16
*** ociuhandu has joined #openstack-infra20:16
fungiclarkb: i would suggest a shower afterward, but really there is no water hot enough for that20:16
jeblairiad is still at 0 though20:17
akerrmaybe I'm missing something, but was approved about 7 hours ago but Jenkins has not yet started gate checks.  Any ideas why?20:17
wendarjeblair: I've got a question about the version of javascript we're running in elastic-recheck, jogo suggested you'd be a good person to ask?20:18
clarkbakerr: good question, nothing is readily apparent to me. May need to dig in logs20:19
jeblairwendar: i'm not sure we get to control the version of javascript.  i'm pretty sure that's determined by browsers.20:19
clarkbbut first I need to hipchat20:19
wendarjeblair: library versions, specifically jQuery20:19
wendarjeblair: it's old, and uses functions that are deprecated in modern browsers20:20
fungiakerr: i think it was approved when 78093 was still a draft?20:21
*** rf01 has joined #openstack-infra20:21
jeblairwendar: it's what is in ubuntu lts;  it's likely to be upgraded when the new lts comes out.20:21
fungiakerr: infrequently encountered zuul bug. if any changes which depend (directly or indirectly) on your change are a gerrit draft, zuul won't enqueue them20:21
wendarjeblair: okay, so avoid deprecated functions in any way possible until sometime after April this year20:22
akerrfungi: so is there a way to tickle jenkins to start the gate?20:22
fungiakerr: one of the reasons we wish we could just turn off gerrit drafts entirely (but in this case it's actually triggering a legitimate bug in zuul we should fix some time when we're not busy... hah, hah)20:22
fungiakerr: recheck no bug on your change20:22
wendarjeblair: I'll see if I can override the default behavior to avoid running that function.20:23
fungiakerr: or get a core to reapprove it20:23
fungiakerr: or you could reverify against the bug we have open for zuul for that corner case. let me find it20:23
akerrfungi: already did recheck no bug.  Will that also continue on into the gate process after the check pipeline finishes?20:24
fungiakerr: yes20:24
akerrfungi: great, thanks20:24
fungiakerr: for reference, bug 126083620:24
akerrfungi: good to know.  thanks for the help :)20:24
pleia2fungi: thanks for having a look, I'll grab one of the redhat guys here at the sprint to have a browse to see how to solve this20:24
fungiand if uvirtbot were around, she would retort:
*** Ryan_Lane has joined #openstack-infra20:25
*** arborism has joined #openstack-infra20:25
fungipleia2: you bet. there's probably a canned selinux pattern/recipe for "expect this to use tempfiles and tempdirs in /tmp"20:25
pleia2fungi: it calls a few scripts aside from just rst2html, md2html and a few others as well, I'll have a look around20:27
pleia2finding a lot about audit2allow, which doesn't really help a whole lot in puppet land20:27
pleia2(and makes me realize lots of people don't grok selinux)20:27
fungipleia2: yes, people don't grok computers (okay, well, technology in general)20:27
*** jcoufal has joined #openstack-infra20:28
fungipleia2: if you ever need a reminder of that, just take a quick stroll through ubuntu forums ;)20:28
*** markmcclain1 has quit IRC20:29
fungii have to a lot of my web searches just to filter out the awful, awful advice20:29
*** akerr has left #openstack-infra20:30
clarkbfungi: use sudo20:30
clarkbfor everything20:30
fungiclarkb: sudo bash. that makes it all work20:30
pleia2and chmod 77720:32
*** jcooley_ has quit IRC20:32
fungipleia2: definitely. how else can you anonymously ftp scripts to your webserver and then run them?20:32
fungipleia2: protip... run apache as root and serve up the contents of /etc so you can back it up more easily!20:34
pleia2this is why we can't have nice things20:35
pleia2our bot has disappeared20:35
fungioh, right, i was going to restart it _again_20:35
fungithis time i'll see what happened to its pidfile first20:35
clarkbfungi: hold on20:35
fungiclarkb: will do20:36
clarkbfungi: what do you think about tagging a release now?20:36
clarkbfungi: and using that?20:36
fungiclarkb: i s'pose we could... but we might then find ourselves rolling it back if something in those 1.5 years of commits is weird20:36
jeblairclarkb: ok i guess20:36
jeblairfungi: you've convinced me :)20:36
clarkbfungi: yeah I am almost positive we will but fixing problems when we need to upgrade seems silly20:37
jeblairi did upgrade irclib on that host, but i expecetd that to make it better, not worse.20:37
*** lcestari_ has joined #openstack-infra20:37
fungiclarkb: so, interesting to note, the pidfile is indeed missing now. it was there when i first started the bot, for at least several minutes thereafter20:37
*** Ryan_Lane has quit IRC20:38
jeblairiad has mysteriously jumped to 5000; so i'll put it back in20:38
*** rf01 has quit IRC20:38
jeblairand restarted puppet20:38
fungithese are the sorts of mysteries i enjoy20:39
*** jp_at_hp1 has quit IRC20:39
*** lcestari has quit IRC20:39
fungimysterious extra quota20:39
clarkbfungi:  for and the associated bug how did you intend on soliciting opinions?20:41
clarkbfwiw I agree with you, just use a .gitreview20:42
fungiclarkb: i was going to ask you, jeblair, mordred and SergeyLukjanov once firefighting slowed a bit20:42
clarkbbut the code doesn't seem particularly fragile for those of us that use a .gitreview so I am not worried about merging the patch20:43
fungiclarkb: agreed, my concerns are more around whether this will lead to other confusion around what workflows are expected to be supported by the utility20:44
fungiwe already have enough people opening bugs against git-review because they read bad workflow advice in a wikimedia forum about how to use it, and it didn't work as described/elicited errors/broke20:46
*** denis_makogon_ has joined #openstack-infra20:46
*** yolanda_ has quit IRC20:46
clarkb:( this is true20:46
*** fbo_away is now known as fbo20:46
fungii worry that the more flexible we make it, the more it will confuse new users (particularly outside the openstack developer community)20:46
* SergeyLukjanov returns back from reading scrollback20:47
SergeyLukjanovfungi, agree with making git-review simple and hopefully well tested20:47
clarkbfungi: so I hadn't considered that angle and I think it is an important one20:48
SergeyLukjanovfungi, clarkb, are there known usages of git-review outside of OpenStack?20:48
SergeyLukjanovI mean not single person but projects with recommendation to use it20:49
fungiSergeyLukjanov: but wmf is the biggest use outside openstack which i'm vurrently aware of20:49
*** krotscheck has quit IRC20:51
clarkbfwiw someone is apparently looking at the ticket for me now. hopefully get an answer back about stuff20:51
*** krotscheck has joined #openstack-infra20:51
jeblairttx: yay ff!20:51
*** gokrokve has quit IRC20:51
jeblairttx: any chance i can convince you to link to git.o.o instead of github? :)  eg
fungiclarkb: sorry about the hipchat taint, but it's for a good cause20:53
jeblair(i'm very proud of our git server farm, and it's so much prettier and openstacky than github)20:53
clarkbpythoneers will that work as expected? The decorator will give us a differently named closure in each case?20:53
fungiclarkb: i've had continuous loops on nodepool.o.o trying (and failing) to build new bare-precise and devstack-precise images in hpcloud-az2 for over 36 hours straight now20:53
*** rpodolyaka has joined #openstack-infra20:54
rpodolyakaHey, all! Has anyone seen this issue with python33 gates?
fungirpodolyaka: yeah, that was a side effect of the zuul restart. should hopefully have only affected a handful of changes being checked20:56
*** markmcclain has joined #openstack-infra20:56
fungirpodolyaka: you can just recheck no bug on that20:56
*** mrda_away is now known as mrda20:56
fungirpodolyaka: zuul tried to start that job before there were any available nodes to run it, so no workers had registered its existence20:57
*** markmcclain has quit IRC20:57
*** markmcclain has joined #openstack-infra20:57
rpodolyakafungi: cool, thanks! just didn't want to force recheck in case it wasn't supposed to work :)20:57
clarkbfungi: jeblair: in a nodepool image-list which number in the row maps to the server instance uuid?20:58
fungirpodolyaka: it was supposed to work. we didn't expect it, but figured it out fairly quickly when we spotted it20:58
clarkbor do I need to do a nodepool list?20:58
fungiclarkb: nodepool list doesn't give you the template nodes, i don't think20:59
*** denis_makogon has quit IRC20:59
*** denis_makogon_ is now known as denis_makogon20:59
clarkbdoesn't appear to. Is it server ID and az2 doesn't do uuids?20:59
SergeyLukjanovclarkb, it'll exec addOnException for each attach_on_exception usage and  ( it'll be triggered only in case of exception20:59
fungiclarkb: i think you'll have to couple nodepool image-list column 4 with nova list and be quick about it21:00
clarkbfungi: :(21:00
fungiclarkb: i'm checking to be sure21:00
clarkbfungi: I think you are correct21:00
clarkbfungi: because the building image state doesn't come with a Server ID21:01
clarkbfungi: acutally I have a server id now21:01
fungiclarkb: oh, the server id column does get it for you (you can see in the nodepool image-list output the rax ids are uuid and hpcloud are int)21:01
SergeyLukjanovclarkb, hm, looks like addDetail could be used21:01
*** bhuvan has quit IRC21:02
clarkbSergeyLukjanov: yeah that might be cleaner but the code as is should work fine21:02
clarkbSergeyLukjanov: I think21:02
fungii like having people with fancy python learnin' around. i start to get lost with decorators still21:02
*** yassine has quit IRC21:02
SergeyLukjanovclarkb, addDetail is already used inside the attach_file21:02
clarkbSergeyLukjanov: yup21:02
*** yassine has joined #openstack-infra21:03
clarkbSergeyLukjanov: the explicit addDetail is for catchign non existant files21:03
*** mrodden has quit IRC21:03
SergeyLukjanovclarkb, yup21:04
rpodolyakahmm, it seems that openstack_citest MySQL user still has not enough permissions to actually use created databases -
clarkbrpodolyaka: on some nodes that is correct21:05
SergeyLukjanovclarkb, oh, I've just find the content_from_file in testtools.content21:06
clarkbwhich that test ran on21:06
clarkbSergeyLukjanov: oh heh I approved the change already though :)21:06
*** MarkAtwood has quit IRC21:06
SergeyLukjanovclarkb, I think it's good enough21:06
SergeyLukjanovclarkb, oh, nevermind content_from_file is used under attach_file and looks like couldn't handle lack of file21:07
jeblairfungi: thanks for taking care of the static volumes; the new space is showing up in cacti now21:08
rpodolyakaclarkb: is there something we can do to make it work on those nodes?21:08
fungijeblair: sure, my pleasure21:08
*** mrodden has joined #openstack-infra21:08
*** Ryan_Lane has joined #openstack-infra21:08
*** aysyanne has quit IRC21:08
*** MarkAtwood has joined #openstack-infra21:09
fungirpodolyaka: we have a case open with hpcloud support to fix whatever has caused us to no longer be able to build new server images in that zone21:09
*** rf0 has joined #openstack-infra21:09
fungirpodolyaka: at the moment our options are to either wait, or stop using that zone (which reduces our test capacity by about 200 servers)21:09
*** andreaf has joined #openstack-infra21:09
*** bhuvan has joined #openstack-infra21:09
*** bhuvan_ has joined #openstack-infra21:09
rpodolyakafungi: got it. Thanks for the info!21:10
jeblairso on the subject of using the new hpcloud regions... do you think we can specify a kernel command line parameter?  if so, we could reduce the available ram on them21:10
*** thomasem_ has joined #openstack-infra21:11
jeblair(so we can use 30g nodes or whatever but only have linux allocate 8g)21:11
*** jeremyb has quit IRC21:11
*** jeremyb has joined #openstack-infra21:11
sdaguemriedem: did not21:12
krotscheckAnyone know where I can find an expert on sqllite and alembic? I'm hitting a wall and need some pointers.21:12
*** SumitNaiksatam_ has joined #openstack-infra21:12
*** thomasem has quit IRC21:12
*** Ryan_Lane has quit IRC21:12
mriedemsdague: the only thing i can think is there is something missing in the job config for how it's supposed to run tests and gather results?21:13
mriedembut that's just based on the nosetests*.xml errors21:13
rpodolyakakrotscheck: what's the problem you are hitting?21:13
sdaguemriedem: devstack is dying in a weird point, and I don't know why21:13
mriedemsdague: oh on setup?21:14
mriedemthe ceilometer thing?21:14
clarkbjeblair: that doesn't solve the problem of them not being useful for tempest but could make them usable for unittesting21:14
clarkbjeblair: however for unittests we can probably just get away with normal 8GB nodes21:14
*** cody-somerville has joined #openstack-infra21:14
jeblairclarkb: why doesn't that make them useful for tempest?21:14
*** bhuvan_ has quit IRC21:14
*** jcoufal has quit IRC21:14
_david_zaro: Forking Gerrit? Havn't we said we don't want to do it?21:15
krotscheckrpodolyaka: Some of my tests are hitting  a UserWarning during a particular migration step, which seems to make the given migration repeat itself. I've got a paste of two relevant log statements here:
*** jcoufal has joined #openstack-infra21:15
clarkbjeblair: they are too slow21:15
*** jhesketh has joined #openstack-infra21:15
jeblairclarkb: i thought the 30g nodes were fast enough21:15
clarkbjeblair: when sdague last pulled numbers they weren't21:15
krotscheckrpodolyaka: Essentially, the warning caused by migration 21645ef1040f is making alembic try to repeat the migration in the subsequent test.21:16
jeblairclarkb: afaik when sdague last pulled numbers they were21:16
*** rpodolyaka has quit IRC21:16
*** SumitNaiksatam has quit IRC21:17
*** lcostantino has quit IRC21:17
*** yassine has quit IRC21:17
*** yassine has joined #openstack-infra21:18
clarkbwoot should have more infos about this az2 thing by tomorrow morning21:18
*** MarkAtwood has quit IRC21:19
sdaguejeblair: do we have some nodes in rotation? I can scrape again21:20
jeblairsdague: yeah, we always have a few running check21:20
jheskethclarkb, jeblair: I have a few patches for zuul if you guys have time to look at them whenever's convenient please :-) (includes the requested footer-message)21:20
clarkbI just pulled up jobs that are running on 5 region-b nodes21:20
*** MarkAtwood has joined #openstack-infra21:20
*** MarkAtwood has quit IRC21:20
clarkbwhen they finish I should haev a small sample21:20
*** MarkAtwood has joined #openstack-infra21:21
sdaguefaster than hp azs by a little, slightly slower than rax (though those are running 2x cores)21:23
sdaguehow many cores in region-b?21:23
sdagueand check-rax is the only performance node region?21:24
clarkbsdague: yes21:24
jeblairsdague: all of rax are performance nodes now21:24
clarkbjeblair: are they?21:24
sdaguejeblair: really21:24
jeblairclarkb: my goodness i hope so21:24
sdaguecheck-rax is definitely different than the others21:24
*** yassine has quit IRC21:24
jeblairi'm not sure what check-rax is21:25
*** yassine has joined #openstack-infra21:25
*** yassine has quit IRC21:25
clarkbsdague: I remember now21:25
sdaguethis is a scrape of what's in jenkins21:25
clarkbsdague: its the only pvhvm image21:25
clarkbeverything is performance but only that image is pvhvm21:25
sdagueok, that makes sense21:25
fungiclarkb: yeah, we wanted to see whether the pvhvm images worked and were any more performant for our jobs21:25
jeblairclarkb: oh, you mean devstack-precise-check in iad.  yes.21:25
clarkbso we should switch to pvhvm for everything too, but need to sort out the oddness in images21:25
sdagueso the fact that we're running cores / 2 workers in tempest21:26
sdagueputs region-b into the useful camp21:26
sdagueat static 2 way, it was slow, IIRC21:26
clarkbsdague: gotcha21:26
clarkbjeblair: yup21:26
sdaguejeblair: ok, so just need the pvhvm image21:27
sdaguethat will speed things up 15 - 20% in rax21:27
jeblairrax nodes are 8 cores too21:27
sdaguejeblair: yep21:27
clarkbfungi: what were the problems with pvhvm again?21:27
sdaguecore for core they are slower than the old hp nodes21:27
sdaguebut they make up for it with the cores / 2 approach21:27
fungiclarkb: no substantial problems once we worked out things like missing some packages we took for granted on the normal images21:28
jeblairso, the question is, can we set "mem=8G" on the kernel command line in hpcloud region b and have that give us what we want21:28
fungiclarkb: the bigger challenge i think is the transition from non-pvhvm to pvhvm... i assume we'll need separate node labels and a couple changes to job configuration to swap them around21:29
jeblairi think we should probably focus on that before pvhvm...21:29
clarkbfungi: why is that?21:29
clarkbfungi: can't we change the base image name and have nodepool transition over time?21:29
fungiclarkb: oh, nevermind. we don't need a different flavor to use the pvhvm base images do we?21:29
clarkbjeblair: as an alternative we can have the root cgroup set a system wide memory limit21:30
jeblairfungi: it's a different base image but not a flavor21:30
clarkbfungi: I don't think so21:30
fungiit was just to go from non-performance to performance that we needed to yse a separate flavor21:30
fungiand separate image for that flavor21:30
clarkbjeblair: it probably is more hairy but gives us more flexibility too if we need it21:31
jeblairclarkb: how so?21:31
clarkbjeblair: doing something in the nodepool script to update grub seems simpler21:31
clarkbjeblair: we can set hard limits and soft limits and restrict based on user21:31
*** rfolco has quit IRC21:31
sdaguejeblair: so the concern is we use too much memory for guests, right21:31
clarkbjeblair: whcih probably don't help this use case much21:31
jeblairclarkb: yeah, but the point is to try to make them as close to rax as possible21:31
jeblairsdague: yeah, i don't want us to slip in a change that makes it so we can only run tests on nodes with 30g of ram21:31
jeblairsdague: then we basically lose rax, and, frankly, sanity at that point.21:32
*** bhuvan has joined #openstack-infra21:33
*** bhuvan_ has joined #openstack-infra21:33
fungiwho still has sanity here? i thought there was a sign saying to check it at the door21:33
clarkbjeblair: I think we should give your kernel parameter a shot21:33
clarkbjeblair: we should be able to do that in the nodepool scripts for all nodes21:33
jeblairclarkb: yeah21:33
clarkband start with testing it on the region-b nodes21:33
fungiclarkb: jeblair: are hpcloud's new regions using an in-image bootloader configuration?21:34
fungifor pvgrub or something similar?21:34
clarkbfungi: I have no idea, I haven't looked at image internals much21:35
sdaguejeblair: or just allocate a giant tmpfs to eat up the memory :)21:36
fungiclarkb: that's likely to present the only real challenge to that plan is if they use external bootloader/config21:36
jeblairsdague: clever :)21:36
clarkbfungi: jeblair: give me a couple minutes and I will have a node for us to experiment on21:36
clarkbsdague: and put our git repos in it21:36
fungisdague: a giant tmpfs with a 22gb copy of /dev/zero on it21:36
*** e0ne has quit IRC21:36
*** zns has quit IRC21:37
*** sweston has quit IRC21:37
sdaguefungi: sure, I actually wonder if we could tweak nova scheduler for this in a sane way as well.21:37
fungisdague: does the api present ways to control that? and does novaclient support them?21:41
sdaguefungi: no. I'm just thinking aloud21:42
sdaguewhich I should stop doing, as I have to run away shortly for our lug meeting21:43
jeblairmy understanding is the the new hpcloud regions have cpu throttling that differs from the old one21:43
sdaguejeblair: interesting21:43
sdaguethat sort of makes sense, that came in on flavors in grizzly21:44
clarkbimage is booting now, I will add keys to the ubuntu account when it is up21:44
sdaguethe cgroups bits21:44
*** jooools has quit IRC21:44
*** MarkAtwood has quit IRC21:44
clarkbjeblair: right. which would be fine if they still allowed bursting but that doesn't seem to be a case21:44
jeblairit's also possible that just the vcpu/ram ratios are simply other than what we would like.  or both of those things.21:44
clarkbyou get an allocation and thats it21:44
* fungi is about to disappear soon for dinner21:45
fungibut will catch up and help experiment on returning21:46
clarkbfungi: but you'll love the name of this machine :)21:46
* fungi can't wait21:47
clarkbfungi: jeblair ubuntu@
clarkbsomeone should start a root screen we can attach to21:47
clarkbbut not me because I never get terminal sizes right21:48
nibalizerclarkb: ^A F21:48
nibalizerresizes screen to your current windows size21:49
funginibalizer: oh! good tip21:49
clarkbwell I started one21:49
nibalizeralso thats awesome that you guys use screen -x, one of my favorite things to do21:49
jeblairyay i joined21:49
nibalizeralso ^A * to see who all is attached21:49
*** rf0 has quit IRC21:50
*** arborism is now known as amcrn21:50
jeblairfungi: want to look at anything before rebooting?21:51
fungijeblair: nope, checked scrollback21:52
fungimostly just curious to see whether the bootloader gives a crap that there's a grub config or goes all honeybadger on it21:52
clarkbI think the rounding happened a little funny but it booted with 8G of memory just fine21:52
jeblairwe could go for exact # of bytes from rax21:53
clarkbjeblair: I think its close enough this way and a lot more readable21:53
fungilather, rinse, repeat, but yeah this looks like it's a way forward21:53
jeblairclarkb: ok.  so next, want to take a snapshot, boot a new 30g from it and make sure the new one comes up with 8g?21:54
clarkbjeblair: yup I can do that now21:54
SpamapSwhy does this show "NOT_REGISTERED" for checks?
clarkbI am using the horizon ui because I haven't sorted out all the neutron stuff in region b fwiw21:54
funginibalizer: i also love using screen as a serial terminal emulator. as much as i prefer tmux these days, that's something it doesn't (and won't, according to the devs) support21:54
*** vkozhukalov has quit IRC21:54
fungiSpamapS: recheck no bug21:54
*** thomasem_ has quit IRC21:55
jeblairSpamapS: fungi: 1 sec21:55
*** esker has joined #openstack-infra21:55
nibalizerfungi: screen /dev/ttyUSB0 is basically how we got our robot working in undergrad21:55
SpamapSalt-tab fail21:55
clarkbjeblair: I will also try booting an 8GB image and maybe a 4GB image to se ehow the kernel reacts when closer to the limit21:55
fungiSpamapS: a few of the jobs which wanted py3k-precise nodes got started before there were any built to run them when we restarted zuul earlier21:55
jeblairwe restarted zuul and there are 0 tripleo nodes online21:55
* SpamapS checks for repeate of quota fail21:55
jeblairSpamapS: so zuul doesn't think the jobs exist21:55
*** mfer has quit IRC21:56
fungijeblair: SpamapS: oh! i should read more closely. my brain through python33 in there21:56
jeblairSpamapS: once those nodes come online, they will be registered and zuul will resume it's regularly scheduled behavior of waiting indefinitely21:56
fungijeblair: good point21:56
* SpamapS feels kinda funny now21:56
fungiSpamapS: did you let mordred buy your drinks again? bad idea21:57
*** jgrimm has joined #openstack-infra21:57
*** jgrimm has quit IRC21:57
*** Sukhdev has quit IRC21:57
SpamapSfungi: he _BUYS_ them ?21:58
*** MarkAtwood has joined #openstack-infra21:58
SpamapSquotas are in fact inaccurate...21:58
SpamapSbut there should be space available21:59
jeblairSpamapS: we're probably within nodepool's 8 hour window21:59
fungiSpamapS: we've got a ton of nodes stuck in a building state for various lengths of time from when your cloud was doing weird things21:59
jeblairSpamapS: so 8 hours after that^ nodepool should start deleting those22:00
jeblairwhich may be as soon as 30 mins from now22:01
fungithere are a few building close to the 8 hour mark, so those will likely start to get turned over in greater and greater numbers22:01
SpamapSfungi: many have deleted now that we kicked nova-compute on all of the nodes22:01
*** fbo is now known as fbo_away22:01
SpamapSOk quotas are accurate now22:01
*** mbacchi has quit IRC22:02
clarkbfungi: jeblair ubuntu@ booted off a snapshot of the first node22:02
clarkbfree reports proper values. moving on to booting off the same snapshot using smaller flavors to see how that goes22:02
ttxjeblair: been trying but then that makes lines > 78char22:02
*** bhuvan__ has joined #openstack-infra22:02
*** bhuvan___ has joined #openstack-infra22:02
jeblairttx: drat!22:03
fungiSpamapS: you also had about 100 stuck in delete state, so i'm manually deleting those quickly22:03
SpamapSfungi: I show 5 in deleting22:03
jeblairclarkb: awesome!22:03
fungiSpamapS: delete state in nodepool, rather than nova22:03
ttxok, time to cut ceilo22:03
fungiSpamapS: nodepool doesn't realize a lot of these don't exist, so it's probably being polite and trying to avoid overrunning the quota22:04
*** thomasem_ has joined #openstack-infra22:04
SpamapSfungi: makes sense.22:04
clarkbhorizon is pretty shiny22:05
fungiSpamapS: it should eventually get around to re-trying to delete them, but i'm just speeding up the process22:05
*** bhuvan_ has quit IRC22:06
*** bhuvan has quit IRC22:06
*** thomasm_ has joined #openstack-infra22:06
*** esker has quit IRC22:07
*** mfink has quit IRC22:08
*** thomasem_ has quit IRC22:09
clarkbbut now I can't find my floating ips22:09
*** ociuhandu has quit IRC22:09
clarkboh right under access and security not manage networks22:10
clarkbfungi: jeblair ssh ubuntu@ and ssh ubuntu@
clarkbI think setting the limit like that is safe22:11
clarkbregardless of flavor size on hpcloud region b precise images22:11
jeblairclarkb: too bad it didn't give us more ram.  ;)22:12
*** sarob_ has joined #openstack-infra22:12
*** pdmars has quit IRC22:12
clarkbshould I go ahead and write a patch for nodepool scripts that does this and do it only for region b nodes ?22:12
jeblairclarkb: why don't we see if it is safe for rax as well...22:13
jeblairclarkb: i'll spin up a server22:13
sdaguettx: can you send a blanket email to the list about the criteria for FFE22:13
sdaguewe're seeing at least some people assume this just means "I didn't get my feature done yet"22:13
clarkbjeblair: can you test with pvhvm?22:14
*** jnoller has quit IRC22:14
clarkbjeblair: since we should move to those images anyways imo22:14
jeblairclarkb: i was thinking we should test w/o pvhvm so we don't have to tie these together.  we can test both tho.22:15
clarkbjeblair: ok22:15
*** flaper87 is now known as flaper87|afk22:15
*** oubiwan__ has joined #openstack-infra22:16
ttxsdague: hm, I must have posted that in the past22:16
*** jeckersb is now known as jeckersb_gone22:16
ttxsdague: will try to find something22:16
sdaguettx: yeh, but lots of new folks22:16
sdaguesee the cisco one on their scheduler22:16
sdaguefor nova22:16
ttxsdague: will post soemthing. Tomorrow.22:17
sdaguettx: sounds good22:17
fungiclarkb: did the floating ip quota in the new hp regions get addressed already?22:17
clarkbfungi: not that I am aware of. We will need to deal with quota for region b and region a separately22:18
fungiclarkb: vaguely remembering you said it was very, very low22:18
clarkbfungi: but if we are using 30GB nodes * 50 that may be what we get who knows :)22:18
*** JayF has joined #openstack-infra22:18
fungiyou can always bring mordred to the plate on that one ;)22:18
*** dkranz has quit IRC22:18
jeblairi believe mordred suggested we run lots of 30g nodes22:19
devanandahi guys! question about an ironic "agent" -- from your perspective, what should determine whether it lives in a separate project or not?22:19
jeblairotherwise i wouldn't be proposing this22:19
clarkbjeblair: it is asking for my apssword on
jeblairclarkb: oh, root@22:19
clarkbah thanks22:19
mordredrun lots of 30g nodes22:19
fungidevananda: is it likely to be installed and run outside of the systems which are installing and running ironic?22:19
mesteryanteaya: Linux Foundation just got Jenkins running for the OpenDaylight Neutron plugin, just wanted to confirm the voting is being blocked on the OpenStack end since we just started.22:19
clarkbjeblair: those both look good to me22:20
*** wchrisj_ has quit IRC22:20
fungidevananda: will the people developing the agent differ substantially from the set of people developing ironic?22:20
jeblairclarkb: i haven't done anything yet22:20
clarkbjeblair: oh did you boot 8GB nodes then?22:20
devanandafungi: it is likely to be _only_ used within environemnts runnign ironic-api and ironic-conductor22:20
jeblairclarkb: yes, since that's what we'd be doing anyway22:21
clarkbjeblair: oh right duh22:21
jeblairclarkb: so we should see mem go from 8 something to 7something22:21
devanandafungi: the people developing them will be closely coupled, but possibly differnt teams22:21
*** jroll has joined #openstack-infra22:21
jeblairclarkb: i have screen on the regular one22:21
fungidevananda: then it's probably not warranted to split it out. the other determining factor is if you want ironic and the agent to have different release cadences/models22:21
jeblair(i'm installing screen on pvhvm)22:21
*** alex-out is now known as Alexandra22:21
devanandafungi: at the moment, a release cadence more akin to the client cadence is more appealing22:22
devanandafungi: it is not a long-lived service which requires periodibig releases22:22
fungidevananda: yeah if you want to release ironic like nova but release the agent like python-novaclient then you want separate projects22:23
jeblairclarkb: screen is running on the pvhvm one22:23
devanandafungi: ack, thanks22:23
ttxyay, all done. Phew22:23
fungi(not saying that the ironic agent has anything to do with novaclient, just an example of the linear tagging and releasing from master model)22:23
clarkbjeblair: yup caught the tail end on the pvhvm one22:23
jeblairclarkb: screen back on the 1st one22:24
jeblairclarkb: looks like that doesn't work on rax22:24
jeblairclarkb: on the other hand, that's not terrible because at least it doesn't mess anything up22:24
fungibut does it break rax?22:24
clarkbjeblair: ya we can just set it and it will be ignored22:25
*** reed has quit IRC22:25
jeblairclarkb: it _does_ work on pvhvm22:25
jeblairtake a look22:25
*** mrodden1 has joined #openstack-infra22:25
fungii think if it either works or is a no-op on rax then there's no reason to try to separate out the handling for it22:25
clarkbfungi: yeah22:25
jogoI thought keystone had a fix for that22:25
jeblairso basically, it's not going to break perf, and we should smoothly upgrade to it when we start using pvhvm22:25
clarkband it looks like that is the case22:25
jeblairfungi, clarkb: agreed22:26
clarkbjeblair: did you want to snapshot those images and try booting from snapshot?22:26
jeblairlet me image and build from images too22:26
clarkbjeblair: :)22:26
*** mfink has joined #openstack-infra22:26
*** smarcet has quit IRC22:27
morganfainbergjogo, *reading up* fix for? cert?22:27
*** mrodden has quit IRC22:28
*** markmcclain has quit IRC22:28
*** spm has left #openstack-infra22:28
jogomorganfainberg: yup22:28
jogojust saw it in gatge22:28
morganfainbergjogo, hmm.22:29
morganfainbergjogo, i think this is the fix
morganfainbergjogo, i _think_22:29
morganfainbergjogo, but just got in, so not released yet22:30
*** sabari has quit IRC22:30
*** bhuvan__ has quit IRC22:30
jogomorganfainberg: we don't use trunk clients22:30
jogoclarkb: ^22:30
morganfainbergjogo, aye22:31
jogoI thought we used to, why not?22:31
*** bhuvan___ has quit IRC22:31
clarkbwe do use trunk clients in integration testing22:31
morganfainbergjogo, i thought we only used release clients22:31
jogomorganfainberg: in that case that fix didn't fix it22:31
morganfainbergclarkb, hmm22:31
morganfainbergjogo, was that error ~3hrs ago?22:32
morganfainberg*looks at time*22:32
*** pcm_ has quit IRC22:32
morganfainbergjogo, it looks like that merged ~3h ago22:32
*** mdenny has quit IRC22:32
morganfainberghm. yeah i would guess that didn't fix it22:32
jogomorganfainberg: too bad :/22:33
*** _david_ has quit IRC22:33
*** sabari has joined #openstack-infra22:33
*** ociuhandu has joined #openstack-infra22:34
*** rpodolyaka has joined #openstack-infra22:35
*** hashar has quit IRC22:35
clarkbfungi: jeblair
fungii guess openstackgerrit is gone again22:35
jeblairfungi: do you know what the issue is?22:35
jogomorganfainberg: I take it you can keep digging on this one22:36
fungijeblair: which issue?22:36
morganfainbergjogo, this might be a case where shutil.move would have been a better use.22:36
StevenKfungi: We really really do miss it when it just ignores the world.22:36
jeblairfungi: openstackgerrit22:36
morganfainbergjogo, vs. os.rename22:36
morganfainbergjogo, the rest of the logic seems sound22:36
morganfainbergjogo, but i am looking at what might be going on22:36
fungijeblair: oh, yes it's apparently losing the irc server connection and throwing a traceback into its log, then sittinf there22:36
jeblairfungi: we could downgrade irclib22:37
fungijeblair: clarkb pointed out that we're running gerritbot release there, and we haven't released in 1.5 years, but there's a reconnect on disconnect patch rotting on master post the last tag22:37
clarkbthat will need testing on centos6 too22:37
mordredmorganfainberg: we'd LIKE to run a copy of the tests that also run with the released clients ... but there are some issues we haven't figured out yet22:37
* clarkb fires up a centos6 image in hpcloud region b22:38
fungijeblair: so clarkb suggested releasing gerritbot and seeing if master works22:38
jogomorganfainberg: thanks!22:38
morganfainbergmordred, ++22:38
jeblairfungi: okay.22:38
morganfainbergmordred, I rememebr this conversation actually22:38
morganfainbergmordred, :) i look forward to the day we get both22:38
jeblairfungi, clarkb: rax images work as expected22:38
fungijeblair: i don't see any evidence to suggest for certain that this issue is new and stems from the irclib upgrade22:38
clarkbjeblair: we should spot check centos6 as well22:39
clarkbjeblair: I am doing that in hpcloud land22:39
jeblairclarkb: ack22:39
fungicould just be another bad day on freenode, or in rax-dfw or something i suppose22:39
*** rossella-s has quit IRC22:39
*** mdenny has joined #openstack-infra22:40
*** mriedem has quit IRC22:40
clarkb2014-03-05 22:42:12,550 -[WARNING]: '' failed [82/120s]: url error [[Errno 113] No route to host] apparently this isn't an az2 only problem22:42
* clarkb deletes and tries again22:43
clarkboh it might have recovered22:43
jeblairclarkb: was that region-b?22:43
clarkbjeblair: ya22:43
*** bhuvan has joined #openstack-infra22:43
mordredmorganfainberg: hrm. I think I just figured out how to do it easily...22:43
*** bhuvan_ has joined #openstack-infra22:44
morganfainbergmordred, cool!22:44
clarkbclarkb: the node brought up sshd but it isn't letting me in22:44
clarkbjeblair: ^ whoops. it wants a password22:44
clarkbso deleting it and trying again22:44
jeblairclarkb: yeah, that probably means it didn't recover from the cloudinit thing22:45
*** david-lyle has quit IRC22:46
*** SumitNaiksatam has quit IRC22:46
*** dangers is now known as dangers_away22:46
*** mdenny has quit IRC22:47
clarkbwoo happened again22:48
morganfainbergjogo, i think we need some extra logging here.22:48
*** SumitNaiksatam has joined #openstack-infra22:49
* clarkb tries a different flavor to get scheduled on a different hypervisor hopefully22:49
morganfainbergjogo, since i can't tell what may or may not have failed.22:49
*** jcooley_ has joined #openstack-infra22:49
*** rossella_s has joined #openstack-infra22:49
*** thomasem has quit IRC22:49
*** rossella_s has quit IRC22:49
*** thomasm_ has quit IRC22:50
morganfainbergjogo, i'm thinking we should log a specific error if the cert is "invalid" and include what data is in the cert.22:50
morganfainbergjogo, it'll at least make it a bit more clear. i'll work on brewing up something to help with chasing this down.22:51
*** sabari has quit IRC22:51
jogomorganfainberg: thanks!22:51
*** sabari has joined #openstack-infra22:51
morganfainbergjogo, but afaict this _should_ work as is22:51
*** rpodolyaka has quit IRC22:51
*** bada_ has joined #openstack-infra22:51
*** Ryan_Lane has joined #openstack-infra22:52
morganfainbergso it is something else going on.22:52
jogomorganfainberg: so if you think you have a fix that won't break things, you can land it and wait 24 hours and see if bug went away22:52
fungiokay, i'm disappearing for a while to go grab dinner... back later22:52
morganfainbergjogo, wont fix it, but will log more data22:52
*** sweston has joined #openstack-infra22:52
*** oubiwan__ is now known as oubiwann-ef22:53
morganfainbergjogo, so we can see if there is something else going on that isn't aparant.22:53
clarkblol amazon. They just refunded 3 cents to me card22:53
morganfainbergjogo, all i can tell you is that the cert file and ca files exist and we had a cert load error22:53
*** bada has quit IRC22:53
clarkbprobably cost more to process the refund22:53
jogoin that cose more debug logs FTW22:53
morganfainbergjogo, yep!22:53
clarkbjeblair: booting centos nodes is hard.22:54
clarkbI have had 3 in a row fail. I passed UUIDS back to hpcloud hoping this helps them debug22:54
jeblairregion-x may not be the fix to the az2 problem we had hoped22:55
*** vkozhukalov has quit IRC22:55
jeblair(though it's still good future proofing since mordred keeps threatening that azx will go away)22:56
swestonchmouel: do I have to activate the account?  I am getting authentication errors :-(.  Or should I give it some time?22:56
clarkbI am trying to boot an 8GB image now to see if things get placed differently22:56
clarkbjeblair: yup22:56
*** vkozhukalov has joined #openstack-infra22:56
*** morgabra has joined #openstack-infra22:56
clarkbno luck I am going to give up for now22:57
clarkbwe don't actually boot centos nodes in hpcloud yet22:57
clarkband judging by this experience won't be able to22:58
jogoclarkb: I am still getting 502s from logstash22:59
*** markmcclain has joined #openstack-infra22:59
jeblairclarkb: ack22:59
clarkbjogo: still as compared to what? I haven't had any trouble since the move23:00
*** dcramer__ has quit IRC23:01
*** eharney has quit IRC23:01
clarkbjogo: is the query malformed?23:02
jogoclarkb: nope23:02
jogoa refresh fixed it23:02
clarkbjogo: what query are you running?23:02
jogo((message:"+ echo \'The following services are still running:") OR (message:"Error: Service" AND message:"is not running")) AND filename:"console.html" AND NOT build_name:"check-grenade-dsvm-partial-ncpu"AND build_queue:"gate"23:02
clarkbyou might be hitting timeouts if you are doing something massive23:02
*** reed has quit IRC23:02
*** reed has joined #openstack-infra23:02
*** openstack has joined #openstack-infra23:03
jogono changes needed23:04
clarkbjogo: what timeperiod areyou searching over?23:04
reedI know it's bad but I need to remove spam from planet; I'll auto-approve this if nobody does it in the next few seconds
jogo24 hours23:04
*** UtahDave has quit IRC23:05
clarkbjogo: hrm not sure then23:05
clarkbI can dig through kibana/apache logs but those are typically not very helpful let me see what es logs say23:06
*** bradm1 is now known as bradm23:07
*** rpodolyaka has joined #openstack-infra23:07
*** SumitNaiksatam has quit IRC23:09
clarkbjogo: there are no tracebacks in ES log for that queriy23:09
clarkbjogo: I think it was something in kibana/apache that derped23:09
*** rpodolyaka has quit IRC23:10
jogoclarkb: ahh no worries23:10
*** SumitNaiksatam has joined #openstack-infra23:11
*** ryanpetrello has quit IRC23:11
*** dkliban has joined #openstack-infra23:13
*** adrian_otto has joined #openstack-infra23:13
*** sabari has quit IRC23:14
adrian_ottoSorry guys, I am seeing something perplexing. On Jenkins voted −1, but the only tests that are not green are showing as NOT_REGISTERED. I don't understand what that means. Any clues?23:14
adrian_ottoI'm expecting to find a link to a console log that indicates what's wrong.23:15
*** jcoufal has joined #openstack-infra23:15
*** ryanpetrello has joined #openstack-infra23:15
*** ryanpetrello has quit IRC23:15
*** sabari has joined #openstack-infra23:15
clarkbadrian_otto: the job was/is not registered with Gearman. It happened because we restarted zuul when jenkins did not have slaves to run some jobs23:16
adrian_ottoso I should do a "recheck no bug"?23:16
clarkbthe jenkins plugin will only register jobs if it has slaves to run those jobs23:16
clarkbadrian_otto: you can23:16
clarkblet me look in scroll back to see if there is abug23:16
adrian_ottoshould I expect that to work now?23:16
*** jamielennox|away is now known as jamielennox23:17
clarkbadrian_otto: yes23:17
clarkbrecheck no bug looks appropriate23:17
*** whoops has quit IRC23:19
adrian_ottothanks clarkb23:19
*** Ryan_Lane has quit IRC23:22
*** rpodolyaka has joined #openstack-infra23:22
*** bada_ has quit IRC23:25
*** bada has joined #openstack-infra23:25
*** rpodolyaka has quit IRC23:25
*** rcleere has quit IRC23:27
*** bhuvan has quit IRC23:27
*** bhuvan_ has quit IRC23:28
*** vkozhukalov has quit IRC23:29
*** yamahata has quit IRC23:29
*** Ryan_Lane has joined #openstack-infra23:30
*** bhuvan has joined #openstack-infra23:32
*** bhuvan has quit IRC23:32
*** bhuvan_ has joined #openstack-infra23:32
*** bhuvan has joined #openstack-infra23:33
*** CaptTofu has quit IRC23:33
morganfainbergjogo, at least this will give us extra logging in failure cases.  this failure should be rare, but when it happens it'll give us something23:34
*** mrodden1 has quit IRC23:36
jogomorganfainberg: woot23:36
*** bhuvan__ has joined #openstack-infra23:37
*** bhuvan___ has joined #openstack-infra23:38
*** sabari has quit IRC23:38
*** bhuvan_ has quit IRC23:39
*** bhuvan has quit IRC23:40
*** rpodolyaka has joined #openstack-infra23:40
*** weshay has quit IRC23:42
*** bhuvan has joined #openstack-infra23:42
SpamapSfungi: could you check nodepoold <-> tripleo-cloud is talking properly? I misconfigured networking last night which would have led to "many problems" :-P23:43
dhellmannmordred: does anything actually use oslo.version?23:44
*** lcostantino has joined #openstack-infra23:44
*** thedodd has quit IRC23:44
*** jhesketh__ has joined #openstack-infra23:45
*** jhesketh__ has quit IRC23:46
*** stevebaker has quit IRC23:46
*** stevebaker has joined #openstack-infra23:46
dhellmannclarkb: you're psychic now? ;-) -- thanks23:47
clarkbdhellmann: no mordred is sitting behind me23:47
dhellmannclarkb: I'm thinking of moving there instead of a lib on its own23:47
dhellmann(I figured)23:47
* dhellmann tries to keep mordred in sight at all times23:47
*** lcostantino has quit IRC23:48
*** krotscheck has quit IRC23:48
*** bhuvan_ has quit IRC23:51
*** bhuvan_ has joined #openstack-infra23:53
*** jcoufal has quit IRC23:55
SpamapSclarkb: hey, since fungi is not responding, can you check if nodepoold can reach the tripleo check cloud?23:56
* anteaya is finally caught up with scrollback *whew*23:58
clarkbSpamapS: I see a bunch of nodes in building state and delete state23:58
anteayamestery: confirmed:,members23:58
*** krotscheck has joined #openstack-infra23:58
anteayamestery: OpenDaylight Jenkins is currently in the nov-voting group for 3rd party ci23:58
*** jcoufal has joined #openstack-infra23:59
anteayamestery: new ci accounts now start in this group by default23:59
anteayamestery: you can comment on changes (one comment per patchset please)23:59

