fungithe graphs seem to be recovering00:16
fungisparklines too00:16
fungidims_: yeah, we had several slaves go off the rails and need restarts. if it was precise7, 9 or 37 those should all be okay again now00:33
fungiugh, spotted another havana change devs keep blindly reverifying even though grenade is *not* going to succeed00:37
dims_cool. is there a way i can add that precise7/9/37 as a field? (in logstash) right now i have to search for it in the logs. we may be able to tell if certain jobs fail on certain hosts00:38
clarkbdims if you add thst info to the zmq event publisher plugin for jenkins00:39
fungiwhich means touching java, so be sure to wear gloves00:40
dims_clarkb, which repo? i can give it a shot00:41
marunthe dumb stuff continues with 3rd party jobs f'ing up review comment logs:
marunis there anyone else seeing this is a problem and considering better ways of integrating advisory testing than review comments?00:53
fungiwow. crazy comment loop00:53
fungimarun: i think our "proprietary technology plugin" architecture is an attractive nuisance, but i'm probably not representative of a majority of the community there00:54
marunfungi: I have had similar thoughts, but I've been convinced it's worth it.00:55
fungicompanies want to sink time into writing driver shims rather than collaborating on standard protocols00:55
marunfungi: The alternative is those vendors doing things entirely outside of openstack and then we don't know enough to know what different vendors share (with an eye to factoring it out for reuse).00:56
marunfungi: But to your point, it's api/protocol stability vs just having the source and doing whatevere.00:56
fungiyes, that has its own fragmentation risks, and none of us who survived the unix wars wants to relive it00:56
marunfungi: the former is certainly preferable, but requires a certain maturity that I'm not sure any openstack project really has yet.00:56
marunfungi: so, in the meantime, I think we need a better way of reporting advisory testing results.00:57
marunfungi: something explicit, ideally.  say, a per-patch section reporting the results of different testing mechanisms00:58
fungito your original question, weigh in on where we're at least trying to wrangle it a bit00:58
marunfungi: (in a perfect world)00:58
marunfungi: ah, ok.00:58
fungimarun: part of the problem, i think, is that this was originally expected to support a handful of smokestack-like systems, and we went from that to 30+ in the span of a few months00:59
fungii agree that if every neutron change is going to start having 20 +/-1 vrfy votes on it, this becomes a bit of an interface usability issue01:00
marunfungi: review comments are already a poor substitute for a mailing list discussion, and introducing spam just makes it impossible01:01
fungimarun: i concur. there are some things gerrit lets you do which aren't really as easy on a mailing list, but it has its own drawbacks as well01:02
marunfungi: i was initially a pretty big fan of tools like reviewboard and gerrit, but the gloss has come off for me.01:03
marunfungi: the side-by-side colorized diffs are definite pluses bug I think the lack of coherency to a review conversation may negate its advantage01:03
fungioverall, having done development in a ml-driven community and in openstack's gerrit setup, i prefer the latter for scalability. if we were to try to review the change volume for all of openstack's subprojects on an ml, i think it would be untenable01:03
fungihaving the ability to add comments in context and subthread discussions around different chunks of a change and easily browse between them is pretty helpful01:04
dims_clarkb, jeblair has already added the node name in the zmq publisher (
marunfungi: fair enough01:05
fungihaving humans organize similar review discussions between multiple lists and threads gets nasty (lkml nasty)01:05
marunfungi: in any case, regardless of merit, gerrit isn't going away01:06
marunfungi: I'm hoping to prevent a regression in utility that these advisory jobs appear poised to deliver.01:06
fungimarun: so far that proposed doc update is the best rallying point we have around this discussion, though we could consider changing venue if it needs more actual discussion and less general document review01:07
marunfungi: I don't think policy is going to be enough - I think we need a different reporting mechanism.01:08
fungimarun: you very well may be right. just suggesting it needs to be raised as a point to a wider audience. right now the interested audience has been gathering on that doc review01:08
marunfungi: ah, fair enough.  I'll raise the issue there then.01:09
fungiit could make for a great -dev ml thread though. so far there have been individual discussions within neutron, nova, infra and so on01:09
fungithe ideas and experiences to date need to merge into a larger debate01:10
*** ArxCruz has quit IRC01:10
marunfungi: Hmmm, and I guess I can point to a ml thread on the review in any case.01:11
fungion the infra plus side, the recent onslaught of third-party testing requirements has gained numerous entities a much deeper understanding of our test infrastructure, so i think that's a great outcome01:11
marunDefinitely a good thing01:14
marunOpenStack efforts pretty much live and die by the quality of the CI effort, whether upstream or down.  Better not to have everyone reinvent the wheel.01:15
openstackgerritAuston McReynolds proposed a change to openstack-infra/reviewstats: Update Trove Core
clarkbfungi: I just marked precise4 offline on jenkins02, it was very quickly fialing things :( you don't happen to still be about do you?01:39
fungii am01:39
clarkbfungi: did you want to poke at it or should I? I will probably just reboot it then reconnect it to jenkins which isn't very informative01:39
fungii'll see if i can shepherd it back in, but we've been losing slaves like that right and left on both jenkins01 (old version) and 02 (newer) so it's not a new bug i guess01:40
clarkbthat is annoying01:41
fungii think we're just grinding them into dust with the current gate volume01:41
fungi(the old masters, i mean)01:41
fungithey've both been restarted in the past 48 hours, so it's not an uptime thing either01:43
*** prad has joined #openstack-infra01:46
openstackgerritDavanum Srinivas (dims) proposed a change to openstack-infra/config: Add jenkins slave name to the logstash records
clarkbfungi: fyi, the logstash processors are slightly behind. We may need to readd logstach-worker05-08 back to the mix01:53
clarkbfungi: but I think it is ok for now01:53
fungiokay, i'll keep that in mind. thanks01:53
dims_clarkb, added a review ( thanks for the pointer01:53
clarkbdims_: thank you01:54
fungiclarkb: looks like maybe but it says remoting 2.33 should be used in 1.542 (jenkins02 is on 1.543)01:59
fungimy eyes still have a tendency to gloss over when i look at java tracebacks. i think it must be a subconscious aversion of some kind02:00
clarkbsurvival instinct02:00
fungicould be02:00
fungii'm going to do what i've done with the other for now, which is ssh into the slave, sudo poweroff, connect to the rackspace dashboard, hard reset the vm, then bring it back up in jenkins once it boots and watch the next few jobs it runs02:04
*** yaguang has joined #openstack-infra02:12
*** fallenpegasus has joined #openstack-infra02:12
*** fallenpegasus has joined #openstack-infra02:18
harlowjawow, zuul is on fire, 70 active reviews02:34
harlowja*or jenkins is on fire, one of the above, lol02:34
fungiit's actually catching up. we were well over 100 earlier this morning north-american time02:36
fungiand it's not like people just stop approving things02:37
jeblairfungi: i disabled precise3702:37
*** mriedem has quit IRC02:37
fungii just noticed it myselg02:38
fungisame error pattern02:38
fungireally looks to me exactly like the backtraces in
fungiexcept that we're also seeing it on a master which should have the fix02:39
fungiprecise37 was rebooted and started working properly again a mere 4.5 hours ago, so it's either recurrent on the same machines (in which case this is the first one i've seen so soon after) or this is a coincidence that it was 37 again02:42
jeblairfungi: are we sure that's supposed to have the fix?02:42
fungier, unless i'm misreading02:43
fungii'm not entirely clear on the relationship between jenkins and the remoting lib02:43
jeblairi'll try to dig into that02:43
fungibut we're seeing it with slaves on jenkins02 as well and it's running a newer rev than the supposed fix-carrying version02:44
jeblairfungi: btw, jenkins02 has some 9h old jobs "running" on centos slaves02:44
fungihere's a supposition... perhaps the agent on the 02 slaves continued running after we updated it?02:44
fungihadn't spotted the overdue centos jobs. that's unfortunate02:45
jeblairfungi: says 1.543 uses remoting 2.33 which bug report says has the fix02:45
*** llu has joined #openstack-infra02:46
fungigot it. and jenkins{-dev,02,03,04} is on 1.54302:46
fungijenkins.o.o and 01 are still on 1.52502:47
*** hcc has joined #openstack-infra02:47
*** hcc is now known as hdd_02:47
jeblairswitched to 2.33  in 1.54002:47
*** nati_ueno has quit IRC02:50
jeblair./maven/org.jenkins-ci.main/remoting/pom.xml:    <tag>remoting-2.28</tag>02:52
jeblair(that's unpacked slave.jar from precise3702:52
fungiso maybe that was a good guess02:52
jeblairbut the timestamp on that file is Jan  9 00:5502:53
fungiperhaps on master upgrades we should reinstall slave agents?02:53
clarkbThat is a plugin iirc maybe core and remoting differ?02:53
jeblairoh but precise37 is jenkins01 which is still old02:53
fungion, precise37. that one's on jenkins01 which runs 1.52502:53
jeblairwhat's a slave on 02 that died?02:54
fungithough i've rebooted it, i haven't relaunched the agent on 04 yet02:54
jeblaircool, hold off on that for a bit02:54
jeblairJan  9 02:31 slave.jar02:55
fungihuh, so it reinstalled on reboot02:55
jeblairdid it reconnect?02:55
fungiaccording to jenkins02, "Ping response time is too long or timed out."02:56
fungidoesn't seem to be communicating, at any rate02:56
jeblairweird.  it's connected and it did install a new slave.jar02:57
*** changbl has joined #openstack-infra02:58
*** jasondotstar has quit IRC02:59
*** reed has quit IRC02:59
fifieldtfungi, if you get a moment in the next couple days, could I get a look at a grep of that log? (my guess is it'll be prepended with "welcome_reviews" or something02:59
jeblairjenkins4 and jenkins6 have remoting 2.3302:59
*** prad has quit IRC02:59
jeblairjenkins6 looks like it's been online for ~24 hrs02:59
jeblairso this is looking like a dead end03:00
jeblairfungi: yes, s/jenkins/precise/03:00
fungiif we could find an evenly-numbered jenkins slave which hasn't been restarted since the jenkins02 upgrade a couple weeks ago, that would be worth looking at03:01
*** fallenpegasus has quit IRC03:01
lifelessclarkb: q: why isn't on status.../zuul ?03:02
fungififieldt: getting late over here, but i'll try to check in a bit03:02
jeblairlifeless: "Queue lengths: 238 events, 46 results."03:02
jog0I think we have a rabbit problem03:02
jeblairlifeless: it's probably in the queue03:02
fifieldtdoesn't have to be today03:02
fifieldtjust queuing the request :)03:02
fungijog0: more lettuce03:02
*** sarob has joined #openstack-infra03:02
fungijog0: s/lettuce/carrots/?03:02
lifelessjeblair: thanks03:03
jog0we keep connecting to rabbit03:03
jog0so we must keep getting disconnected too?03:04
jog0fungi: not sure how to confirm, don't know how to read rabbit logs03:05
fungiNo distributions at all found for oslo.messaging>=1.2.0a11 in ./.tox/pep8/lib/python2.7/site-packages (from glance==2014.1.dev105.gd80aa3c)03:08
fungiseen in a glance pep8 job just now03:08
fungiin the gate03:08
funginevermind. looks like it was a change which should never have been approved03:09
fungiwow. we have bred a monster03:10
fungisome devs will look for *any* excuse to reverify their changes once they're approved03:11
jog0fungi: whoa that is scray03:11
jog0all those reverifies03:12
fungiif this were the only change it'd be one thing03:12
jeblairevery now and then we're reminding why we have a gate03:13
fungii've been playing whack-a-mole with havana patches all day asking devs to stop robo-reverifying them since grenade won't pass until the grizzly devstack exercises thing gets fixes03:13
jeblairfungi: so i think jenkins02 was restarted a few days ago, and it's likely that they all have that version03:14
fungithey just pick random bugs which have nothing at all to do with the same damn failure (in the case of the havana reverifies)03:14
jog0fungi: sigh03:14
*** blamar has joined #openstack-infra03:14
jog0this is why we can't have nice things03:15
fungijeblair: okay, that answers my question then. i wondered whether slave agents hanging around from pre-upgrade with old remoting versions explained the bug continuing03:15
jeblairi'm pretty sure jenkins copies it over each time it connects, so it should be all updated03:16
fungijeblair: so we're left with two remaining likelihoods... 1) that jenkins bug isn't completely fixed or, 2) we have a different bug with the same backtrace03:16
jeblairand looking in the logs, the slave outputs a version number: 2.33, so i'm assuming that's in sync with the remoting lib03:16
fungimakes sense03:17
fungififieldt: bad news03:22
fungi(is better than no news at all?)03:22
fungi[2014-01-09 03:22:08,632] INFO : hook[patchset-created] output: timeout: failed to run command `/usr/local/bin/welcome-message': No such file or directory03:22
fifieldtgreat :D03:23
fungii can confirm that executable dne03:23
fungiprobably missing an entrypoint for it in the jeepyb setup.cfg?03:24
jeblairi wonder if we should write a quick daemon that behaves like nodepool and we offline all nodes after completion, and then have this tool disconnect and reconnect the static slaves03:24
fifieldtthat'd be it03:24
jeblairi guess the question is would that be faster, or would it be better to rush ahead with all-single-use slaves?03:25
*** sarob has quit IRC03:25
fungijeblair: i vote for the latter. the pain is likely similar, but it gets us where we want to be sooner03:25
*** sarob has joined #openstack-infra03:26
openstackgerritTom Fifield proposed a change to openstack-infra/jeepyb: Add entrypoint for welcome_message
*** fallenpegasus has joined #openstack-infra03:28
openstackgerritTom Fifield proposed a change to openstack-infra/jeepyb: Add entrypoint for welcome_message
fifieldtthere we go03:28
fungififieldt: second patchset lgtm!03:30
fifieldtcheers fungi, sorry for the sloppyness03:30
fifieldtI must have assumed jeepyb was just magic :D03:30
fungififieldt: no worries, that's an easy bit to forget03:30
fungififieldt: technically, the magic there is in pbr03:30
*** julim has quit IRC03:35
*** sarob has quit IRC03:38
*** sarob has joined #openstack-infra03:39
*** sarob has joined #openstack-infra04:02
*** dcramer_ has quit IRC04:04
*** sarob has quit IRC04:07
*** amotoki has joined #openstack-infra04:07
*** harlowja_away is now known as harlowja04:14
*** yaguang has joined #openstack-infra04:15
*** prad has joined #openstack-infra04:32
*** fallenpegasus has quit IRC04:33
*** dstanek has joined #openstack-infra04:42
*** banix has quit IRC04:45
*** wenlock has joined #openstack-infra04:45
fungiokay, knocking off early for the night... back in 7 or 8 hours04:46
*** changbl has joined #openstack-infra04:57
*** praneshp has joined #openstack-infra04:58
*** banix has quit IRC05:06
*** morganfainberg has quit IRC05:07
*** fallenpegasus has joined #openstack-infra05:10
clarkbfungi: good night05:10
openstackgerritJames E. Blair proposed a change to openstack-infra/config: Create more bare-precise nodes
openstackgerritJames E. Blair proposed a change to openstack-infra/config: Allow nova to use bare-precise nodes
openstackgerritA change was merged to openstack-infra/config: Use performance rax nodes in the gate
*** chandankumar has joined #openstack-infra05:27
openstackgerritA change was merged to openstack-infra/zuul: Move gear import to a safe place
*** praneshp_ is now known as praneshp05:35
*** prad has quit IRC05:36
*** gyee_ has quit IRC05:39
*** dstanek has joined #openstack-infra05:40
*** morganfainberg has joined #openstack-infra05:41
*** nati_ueno has joined #openstack-infra05:43
*** coolsvap has joined #openstack-infra05:53
*** dstanek has joined #openstack-infra05:56
*** fallenpegasus has joined #openstack-infra06:05
*** wenlock has quit IRC06:06
*** dstanek has quit IRC06:08
*** jamielennox is now known as jamielennox|away06:13
clarkbpleia2: anteaya: were you guys having a day off?06:13
StevenKI did wonder that myself.06:15
*** hdd_ has quit IRC06:21
*** talluri has joined #openstack-infra06:23
*** chandankumar has quit IRC06:25
*** harlowja is now known as harlowja_away06:25
pleia2clarkb: yeah, I seem to have a cold :( decided to take it easy06:27
clarkbget better06:27
pleia2we're meeting in the lobby soon to play with some testing stuff though06:27
clarkbI feel like i am coming down with a cold too, going to skip the dinner thing tonight and try t oget a decent amount of sleep06:28
pleia2good idea06:28
openstackgerritA change was merged to openstack-infra/publications: Update sysadmin-codereview from Oct presentation
pleia2if someone feels like tagging that, I can get my latest changes in too: git tag -s -m "Bay Area LUG, 2013" 2013-balug-sysadmin-codereview06:31
jeblairpleia2: to confirm, i'll tag the commit that just merged with that ^ ?06:32
pleia2jeblair: correct, october was my balug talk06:32
*** fallenpegasus has quit IRC06:35
jeblairpleia2: pushed06:35
pleia2jeblair: thanks!06:35
jeblairno prob!06:36
openstackgerritA change was merged to openstack-infra/config: Remove devstack-precise-check rax images
openstackgerritA change was merged to openstack-infra/config: Increase hpcloud ssh timeout to 180
openstackgerritA change was merged to openstack-infra/config: Create more bare-precise nodes
*** pblaho has joined #openstack-infra06:48
openstackgerritElizabeth Krumbach Joseph proposed a change to openstack-infra/publications: Add a couple services we manage and checks
*** oubiwann has quit IRC06:53
*** morganfainberg has quit IRC07:00
*** fallenpegasus has joined #openstack-infra07:00
*** morganfainberg has joined #openstack-infra07:01
*** fallenpegasus2 has joined #openstack-infra07:06
*** fallenpegasus has quit IRC07:06
*** fallenpegasus2 has quit IRC07:10
mozawaHi, can someone please restore this one ?
mozawaIt's not owned by me but by Jenkins. So I can not restore it.07:18
*** obondarev has joined #openstack-infra07:19
mozawaDue to this issue is in abandoned state, I got the following error in git review (dependency error).07:19
mozawa[mozawa@mubuntu unit (bp/s3-multi-part-upload)]$ git review07:19
mozawaYou have more than one commit that you are about to submit.07:19
mozawaThe outstanding commits are:07:19
mozawa56be241 (HEAD, bp/s3-multi-part-upload) Implemented S3 multi-part upload functionality07:19
mozawa6dace29 Updated from global requirements07:19
mozawaIs this really what you meant to do?07:19
mozawaType 'yes' to confirm: yes07:19
mozawaremote: Resolving deltas: 100% (13/13)07:19
mozawaremote: Processing changes: refs: 1, done07:19
mozawaTo ssh://
mozawa ! [remote rejected] HEAD -> refs/for/master/bp/s3-multi-part-upload (change 63708 closed)07:19
mozawaerror: failed to push some refs to 'ssh://'07:19
mozawa[mozawa@mubuntu unit (bp/s3-multi-part-upload)]$07:19
*** fallenpegasus has quit IRC07:20
*** jamielennox|away is now known as jamielennox07:23
*** yolanda has joined #openstack-infra07:23
*** yolanda has quit IRC07:26
*** yolanda has joined #openstack-infra07:31
*** fallenpegasus has joined #openstack-infra07:32
*** skraynev has joined #openstack-infra07:41
*** coolsvap has joined #openstack-infra07:44
*** fallenpegasus has quit IRC08:01
*** dpyzhov has joined #openstack-infra08:09
*** pblaho has quit IRC08:11
*** fallenpegasus has joined #openstack-infra08:27
*** fallenpegasus has quit IRC08:30
*** jpich has joined #openstack-infra08:49
*** fallenpegasus has joined #openstack-infra08:59
*** jroovers has joined #openstack-infra09:25
*** SergeyLukjanov has joined #openstack-infra09:25
*** fallenpegasus has joined #openstack-infra09:43
*** coolsvap has quit IRC09:46
*** yassine has quit IRC09:46
*** yassine has joined #openstack-infra09:46
*** praneshp has quit IRC09:47
openstackgerritDarragh Bailey proposed a change to openstack-infra/jenkins-job-builder: Use yaml local tags to support including files
openstackgerritDarragh Bailey proposed a change to openstack-infra/jenkins-job-builder: Add tests for YamlParser and patch 2.6 minidom
anteayawhen someone who can access the gerrit db is awake and able, can you check account id # 9832 username mayu and let me know what the db has as ssh keys?09:57
anteayaI am looking to confirm this public key: for that account09:57
*** nati_ueno has joined #openstack-infra09:58
*** plomakin has joined #openstack-infra10:01
*** nati_ueno has quit IRC10:02
*** mozawa has quit IRC10:02
clarkbanteaya: can the user confirm it themselves?10:12
ttxsdague: did you propose a date for the gate bugs day ?10:17
* ttx is slightly out of touch with crappy Internet connection10:18
ttxtethering on cell data right now10:18
*** fallenpegasus2 has joined #openstack-infra10:22
*** SergeyLukjanov has quit IRC10:30
*** fallenpegasus2 has quit IRC10:35
*** boris-42 has joined #openstack-infra10:50
*** yassine has quit IRC10:51
*** saschpe has quit IRC10:57
*** hashar has joined #openstack-infra11:01
*** jcoufal has quit IRC11:14
*** jcoufal has joined #openstack-infra11:15
*** markmc has joined #openstack-infra11:16
*** yaguang has quit IRC11:18
*** ruhe is now known as _ruhe11:18
*** _ruhe is now known as ruhe11:23
sdaguettx: I have not yet, I was going to run it by the ptls on friday. Honestly, I was trying to get through some of this analysis first11:28
*** boris-42 has quit IRC11:29
ttxsdague: ack. Was about to suggest it as a response to jog0's call for help, but figured I should ask you first11:30
*** pblaho has quit IRC11:34
*** che-arne has joined #openstack-infra11:37
*** boris-42 has joined #openstack-infra11:38
*** pblaho has joined #openstack-infra11:46
*** pblaho has quit IRC11:46
*** dstanek has quit IRC11:48
*** rakhmerov has joined #openstack-infra11:51
*** rakhmerov1 has joined #openstack-infra11:52
*** cody-somerville_ has quit IRC11:52
*** rfolco has joined #openstack-infra11:57
*** weshay has joined #openstack-infra11:57
*** coolsvap has quit IRC12:02
*** tma996 has joined #openstack-infra12:03
*** ruhe_away is now known as ruhe12:08
*** jooools has quit IRC12:09
*** cody-somerville_ is now known as cody-somerville12:10
openstackgerritDavanum Srinivas (dims) proposed a change to openstack-infra/config: Add jenkins slave name to the logstash records
*** dims_ is now known as dims12:11
*** ruhe is now known as _ruhe12:23
*** hashar has quit IRC12:26
*** saschpe has joined #openstack-infra12:29
*** yassine has joined #openstack-infra12:32
openstackgerritNikita Konovalov proposed a change to openstack-infra/storyboard: Introducing basic REST API
openstackgerritNikita Konovalov proposed a change to openstack-infra/storyboard: Introducing basic REST API
*** dstanek has quit IRC12:47
*** _ruhe is now known as ruhe12:50
*** fallenpegasus has joined #openstack-infra12:53
*** heyongli has joined #openstack-infra12:54
*** mozawa has joined #openstack-infra12:55
*** NikitaKonovalov has quit IRC12:57
*** oubiwann has joined #openstack-infra12:58
*** pblaho has joined #openstack-infra12:59
fungimozawa: i've restored 63708 as you requested12:59
*** johnthetubaguy1 is now known as johnthetubaguy13:00
fungianteaya: checking into mayu's account now13:00
*** dkranz has joined #openstack-infra13:01
fungianteaya: it looks like mayu pasted their ssh key into gerrit the same way it's pasted into that paste... incorrectly13:03
fungianteaya: it should not have embedded newlines, but rather be all one line with no breaks13:04
*** pblaho has left #openstack-infra13:04
*** pblaho has joined #openstack-infra13:04
mozawafungi: Thank you very much !13:06
fungilooks like precise26 has gone rogue. marking offline13:06
*** dkranz has quit IRC13:08
*** jcoufal has quit IRC13:11
openstackgerritThierry Carrez proposed a change to openstack-infra/config: Notify openstack-operators on UpgradeImpact
*** oubiwann has quit IRC13:17
*** jroovers has joined #openstack-infra13:18
*** jroovers has quit IRC13:18
*** jcoufal has joined #openstack-infra13:18
*** jroovers has joined #openstack-infra13:18
*** alexpilotti has joined #openstack-infra13:25
*** prad has joined #openstack-infra13:25
*** jcoufal_ has joined #openstack-infra13:28
*** jcoufal has quit IRC13:28
ruhei'm trying to setup zuul sever. and constantly get the following error on attempt to enqueue a job:13:28
ruheJob <gear.Job 0x1f4df50 handle: None name: build:job_name unique: feb7436652964357bfeb1d1db8443478> is not registered with Gearman13:28
ruheis it jenkins who has to register the job with gearman?13:28
fungiruhe: yes, or more specifically, the jenkins-gearman plugin installed and configured on the jenkins master13:28
ruhefungi: does it happen by itself when i create a new job? (assuming plugin is installed and enabled)13:30
fungiruhe: it should, yes13:30
fungiassuming correct configuration and connectivity13:30
ruhefungi: thank you. now i now where to dig13:30
fungiyou're welcome. there's also an admin protocol for gearman, so if you want to debug there you can connect to the gearman service and interrogate it13:31
openstackgerritEli Klein proposed a change to openstack-infra/jenkins-job-builder: Added rbenv-env wrapper
*** sandywalsh has joined #openstack-infra13:31
fungiruhe: with gearman admin commands you can do things like see connected workers and registered jobs13:31
*** jasondotstar has quit IRC13:38
*** prad has quit IRC13:38
*** jasondotstar has joined #openstack-infra13:39
*** CaptTofu has joined #openstack-infra13:40
*** yamahata has joined #openstack-infra13:49
*** yamahata has quit IRC13:50
*** yamahata has joined #openstack-infra13:51
openstackgerritA change was merged to openstack-infra/config: Allow nova to use bare-precise nodes
*** mriedem has joined #openstack-infra14:00
*** dprince has joined #openstack-infra14:00
fungii'm keeping an eye on nova jobs and will revert that ^ if it starts to cause trouble14:01
*** med_ has quit IRC14:01
openstackgerritSean Dague proposed a change to openstack-infra/elastic-recheck: use pandas DataFrames for new check_success_pd
*** talluri has quit IRC14:08
*** thuc has joined #openstack-infra14:08
*** amotoki has quit IRC14:09
*** talluri has quit IRC14:13
fungii've taken precise34 offline now too, same problem14:14
*** prad has joined #openstack-infra14:22
*** SergeyLukjanov has joined #openstack-infra14:23
SergeyLukjanovhey folks14:24
SergeyLukjanovare there any bugs for specifying them if jenkins slaves problems?14:25
SergeyLukjanovespecially in case when this slave was already disabled14:26
fungiSergeyLukjanov: we think it may be but some of the failing slaves are on jenkins02 which is supposed to have that fixed, so we're not yet sure14:26
*** michchap_ has quit IRC14:26
*** michchap has joined #openstack-infra14:27
SergeyLukjanovI see another error on precise3414:27
SergeyLukjanovit was already disabled by you too14:27
fungiSergeyLukjanov: however, we've merged as the latest tack in our progress toward not reusing any general-purpose slaves any longer, and i'm seeing what looks like success on the nova changes in the gate so far14:27
fungiso maybe this will be mostly behind us within the next day or so14:28
fungipython 2.6 and 3.3 unit tests are going to be a bit more work to knock out though14:28
fungisince we'll need additional images for them, and possibly additional scripting14:29
SergeyLukjanovin savanna I see mostly pep8 and py27 failed14:29
openstackgerritSean Dague proposed a change to openstack-infra/elastic-recheck: add pandas infrastruce into elastic recheck
openstackgerritSean Dague proposed a change to openstack-infra/elastic-recheck: use pandas DataFrames for new check_success_pd
fungiSergeyLukjanov: yeah, we run a lot more tests on regular ubuntu precise, while the centos6 and precisepy3k nodes are pretty much only used for python 2.6 and 3.3, so we have many more regular precise slaves as a result and i so expect to see them fail (and have) with greater frequency14:30
SergeyLukjanovfungi, which bug is better to use to verify CRs failed on precise34?14:31
fungiSergeyLukjanov: there have been a few one-off openstack-ci bugs opened for the individual slaves which failed, but they're starting to happen with such regularity now that i'm not sure continuing to open new bugs for each of them is worthwhile this week. maybe just take over one of those and mark the others as a dupe14:32
*** ryanpetrello has joined #openstack-infra14:32
SergeyLukjanovfungi, that's why I'm asking14:32
SergeyLukjanovI'll take a look on them14:33
*** prad has quit IRC14:33
fungithe uptick in slave agent failures seems to correspond with the rise in test load, but also possibly even more sharply with the zuul upgrade. i suspect as we made zuul more efficient, it has put further strain on the jenkins masters14:33
*** rakhmerov has quit IRC14:35
luqasHi guys, I'm seeking for a service account for 3rd party testing14:35
luqasi've written to open-infra mail list but got no answer so far14:35
fungiluqas: i was just in the process of handling that one, actually14:36
luqasoh perfect14:36
fungiluqas: i'll reply to your message here shortly14:36
*** mrodden has joined #openstack-infra14:36
fungiwell, i'll reply to your message *there* shortly (i've already replied here) ;)14:36
luqasthanks a lot, we'll maybe changing the contact e-mail if possible14:36
fungiluqas: i can change it for you later if needed, but it's not something you'll be able to change directly yourself since it needs to happen in the database14:37
SergeyLukjanovfungi, yup, it looks like we can have extremely bigger load on jenkins masters in case if zuul will process events faster14:37
luqasfungi, ok, it's ok right now14:37
*** markmcclain has quit IRC14:39
*** dkliban has joined #openstack-infra14:39
SergeyLukjanovfungi, I see only one open bug about jenkins agent init failed14:41
SergeyLukjanovthat was creted by me for precise3714:41
*** ilyashakhat has quit IRC14:41
*** eharney has joined #openstack-infra14:41
fungiSergeyLukjanov: okay, that works. we'd previously been marking them fix-released shortly after disabling/restoring the affected slave14:42
fungiwhich is probably why you don't see others with a normal search14:42
SergeyLukjanovfungi, yup, I mean that all other was already closed14:42
SergeyLukjanovfungi, I think that it'll be better to rename it to "Jenkins agent init failed" and use it for recheck/reverify14:42
SergeyLukjanovand probably track failed slaves list in description14:43
SergeyLukjanovto not create new bugs each time14:43
*** GheRiver1 is now known as GheRivero14:43
*** thuc has quit IRC14:43
*** thuc has joined #openstack-infra14:44
SergeyLukjanovfungi, ok, I'll rename it, update description and send a bug it to os-dev to make guys able to recheck/reverify failed CRs14:45
fungiSergeyLukjanov: thanks! that's a hude help14:46
fungier, huge too14:46
fungiSergeyLukjanov: in positive news, these nova jobs, which previously would have been run on long-running reusable slaves, ran successfully on nodepool bare nodes:
fungiso i think i may try turning it on for a few more projects here shortly14:47
*** thuc has quit IRC14:48
*** CaptTofu has quit IRC14:48
SergeyLukjanovbare nodes are single-use?14:49
fungiSergeyLukjanov: for now at least, all nodepool-managed nodes are single-use14:49
*** dcramer_ has joined #openstack-infra14:50
SergeyLukjanovfungi, oh, that's surprinsing ;)14:50
SergeyLukjanovI was thinking that long-running nodes are creating but nodepool14:50
fungiwe may decide to instead set an upper-bound on reuse for them so we don't incur as much build/beat/burn overhead, but we'll approach that once we see it's an actual problem14:51
SergeyLukjanovbtw we're now using nodepool to build non single-use nodes for savanna-ci14:51
fungiSergeyLukjanov: awesome!14:51
SergeyLukjanovfungi, zuul/nodepool/gearman on trunk OpenStack with Neutron are used now14:52
fungioh wow, good work14:52
*** banix has joined #openstack-infra14:53
SergeyLukjanovthere are very small changes to support Neutron14:53
SergeyLukjanovlike adding network id AFAIK14:53
SergeyLukjanovwe'll return back with patches when it'll be fully checked14:53
*** malini_afk is now known as malini14:54
fungiso just to confirm, you have a continuously-deployed openstack-neutron cloud, and you have nodepool adding/deleting slaves within it? definitely appreciate the patches14:55
fungithat'll put us ahead of the curve for when our cloud providers catch up to that14:57
*** marun has quit IRC14:57
*** thuc has joined #openstack-infra14:57
*** prad has joined #openstack-infra14:57
*** herndon_ has joined #openstack-infra14:58
*** CaptTofu has joined #openstack-infra14:58
*** rakhmerov has joined #openstack-infra14:59
*** rakhmerov has quit IRC14:59
ruhefungi: what do you mean by "continuously-deployed" ? our (savanna-ci) cloud is based on stable/havana with neutron15:00
*** oubiwann has quit IRC15:00
fungiruhe: aha, i misread. when SergeyLukjanov said "on trunk" he meant trunk zuul, nodepool... not trunk openstack15:01
SergeyLukjanovfungi, yup, we have an OpenStack cluster with Neutron manually installed (using devstack)15:01
SergeyLukjanovfungi, and nodepool creates slaves in this cluster15:01
fungistill sounds good, and definitely excited to get patches to support that15:02
*** thuc has quit IRC15:02
*** CaptTofu has joined #openstack-infra15:02
*** thuc has joined #openstack-infra15:02
ruhefungi: we're also working towards deploying it all with openstack-infra/config puppet scripts (that's where i've got stuck with zuul not being able to find jobs)15:03
openstackgerritSean Dague proposed a change to openstack-infra/config: make a dedicated page for gate status
fungiruhe: makes sense. if you find the puppetry or documentation is missing something significant, please open bugs and/or submit patches for that. we're always looking to improve and make it easier15:04
SergeyLukjanovfungi, I hope that we'll eventually adopt os-infra/config to install all our savanna-ci related infra15:06
*** rcarrillocruz1 has joined #openstack-infra15:07
fungiSergeyLukjanov: i hope so too!15:07
*** oubiwann has joined #openstack-infra15:07
*** thuc has quit IRC15:07
SergeyLukjanovfungi, here is the bug for agent failures
SergeyLukjanovfungi, could you please take a look on it and I'll send an email to dev15:07
*** rcarrillocruz has quit IRC15:07
fungiSergeyLukjanov: definitely, having a look now15:07
SergeyLukjanovfungi, thx15:08
*** marun has joined #openstack-infra15:12
openstackgerritSean Dague proposed a change to openstack-infra/config: add in the optional ; everywhere
sdaguefungi: can you give a look. That will let me drop that piece off the ER webpage15:15
fungiSergeyLukjanov: okay, bug looks good. i added some additional status detail on what we're doing to solve it15:16
fungisdague: sure15:16
*** senk has quit IRC15:17
*** senk has joined #openstack-infra15:22
*** krotscheck has joined #openstack-infra15:23
*** rwsu has joined #openstack-infra15:23
*** dims has quit IRC15:23
*** dims has joined #openstack-infra15:24
*** senk1 has joined #openstack-infra15:25
*** senk has quit IRC15:26
*** dkranz has joined #openstack-infra15:27
*** kraman has joined #openstack-infra15:27
*** mfer has joined #openstack-infra15:29
*** rakhmerov has joined #openstack-infra15:30
*** jorisroovers has joined #openstack-infra15:34
shardyHi all, seeing this failure which seems to be a network outage or something:15:34
shardyIs it safe to reverify no bug, or should I raise one?15:35
*** rnirmal has joined #openstack-infra15:35
*** rakhmerov has quit IRC15:35
*** mfink has left #openstack-infra15:36
*** rakhmerov has joined #openstack-infra15:38
fungishardy: reverify bug 126736415:39
*** dpyzhov has quit IRC15:40
fungishardy: i'm starting to suspect that the increased approval volume following everyone's return from the holidays, plus our recent upgrade of zuul to a beefier server, has started to strain jenkins in ways we hadn't previously seen with this regularity15:41
*** dpyzhov has joined #openstack-infra15:41
pasquier-shi, I've got a review that's been approved but the gate jobs seem to be lost:15:41
pasquier-sany hint?15:42
fungipasquier-s: looking15:42
pasquier-sfungi, thanks!15:42
*** mfink has joined #openstack-infra15:42
openstackgerritA change was merged to openstack-infra/reviewstats: Reformat heat.json
*** jgrimm has joined #openstack-infra15:44
*** mfink has quit IRC15:44
fungipasquier-s: yeah, it looks like that may have happened right when we were restarting zuul yesterday, possibly between when jeblair dumped the queue and stopped the service, so it didn't get reenqueued with the others. i'll add it back to the gate for you in just a moment15:44
*** mfink has joined #openstack-infra15:44
*** mfink has quit IRC15:45
fungipasquier-s: openstack/python-heatclient 65269,2 has been enqueued into the gate pipeline and appears on now15:51
*** pblaho has quit IRC15:52
*** AJaeger has quit IRC15:56
*** markmcclain has quit IRC16:00
*** rakhmerov has quit IRC16:01
*** gothicmindfood has joined #openstack-infra16:02
*** markmcclain has joined #openstack-infra16:10
*** medberry has joined #openstack-infra16:12
*** medberry has joined #openstack-infra16:12
*** rcarrillocruz has joined #openstack-infra16:12
*** rcarrillocruz1 has quit IRC16:13
*** tma996 has quit IRC16:22
*** pcrews has joined #openstack-infra16:26
*** thuc has joined #openstack-infra16:28
*** thuc_ has joined #openstack-infra16:30
*** thuc has quit IRC16:30
*** jcoufal_ has quit IRC16:31
*** rakhmerov has joined #openstack-infra16:31
*** mozawa has quit IRC16:34
*** rakhmerov has quit IRC16:36
*** mozawa has joined #openstack-infra16:39
*** CaptTofu has quit IRC16:44
*** UtahDave has joined #openstack-infra16:45
*** medberry is now known as med_16:47
openstackgerritJoão Vale proposed a change to openstack-infra/jenkins-job-builder: Add support for parameters in pipeline publisher.
*** che-arne has joined #openstack-infra16:50
*** senk1 has quit IRC16:50
*** ^d has joined #openstack-infra16:57
*** senk has joined #openstack-infra16:57
fungii've taken precise39 offline17:00
*** derekh has quit IRC17:02
SergeyLukjanovthe same problems?17:02
SergeyLukjanovjenkins01 again17:03
fungiit seems to be fairly evenly distributed between jenkins01 and 0217:04
*** markmc has quit IRC17:04
fungii need to add an account in their jira and comment on that bug that we're seeing the same backtraces in 1.543 (possible regression? different issue?)17:04
SergeyLukjanovfungi, yup17:04
*** NikitaKonovalov has quit IRC17:04
*** AaronGr_Zzz is now known as AaronGr17:06
*** praneshp has joined #openstack-infra17:07
*** alexpilotti has joined #openstack-infra17:09
openstackgerritJeremy Stanley proposed a change to openstack-infra/config: Allow cinder to use bare-precise nodes
openstackgerritJeremy Stanley proposed a change to openstack-infra/config: Allow glance to use bare-precise nodes
openstackgerritJeremy Stanley proposed a change to openstack-infra/config: Allow keystone to use bare-precise nodes
openstackgerritJeremy Stanley proposed a change to openstack-infra/config: Allow heat to use bare-precise nodes
openstackgerritJeremy Stanley proposed a change to openstack-infra/config: Allow horizon to use bare-precise nodes
openstackgerritJeremy Stanley proposed a change to openstack-infra/config: Allow ceilometer to use bare-precise nodes
openstackgerritJeremy Stanley proposed a change to openstack-infra/config: Allow swift to use bare-precise nodes
fungidone as individual changes to make them easier to roll in or revert as we see issues17:17
*** rcarrillocruz has quit IRC17:17
*** markmcclain has joined #openstack-infra17:21
fungiand i want to wait until there are other infra cores around to confirm this is a sane direction17:23
SergeyLukjanovfungi, I"m proposing to make savanna able to use bare nodes to test it, what do you think about it?17:24
fungisounds great!17:24
fungibase it on the tip of master though, so we can merge it without needing to wait on my stack17:25
openstackgerritJeremy Stanley proposed a change to openstack-infra/config: Allow swift to use bare-precise nodes
fungithat one ^ i had the wrong project on (accidentally put it on the client when i meant to do the server)17:26
*** ruhe is now known as _ruhe17:27
openstackgerritSergey Lukjanov proposed a change to openstack-infra/config: Allow savanna to use bare-precise nodes
*** yassine has quit IRC17:30
*** praneshp has quit IRC17:31
fungiSergeyLukjanov: we can merge that if you're feeling fairly confident it won't break for you (or if you don't mind rechecking once we revert)17:32
*** rakhmerov has joined #openstack-infra17:32
*** reed has joined #openstack-infra17:33
SergeyLukjanovfungi, I don't think that it break anything in savanna, we're not using any specific jobs17:33
SergeyLukjanovfungi, let's try tomerge that and wait for a day to collect results17:34
SergeyLukjanovfungi, is it possible to search in console logs using logstash?17:34
fungiyes, just no wildcard searches17:35
*** praneshp has joined #openstack-infra17:35
*** mancdaz is now known as mancdaz_away17:35
*** rakhmerov has quit IRC17:37
*** NikitaKonovalov has joined #openstack-infra17:37
fungiso far, all of the nova jobs in the gate which ran on nodepool bare-precise nodes (and which were allowed to complete without being cancelled/aborted) have succeeded. also, we haven't previously seen any impact from the infra jobs we moved to that node type either17:41
fungii just jinxed myself...
SergeyLukjanovfungi, I'll monitor savanna jobs for any issues17:42
SergeyLukjanovfungi, oh, the same error at bare node slave @ jenkins0117:43
fungithough that's an expected behavior. jenkins will still have occasional broken slave agents regardless, because of whatever the jenkins bug is17:43
fungithe up side is that the node will get thrown away and not reused over and over in a rapid-fire loop failing every job it's given17:44
SergeyLukjanovfungi, yup and it's much better I think than manually disable slaves each time17:44
SergeyLukjanovat least...17:44
fungiand as intended, if you follow the link to where it ran, that slave has already been deregistered and deleted17:44
fungiso the broken slave killed a job, but just *that one* job17:45
SergeyLukjanovfungi, yup, see it17:45
SergeyLukjanovwe've used 534 jenkins for a long time for savanna-ci w/o such problems17:46
SergeyLukjanovj02 is 543 and we have another agent error on it17:46
SergeyLukjanovzaro, you need to rebase one to another17:47
fungizaro: easy solution, git review -d change1 && git review -x change2 && git review -x change3 && ...17:47
SergeyLukjanovfungi, j03 and j04 are 543 too17:48
SergeyLukjanovfungi, are there any problems in updating jenkins?17:48
fungiSergeyLukjanov: we upgraded -dev from 1.525 to 1.543, ran it through some tests, then uprgaded 02 to 1.543, ran into a bug in one of our plugins so rolled that back tracked down/fixed and upgraded again17:49
SergeyLukjanovfungi, oh, got it17:50
fungiabout the time we determined it was at least as stable as 1.525 we rushed into building 03 and 04 with 1.543 before we got around to upgrading 0117:50
fungijenkins.o.o is similarly still on 1.52517:50
mgagnecould someone explain to me the fundamental differences between precise and bare-precise ?17:51
SergeyLukjanovzaro, you can't do it w/o pushing a new patch17:51
fungimgagne: "precise" nodes are long-running nodes which get reused over and over17:51
openstackgerritJoão Vale proposed a change to openstack-infra/jenkins-job-builder: Add support to specify GitLab version.
fungimgagne: bare-precise are our new nodepool-managed single-use slaves17:51
mgagnefungi: bare-precise are throw away managed by nodepool?17:51
zaroSergeyLukjanov: ok. thanks.17:51
mgagnefungi: alright, thanks17:51
fungimgagne: "bare" in this case meaning "not dsvm"17:52
mgagnefungi: oh17:52
*** dmsimard has joined #openstack-infra17:52
zarofungi: seeems very overloaded term :)17:52
dmsimardfungi: Thanks for deactivating .. was confused about the build failures :)17:52
fungimgagne: the expectation being that we may wind up with more types of precise nodes in nodepool (for example, maybe py3k-precise)17:53
fungidmsimard: i should start linking in the deactivation messages17:54
*** luqas has quit IRC17:54
mgagnefungi: sure. I would have suggested renaming precise to dsvm-precise then :P17:54
dmsimardfungi: Is there a way to ask jenkins another run without submitting another patch set ?17:55
fungidmsimard: yes, was the patch already approved or just being checked?17:55
dmsimardfungi: Just being checked, not approved yet17:56
mgagneleave a comment: recheck bug 126736417:56
*** NikitaKonovalov has quit IRC17:56
openstackgerritKhai Do proposed a change to openstack-infra/jenkins-job-builder: make scm test as the examples
SergeyLukjanovdmsimard, you can find more info about it here -
*** afazekas has quit IRC18:00
fungidmsimard: in this case, bug 1267364 is a bug about failing jenkins slave agents18:00
*** BobBall is now known as BobBallAway18:00
fungidmsimard: the recheck bug ###### syntax is so that we can try to keep track of what bugs are causing devs to need to re-test their changes18:01
openstackgerritA change was merged to openstack-infra/config: Allow savanna to use bare-precise nodes
*** CaptTofu has joined #openstack-infra18:01
*** sparkycollier has quit IRC18:01
fungiSergeyLukjanov: in about 30 minutes you can recheck an open change and see whether that ^ worked18:01
dmsimardfungi: Yeah, I caught on - thanks18:01
*** NikitaKonovalov has joined #openstack-infra18:01
*** jorisroovers has joined #openstack-infra18:01
fungimgagne: we already have dsvm-precise nodes (those are created by nodepool to run "devstack vm" jobs)18:02
fungimgagne: the "precise" nodes are just the ones which aren't managed by nodepool at all18:02
mgagnefungi: oh, now that's getting confusing =)18:02
mgagnefungi: thanks for the info18:03
fungimgagne: i don't think it should be confusing... "precise" is legacy long-running general-purpose workers. "precise-bare" and "precise-dsvm" are two types of precise nodes built and managed by nodepool18:03
*** moted has joined #openstack-infra18:04
fungier, "bare-precise" and "dsvm-precise" i mean18:04
mgagnefungi: understood18:04
fungiwe'll likely soon also grow some "py3k-precise" and "bare-centos6" node types too18:04
*** jroovers has quit IRC18:06
*** praneshp has quit IRC18:07
*** harlowja_away is now known as harlowja18:07
*** jpich has quit IRC18:08
*** thuc_ has quit IRC18:12
*** karpukhina has joined #openstack-infra18:12
*** sdake_ has quit IRC18:12
SergeyLukjanovfungi, I'll check it18:12
*** thuc has joined #openstack-infra18:13
*** karpukhina has quit IRC18:13
*** sparkycollier has quit IRC18:14
*** johnthetubaguy has quit IRC18:15
openstackgerritYanis Guenane proposed a change to openstack-infra/config: New project request: eDeploy
*** thuc has quit IRC18:17
*** morganfainberg is now known as needscoffee18:17
*** thuc has joined #openstack-infra18:20
SergeyLukjanovfungi, heh, I need to update my zuul changes to easily determine the time when zuul config updated :)18:21
*** thuc has quit IRC18:21
*** thuc has joined #openstack-infra18:21
*** thuc has quit IRC18:22
*** thuc has joined #openstack-infra18:22
SergeyLukjanovfungi, do you now anything about new project creation?18:27
SergeyLukjanovfungi, I mean about the how it created18:28
SergeyLukjanovwill it be the empty repo or the python project now?18:28
fungiSergeyLukjanov: which new project, specifically?18:29
openstackgerritJerry Zhao proposed a change to openstack-infra/config: Add compass project to stackforge
fungiSergeyLukjanov: it can be either, depending on your configuration18:30
*** dizquierdo has quit IRC18:30
SergeyLukjanovfungi, for example, if i'd like to add new project to the stackforge18:31
SergeyLukjanovfungi, w/o upstream18:31
fungiSergeyLukjanov: what determines that behavior is whether you provide an "upstream" (poorly named, we should probably eventually change that)18:31
SergeyLukjanovfungi, I remember some work on using cookiecutter for new projects creation18:31
SergeyLukjanovbut can't find it atm18:31
fungiwithout "upstream" specified, you get an empty repo with a solitary commit adding a correct .gitreview file18:32
mgagneSergeyLukjanov: like this one:
jog0so I don't think the rax high perf are fast enough18:32
fungiSergeyLukjanov: which, if you don't have much code to put into the repo yet, us probably a good bet18:32
SergeyLukjanovmgagne, yup, I know, but I was thinking that this behaviour was changed18:33
*** rakhmerov has joined #openstack-infra18:33
mgagneSergeyLukjanov: oh, I don't know. All I know is there was issues with project creation at one time, don't know if it got fixed ^^'18:33
fungiSergeyLukjanov: you can check out and use openstack-dev/cookiecutter to create a templated project and add that as your next commit fairly easily if you don't already have a repo you want to import18:34
fungijog0: that may explain why large-ops has been causing so many resets recently18:34
jog0fungi: yup18:34
SergeyLukjanovfungi, yup, I know about it, just was confused about possible default behavior change, thank you18:34
jog0jeblair: ^18:36
jog0fungi: how should we handle this?18:36
SergeyLukjanovfungi, the question was because I'm thinking about moving savanna-ci jjb/zuul configs to the stackforge to be able to review/manage them, what do you think about this?18:36
jog0I think this means rax will have more timeouts on other tests too18:37
fungigah, a bare-precise node ran into a java io exception and then continued grabbing several jobs before it was deregistered...
dimsjog0, should help figure out problems with specific slaves18:37
SergeyLukjanovjog0, do we have timeouts only on rax nodes now?18:38
jog0SergeyLukjanov: I don't know how to confirm that18:38
*** rakhmerov has quit IRC18:38
jog0we don't store that data  in logstash18:38
dimsjog0, see above :)18:38
fungijog0: the review dims just linked18:38
fungi(would add that)18:38
jog0dims: \o/18:38
dimsjog0, what else will help?18:39
*** thuc has quit IRC18:39
dimswhat else should we log that will help?18:39
mferSergeyLukjanov mgagne the issue with new project creation is still open...
fungizaro: would you mind looking at ? it's a one-liner addition to the zmq publisher plugin which would help with job failure diagnostics18:40
*** thuc has joined #openstack-infra18:40
jog0dims: that was the big thing missing18:40
jog0so spot check of large-ops failures points to rax18:40
fungimfer: SergeyLukjanov: yes, i'm cycling back around on new project requests to get one in shape to test the current assumed fix for new project creation... i'll try to make another pass here in a bit18:40
mferfungi i'll be around if there is anything i can do to help18:41
jog0fungi: can we revert the rax high perf patch for now? or is that a bad idea18:41
*** thuc_ has joined #openstack-infra18:42
fungijog0: i need to look back and see whether there's just the one change to revert, or whether we have to go through a cycle of transitioning back to non-performance images et cetera18:43
*** thuc_ has quit IRC18:43
*** thuc_ has joined #openstack-infra18:44
jog0fungi: thanks. the other option is to have a seperate large_ops number for rax nodes18:44
*** thuc has quit IRC18:44
jog0its 100 on hp18:44
jog0down from 150 to avoid these issues18:44
jog0turns out rax is just hella slow18:44
*** thuc_ has quit IRC18:45
*** thuc has joined #openstack-infra18:45
*** che-arne has quit IRC18:45
*** dstanek has joined #openstack-infra18:45
fungijog0: yeah, so looking at the patch series, we'd need to take 65237, 65246 and 65619 into account if we're going to revert 6523618:46
*** herndon_ has quit IRC18:46
*** ^d has quit IRC18:47
*** dmsimard has left #openstack-infra18:47
fungii think it's mainly 65246 which would need to be undone first, then get the images back, then undo 65237 (but we'd need to make sure the other two i mentioned don't have any implications on that)18:47
fungijog0: the main difference 65236 brings is that it's now using them for gate jobs as well as check jobs, but these failures should have been apparent in the weeks that we ran check jobs on them if so18:49
*** sarob has joined #openstack-infra18:49
jog0large-ops only runs on gate nodes18:50
jog0even in cehck queue18:50
jog0because of this issue18:50
jog0the perf aspect to it18:50
jog0which is something we don't like18:50
*** dstanek has quit IRC18:50
fungialso i mis-pasted above. it's 65237 which removed the images, and which we'd need to undo and wait on before reverting 6523618:51
*** _ruhe is now known as ruhe18:52
*** jerryz has joined #openstack-infra18:52
fungijog0: how much lower would the quantity need to go, do you think, to work on rackspace performance nodes?18:52
fungijog0: or should we think about upping the job timeout?18:53
jog0fungi: tiemout is in nova18:53
jog0and not sure how much worse rax hi perf are18:53
jog0would have to experiment18:53
fungioh, got it, so the timeout we're hitting on those isn't the overall d-g timeout, right18:54
*** beagles has quit IRC18:56
fungijog0: well, i can add the devstack-precise-check images back, get those building, then tear the devstack-precise images back out of rax, but it'll be a little while to complete and will reduce our capacity again18:57
fungijog0: or i can see about adding a new label just to the hpcloud nodes and switch the large-ops job to that (but again, that's a sort of ugly workaround)18:58
*** dstanek has joined #openstack-infra18:59
jog0fungi: correct, nova is timing out.19:00
jog0fungi: and solution wise: its your call19:00
*** b3nt_pin has joined #openstack-infra19:01
*** b3nt_pin is now known as beagles19:01
*** NikitaKonovalov has quit IRC19:02
fungijeblair: clarkb: mordred: if you're near an internet, opinions would be appreciated. basic summary, we were never running large-ops jobs on rackspace, even in check. now that we got rid of the check-specific nodes, large-ops jobs are running on rackspace and timing out19:02
sdaguemriedem: did you start in on adding other job support into er? if not I was going to work on that, because I need to step away from the data analysis bits for a while or I'm going to break my computer19:03
*** dstanek has quit IRC19:04
mriedemsdague: which other job support? i thought there was a tempest one that wasn't getting hit for large ops yesterday but turned out that it was19:04
fungijeblair: clarkb: mordred: long-term, i think jobs like large-ops whose success or failure is determined by the performance of the underlying provider (global job timeouts aside) need to do some sort of benchmarking prior to starting the job so they know how far its safe to scale. short term, our options are somewhat more limited19:04
mriedemsdague: so no19:04
mriedemsdague: this is a bug i started looking at though, some notes in there:
sdaguemriedem: grenade still needs to be added, as well as the unit tests19:05
fungijeblair: clarkb: mordred: options i see are to bring back the check nodes and stop putting normal nodes in rackspace again, or try to find some way to designate the hpcloud nodes and switch the large-ops job to only run on those, or scale down the large-ops job to the point where it's probably ineffective on hpcloud19:05
mriedemsdague: ah, no, didn't dig into that. got some stuff i needed to work on before being out next week19:06
mriedemi spent about 2 days doing reviews and infra only stuff this week so got a bit sidetracked19:06
*** alexpilotti has quit IRC19:06
sdagueyep, no worries, I just didn't want to dive into it if you had.19:06
*** ^d has joined #openstack-infra19:08
*** alexpilotti has joined #openstack-infra19:09
*** praneshp has joined #openstack-infra19:10
*** ^d has quit IRC19:12
*** SergeyLukjanov has quit IRC19:13
*** rcarrillocruz has joined #openstack-infra19:14
*** rossella_s has quit IRC19:14
jerryzfungi: is it that large-ops jobs end up timeout waiting for instance to be Active?19:14
fungijeblair: clarkb: mordred: i'm going to work on bringing back the devstack-precise-check nodes for now, and then stop building devstack-precise nodes on rackspace19:15
*** rossella_s has joined #openstack-infra19:15
fungijerryz: i don't have the error details handy. it's the one which tries to spin up 100 instances at once19:15
*** hogepodge has joined #openstack-infra19:16
jerryzfungi: i used to run into that kind of error on my own test cloud provider. remember me asking you about flavor choice on your cloud providers? i ended up having to raise the number of cpus for devstack slaves19:17
sdaguejog0: so is there a way, like in the turbo hipster case, where we can isolate large-ops? Because I'm concerned we're going to run into another issue where it's actually a performance test, and 2 std deviations is not enough19:18
*** melwitt has joined #openstack-infra19:19
fungijerryz: yes, i believe this is probably similar to what you were encountering in your cloud19:19
jog0sdague: not sure what you mean?19:20
*** rcleere has quit IRC19:22
sdaguejog0: large-ops is basically a performance test19:24
sdaguethat's got a large-ops value set to completely within a timeout, otherwise fail19:25
sdaguebut the variability in the cloud envs mean performance tests are hard19:25
sdaguebecause the timing is all over the map19:25
jog0the variance without a single cloud hasn't been much of an issue19:26
fungijog0: depends on what you call a single cloud19:26
fungivariance between hpcloud east and west is huge19:27
jog0and while this is a perforamance test its just to make sure things aren't terrible so dropping the number is fine globallty is fine with me19:27
jog0we used to not abe able to boot 30 instances at once19:27
jog0fungi: true19:27
jog0I am fine with dropping large-ops to 5019:27
jog0that still enough of a test to catch things like rootwrap regressions etc19:28
fungijog0: if you want to try that first, i can promote it to the head of the gate asap for some quick relief19:28
*** thuc has quit IRC19:29
*** thuc has joined #openstack-infra19:30
openstackgerritJoe Gordon proposed a change to openstack-infra/devstack-gate: Drop large-ops test down to 50 instances from 100
jog0fungi: ^19:30
fungirolling back the changes which shifted more of our load onto rackspace is going to be a more involved transition, so i'd like to confer with the rest of infra before we do that (but i can get the changes and a basic plan of attack drafted up to save us some time in case we decide we should)19:30
jog0fungi: sounds like a plan to me19:31
jog0hopefully this will work instead though19:31
*** ^d has joined #openstack-infra19:32
Ajaegerfungi: did you read ?19:34
AjaegerMonty Taylor's laptop was stolen and he asks to have his ssh-keys disabled...19:34
*** rakhmerov has joined #openstack-infra19:34
fungiAjaeger: haven't seen it yet. irc has kept me away from -email19:34
fungiwill do19:34
*** thuc has quit IRC19:34
Ajaegerfungi: that's why I asked ;) Thanks for taking care.19:35
*** thuc has joined #openstack-infra19:35
fungijog0: okay, it's promoted to the head of teh gate19:36
*** rakhmerov1 has joined #openstack-infra19:36
*** rakhmerov has quit IRC19:36
jog0fungi: thanks19:37
*** rcleere has joined #openstack-infra19:38
*** oubiwann has quit IRC19:39
*** rnirmal has joined #openstack-infra19:39
*** _david_ has left #openstack-infra19:40
openstackgerritJeremy Stanley proposed a change to openstack-infra/config: Remove SSH key for Monty Taylor (mordred)
*** rakhmerov1 has quit IRC19:41
*** sparkycollier has joined #openstack-infra19:42
*** markmcclain has quit IRC19:42
*** _david_ has joined #openstack-infra19:43
fungiapproved that ^19:43
fungithanks Ajaeger!19:43
fungialso removing his key from the gerrit database19:44
fungii've disabled precise 12. it just decided to go on a rampage and fail a ton of jobs19:44
*** ruhe is now known as _ruhe19:44
fungijog0: 65760 won the coin toss and is self-gating on a rax node...
_david_fungi, zaro have you tried that guy on gerrit-dev.o.o:
Ajaegerfungi, thank you for fast action, I suggest you followup via email and tell mordred about it.19:49
*** pliszka has joined #openstack-infra19:49
fungiAjaeger: no need--i can reply19:49
fungiAjaeger: thanks a ton for bringing it to my attention quickly19:50
*** sarob has quit IRC19:50
*** sarob has joined #openstack-infra19:51
*** sarob has quit IRC19:53
Ajaegerfungi: thanks a lot for holding the infrastructure together ;)19:53
*** sarob has joined #openstack-infra19:53
Shrewsfungi: he just wanted an excuse for the new x24019:53
*** needscoffee is now known as morganfainberg19:54
*** blamar has quit IRC19:54
Shrewsjokes on mordred, though, b/c the HD display version isn't out yet19:55
*** david-lyle_ has joined #openstack-infra19:55
fungiShrews: maybe it's an excuse to not have to work while he;s out19:56
*** markmcclain has joined #openstack-infra19:56
Shrewsalso possible19:57
fungiShrews: i used to take the battery out of my pager when i got really tired of work beeping me19:57
fungi"battery must have ran dry, you called it so many times!"19:57
*** CaptTofu has quit IRC19:59
fungi_david_: i saw the commit title but haven's made it that far through my review queue yet. that's awesome that they integrated it. one more item for the list of code we can stop maintaining20:03
*** mrmartin has joined #openstack-infra20:05
fungiit's in my starred list of patches, looks good at first glance but we definitely don't want to merge that into production of course until we upgrade20:05
*** vipuls is now known as vipuls-away20:05
*** vipuls-away is now known as vipuls20:05
_david_sure, sure, i just wonder what would be the best way to integrate it in config site. Is this solely manual step (what i assume)?20:06
fungimrmartin: hi there! re what?20:06
mrmartinhi fungi20:07
mrmartinI have a quick question. I want to add some check / gating scripts for the community portal.20:07
fungi_david_: i think what we'd want to do is wrap the command in the hook with a conditional on the role id and then only set it empty on review-dev20:07
mrmartinWhat do you think, what is the shortest way to support php platform somehow?20:07
fungi_david_: that way we could merge it as is and not break 2.4.4 in production20:08
*** boris-42 has quit IRC20:08
_david_fungi, make sense20:08
fungimrmartin: do you have any example code for your tests? usually you would put them in your repository, probably in a tests subdirectory, then we can run them automatically to test proposed changes20:09
*** SergeyLukjanov has joined #openstack-infra20:09
mrmartinfungi: ok, but for running the test you need to deploy some php environment, right?20:09
*** ^d has quit IRC20:10
*** pliszka has left #openstack-infra20:11
mrmartinok 12.04 lts is supported, so it won't be a problem. And one additional thing, finally I want to build a snapshot tarball from the output of a drush make command.20:11
fungimrmartin: could the tests be run directly under a php interpreter (with no separate webserver process)? if so, that would probably be pretty easy to implement in a job20:11
*** alexpilotti has quit IRC20:11
mrmartinfungi, yes the tests can run without a browser, we are not yet using any selenium type testing.20:11
*** SergeyLukjanov has quit IRC20:11
fungimrmartin: hashar, when he's around, also may have some suggestions. at wikimedia they test a *ton* of php using mostly the same tools we do20:12
mrmartinok, so if I prepare a test and write some notice about the required environment, and execution you could help me to integrate it into ci process.20:13
fungimrmartin: as for the custom tarball job, that's probably easy to add as well. our normal tarball jobs are specific to python project packaging/tooling ( sdist stuff) but we have other custom tarball jobs. storyboard-webclient has a change proposed for something similar20:13
fungimrmartin: sure thing, i'd love to help20:13
mrmartinWhat I want to achieve is to run the tests first, create a tarball as a part of commit. From other part I want to upgrade the staging scripts and create prod puppet manifests to use those tarballs for site deploy / upgrade.20:14
fungimrmartin: here's how storyboard-webclient is thinking about doing their tarballs...
fungimrmartin: i'm going to guess that what you're referring to as a snapshot tarball is going to be more of a milestone/release and not something you're going to want to install in production for every single approved commit to the git repository, righth?20:16
*** sdake has joined #openstack-infra20:16
*** sdake has quit IRC20:16
*** sdake has joined #openstack-infra20:16
mrmartinhow difficult could be to deploy a zuul / gerrit environment for local testing/development?20:16
mrmartinfungi: yes, snapshot can go to staging anytime, but prod must be linked for some git tags.20:17
*** boris-42 has joined #openstack-infra20:17
fungimrmartin: others have done it and documented it fairly thoroughly...
fungimrmartin: so for the tarballs, our usual workflow on other projects is similar. we have per-branch tarballs that get replaced each time a commit is merged, so that they always reflect the tip of their respective branches, and then individual tarballs built from tagged commits which get kept around for ever20:18
mrmartinoh great, thank you, I'll try it, and tell you when I have a question, or something is ready.20:18
fungimrmartin: we can also distinguish between pre-release and release version numbers in tags, and take different steps accordingly20:19
openstackgerritDavanum Srinivas (dims) proposed a change to openstack-infra/elastic-recheck: Add query for bug 1261182
*** hogepodge has quit IRC20:19
mrmartinok, first I want to do gating scripts and tarball creation.20:20
fungimrmartin: sounds great20:21
*** vipuls is now known as vipuls-away20:21
fungijog0: i re-promoted 65760,1 to reset the gate, since precise12's shooting spree was going to take out the next half dozen changes if it merged20:23
*** yolanda has quit IRC20:25
*** jecarey has joined #openstack-infra20:26
fungijog0: and both ran on rax and succeeded, if you want to do any early evaluation on those20:26
*** ryanpetrello has quit IRC20:27
*** alexpilotti has joined #openstack-infra20:27
*** rfolco has quit IRC20:28
*** malini has left #openstack-infra20:29
*** gothicmindfood has quit IRC20:30
*** eharney has quit IRC20:30
*** _david_ has quit IRC20:31
*** _david_ has joined #openstack-infra20:31
*** ^d has joined #openstack-infra20:36
*** rakhmerov has joined #openstack-infra20:37
zaro_david_: i haven't set that up yet.  we will try it out though.20:38
_david_zaro, that would  be great, but it would surprise me, if t wouldn't work, because it works on gerrit-review ;-)20:39
*** mrmartin has quit IRC20:40
zaro_david_: hey, why is this invalid?20:40
_david_zaro, you claimed, that you can only change groups from UI and that this prevent you from automating project configuration. I explained in the description why i think it's not true.20:41
*** rakhmerov has quit IRC20:42
zaro_david_: yeah i see that. but my point to that bug is that you should be able do what i was describing without having to manually adding change owner to groups file.20:43
openstackgerritA change was merged to openstack-infra/elastic-recheck: Add some documentation on wildcard limitations in queries
zaro_david_: i think it's a valid bug.20:43
*** ^d has quit IRC20:45
_david_zaro, anyway i can change it to enhancement, or something but with current situation you can achieve what you want: set up and configure new gerrit project from a python script or something by editing two files20:46
zaro_david_: sounds good to me.20:49
_david_zaro, Done: you can may be clarify what you want there20:51
*** denis_makogon_ has joined #openstack-infra20:52
*** denis_makogon_ is now known as denis_makogon20:52
*** dprince has quit IRC20:53
*** sandywalsh has quit IRC20:54
*** ^d has joined #openstack-infra20:54
fungizaro: _david_: if that project.config gets pushed through gerrit's git interface, it will create the groups file reference automatically, right? (it does the same for any other group you mention in an acl)20:55
_david_zaro, "if that project.config gets pushed through gerrit's git interface" you mean through UI?20:56
zaro_david_: yes, it works when using the UI.20:56
fungi_david_: through git+ssh (or whatever other protocols new gerrit supports)20:56
fungi_david_: right now, when we push project.config via gerrit's git service, gerrit is smart enough to create groups and groups file references20:57
fungiand it also syntax-checks the project.config and rejects the push if it's non-parseable20:58
fungizaro: were you pushing locally on the filesystem instead?20:58
_david_OK, i din't check the old version, only the master.20:58
_david_on mater git push doesn't create reference in groups file, but the UI does, at least for system groups20:59
zarofungi: no i have All-Projects cloned to my laptop. then pushing to review-dev.oo20:59
openstackgerritDavanum Srinivas (dims) proposed a change to openstack-infra/elastic-recheck: Add query for bug 1264755
fungizaro: so your git remote is ssh://
fungiand you're pushing to that?21:00
fungizaro: yeah, then that does definitely do the right thing under 2.421:00
*** dstanek has joined #openstack-infra21:01
fungiwe frequently add acl entries specifying new groups, or referring to existing groups not previously referenced in the acl, and gerrit adds the line to the groups file automatically21:01
* fungi has to step away for just a moment... will brb21:01
*** hogepodge has joined #openstack-infra21:01
zaroyeah. it would be really bad for upgrade situations where there are already lots of existing groups.21:01
*** herndon has joined #openstack-infra21:02
*** SEJeff_work has quit IRC21:02
_david_zaro, how about to describe it this way? It sounds may be differently then ;-)21:04
*** eharney has joined #openstack-infra21:05
anteayafungi: thanks for checking the ssh key, I will advise them21:06
zaro_david_: were you suggesting a new bug or just update to existing on?21:06
_david_zaro i would suggest a new one.21:07
_david_zaro, like fungi mentioned; can you verify that it is regression?21:07
*** sandywalsh has joined #openstack-infra21:07
_david_and that not only related to the system groups?21:08
anteayaclarkb: the user in question is the motivation for
fungizaro: _david_: right, we should see whether adding an existing system group to an acl in 2.4.4 automatically adds an entry in that project's groups file21:08
*** vipuls-away is now known as vipuls21:08
zarofungi: were you asking me to review the config change,,  or the actual zmq change ?21:08
zarolooks like 41814 already merged.21:09
fungizaro: oh, right, you're right. i linked the wrong one, and it's already merged21:09
fungizaro: good review! ;)21:09
zarofungi: the commit message should probably link to review.o.o, no?21:10
zaro_david_, fungi : i can verify behavior on 2.421:10
fungizaro: the commit message could just mention the change id in fact (I1cf2aee446c1e51c8eb15f7d84c3e828f3716cce)21:11
zaro_david_: i've already verified that in 2.8 you cannot assign any groups to permissions that are not in the groups file.21:11
*** fbo is now known as fbo_away21:12
_david_zaro, to be more precisely, you can't do this through command line, OK21:12
openstackgerritA change was merged to openstack-infra/devstack-gate: Drop large-ops test down to 50 instances from 100
fungijog0: ^21:13
zaro_david_: that is correct, you cannot do it from cmd line unless you manually add the group to the groups file first.21:14
fungijog0: (and a massive string of green behind it)21:14
funginumerous passing large-ops tests21:14
_david_zaro, are you on Master or on 2.8?21:14
zaro_david_: on 2.821:15
_david_zaro, because /me is on master:
_david_this is the only big change on group system that i am aware of21:16
jog0fungi: woot21:17
*** jamielennox is now known as jamielennox|away21:17
zaro_david_: ahh. that is a big change.21:18
_david_zaro, that's the change dborowitz mentioned, as you've asked21:18
zaro_david_: however this probably won't fix it for upgrade situations.  on upgrade existing groups are still in the db correct?21:18
zaro_david_: that would mean you still will not be able to assign non-system groups to permissions?21:19
_david_zaro, you mean without changing the group manually?21:19
zaro_david_: yes21:20
_david_zaro, let me check it21:20
*** julim has quit IRC21:22
* zaro steps away, will brb21:23
*** sandywalsh has quit IRC21:27
*** praneshp has quit IRC21:28
*** praneshp has joined #openstack-infra21:30
openstackgerritDavanum Srinivas (dims) proposed a change to openstack-infra/config: Add jenkins slave name to the logstash records
*** sandywalsh has joined #openstack-infra21:41
*** thuc has quit IRC21:44
*** thomasem has quit IRC21:45
*** masayukig has joined #openstack-infra21:45
*** DennyZhang has joined #openstack-infra21:47
*** herndon has quit IRC21:48
*** banix has quit IRC21:48
*** banix has joined #openstack-infra21:49
_david_zaro, Have you verified against 2.4.2?21:50
*** sarob has quit IRC21:50
zaro_david_: not yet.  will start in a few moments.21:50
*** DennyZhang has quit IRC21:53
*** sarob_ has joined #openstack-infra21:53
openstackgerritSean Dague proposed a change to openstack-infra/elastic-recheck: parse the failed jobs in stream
_david_zaro, checked21:54
*** Ajaeger has quit IRC21:54
_david_zaro, it always worked this way, and it is even documented:21:54
_david_In order to reference a group in +project.config+, it must be listed in21:54
_david_the +groups+ file.  When editing permissions through the web UI this21:54
_david_file is maintained automatically, but when pushing updates to21:54
_david_+refs/meta/config+ this must be dealt with by hand.  Gerrit will refuse21:54
_david_+project.config+ files that refer to groups not listed in +groups+.21:54
*** david_lyle has joined #openstack-infra21:55
*** lcestari has quit IRC21:55
*** sarob has quit IRC21:56
*** nati_ueno has quit IRC21:56
_david_zaro, the code that refuses it:$ConfigValidator.onCommitReceived(
*** nati_ueno has joined #openstack-infra21:59
*** david-lyle_ has quit IRC21:59
*** dklyle has quit IRC22:00
*** david_lyle has quit IRC22:01
*** nati_ueno has quit IRC22:03
*** mfink has quit IRC22:04
fungi_david_: zaro: i stand corrected! we automated that apparently...
zaro_david_: thanks for checking.22:05
* fungi apologizes profusely for the confusion22:05
_david_fungi, zaro n. p.22:06
fungifor some reason i really thought gerrit had magicked that into existence, but it looks like we just brute force it in with db queries and writes to the fs22:06
_david_zaro, after thinking about it more, i think your enhancement request makes sense: harmonize the behavior between UI & git push22:08
zaro_david_: ++22:08
fungi_david_: that's part of why i was confused. gerrit already does some spooky magic to syntax-check the projects.config when you git push it to refs/meta22:09
fungiso it clearly has somewhere this could get hooked in22:09
_david_zaro, so may be opening a feature request is really a good idea, but as always with open source it would be even better to contribute a patch ;-)22:09
_david_fungi, jepp22:10
*** smarcet has quit IRC22:11
*** praneshp has quit IRC22:12
*** jgrimm has quit IRC22:13
*** CaptTofu has joined #openstack-infra22:14
zaro_david_: of course :))22:16
zaro_david_: will review javamelody today.22:16
_david_zaro, Have you noticed? It was merged22:17
*** alexpilotti has quit IRC22:17
zaro_david_: ohh wow, that was fast.  awesome!22:17
_david_zaro, so basically, forget The preference is gerrit-review22:18
_david_ ohh wow, that was fast.: that because /me have +2 ;-)22:18
*** mestery has joined #openstack-infra22:19
zarohaha! cores are powerful over there.22:21
*** ^d has quit IRC22:21
*** gokrokve has joined #openstack-infra22:23
jeblairfungi: morning; how's the large-ops issue?22:25
fungijeblair: i just watched half a dozen changes in series pass the gate (including a couple of nova changes running pep8/docs/py27 jobs on nodepool bare-precise nodes), so i think we're in *much* better shape again22:27
_david_zaro, don't quite understand you question, sorry22:27
zaro_david_: the h22:27
fungijeblair: i need to step away for a moment to cook dinner, but have a series of changes to incrementally move more projects to bare-precise nodes if you think it's a good idea... see
zaro_david_: the "history" used to say ..changed was merged successfully.  but not gerrit-review.o.o doesn't do that.22:28
fungijeblair: i've been offlining precise nodes which go into rapid-fire fail and not bringing them back online (there have been 4 or five so far today, i lost count)22:28
jeblairfungi: i just realized my change to do that for nova is wrong -- it neglects consideration of OFFLINE_NODE_WHEN_COMPLETE22:28
openstackgerritRussell Bryant proposed a change to openstack-infra/devstack-gate: Allow concurrency to be tweaked for tempest
jeblairfungi: (so basically, i think we can't do 'precise || bare-precise' and have to completely switch a project at a time, along with a zuul change to set that value)22:29
openstackgerritRussell Bryant proposed a change to openstack-infra/devstack-gate: Cut tempest concurrency in half
*** weshay has quit IRC22:29
jeblairrussellb: what's that for ^?22:29
fungijeblair: i did see a couple of bare-precise nodes rapid-fire fail a handful of jobs within 3-5 seconds before they got deregistered... would that be the cause?22:29
jeblairfungi: yes22:30
fungijeblair: okay, i'll rework my changes and fix nova after dinner22:30
_david_zaro, yes, never bothered me22:30
*** flaper87 is now known as flaper87|afk22:30
jeblairfungi: i'll go ahead and propose the nova fi x22:30
jeblairfungi: enjoy dinner22:30
zaro_david_: not a problem.  just that i've been traided to look at history for the merge.  got me confused on new UI.22:31
russellbjeblair: yeah so ... a group of us have been diving deep into failures, and at least half of them are performance problems that we theorize would improve if we just toned down the test load22:31
jeblairrussellb: roger, thx!22:31
fungithanks jeblair!22:33
_david_zaro, i think its' not really old/new change UI related, only merge firehouse was shut down22:34
*** mfer has quit IRC22:35
_david_ha gerrit-review seems to removed old change screen?22:37
*** sparkycollier has quit IRC22:37
*** mriedem has quit IRC22:39
*** mfink has joined #openstack-infra22:40
zaro_david_: gone for good?22:43
_david_zaro, that the question ;-)22:43
zaro_david_: did you ever push a change to secure monitoring?22:44
_david_zaro, i've spent months writing this change:
_david_it wasn't merged and the old change screen is disappeared ;-)22:44
_david_zaro, i didn't do it for a reason: do you really think it is necessary?22:45
*** rwsu has quit IRC22:45
*** thuc has quit IRC22:47
*** thuc has joined #openstack-infra22:47
zaro_david_: i'm not real familiar with javamelody, but clarkb seems to think it's absolutely necessary.22:47
*** kraman has quit IRC22:48
_david_zaro, OK, then we should definitely do that22:48
zaro_david_: apparently you can do stuff if it's left open.22:48
*** kraman has joined #openstack-infra22:48
*** mfink has quit IRC22:48
_david_zaro what we are missing beside that?22:48
_david_I think we are waiting for Gerrit 2.9? Right?22:48
clarkbzaro you can kill threads and stuff22:49
clarkbcould be used to dos us or whatever22:49
jeblairor create a heap dump22:49
zaro_david_: i think what we absolutely needed was core plugins+monitoring.  rquirements are in the etherpad:
*** rockyg has joined #openstack-infra22:50
zaro_david_: the current thought is to upgrade to 2.8 first.22:51
*** ryanpetrello has quit IRC22:51
*** praneshp has joined #openstack-infra22:52
openstackgerritJames E. Blair proposed a change to openstack-infra/config: Move nova/savanna to only use bare-precise nodes
*** kraman has quit IRC22:52
_david_zaro, I don' understand what you mean with plugins+monitoring22:53
_david_and reading the link i don't understand that commit message:22:53
_david_zaro, first line of this document states:22:54
_david_Gerrit 2.8 ships 4 core plugins, that must be installed to be full functional. Particularly replication and download-commands plugin are vital. There are number of ways how those plugins can be installed. Unattended mode is supported with --batch and --install-plugin foo option. This option must be provided multiply time to install all core plugins22:54
*** nati_ueno has quit IRC22:54
*** markmcclain has quit IRC22:54
zaro_david_: try refreshing.22:55
_david_zaro, ah, missed it, because it was listed...22:55
zaro_david_: because the current documentation states that it's --install-plugin is not supported.22:55
_david_zaro where?22:56
*** nati_ueno has joined #openstack-infra22:56
*** dcramer_ has quit IRC22:57
*** nati_ueno has quit IRC22:57
*** nati_ueno has joined #openstack-infra22:57
*** jorisroovers has quit IRC22:57
_david_zaro, it's the wrong link22:58
_david_zaro, this is the right one: <zaro> _david_: i think what we absolutely needed was core plugins+monitoring.22:58
_david_zaro, check --install-plugin option22:58
*** thuc has quit IRC22:59
zaro_david_: yeah. i got it now that's why i abandoned the change.22:59
_david_anyway, what do you mean with "i think what we absolutely needed was core plugins+monitoring"?22:59
*** burt has quit IRC22:59
sdaguethat fixes a gate fail22:59
*** UtahDave has quit IRC22:59
sdagueso next time there is a reset, popping it to the top of the list would be cool23:00
jeblairsdague: i will do it now23:00
jeblairsdague: the current head is a dead change walking23:00
sdaguejeblair: thanks23:01
*** sarob_ has quit IRC23:01
zaro_david_: i assume you mean content on the doc is incorrect. is there a patch to fix that?23:01
*** sarob has joined #openstack-infra23:01
zaro_david_: were you asking me about what else is needed for javamelody specifically?  i thought you mean what's needed for us to upgrade.23:02
jeblairsdague: gah, it was a race with my command and jenkins result; i don't know which won.23:02
_david_zaro, nope, this documentation you pointed to is correct, but its for something different! That why it called automatic:23:03
*** DennyZhang has quit IRC23:03
jeblairjenkins did, it seems.  oh well.23:03
_david_zaro, use case: you have nothing: no database, nothing. and you want to set up in one run.23:03
_david_zaro, Has nothing to do with what are doing: we have everything23:04
_david_zaro i was asking what is needed fir us to upgrade ?23:04
_david_not javamelody specific, but generally Also what?23:04
*** sarob has quit IRC23:06
zaro_david_: the gerrit 2.8 doc says.. "Installation of plugins during the site creation/initialization is not yet supported". you can do this right?23:06
*** denis_makogon has quit IRC23:07
_david_zaro wrong, can you please totally ignore this document:
zaro_david_: so to upgrade, we need to answer all of the outstanding questions in the etherpad and then do additional testing.23:08
*** jecarey has quit IRC23:08
_david_zaro, found it:
*** wenlock has quit IRC23:10
_david_zaro, this was the change that introduced this feature that you get confused, again has nothing to do with what we are doing23:11
jeblairsdague, fungi: turns out i haven't done it yet as it hasn't quite made it into the queue23:11
* zaro wipes out memory of that doc.23:13
_david_zaro: Gerrit upgrade + Plugin installation must be done unattended, right?23:13
sdaguejeblair: ok, well I'm about to step out for a bit. When you can, getting it in would be good23:13
zaro_david_: ohh, yeah that too. but i don't think that can be done for 1st upgrade.23:14
jeblairsdague: yep.  i'll keep checking back.  should be there in a few minutes23:14
*** yamahata has quit IRC23:14
zaro_david_: due to CLA configuration thing mentioned in the etherpad23:15
_david_zaro, basically one command would do that: java -jar <path-to-gerrit.war> init -d <path-to-gerrit-site> --batch  --install-plugin foo  --install-plugin bar  --install-plugin baz23:15
_david_zaro, Exception that CAL problem that sounds like a bug to me23:16
*** dstufft has joined #openstack-infra23:17
*** rcleere has quit IRC23:19
_david_zaro, how that? it is adding some groups and some group not?23:19
_david_to the database?23:19
jeblairsdague: fungi: 63365,5 is at the head23:20
zaro_david_: the upgrade does not add the "CLA Accepted - ICLA" to the db23:20
sdaguejeblair: thanks23:20
fungijeblair: in other news, mordred needs to get familiar with luks/dm-crypt23:20
sdaguefungi: yeh, he wasn't luksed?23:21
zaro_david_: that is the problem.23:21
jeblairfungi: yeah, i saw that.  :(23:21
_david_zaro, may be i have misread your bug description23:21
_david_you have three CLA sections right?23:21
_david_with 3 different groups, right?23:22
zaro_david_: yes.23:22
_david_now you are saying that only one groups was mssing in the database?23:22
_david_and other two were correct inserted?23:22
zaro_david_: sorry i still can't get that doc out of my mind. since was merged why is that doc still referencable?23:23
fungi_david_: the other groups already existed. we have three clas, two of which are group-based and one of which was not. during upgrade the new group was added to the groups file in all-projects but not to the db23:23
_david_fungi, zaro so i i would like to reproduce the problem with missing group:23:25
_david_1. set up 2.4.2 with CLA not group based.23:26
_david_2. upgrade to 2.823:26
zarofungi, _david_ : what _david_ is saying is correct.  1 missing, other 2 were inserted correctly.23:26
fungi_david_: the utopian expectation is that the new group which got automatically added to the groups file in all-projects would also be added to the various related groups tables in the db, and the new group members would be populated based on a query against the account_agreements table23:26
_david_3. add new group to the all-projects23:26
fungizaro: the other two groups were already in the database. those were group-based clas to begin with23:27
_david_fungi so you have added a new CLA during migration? i still don't get it, sorry23:27
zarofungi: ohh i see.23:28
*** slong has joined #openstack-infra23:28
_david_if you only had two group based CLA in 2.4.2 then the migration was successful and all is fine?23:28
zaro_david_: this is what's in the db after migration:
fungi_david_: one of our clas, the problem one, was not a group-based cla. it was the sort which was enforced with autoverify which, in older gerrit, added an entry for each account_id to an account_agreements table in the db. in new gerrit this is replaced by a mechanism wherein autoverify adds accepted cla accounts to the indicated group23:29
*** senk has joined #openstack-infra23:29
fungiso it *seems* like there's a missing migration step there in the upgrade. we can script it as a follow-on, but it's not a seamless upgrade23:30
zarofungi: probably would not be in puppet right? since it's just a 1 time deal.23:31
_david_fungi, sorry, this is a wrong wrong way to treat the problem. The right was it to fix it upstream23:31
_david_apply the patch or wait for the fix and make it seamless upgrade23:31
fungi_david_: i agree. that's why it's open as a bug23:32
_david_fungi, i can not start to fix the problem based on this bug description23:32
_david_what i need to fix it:23:32
_david_Exact 10 steps to reproduce it23:32
_david_1.set up 2.4.2 Gerrit23:32
_david_2. set up non group based CLA:23:33
_david_2.1, 2.n23:33
_david_9 upgrade23:33
_david_10 Error23:33
*** dkranz has quit IRC23:33
fungi_david_: sure, the bug report could definitely use a more reproducible test case if the gerrit devs don't have intimate familiarity with changes they've made to the cla bits, no question23:33
_david_fungi, the problem is: the dev that did that change not would be the same that is going to fix that problem.23:34
fungi_david_: i'm hoping one of us gets time to dig into gerrit source and identify the cause23:34
zarosorry about that.  i didn't know how we setup the CLA.23:34
_david_the bug described working parts: CLAs that were correctly migrated23:34
_david_zaro, i don't know it either. let us find it out and fix it23:35
fungii saw that bug report as a placeholder, pending a more detailed report23:35
*** UtahDave has quit IRC23:35
*** alexpilotti has joined #openstack-infra23:35
fungi(and also in hopes someone would just pipe up with "oh, yeah, we fixed that on trunk" or something)23:36
*** prad has quit IRC23:36
fungi:( i ran him off23:38
zaro_david_: while your here.  the Change Owner owner_group_id is set to 0 in the DB.  that seems wrong to me.23:38
*** mozawa has quit IRC23:39
zarodarn! just when i was about to ask another question.23:39
fungizaro: that doesn't seem wrong to me without more context, at least. in older gerrit, "administrators" is group_id 0 and owns most system groups23:41
zarofungi: ohh, well now it makes total sense why the code assigned 0. but that may not work since admin group_id looks like it's 1 now.23:42
*** CaptTofu has quit IRC23:43
fungioh, maybe it was admins=1 i was thinking of. not sure what 0 is in that case (maybe undefined? change owners wouldn't have a static list of accounts like a normal group anyway, right?)23:44
*** CaptTofu has joined #openstack-infra23:45
zarofungi: nope.23:46
zarofungi: not sure if i need to patch that up or not.23:46
fungiyeah, i'd be curious to hear the meaning of group_id=0 then, since the owner_group_id always seemed to correspond to another group_id in the table23:48
openstackgerritEmilien Macchi proposed a change to openstack-infra/devstack-gate: Enable Firewall as a Service plugin
*** thuc has joined #openstack-infra23:54
*** vipuls is now known as vipuls-away23:56
*** jasondotstar has quit IRC23:56
*** vipuls-away is now known as vipuls23:58
jeblairfungi: is 0 self-owned?23:58

