Tuesday, 2015-04-28

clarkbjogo: yup and the interrupting deals with that (not necessarily well but it does make people aware and care)00:00
lifelessI don't think infra would be held in the same lights-on respect if its approach is to hold everything hostage until thing X is done00:00
jogoclarkb: so as a nova-core this is the first time I am hearing about this ipv6 issue00:00
clarkbjogo: https://review.openstack.org/#/c/168701/ it just merged today00:00
jogoclarkb lifeless: I am not interested in having a theortical conversation about how to handle this aspect of dependency management. We can experiment once the bits are in place and just see00:01
clarkbjogo: but missed the kilo train00:01
jogoclarkb: oh I reviewed that this morning :)00:01
jogoclarkb: tell me more about the general issue00:01
jogoclarkb: we want to make devstack have ipv6 enabled in the gate by default?00:01
jogowhat other bits are needed to test it etc.00:02
clarkbjogo: general issue is nova in < kilo will attach floating IPs to ipv6 private addrs00:02
clarkbjogo: so kilo ipv6 doesn't work with neutron and floating Ips00:02
jogowe can backport the fix right away00:02
clarkbjogo: this means we can't turn on tests for it (because they fail) and it is a bad user expereince00:02
clarkbjogo: or we could've met the problem head on and just fixed it00:02
*** zz_dimtruck is now known as dimtruck00:03
*** xyang1 has quit IRC00:03
asselin_is etherpad down?00:03
clarkbasselin_: its up for me00:03
asselin_nevermind, just loaded...took a while00:03
clarkbasselin_: I think we do DB backups at 0000UTC00:04
clarkbwhich just happened00:04
jogoclarkb: so patch landed what is missing in tempest conf to turn it on?00:04
asselin_ok I see00:04
lifelessclarkb: I understand your frustration, and as you know I'm a big fan of reducing concurrency and churn on project effort,00:04
clarkbjogo: we need to enable it by default in tempest now, that change has been rechecked with the depends on merged00:04
clarkbjogo: then we should backport this fix and the devstack change to kilo00:04
lifelessclarkb: but I can't get beind stopping everyone elses genuine unrelated efforts unnecessarily00:04
clarkbjogo: sorry enable by default in devstack00:05
clarkblifeless: ya I just worry that everyones unrelated efforts usually take precedence00:05
clarkblifeless: and the actual user facing bugs get ignored00:05
lifelessstop the line is useful only because its stopping everyone to fix the flow00:05
clarkblifeless: the gate is not a perfect analog to user facing issues but it does a pretty good job00:05
jogoclarkb: ahh https://review.openstack.org/#/c/160856/00:05
lifelessthings that are within the flow don't invoke that00:05
jogoclarkb: it failed bashate00:06
clarkbwoo git problems00:06
clarkbI wonder if rax is generally having network weirdness00:06
jogoheh git00:06
*** openstack has joined #openstack-infra00:09
clarkbya thats a different error but possibly related if its network trouble00:09
jogolifeless: slight tangent. One of the things I am interested in seeing is better enabling smaller teams to own things they have an interest in. Such as a team of people who care about making OpenStack work with the latest dependencies00:09
clarkb(fungi has a support ticket open fwiw)00:10
jogolifeless: versus making that the responsibility of the larger group etc00:10
fungiright, i added some timing data to the support ticket, but so far no fanatical takers00:11
* jogo wonders if AWS has the issues as frequently00:11
jogoclarkb: support with rax or HP?00:11
clarkbjogo: rax00:12
jogooh fanatical00:12
fungiokay, apparently it takes 3 bandersnatch mirror runs from a full sync to get the generation high enough that the todo is empty and it generates a status file00:14
fungithis explains the oddness i ran into after the refresh i did last night00:14
*** sslypushenko has joined #openstack-infra00:15
*** samueldmq has quit IRC00:15
*** yamamoto has quit IRC00:18
asselin_dsvm jobs always run on ubuntu or are other operating systems used?00:22
*** markvoelker has quit IRC00:23
*** Somay has joined #openstack-infra00:23
*** baoli has joined #openstack-infra00:27
*** tjones1 has joined #openstack-infra00:28
*** annegentle has joined #openstack-infra00:36
*** zhiwei has joined #openstack-infra00:37
*** davideagnello has quit IRC00:43
*** davideagnello has joined #openstack-infra00:44
clarkbcentos and fedora as well00:45
*** baoli has quit IRC00:46
*** amotoki has joined #openstack-infra00:47
*** davideagnello has quit IRC00:49
*** kun_huang has joined #openstack-infra00:50
asselin_clarkb, thanks00:53
*** shashankhegde has quit IRC00:53
kun_huangdear guys, is this link the latest one: http://ci.openstack.org/running-your-own.html00:54
*** markvoelker has joined #openstack-infra00:54
kun_huanglifeless: thanks ;-)00:54
*** tjones1 has quit IRC00:55
*** stevemar has joined #openstack-infra01:01
*** baoli has joined #openstack-infra01:01
openstackgerritlifeless proposed openstack-dev/pbr: Use /opt/git directly  https://review.openstack.org/17762901:02
openstackgerritlifeless proposed openstack-dev/pbr: Stop testing setup.py easy_install behaviour  https://review.openstack.org/17750501:03
openstackgerritlifeless proposed openstack-dev/pbr: Test pip install -e of projects.  https://review.openstack.org/17750401:03
*** sputnik13 has quit IRC01:05
*** david-lyle_ has joined #openstack-infra01:09
*** jogo has joined #openstack-infra01:10
*** tiswanso has joined #openstack-infra01:16
*** krtaylor has joined #openstack-infra01:17
jheskethclarkb: hmm, so I wonder if we need some kind of timeout catcher in zuul-swift-uploader?01:19
jheskeththe question also becomes, that if we turn off scp logs, how do we notify of this kind of error01:19
clarkbjhesketh: I think we fail the job then go look at jenkins? thats a bad answer01:21
*** jtriley has quit IRC01:23
EmilienMnibalizer: https://storyboard.openstack.org/#!/story/2000247 - let me know if it's wrong01:23
*** signed8bit_ZZZzz is now known as signed8bit01:24
*** mriedem has quit IRC01:26
jheskethbut that could be tricky01:26
*** otter768 has joined #openstack-infra01:30
*** fifieldt has joined #openstack-infra01:32
*** otter768 has quit IRC01:35
*** asettle has quit IRC01:36
*** dboik has joined #openstack-infra01:36
*** annegentle has joined #openstack-infra01:37
*** dboik has quit IRC01:40
*** sarob has quit IRC01:43
*** patrickeast has quit IRC01:47
*** esker has joined #openstack-infra01:48
*** markvoelker has joined #openstack-infra01:55
*** tjones1 has joined #openstack-infra01:58
*** ayoung has joined #openstack-infra02:00
*** shashankhegde has joined #openstack-infra02:00
*** spzala has quit IRC02:03
*** jtriley has joined #openstack-infra02:05
*** hichihara has joined #openstack-infra02:05
*** yamahata has quit IRC02:06
*** unicell has quit IRC02:09
openstackgerritEmilien Macchi proposed openstack-infra/system-config: Create rubygems mirror from rubygems.org  https://review.openstack.org/17802602:09
EmilienMnibalizer: PoC ^02:09
*** david-lyle_ has quit IRC02:09
*** annegentle has quit IRC02:12
openstackgerritEmilien Macchi proposed openstack-infra/system-config: Create rubygems mirror from rubygems.org  https://review.openstack.org/17802602:23
*** mahito has quit IRC02:31
*** otter768 has joined #openstack-infra02:32
*** mahito has joined #openstack-infra02:35
*** yamada-h has joined #openstack-infra02:39
*** wenlock has joined #openstack-infra02:42
*** dboik has joined #openstack-infra02:43
*** baoli has quit IRC02:43
*** dboik_ has joined #openstack-infra02:44
*** yamada-h has quit IRC02:44
*** baoli has joined #openstack-infra02:46
*** dims has joined #openstack-infra02:47
*** dims has quit IRC02:52
*** jamesmcarthur has joined #openstack-infra02:53
*** markvoelker has joined #openstack-infra02:56
*** markvoelker has quit IRC03:00
*** aswadr has joined #openstack-infra03:01
*** jyuso1 has quit IRC03:06
*** asettle has joined #openstack-infra03:08
*** hichihara has joined #openstack-infra03:10
*** yamahata has joined #openstack-infra03:24
*** mahito has quit IRC03:29
*** sdake_ has quit IRC03:30
*** otter768 has joined #openstack-infra03:33
*** mahito has joined #openstack-infra03:37
openstackgerritLingxian Kong proposed openstack-infra/project-config: new-project: stackforge/terracotta  https://review.openstack.org/17774703:37
*** yamada-h has joined #openstack-infra03:40
*** reed_ has quit IRC03:41
*** tiswanso has quit IRC03:41
*** mahito has quit IRC03:41
*** xylan_kong has left #openstack-infra03:45
*** yamada-h has quit IRC03:45
*** yfried|afk is now known as yfried_03:45
*** fedexo has joined #openstack-infra03:47
*** shashankhegde has quit IRC03:47
openstackgerritMatthew Treinish proposed openstack-infra/subunit2sql: WIP: Add CLI tool to graph aggregate failure counts for tests  https://review.openstack.org/17803903:56
*** markvoelker has joined #openstack-infra03:56
*** camunoz has quit IRC04:00
*** yfried_ is now known as yfried|afk04:00
*** otter768 has quit IRC04:01
*** tjones1 has quit IRC04:01
*** markvoelker has quit IRC04:01
*** ddieterly has joined #openstack-infra04:05
*** yamada-h has joined #openstack-infra04:06
*** subscope_ has joined #openstack-infra04:09
*** ddieterly has quit IRC04:09
*** ildikov has joined #openstack-infra04:10
*** ayoung has quit IRC04:11
*** unicell has quit IRC04:15
*** yfried|afk is now known as yfried_04:15
*** camunoz has joined #openstack-infra04:16
*** unicell has joined #openstack-infra04:17
ianwclarkb: is there a problem with https://review.openstack.org/#/c/167837/ ?04:20
*** isviridov_away is now known as isviridov04:25
clarkbianw: no?04:28
ianwclarkb: ok, well it has been there for a month now04:29
*** yfried_ is now known as yfried|afk04:31
jheskethsdague: ping04:32
*** ildikov has quit IRC04:33
*** Sukhdev has joined #openstack-infra04:37
*** isviridov is now known as isviridov_away04:39
*** yfried|afk is now known as yfried_04:40
openstackgerritMatthew Treinish proposed openstack-infra/subunit2sql: WIP: Add CLI tool to graph aggregate failure counts for tests  https://review.openstack.org/17803904:41
*** mrmartin has joined #openstack-infra04:41
*** achanda has joined #openstack-infra04:44
*** sks has joined #openstack-infra04:46
*** markvoelker has joined #openstack-infra04:47
*** btully has quit IRC04:49
*** fedexo has quit IRC04:53
*** subscope_ has quit IRC04:57
openstackgerritYAMAMOTO Takashi proposed openstack-infra/devstack-gate: Handle the case of REMAINING_TIME <= 0  https://review.openstack.org/17804304:58
*** yfried_ has quit IRC05:01
*** shashankhegde has quit IRC05:05
*** ddieterly has joined #openstack-infra05:06
*** wenlock has quit IRC05:06
*** prad has joined #openstack-infra05:09
*** yamada-h has quit IRC05:10
*** ddieterly has quit IRC05:11
*** xylan_kong has joined #openstack-infra05:14
*** mwagner_lap has quit IRC05:15
xylan_konghey, guys, I submitted a new project proposal https://review.openstack.org/#/c/177747/, and I hope it could be approved before May Day, so my team can continue our work when we are back in office. I don't know if the change is perfect, so, I came here, beg for reviewing and feecback, so I can improve it according to the community rules, and to make it happen05:19
*** BharatK has joined #openstack-infra05:23
*** ildikov has joined #openstack-infra05:25
*** rm_work|away is now known as rm_work05:32
*** ibiris_away is now known as ibiris05:41
jheskethxylan_kong: I'll take a look05:46
jheskethxylan_kong: so you don't want to seed any repo now?05:46
*** yamada-h has joined #openstack-infra05:47
xylan_kongjhesketh: hi, I thought if the project is approved, a new repo will be created, and I can submit code to the repo, right?05:47
jheskethyep, that's right05:47
jheskethbut it'll be empy05:47
xylan_kongjhesketh: ok, sure. I will submit code by myself, if the repo is there.05:48
jheskethxylan_kong: if you wanted what's from https://github.com/beloglazov/openstack-neat you should do it at import05:48
jheskethresubmitting the code will cause it to go through the review process which is not something you want to do05:48
jheskethor else you'll lose all your history (also not great)05:48
*** tnovacik has joined #openstack-infra05:49
xylan_kongjhesketh: yep, i am ok with that. because i have do some code improvement to make it more suiable for an OpenStack related project.05:49
*** shashankhegde has joined #openstack-infra05:49
xylan_kongjhesketh: but thanks for your advise and the info!05:50
*** hodos|2 has quit IRC05:50
jheskethxylan_kong: right, but it's reasonably important to keep the authors05:51
jheskethand you never know when the history is going to be useful05:51
jheskethwouldn't it be easier to just import the repo and then propose your changes on top of that?05:51
*** hdd has joined #openstack-infra05:52
*** harlowja_ is now known as harlowja_away05:53
xylan_kongjhesketh: actualy, it's what I intended to do, but as Andreas Jaeger said in my last patchset, the project contains a stale branch which will things complicated.05:53
*** asrangne has joined #openstack-infra05:53
jheskethxylan_kong: you can't remove the stale branch?05:53
*** aswadr has quit IRC05:54
xylan_kongjhesketh: yes, i am not the author, i just get the permission of the author to put the project to stackforge, and make myself as the maintainer. the author will not work on that.05:54
jheskethxylan_kong: the author is no longer involved?05:55
xylan_kongjhesketh: yep05:55
jheskethxylan_kong: you could create a fork on github of the branch(es) you need and then use that as the seed for the new repo05:55
jheskethI think it's important to keep the original commits and author details05:55
xylan_kongjhesketh: really?05:56
xylan_kongjhesketh: ok, I'll try05:56
xylan_kongjhesketh: give me a sec05:56
jheskeththanks :-)05:57
*** asrangne has quit IRC05:59
xylan_kongjhesketh: btw, other parts of the patch is ok for you, right?05:59
jheskethxylan_kong: I'll take a closer look, please hold05:59
*** Krinkle is now known as Krinkle|detached05:59
jheskethxylan_kong: yep, otherwise looks good06:00
*** yamada-h has quit IRC06:00
*** Darkwan has quit IRC06:01
openstackgerritOpenStack Proposal Bot proposed openstack-infra/project-config: Normalize projects.yaml  https://review.openstack.org/17805106:01
*** scheuran has joined #openstack-infra06:01
openstackgerritLingxian Kong proposed openstack-infra/project-config: new-project: stackforge/terracotta  https://review.openstack.org/17774706:02
*** otter768 has joined #openstack-infra06:02
*** yamada-h has joined #openstack-infra06:02
xylan_kongjhesketh: i have updated the patch, please have a look when it pass the Jenkins, thanks very much!06:02
*** hemnafk has quit IRC06:03
jheskethxylan_kong: thanks, looks good to me06:03
*** hemnafk has joined #openstack-infra06:03
xylan_kongjhesketh: your help is much appreciated!06:04
*** stevemar has quit IRC06:04
jheskethno trouble :-)06:04
*** SlickNik has quit IRC06:05
*** krotscheck has quit IRC06:05
*** otter768 has quit IRC06:06
*** ddieterly has joined #openstack-infra06:06
*** yamada-h has quit IRC06:06
*** samuelBartel has quit IRC06:10
*** hyakuhei has joined #openstack-infra06:13
*** achanda has quit IRC06:13
*** sandywalsh has quit IRC06:14
*** yfried_ has joined #openstack-infra06:15
*** shardy_z is now known as shardy06:15
*** jamespage_ has joined #openstack-infra06:16
*** jamespage_ has quit IRC06:18
*** shashankhegde has quit IRC06:19
*** MaxV has joined #openstack-infra06:22
*** sandywalsh_ has quit IRC06:23
*** armax has quit IRC06:23
*** yfried_ is now known as yfried|afk06:24
*** mrunge has joined #openstack-infra06:26
*** yfried|afk is now known as yfried_06:28
*** asettle has quit IRC06:31
*** zul has joined #openstack-infra06:31
*** zul has quit IRC06:36
*** hyakuhei has quit IRC06:36
*** Sukhdev has quit IRC06:37
*** hyakuhei has joined #openstack-infra06:38
*** e0ne has joined #openstack-infra06:39
*** jcoufal has joined #openstack-infra06:39
*** yfried_ is now known as yfried|afk06:40
*** ociuhandu has joined #openstack-infra06:44
*** yamamoto has quit IRC06:48
*** samuelBartel has joined #openstack-infra06:50
*** sandywalsh_ has joined #openstack-infra06:51
*** sandywalsh has quit IRC06:51
openstackgerritYanis Guenane proposed openstack-infra/project-config: Add support for backport-potential commit flag  https://review.openstack.org/17584906:55
*** ociuhandu has quit IRC06:57
*** sandywalsh has joined #openstack-infra06:58
*** sandywalsh_ has quit IRC06:59
*** zul has joined #openstack-infra06:59
*** zz_dimtruck is now known as dimtruck06:59
*** dtantsur|afk is now known as dtantsur07:01
*** hyakuhei has quit IRC07:06
*** ddieterly has joined #openstack-infra07:08
*** dimtruck is now known as zz_dimtruck07:09
*** ddieterly has quit IRC07:12
*** MaxV has joined #openstack-infra07:13
*** zz_dimtruck is now known as dimtruck07:14
*** Somay has joined #openstack-infra07:17
*** sergsh has joined #openstack-infra07:19
*** dimtruck is now known as zz_dimtruck07:23
jklarehi, could some core give this one https://review.openstack.org/#/c/176674/ a quick push? it would be great if we could continue to work with these new gates for some time this week07:24
*** funzo has quit IRC07:24
jklareor fix any issues before the weekend07:25
rakhmerovhi, is there a way to delete 2015.1.X tags from stackforge/python-mistralclient repo? We'd like to start versioning our client in the same way as other projects (e.g. 0.2.x, 0.3.x)07:27
*** shashankhegde has quit IRC07:28
openstackgerritFabien Boucher proposed openstack-infra/system-config: Move hardcoded values into jenkins class params  https://review.openstack.org/16728807:28
openstackgerritMerged openstack-infra/project-config: new-project: stackforge/terracotta  https://review.openstack.org/17774707:31
*** viktors|afk is now known as viktors07:32
*** Hal has joined #openstack-infra07:32
*** hichihara has quit IRC07:32
*** Hal is now known as Guest460707:32
*** pcaruana has joined #openstack-infra07:33
*** hdd has joined #openstack-infra07:35
openstackgerritHuang Rui proposed openstack-infra/project-config: Create neutron-zvm-plugin project on StackForge  https://review.openstack.org/17103007:36
*** samuelBartel has joined #openstack-infra07:37
*** ildikov has joined #openstack-infra07:37
*** jlanoux has joined #openstack-infra07:39
*** arxcruz has joined #openstack-infra07:41
openstackgerritFabien Boucher proposed openstack-infra/system-config: Move server class call outside of jenkins*.pp class  https://review.openstack.org/17048707:41
*** yfried_ is now known as yfried|afk07:44
*** markvoelker has quit IRC07:44
*** yfried|afk is now known as yfried_07:46
*** isviridov_away is now known as isviridov07:53
*** yamahata has quit IRC07:56
*** funzo has joined #openstack-infra07:58
*** mwagner_lap has quit IRC07:58
*** dhritishikhar_ has joined #openstack-infra07:59
*** yfried_ is now known as yfried|afk08:00
*** Longgeek has joined #openstack-infra08:00
*** dhritishikhar_ has quit IRC08:01
*** dhritishikhar_ has joined #openstack-infra08:01
*** otter768 has joined #openstack-infra08:03
*** hichihara has joined #openstack-infra08:04
openstackgerritFlavio Percoco proposed openstack-infra/project-config: Use zaqar's devstack plugin  https://review.openstack.org/17807608:07
*** otter768 has quit IRC08:08
*** ddieterly has joined #openstack-infra08:08
*** notnownikki has joined #openstack-infra08:09
*** Somay has joined #openstack-infra08:09
*** Ala has quit IRC08:10
*** jogo has quit IRC08:11
openstackgerritFlavio Percoco proposed openstack-infra/project-config: Use zaqar's devstack plugin  https://review.openstack.org/17807608:12
openstackgerritFlavio Percoco proposed openstack-infra/project-config: Use zaqar's devstack plugin  https://review.openstack.org/17807608:12
*** ddieterly has quit IRC08:13
*** oomichi has quit IRC08:13
*** markvoelker has joined #openstack-infra08:15
*** Longgeek has quit IRC08:16
*** Longgeek has joined #openstack-infra08:16
*** derekh has joined #openstack-infra08:17
*** btully has quit IRC08:19
*** markvoelker has quit IRC08:20
*** isviridov is now known as isviridov_away08:21
*** hdd__ has joined #openstack-infra08:21
*** resker has joined #openstack-infra08:21
*** yfried__ has quit IRC08:21
*** yfried__ has joined #openstack-infra08:21
*** yfried_ has quit IRC08:22
*** zhiwei has quit IRC08:22
openstackgerritAndreas Jaeger proposed openstack-infra/project-config: Create networking-zvm project on StackForge  https://review.openstack.org/17103008:23
*** lucap has joined #openstack-infra08:24
*** mrodden has quit IRC08:24
*** _nadya_ has joined #openstack-infra08:25
*** mpaolino has joined #openstack-infra08:33
*** mpaolino has quit IRC08:33
openstackgerritFabien Boucher proposed openstack-infra/puppet-openstackci: Add generic zuul manifests  https://review.openstack.org/17597008:37
*** claudiub has joined #openstack-infra08:37
xylan_kongjhesketh: ping08:43
*** amotoki has quit IRC08:45
jheskethxylan_kong: pong08:46
xylan_kongjhesketh: https://review.openstack.org/177747 was approved. and then?08:47
xylan_kongjhesketh: accroding to http://docs.openstack.org/infra/manual/creators.html#update-the-gerrit-group-members, i need to contact someone08:47
jheskethxylan_kong: yep, but unfortunately I don't have privs to do that08:49
anteayajhesketh: I think you should08:49
anteayaif that counts08:49
anteayathen you can help folks like xylan_kong08:49
jheskethxylan_kong: you can try pinging a infra/root person here, or otherwise open a bug08:49
jheskethanteaya: well I like being helpful ;_)08:50
openstackgerritSerg Melikyan proposed openstack/requirements: Add python-muranoclient to requirements  https://review.openstack.org/17720508:50
xylan_kongjhesketh: do you know who has the permission? and anteaya thanks!08:50
anteayajhesketh: :) you excel at it08:50
anteayaxylan_kong: yes and they are asleep right now08:50
anteayaxylan_kong: did you submit the patch 177747?08:51
xylan_konganteaya: yes08:51
anteayathat is all they need08:51
anteayacheck tomorrow08:51
xylan_konganteaya: ok08:51
anteayayou should be in both -core and -release groups08:51
anteayathen you can add whoever else you need to, to the -core group08:52
jheskethxylan_kong: in the mean time you should be able to clone the repo from git.openstack.org and push up your changes for review08:52
xylan_konganteaya: well, see that from the doc..08:52
anteayawhat doc?08:53
xylan_kongjhesketh: yep, i find it, https://github.com/stackforge/terracotta08:53
jheskethxylan_kong: that'll work, but the canonical repository is http://git.openstack.org/cgit/stackforge/terracotta/08:53
xylan_konganteaya: what you said is memtioned in http://docs.openstack.org/infra/manual/creators.html#update-the-gerrit-group-members08:54
anteayaxylan_kong: great08:54
anteayaxylan_kong: thank you for reading the documentation08:54
xylan_konganteaya: it's really helpful08:55
anteayaxylan_kong: I'm glad to hear that08:55
anteayathanks for sharing that feedback08:55
xylan_konganteaya: you're one of the author?08:56
anteayawell I helped to review parts of it08:56
anteayaI don't think I wrote any part of the creator guide08:56
anteayamost of that was dhellmann08:56
xylan_konganteaya: ok, good job!08:56
anteayaxylan_kong: thank you08:57
anteayaand I'm back to bed08:57
anteayagood night again08:58
xylan_konganteaya: :) good night!08:58
openstackgerritFabien Boucher proposed openstack-infra/puppet-openstackci: Add generic zuul manifests  https://review.openstack.org/17597009:03
*** yfried__ has quit IRC09:03
*** yfried__ has joined #openstack-infra09:04
*** fhubik has joined #openstack-infra09:04
*** e0ne has joined #openstack-infra09:05
openstackgerrityolanda.robla proposed openstack-infra/jenkins-job-builder: Added parallelization options  https://review.openstack.org/7551409:08
*** Ala has quit IRC09:08
*** ddieterly has joined #openstack-infra09:09
*** Ala has joined #openstack-infra09:09
*** yamamoto has joined #openstack-infra09:12
*** vlaza_brb is now known as vlaza09:14
*** rlandy has joined #openstack-infra09:15
*** yamamoto has quit IRC09:16
*** markvoelker has joined #openstack-infra09:16
*** zz_johnthetubagu is now known as johnthetubaguy09:17
*** Somay has quit IRC09:18
*** markvoelker has quit IRC09:21
*** Somay has joined #openstack-infra09:21
*** yfried__ is now known as yfried|afk09:25
*** Somay has quit IRC09:26
*** Longgeek has quit IRC09:28
*** yamamoto has joined #openstack-infra09:29
*** jamespage_ has quit IRC09:36
openstackgerritSamuel BARTEL proposed openstack-infra/project-config: add project fuel-plugin-glance-nfs  https://review.openstack.org/17810909:39
*** Longgeek has joined #openstack-infra09:40
*** jogo has joined #openstack-infra09:43
*** zhiwei has left #openstack-infra09:44
*** mugsie has quit IRC09:45
*** yfried|afk is now known as yfried__09:45
openstackgerritDavide Guerri proposed openstack-infra/shade: Fix exception re-raise during task execution for py34  https://review.openstack.org/17810709:49
openstackgerritDavide Guerri proposed openstack-infra/shade: Add Neutron/Nova Floating IP support  https://review.openstack.org/17703609:49
*** woodster_ has quit IRC09:50
*** mugsie has joined #openstack-infra09:52
*** Longgeek has quit IRC09:54
openstackgerrityolanda.robla proposed openstack-infra/jenkins-job-builder: Added parallelization options  https://review.openstack.org/7551409:55
*** yfried__ is now known as yfried|afk09:55
openstackgerritMarton Kiss proposed openstack-infra/puppet-redis: Add redis 2.8 support  https://review.openstack.org/17811309:57
*** fhubik is now known as fhubik_afk09:57
*** fifieldt has quit IRC09:58
*** Longgeek has joined #openstack-infra10:00
*** yamada-h has joined #openstack-infra10:02
*** otter768 has joined #openstack-infra10:03
*** mugsie has quit IRC10:04
*** xianghui has joined #openstack-infra10:04
*** yfried|afk is now known as yfried__10:08
*** otter768 has quit IRC10:08
*** jamespage_ has joined #openstack-infra10:09
*** ddieterly has joined #openstack-infra10:10
openstackgerrityolanda.robla proposed openstack-infra/jenkins-job-builder: Added parallelization options  https://review.openstack.org/7551410:10
*** kaisers has quit IRC10:12
*** dims has joined #openstack-infra10:12
*** ddieterly has quit IRC10:14
*** jamespage_ has quit IRC10:15
*** Longgeek has quit IRC10:16
*** markvoelker has joined #openstack-infra10:17
*** fhubik_afk is now known as fhubik10:20
*** lucap1 has joined #openstack-infra10:23
*** lucap1 has quit IRC10:24
*** zigo has quit IRC10:29
*** lucap has quit IRC10:33
*** cdent has joined #openstack-infra10:33
*** Longgeek has joined #openstack-infra10:33
*** gsagie has joined #openstack-infra10:33
gsagieHello, i have a patch that fails Jenkins but i see eveyrthing as green so i am trying to understand why Jenkins added -1, https://review.openstack.org/#/c/178081/   , anyone might take a look please?10:33
*** hyakuhei has quit IRC10:33
*** adrian_otto has quit IRC10:33
*** BobH has quit IRC10:33
*** yamamoto has quit IRC10:34
openstackgerritJiri Stransky proposed openstack-infra/tripleo-ci: Puppet: don't manage /etc/hosts via cloud-init  https://review.openstack.org/17772210:35
*** SergK_ has quit IRC10:35
*** SergK has joined #openstack-infra10:35
samuelBartelhello, seems to have a problem similar as the gsagie'one10:36
samuelBarteljenkins put -1 during check but in console logs errors seems to be linked to ressources not related to my change and unavailable10:37
samuelBartelreview is https://review.openstack.org/#/c/178109/ , if someont might take a look ti would be great thank you10:37
*** sushilkm has left #openstack-infra10:41
xylan_kong”After the review is approved and groups are created, ask the Infra team to add you to both groups in gerrit, and then you can add other members.“, I'm  waiting for help from someone who has the previlege.10:43
*** ociuhandu has quit IRC10:43
*** teran has joined #openstack-infra10:44
*** zul has quit IRC10:45
xylan_kongAJaeger: hi, how are you, see you again10:45
AJaegerxylan_kong, doing fine, thanks!10:46
*** alexpilotti has quit IRC10:46
xylan_kongAJaeger: my gerrit full name: Lingxian Kong, short name: kong, email address: anlin.kong@gmail.com10:46
*** spredzy_ is now known as spredzy_|afk10:46
*** teran has quit IRC10:47
*** jlanoux has joined #openstack-infra10:47
AJaegerxylan_kong, thanks - I expect others to backscroll during US morning and do this for you.10:47
*** teran has joined #openstack-infra10:47
xylan_kongAJaeger: i hope so :)10:47
*** marcusvrn1 has joined #openstack-infra10:49
*** jlanoux_ has quit IRC10:50
*** fhubik is now known as fhubik_afk10:52
AJaegergsagie, do you have an example?10:53
*** Longgeek has quit IRC10:56
*** hichihara has quit IRC10:56
*** nfedotov has joined #openstack-infra10:56
*** jamielennox is now known as jamielennox|away10:57
*** e0ne is now known as e0ne_10:58
*** Longgeek has joined #openstack-infra11:00
gsagieAJaeger: https://review.openstack.org/#/c/178081/11:01
AJaegergsagie, click on "toggle CI" to get the full information.11:02
AJaegerYou have the -1 for "gate-dragonflow-requirements http://logs.openstack.org/81/178081/8/check/gate-dragonflow-requirements/be70cd3/ : Incompatible requirement found; see https://wiki.openstack.org/wiki/Requirements in 21s"11:02
*** weshay has joined #openstack-infra11:07
*** e0ne_ has quit IRC11:09
*** weshay has joined #openstack-infra11:10
gsagiei saw this line used at another project where it works (i do imports from neutron)11:10
AJaegergsagie, those might not have a requirements check enabled ;)11:10
*** ddieterly has joined #openstack-infra11:10
gsagieAJaeger : you add it per project?11:11
gsagiebecause it works in other project11:11
AJaegergsagie, give me an example, please11:11
*** mugsie has quit IRC11:11
gsagieAJaeger: https://github.com/stackforge/networking-ovn/blob/master/requirements.txt11:12
gsagieset here and it works11:12
AJaegerand that project has no "check-requirements" job defined11:13
AJaegergsagie, you could do it like neutron-vpnaas does it:11:13
gsagiewill try, thanks11:14
AJaegersee also http://git.openstack.org/cgit/openstack/neutron-vpnaas/tree/requirements.txt#n2011:14
*** ddieterly has quit IRC11:15
*** markvoelker has joined #openstack-infra11:18
sdaguethe top 3 recheck bugs are all infrastructure related11:20
AJaegersdague ;(11:21
sdaguethe git servers are dropping connections11:21
AJaegerWhat are the problems? I see git failing quite often today11:21
sdaguethe apt repos just went bonkers11:21
*** markvoelker has quit IRC11:23
ttxsdague: I'll admit this is jeopardizing late RC respins. We basically cancelled Nova RC3 until the gate is not crazy anymore11:27
*** rvasilets_ has joined #openstack-infra11:30
*** yamada-h has joined #openstack-infra11:30
rvasilets_Hi, I'm form Rally team. Our Murano job isn't working. And  we need to merge this patch https://review.openstack.org/#/c/177746/11:30
rvasilets_Could you help with this?11:31
sdaguettx: yeh, well, tell rax to have a better network?11:32
ttxsdague: I tell them all the time.11:33
sdagueI don't think anyone figured out why - Bug 1449136 - OpenStack pypi mirrors disconnecting connections stopped hitting, but that killed us yesterday11:33
openstackbug 1449136 in OpenStack-Gate "OpenStack pypi mirrors disconnecting connections" [Undecided,New] https://launchpad.net/bugs/144913611:33
ttxsdague: personally I blame our inability to detect release-critical bugs earlier in the process. We shouldn't need last-minute RCs11:33
AJaegerrvasilets_, I approved the patch, after merging it needs 30 mins until it's ready, so wait a bit before running "recheck"11:33
*** jistr is now known as jistr|class11:34
rvasilets_AJaeger, ok, thank you11:34
sdagueThis - Bug 1282876 - git clone fails with "fatal: Not a git repository", "git remote update failed." - is apparently how rax implements bw limitting11:34
openstackbug 1282876 in OpenStack-Gate "git clone fails with "fatal: Not a git repository", "git remote update failed."" [Critical,Fix released] https://launchpad.net/bugs/1282876 - Assigned to Jeremy Stanley (fungi)11:34
sdaguewhich, makes our git servers less reliable than github11:34
sdagueand - Bug 1286818 - Ubuntu package archive periodically inconsistent causing gate build failures - is a thing that hits us all the time, because no one ever thought about apt mirroring correctly11:35
openstackbug 1286818 in OpenStack-Gate "Ubuntu package archive periodically inconsistent causing gate build failures" [Low,In progress] https://launchpad.net/bugs/1286818 - Assigned to Jeremy Stanley (fungi)11:35
*** jamespage__ has joined #openstack-infra11:35
*** yamada-h has quit IRC11:35
*** zul has joined #openstack-infra11:35
*** yfried__ is now known as yfried|afk11:36
sdaguefungi: it would be nice to reopen this bug - https://bugs.launchpad.net/openstack-ci/+bug/1282876 - because it's not actually fixed11:37
openstackLaunchpad bug 1282876 in OpenStack-Gate "git clone fails with "fatal: Not a git repository", "git remote update failed."" [Critical,Fix released] - Assigned to Jeremy Stanley (fungi)11:37
jeblairsdague, ttx: i'm not up to speed on what you're talking about.  is there anything i can do to help?11:40
*** jamespage_ has quit IRC11:40
sdaguejeblair: honestly, I think it's all structural11:40
jeblairsdague: on centos or otherwise?11:40
sdague72 fails in 24 hrs11:41
sdagueall over the place11:41
sdaguewe just had a giant apt-mirror fail spike on top of it11:41
openstackgerritMerged openstack-infra/project-config: Using Neutron network in gate-rally-dsvm-murano-rally  https://review.openstack.org/17774611:41
sdagueso that's basically managed to kill most things11:42
sdagueand the pypi mirror fail yesterday made anything with more than a couple of devstack jobs on it very difficult to get through11:43
sdagueso things were pent up11:43
jeblairsdague: we may need to run our own apt mirrors then.  we've talked about that, but these failures are usually rare and transient.  maybe we should bump the priority of that.11:43
jeblairwhat do we know about the pypi issue?  did running bandersnatch full resync fix it?11:44
*** dims has quit IRC11:44
sdaguejeblair: I don't know, it went away. I don't know if fungi updated the bug with why11:44
sdagueyeh, the bug seems to only contain my initial description - https://bugs.launchpad.net/openstack-gate/+bug/144913611:45
openstackLaunchpad bug 1449136 in OpenStack-Gate "OpenStack pypi mirrors disconnecting connections" [Undecided,New]11:45
jeblairi'm looking at the git servers11:46
*** yfried|afk is now known as yfried__11:46
jeblairsdague: were the git failures temporally localized?11:48
*** spredzy_|afk is now known as spredzy_11:48
*** yamamoto has joined #openstack-infra11:48
sdagueit's the #2 bug on elastic-recheck, so you can see the graph there11:49
jeblairmy connection is very bad.  it should load eventually tho11:50
*** rfolco has joined #openstack-infra11:50
jeblairno i'm at the hotel wondering if i will get to attend the conference11:52
*** sambetts has quit IRC11:52
*** sambetts has joined #openstack-infra11:53
*** skolekonov has quit IRC11:55
*** baoli has joined #openstack-infra11:56
sdagueit also looks like even when the git connection doesn't get killed, it gets really slow some times11:56
sdaguethe git remote updates there are taking about 2m per repo11:57
*** claudiub has quit IRC11:57
openstackgerritGal Sagie proposed openstack/requirements: Add Ryu to requierments  https://review.openstack.org/17814811:57
sdaguethe job was killed because it ran out of time as workspace setup took 48 minutes11:57
*** che-arne has joined #openstack-infra11:58
*** mpavone has quit IRC11:59
jeblairokay, for the apt mirror, we could override sources.list to use upstream12:00
jeblairit seems like all those errors are on rax and i beive on rax the nodes have sources.list using mirror.rackspace.com12:01
sdagueyeh, the apt mirror is currently kind of melting everything12:01
jeblair(that should be double checked)12:01
sdagueso... not all are on rax12:01
sdaguelet me look at the logstash12:01
sdagueyeh, the apt mirror seems to be nuking the world right now, we jumped to 174 failed jobs now12:02
*** ildikov has quit IRC12:02
*** mugsie has joined #openstack-infra12:03
*** fhubik_afk is now known as fhubik12:03
*** dtantsur|brb is now known as dtantsur12:03
sdagueI see some hp cloud fails as well, let me see if I can get logstash to scope that a bit better12:04
*** otter768 has joined #openstack-infra12:04
*** jamespage__ has quit IRC12:05
jheskethsdague: oh hiya. If you get a chance to talk os-loganalyze, let me know12:06
sdagueso, there are a few hp cloud fails as well, I suspect that rax mirrored when the tcpdump was in the broken state12:06
sdaguethe fails seem to be all over a corrupt tcpdump package12:06
sdagueor, a missing one, that is12:06
*** yfried__ has quit IRC12:06
*** mpaolino has joined #openstack-infra12:06
*** yfried__ has joined #openstack-infra12:07
sdaguejeblair: so what would we need to do to flush the mirror setting on rax?12:07
sdaguejhesketh: sure, what's up?12:07
jeblairsdague: mordred is writing a change to do that now.  it will require an image rebuild12:08
gsagiewhere do i change the project to not check requierments?12:08
jheskethsdague: just wondering if you have any hints to what went wrong when serving up the console (that caused you to revert https://review.openstack.org/#/c/177221/) before I go digging12:08
jheskethie logs etc12:08
gsagiei need to add a requierment which is not yet in global-requierments (Ryu)12:09
gsagiebut jenkins keeps failing12:09
*** otter768 has quit IRC12:09
sdaguejhesketh: I don't remember, eventually fungi found an error log entry and put it in pastebin12:10
sdaguejeblair: ok, great12:10
jheskethfungi: are you about per chance?12:10
*** nmagnezi has joined #openstack-infra12:10
sdaguejhesketh: more importantly though, because only infra root can debug that, I think we should actually work out a devstack job to actually functionally test this. Because the cost of fail ends up being pretty high.12:11
*** ildikov has joined #openstack-infra12:11
*** ddieterly has joined #openstack-infra12:11
*** johnthetubaguy is now known as zz_johnthetubagu12:11
jeblairfungi, clarkb, jhesketh, pleia2, mordred: mordred is working on a change to switch to using upstream apt mirrors.  hopefully that will ease the pain of the apt mirror bug, but it will take an image rebuild to take effect.12:12
*** dprince has quit IRC12:12
fungisounds good12:12
jheskethsdague: hmm, that's fair. I think maybe we need to improve the test suite first though..12:12
jeblairfungi, clarkb, jhesketh, pleia2, mordred: afaict, the pypi mirror error is not happening now.  i don't think it was related to bandwidth caps -- the cacti graphs look well under the specified allocation there12:13
*** markvoelker has joined #openstack-infra12:13
*** mwagner_lap has joined #openstack-infra12:13
*** shardy_ has joined #openstack-infra12:13
jeblairfungi, clarkb, jhesketh, pleia2, mordred: if it starts happening again, we may want to perform more intense debugging12:13
fungiyeah, i suspect a misbehaving neighbor on the same compute host as pypi.dfw12:13
jheskethjeblair: okay. Let me know if there is anything I can do to help :-)12:14
fungieventually rackspace may respond on the ticket i opened12:14
*** woodster_ has joined #openstack-infra12:14
jeblairi can't do that kind of work from my current network location :(12:14
*** shardy has quit IRC12:15
*** Adri2000 has quit IRC12:15
sdaguejhesketh: sure, but I think part of the issue is that unit tests are only going to get us so far. Making a devstack plugin for os-loganalyze where it actually runs in apache (and we can even use a swift) will make it much easier to know it's going to work on deploy12:15
jeblairfungi, clarkb, pleia2, mordred: so hopefully someone else can look into that12:15
*** ddieterly has quit IRC12:16
*** dprince has joined #openstack-infra12:16
openstackgerritMonty Taylor proposed openstack-infra/project-config: Avoid vendor supplied apt mirrors  https://review.openstack.org/17816012:16
*** nmagnezi has quit IRC12:16
mordredok - I think ^^ that's what we need there12:16
*** nmagnezi has joined #openstack-infra12:16
jeblairi think we sholud drop the rest of what we are doing and focus on these bugs now; i do not want to delay the release12:16
sdagueso... the apt mirror failure is something we can very strongly fingerprint. Would it be possible to have zuul auto restart those jobs?12:16
mordredalthough I wonder if we should also put that into a ready script, so that we don' thave to wait for image rebuilds12:16
jheskethsdague: I'm not sure devstack is the best place for that, but sure, integration testing is a good idea12:17
sdaguejhesketh: I don't understand12:17
jeblairmordred: good idea12:17
mordredworking on that now12:17
jeblairsdague: not easily and i'd rather avoid rube goldberging it12:17
*** [HeOS] is now known as HeOS12:17
sdaguejeblair: ok, that's fair12:18
jheskethsdague: we could have a test set up the instance it is on with os-loganalyze and serve up some content via swift and disk without needing to spin up a whole devstack cloud12:18
jheskethsdague: the catch is we'd have to use switch credentials somewhere for it12:18
jeblairi'm going to transit to the conference now.  i'll check in later.12:18
jhesketh(which I guess we'd get with devstack swift)12:18
sdaguejhesketh: that seems like a lot of replicating devstack for no particularly good reason12:19
*** shardy_ has quit IRC12:19
openstackgerritMonty Taylor proposed openstack-infra/project-config: Avoid vendor supplied apt mirrors  https://review.openstack.org/17816012:19
sdaguealso, if it was a devstack plugin, people could use os-loganalyze on their devstacks, which has been requested12:19
*** shardy has joined #openstack-infra12:19
*** Hal has joined #openstack-infra12:19
*** Hal is now known as Guest6960712:19
fungiokay, so the one avenue i tried before with apt sources which seemed to work was to catch failures in apt-get update or apt-get install and then sed the sources.list with a different fallback url and do another apt-get update and retry the install12:21
openstackgerritMonty Taylor proposed openstack-infra/project-config: Avoid vendor supplied apt mirrors  https://review.openstack.org/17816012:21
fungithere is a significant downside however12:21
*** resker has quit IRC12:21
mordredthis is an area where I think yum has a better architecture, since it knows about mirror lists and how to sanly do fallbacks between them12:22
fungiwhich is that in performing a fallback we hide failures of the mirrors we're using, so that if they're decommissioned entirely we won't know about it until we start hitting failures on the fallback12:22
jheskethmordred: looking12:22
mordredfungi, jhesketh: in any case- please check me on that patch - obvs testing this is mildly hard12:22
fungijhesketh: alternatively, when developing changes to os-loganalyze run a local apache instance with it deployed there so you can simulate interactions. in this case all requests for console logs were causing wsgi tracebacks in the access logs12:24
sdaguefungi: except swift12:24
notnownikkimordred, we've seen it a *lot* recently, I've written a script that gets run hourly that detects ips that have gone unassigned and frees them up12:24
mordredfungi, jhesketh: I suppose we could grab the ready script and put it in place manually on nodepool and see if failures stops12:24
mordrednotnownikki: oh lovely12:25
fungisdague: true. was this a problem only for console logs served via swift?12:25
notnownikkiI can submit it to -infra if you're interested?12:25
jheskethfungi: okay, good to know, thanks12:25
mordrednotnownikki: yes please12:25
notnownikkicool :)12:25
jeblairi'm not sure how to communicate this effectively, but i think those three bugs are the only things we should be working on now.  i would like for us to defer unrelated conversation to another time so that the people who can fix them are best able to do so12:25
jheskethmordred: we'd still need to rebuild the images though right?12:25
sdaguejeblair: ok, no problem12:25
mordredjhesketh: nope - the ready script gets run by nodepool on boot12:25
*** kgiusti has joined #openstack-infra12:26
openstackgerritYAMAMOTO Takashi proposed openstack/requirements: Add ryu  https://review.openstack.org/15435412:26
jheskethmordred: well I can't see it making things worse12:26
mordredanybody have a problem with me applying that ready-script by hand?12:26
openstackgerritFabien Boucher proposed openstack-infra/infra-specs: Specification proposal about system-config testing using containers  https://review.openstack.org/17283312:28
mordredhave we sent out a status?12:28
* jeblair transits12:29
openstackgerritMonty Taylor proposed openstack-infra/project-config: Avoid vendor supplied apt mirrors  https://review.openstack.org/17816012:29
mordredthat is just updating the commit message to associate with the bug12:29
AJaegermordred, did you run bashate on your script? I think it will fail12:30
mordredreally? piddle12:30
mordredAJaeger: what did I get wrong?12:30
AJaegermordred, indent the sudo.12:30
AJaegerJust from visual inspection...12:30
mordredoh - bashate will fail - but hte script will work12:31
mordredcool - I can handle that :)12:31
*** bswartz has quit IRC12:31
AJaegermordred, just run it myself - bashate was happy. So, leave it as is ;)12:32
fungimordred: out of curiosity why do we need to do it in the configure mirror script and the node prep script?12:32
*** devvesa has joined #openstack-infra12:32
jklarecould i beg for a quick push for this one https://review.openstack.org/#/c/176674/ so we (the openstack chef people) can start playing with our new gates ;) ?12:32
mordredfungi: we don't - readyscript is purely to mitigate current problem without needing to respin nodes12:33
fungijklare: we're suspending normal helpfulness while we focus on infra bugs12:33
*** dboik_ has quit IRC12:33
fungimordred: oh, perfect. that makes great sense12:33
*** davideagnello has joined #openstack-infra12:33
AJaegerjklare, the team is working on fixing some release critical blocker, see topic12:33
*** hyakuhei has joined #openstack-infra12:33
mordred#status Gate is experiencing epic failures due to issues with mirrors, work is underway to mitigate and return to normal levels of sanity12:34
BobBallI just noticed the epic failures in my CI too12:35
jklareAJaeger: fungi: oh sry didnt realize, nvm then and good luck12:35
mordred#status Gate is experiencing epic failures due to issues with mirrors, work is underway to mitigate and return to normal levels of sanity12:36
openstackstatusmordred: unknown command12:36
mordredyou'd think I'd know how to use that12:36
mordredbut you'd apparently be wrong12:36
sdagueso, the apt fix should make things much better, the git clone errors still exist though (at a lower failure rate)12:36
sdaguehttp://logs.openstack.org/57/173357/5/check/gate-nova-docs/8155851/console.html is the most recent one12:36
mordredyeah - I think we should figure those out - we looked at the network and it doesn't seem to be bandwidth-cap related12:37
mordredso it may take looking at logs on the git servers and seeing what's up12:37
sdaguethat's from an hpcloud node12:37
sdaguebut it looks pretty spread around12:37
*** davideagnello has quit IRC12:38
sdaguelast 48hrs, I also excluded the tripleo jobs that were failing because I don't know if they might be another issue or not12:38
jheskethmordred: unfortunately I don't have access to the git morrors to check logs, but will see if there are any more clues in the jobs (although sdague seems to have done a good job of investigating012:38
sdagueand they aren't preventing RC things from landing12:38
fungijhesketh: i don't believe the apt mirrors and git failures to be in any way related (other than clouds don't do a good job of providing robust networks?)12:39
mordred#status alert Gate is experiencing epic failures due to issues with mirrors, work is underway to mitigate and return to normal levels of sanity12:39
sdagueoh... hey... so all the current git failures are *non* dsvm jobs12:40
sdaguenow.... the apt fails could be masking that12:40
sdaguehowever, in devstack we've got built in git retry logic12:40
*** ildikov has quit IRC12:40
sdagueI wonder if that could be uplifted to help12:40
fungiwe could in theory do something similar in the gerrit-git-prep macro12:40
-openstackstatus- NOTICE: Gate is experiencing epic failures due to issues with mirrors, work is underway to mitigate and return to normal levels of sanity12:41
*** jtriley has quit IRC12:41
sdaguethat's the inner function we pass everything through12:42
mordredfungi: I can work on uplifting that into ggp if you want12:42
mordredalthough that _will_ take a node respin12:43
fungimordred: alternatively we could retry the ggp script in the builder macro?12:43
mordredfungi: ah12:43
fungithat's about the only way i can think of to avoid rebuilding images12:43
sdaguefungi: so it is the lesser fail12:44
*** samueldmq has joined #openstack-infra12:44
fungisdague: i can't parse your last sentence12:44
sdaguesorry, this git clone disconnect is a less frequent failure12:45
mordredso might as well work on git fails12:46
mordredah - yeah. gotcha12:46
fungivery good point12:46
mordredfungi, sdague, AJaeger: why can I not find the gerrit-git-prep macro with git grep?12:47
fungistill waking up and getting ready to coffee, so brain fuzzy12:47
*** ddieterly has joined #openstack-infra12:47
AJaegermordred, jenkins/scripts/gerrit-git-prep.sh12:47
mordredoh. it's because I suck12:47
sdagueis the job break down, I guess there still is one dsvm fail in there, but the majority is things like doc jobs12:47
fungimordred: jenkins/jobs/macros.yaml12:48
AJaeger"git grep gerrit-git-prep" works for me12:48
sdagueand based on how many more dsvm jobs we have, I think that points to the fact that this inner retry is quite effective12:48
mordredAJaeger: yeah - I just quit out of the search before and then forgot that I did12:48
*** zz_dimtruck is now known as dimtruck12:50
*** ibiris is now known as ibiris_away12:50
fungii wonder if that would be useful to implement in zuul-cloner as well12:50
*** sdake has joined #openstack-infra12:51
openstackgerritAndrey Pavlov proposed openstack/requirements: Add botocore to requirements  https://review.openstack.org/17233512:51
*** alexpilotti has joined #openstack-infra12:52
sdaguedamn, apparently we stick the retry message here into a weird log12:53
*** sdake_ has joined #openstack-infra12:53
sdagueso we can't see how often we recover on normal devstack runs12:53
*** hyakuhei has quit IRC12:53
openstackgerritMonty Taylor proposed openstack-infra/project-config: Put retry loop around gerrit-git-prep  https://review.openstack.org/17817312:53
*** sushilkm has joined #openstack-infra12:54
*** sushilkm has left #openstack-infra12:54
mordredfungi, sdague, AJaeger: ^^ there is one putting it into the builder macro, which should get us til a rebuild - I'll do one to ggp itself next12:54
*** dboik has joined #openstack-infra12:54
*** marcusvrn1 has quit IRC12:54
sdaguemordred: you want to put a message in there when we fall into the retry loop as well so we can keep an eye on how often it happens?12:55
*** julim has joined #openstack-infra12:56
openstackgerritIlia Meerovich proposed openstack-infra/jenkins-job-builder: Adding support for SSH bulider plugin  https://review.openstack.org/17817612:56
*** sdake has quit IRC12:56
*** Somay has joined #openstack-infra12:57
*** ibiris_away is now known as ibiris12:57
AJaegermordred, no need to make the two changes dependend on each other12:57
fungisdague: probably best to do that in the script modification12:58
openstackgerritIlia Meerovich proposed openstack-infra/jenkins-job-builder: Adding bulider for SSH plugin  https://review.openstack.org/17817612:58
*** ildikov has joined #openstack-infra12:59
openstackgerritMonty Taylor proposed openstack-infra/project-config: Use git_timed function from devstack in ggp  https://review.openstack.org/17817712:59
openstackgerritIlia Meerovich proposed openstack-infra/jenkins-job-builder: Adding builder for SSH plugin  https://review.openstack.org/17817612:59
mordredsdague: sure!12:59
mordredAJaeger: good point - I'll decouple them next pass12:59
*** gsagie has quit IRC13:00
sdaguefungi: ok, that's probably fine as well13:00
sdagueAJaeger: well I approved the first patch13:00
sdagueso dependencies won't really be an issue13:00
AJaegersdague, ok ;)13:01
*** jistr|class is now known as jistr13:01
*** yamamoto has quit IRC13:01
openstackgerritMonty Taylor proposed openstack-infra/project-config: Add retry message to gerrit-git-prep macro  https://review.openstack.org/17818013:02
sdaguemordred: so... there is no GIT_TIMEOUT set in https://review.openstack.org/#/c/178177/1/jenkins/scripts/gerrit-git-prep.sh,cm right?13:02
mordredsdague: there ya go ^^13:02
*** zul has quit IRC13:02
mordredsdague: nope13:02
sdagueso... shouldn't we set one?13:02
*** swat30 has quit IRC13:02
mordredoh - heh13:02
sdagueotherwise what does timeout do if timeout is set to 013:02
mordredgood point13:02
mordredoh - 0 defaults to waiting for forever13:03
sdagueI think we set it in d-g13:03
sdagueor.... not13:03
sdagueyeh, I don't know13:03
sdagueapparently we don't set it other places13:03
sdagueso I guess it's fine13:03
mordredso - hangs aren't our problem - at least the mechanism is there so that we can set GIT_TIMEOUT in the g-g-p macro in the future13:04
mordredif it becomes an issue13:04
*** _nadya_ has quit IRC13:04
sdaguemaybe that's one of the things that ianw needed for the red hat network13:04
*** yamamoto has joined #openstack-infra13:04
sdaguehttps://review.openstack.org/#/c/74910/ - yeh, that's the original change13:05
mordredany chance it's been long enough for us to see if anyhting is working?13:05
fungii've finally gotten bandersnatch caught back up on all the mirrors and cron/puppet reenabled on them now13:06
fungistatus files look fine this time so shouldn't have a repeat of yesterday13:06
mordredI should be able to see apt-get commands in a jenkins log on a dvsm job, yeah?13:06
*** spzala has joined #openstack-infra13:06
fungiwell, in a devstack-gate setup log13:07
*** swat30 has joined #openstack-infra13:07
fungiyou could ssh into a worker and look at its filesystem too13:07
sdaguemordred: so... as of 2 minutes ago rax systems were still using their mirrors - https://jenkins02.openstack.org/job/check-tempest-dsvm-ironic-pxe_ssh/3604/console13:07
jeblairfungi: was the re-running of bandersnatch related to the disconnects, or were there two pypi problems?13:07
mordredoh wow13:07
*** hichihara has joined #openstack-infra13:07
sdaguehow soon should the hotfix have applied13:08
*** hichihara has quit IRC13:08
*** jeblair changes topic to "Bugs 1449136, 1282876, and 1286818 are critical and are affecting the release process"13:08
*** mriedem has joined #openstack-infra13:08
*** ZZelle has quit IRC13:08
*** raginbajin has quit IRC13:08
sdaguemordred: oh... that's a new one13:08
mordredjeblair: ready scripts run when the node becomes ready, not when a job starts, right?13:08
sdaguefail on swift log upload13:08
mordredI'm thinking there are some issues at rackspace13:09
openstackgerritMerged openstack-infra/project-config: Avoid vendor supplied apt mirrors  https://review.openstack.org/17816013:09
*** ddieterly has quit IRC13:09
fungijeblair: unrelated, though it was suspected they might be related early on yesterday13:09
sdaguewell, the apt mirror is the normal apt mirror issue13:09
sdaguewe actually saw the same package fail on an hpcloud node13:09
mordredwait - what?13:09
sdaguerax just happened to lock their mirror to that mirror state13:10
sdagueso in the giant pile of rax failures13:10
jeblairmordred: yes, right before nodepool marks them ready13:10
mordredsdague: so - we need to wait for a node to move from building to ready state13:10
sdagueI found 1 hp cloud node that tried to apt-get upgrade and hit the same missing tcpdump node13:10
fungijeblair: it was ultimately caused by a problem on pypi itself (probably glusterfs-related) which caused bandersnatch to delete some packages from all our mirrors (but left them in the indexes), at least one of which we were using13:10
sdaguemordred: ah, so we have to cycle out some ready nodes first then?13:10
mordredsdague: yes13:11
*** tiswanso has joined #openstack-infra13:11
sdagueso... that swift fail, is a thing13:11
jeblairfungi: ack.  that's exciting.13:11
fungijeblair: but apparently when you do a full refresh of bandersnatch mirror now you need to keep running it over and over until it catches up, because the second pass is likely to still take longer than our timeouts13:11
mordredjeblair: we may have a _fourth_ problem13:11
*** changbl has quit IRC13:11
sdague124 hits in the last 24 hours13:11
sdaguelet me ER that13:11
jeblairfungi: wow13:11
jheskethmordred: simple non-blocking comment on https://review.openstack.org/#/c/178177/13:12
*** zz_johnthetubagu is now known as johnthetubaguy13:12
*** annegentle has joined #openstack-infra13:13
jeblairso "rax network problems" explains pypi, git, and swift errors :/13:13
fungioh yeah13:13
*** zul has joined #openstack-infra13:13
sdaguemordred: though it looks like it's only a failure 25% of the time?13:14
fungithey did have notices up that they were doing network upgrade maintenance in dfw all week13:14
mordredjhesketh: yes. let me fix that13:14
jeblairfungi: they picked a good week for it13:14
fungilinked from https://status.rackspace.com/13:14
*** lucap has quit IRC13:15
jeblairjhesketh: i wonder if we should put some retries in the swift upload?13:15
fungioh, good point. we have it retry jenkins console log retrieval but we don't yet have it retry uploading to swift13:16
*** dimtruck is now known as zz_dimtruck13:16
sdaguejeblair: so... it must be retrying13:16
openstackgerritMonty Taylor proposed openstack-infra/project-config: Use git_timed function from devstack in ggp  https://review.openstack.org/17817713:16
openstackgerritMonty Taylor proposed openstack-infra/project-config: Add retry message to gerrit-git-prep macro  https://review.openstack.org/17818013:16
sdaguebecause that error is only 25% fatal13:16
*** wenlock has joined #openstack-infra13:16
sdaguethis is a fail - http://logs.openstack.org/72/177072/4/gate/gate-senlin-python27/fce63d8/console.html13:17
fungii didn't see it in the script. looking again13:17
sdaguebut this is a success13:17
*** sushilkm has joined #openstack-infra13:18
*** peristeri has joined #openstack-infra13:18
jeblairsdague: i think it's never a failure actually13:18
jeblairsdague: the failure you posted looks like a git failure13:19
jheskethjeblair: can do13:19
sdaguejeblair: ok, so it's a red herring?13:19
sdaguedo we just need the tool to stop stack tracing13:19
jeblairsdague: i believe it really failed to upload the file, so i think we want to fix it, but i think it's not causing carnage13:20
jeblairjhesketh: ^13:20
openstackgerritMerged openstack-infra/project-config: Put retry loop around gerrit-git-prep  https://review.openstack.org/17817313:20
jheskethjeblair: there are lot of errors in the query sdague posted.. it seems like zuul-swift-upload is having as much trouble with the network as anything else13:21
jheskethdifference is that the job doesn't fail on bad upload13:22
jeblairjhesketh: agreed13:22
mordredsdague: 2015-04-28 05:01:15.887 | fatal: unable to access 'http://zm01.openstack.org/p/openstack-infra/devstack-gate/': Failed to connect to zm01.openstack.org port 80: Connection timed out13:23
mordredsdague: that's one of the things we're tracking, yeah?13:23
jheskethjeblair: well at the moment everything on the critical path appears to be network hardening/retrying anyway?13:24
jeblairsdague, mordred: that's a zuul merger, as was the git failure sdague posted in relation to the swift log thing.  i think that is showing that the network errors are _very_ widespread13:24
fungii saw a fetch failure from a zuul merger tank a job yesterday. retried it manually (same ref) and it worked. wasn't a timeout, but a short read or something13:24
fungiseems likely to also be "rackspace network broken"13:24
jeblairso we're seeing network problems on zuul mergers, git servers, pypi mirrors, and individual nodes uploading to swift13:24
fungioh! we also saw a multinode master lose ssh connectivity to a subnode in the middle of a job yesterday13:25
mordredwell, hopefully the retry in g-g-p will help work us past network issues on git things13:25
funginot sure what provider that was in though13:25
jeblairand by bad luck, there's the ubuntu mirror problem which is (probably) not network related13:26
mordredjeblair: so - fwiw - in the past we've not had a good reason to run zuul mergers or git mirrors per-cloud13:26
fungithe distro mirror networks are usually volunteer operated, and i've seen them broken at random plenty. it's not just rackspace that has a hard time maintaining package mirrors13:26
*** sks has quit IRC13:26
mordredjeblair: this might be a reason to consider that in the future13:26
mordredfungi: totally13:27
jheskethmost failures do appear to be in the hpcloud (although I'm guessing y'all already knew that)13:27
mordredjhesketh: the swift errors?13:28
jeblairso i'm optimistic and glad we can mitigate the ubuntu mirror problem, and hopefully reduce the git errors.  i'm also glad that it looks like we actually have an explanation for why everything broke at once.13:28
jheskethmordred: yes13:28
fungiit's the whole "crossing the internet" problem13:28
fungifor swift anyway13:28
mordredjeblair: ++13:28
*** e0ne is now known as e0ne_13:28
fungirax dfw seems to be broken enough internally right now that machines can't even communicate reliably within it though13:28
mordredit's too bad that we can't use a swift in each cloud and have it appear as one contiguous thing13:28
jheskethfungi: ah, the swift ones probably aren't failing for rax as they are closer to the network13:28
fungijhesketh: that would be my entirely unscientific conjecture, anyway13:29
*** dkranz has quit IRC13:29
*** adrian_otto has joined #openstack-infra13:30
sdaguefungi: yeh, it's hard to tell if it's hpcloud network being bonkers, or rax network being bonkers, or a 3rd one13:30
sdaguebut it looks like predominantly hpcloud fails in the swift log upload13:31
sdagueI guess when/if we get a 3rd cloud, the finger pointing will be easier13:31
fungisdague: breaking news! clouds fail at networking13:32
fungistory at 1113:32
sdagueand hence why people like to run their own, where they can control that :)13:32
openstackgerritDoug Hellmann proposed openstack-infra/release-tools: Add a --stable-series argument to release_notes.py  https://review.openstack.org/17819413:32
openstackgerritDoug Hellmann proposed openstack-infra/release-tools: Add option to format release notes for email  https://review.openstack.org/17819513:32
sdaguejhesketh: I filed - https://bugs.launchpad.net/openstack-gate/+bug/144957013:32
openstackLaunchpad bug 1449570 in OpenStack-Gate "raxspace swift sometimes fails to accept log uploads with file posting error" [Undecided,New]13:32
jeblairsdague: triangulation ftw13:32
openstackgerritJoshua Hesketh proposed openstack-infra/project-config: Retry log upload to swift  https://review.openstack.org/17819913:34
jheskethsdague, jeblair: ^13:34
*** esker has joined #openstack-infra13:35
sdagueok, I'm pushing a tempest patch series which will hopefully drain all the rest of the old nodes13:35
*** jtriley has joined #openstack-infra13:35
sdaguehttp://dl.dropbox.com/u/6514884/screenshot_226.png (though that was an hour ago)13:36
*** e0ne_ is now known as e0ne13:37
*** ddieterly has joined #openstack-infra13:38
fungiit's marvellous how the rackspace maintenance notice says "We expect this maintenance to be non-impact to customers."13:38
mordredyeah - I spoke to johnthetubaguy about it13:38
fungimakes me wonder if they're wrong, or if we're wrong13:38
mordredand it was control-plane side maint13:38
*** ociuhandu has joined #openstack-infra13:38
mordredbut he's helpfully talking to some people further13:39
*** Ala has quit IRC13:39
*** _nadya_ has joined #openstack-infra13:40
sdaguejhesketh: that's an infinite loop, right?13:40
sdaguethere is no break out of the while loop13:40
fungii'm less wondering about disruptiveness of the maintenance activity itself, and more whether what was changed by the maintenance yesterday is not quite operating as expected13:40
jheskethsdague: the raise should break it yes?13:40
*** sigmavirus24_awa is now known as sigmavirus2413:40
sdaguejhesketh: only if you get an exception13:40
jheskethoh right, yes, my bad13:40
*** Somay has quit IRC13:40
openstackgerritJoshua Hesketh proposed openstack-infra/project-config: Retry log upload to swift  https://review.openstack.org/17819913:41
jheskethsdague: okay, trying again ^13:41
sdagueso, honestly for stuff like this for x in xrange(3) is often what I use so that even if you screw up success you aren't in an infinite loop13:42
johnthetubaguyfungi: I am looking into what that change actually is13:42
sdaguebut this should be fine13:42
fungijohnthetubaguy: awesome--thanks and sorry to bother you!13:43
openstackgerritJoshua Hesketh proposed openstack-infra/project-config: Retry log upload to swift  https://review.openstack.org/17819913:44
jheskethsdague: good idea, ^13:44
*** stevemar has joined #openstack-infra13:44
sdaguejhesketh: s/while/for/ ?13:44
openstackgerritJoshua Hesketh proposed openstack-infra/project-config: Retry log upload to swift  https://review.openstack.org/17819913:45
jheskethsigh, sorry, thanks13:45
*** vhoward has left #openstack-infra13:45
jheskethbut multiple attempts like this is pretty trivial, it's just late and I'm rushing things (clearly a bad idea)13:47
*** zz_jgrimm is now known as jgrimm13:47
mordredyah - I certainly don't think we should block on that13:47
jeblairit is 8 lines13:47
*** scheuran has quit IRC13:47
sdaguemordred: sure, though honestly retry logic is something that gets written all over the place, I think it's fine to just do it in place and not librarize it13:48
*** esker has joined #openstack-infra13:48
*** mpavone has joined #openstack-infra13:48
mordredsdague: indeed13:49
sdaguejhesketh: I do not know, I reordered a tempest patch series I had to try to sweep up any remaining bad nodes13:49
openstackgerritMerged openstack-infra/project-config: Use git_timed function from devstack in ggp  https://review.openstack.org/17817713:50
sdagueas that should have just consumed 100+ dsvm nodes13:50
jheskethsdague: ah okay13:52
jeblair2015-04-28 13:43:29.724 | + echo 'HTTP check of http://mirror.rackspace.com/ubuntu/dists/trusty/Release.gpg - attempt #1'13:52
jeblairi wonder if we should remove that from devstack-gate?13:53
jeblairsdague: does that read sources.list?13:53
sdaguejeblair: I don't think it reads sources.list13:53
fungiit does not. its just an http ping basically13:54
sdaguewe should probably remove it, it was put in to try to debug this previously13:54
fungigrabs the release file i think13:54
*** BharatK has quit IRC13:54
jeblairoh i think it's hardcoded for rax13:55
jeblairso it even does that on hpcloud13:55
mordredsdague: basically, you should not see mirror.rackspace.com in any of the apt interactions13:55
mordredsdague: if you do - my fix did not work13:56
ttxmight have been not that smart to release after Ubuntu after all13:56
sdaguemordred: ok, well still seeing it, but I want to know if your fix is not applied, or if it's not working13:56
sdagueso I was wondering if there was a way to figure that out13:56
ttxmaybe their apt mirrors are taking a 15.04 upgrade hit ?13:56
jeblairsdague: yeah, point me at a failed jobs13:56
*** armax has joined #openstack-infra13:56
sdaguejeblair: https://jenkins02.openstack.org/job/gate-tempest-dsvm-neutron-large-ops/46910/console13:57
jeblairttx: we're seeing the problem mostly on rax mirrors, and we're in progress moving to upstream ubuntu mirrors to mitigate13:57
ttxjeblair: ack13:57
sdaguettx: yeh, so it's the same issue that happens from time to time, rax runs their mirror at a moment when canonical is updating theirs, and so gets a broken version13:58
sdaguewhich remains broken until next mirror update13:58
jeblair2015-04-28 13:40:06,707 DEBUG nodepool.NodeLauncher: Node id: 2345326 is running, ip:, testing ssh13:58
sdaguebut unlike hitting ubuntu servers directly, which recover in a couple minutes, this ends up being broken for a long window of time13:58
ttxsdague: I know there is a strict process to follow to avoid that -(- basically there is an "update in progress" lock file you can use to sync your own13:58
*** Krinkle|detached is now known as Krinkle13:58
derekhCan anybody take a look at this please, I'm trying to move the tripleo F20 job to F21 https://review.openstack.org/#/c/169778/13:58
jeblairsdague, mordred: ^ that seems very recent, recent enough that i think it's not working13:59
sdaguederekh: right now everyone is working on release impacting bugs in infra13:59
sdaguejeblair: right, that's why I asked13:59
derekhsdague: ack13:59
*** sdake_ has quit IRC13:59
sdaguethat was one of the jobs in my 6 tempest series13:59
jeblairsdague: yeah, i mean that's the timestamp for when the ready script should have run.  so confirming that node _should_ have gotten the fix.13:59
mordredjeblair: bleh14:00
*** dprince has quit IRC14:00
jeblairmordred: i checked a random ready node and see mirror.rax in sources.list14:01
mordredif [ "$LSBDISTID" == "Ubuntu" ] ; then14:01
jeblair(_recently_ ready)14:01
mordredthat should be a single = shouldn't it14:01
mordredhrm. nope. == works too14:02
sdagueso... at least on my 15.0414:02
sdagueos1:~> cat /etc/lsb-release14:02
sdagueDISTRIB_DESCRIPTION="Ubuntu 14.04 LTS"14:02
mordredyeah - I've checked that on precise, trusty and vivid14:02
*** ildikov has quit IRC14:03
jeblairsdague: LSBDISTID=$(lsb_release -is)14:03
mordredsdague: the line before it14:03
jeblairin the script14:03
jeblairmordred, sdague: i've run the contents of the if block and they seem to work14:03
mordredjeblair: I don't need to apt-get update after do I?14:03
*** asselin has joined #openstack-infra14:03
jeblairmordred: no, i believe the problem is that isn't being run for some reason14:04
jeblairmordred: because the file did not appear to have been updated14:04
funginodepool.o.o _does_ seem to have the updated ready script at least14:04
jrolljeblair: so uh, you all want the rax mirror team poked to bump the mirrors? which region is this?14:05
jeblairsdague: ^ is it all regions?14:05
*** otter768 has joined #openstack-infra14:05
*** ibiris is now known as ibiris_away14:06
jeblairmordred: the pydistutils file _is_ being updated.  i'm stumped.14:06
*** shardy_ has joined #openstack-infra14:06
sdaguejeblair: I see at least iad and dfw in the list14:07
sdaguemordred: so we're running under -x14:07
fungi"Exception: Unable to run ready script"14:07
fungiin the nodepool debug log14:07
sdagueis it not +x or something?14:07
fungioh, that's from much earlier14:07
mordredsdague: but we don't log the ready script output14:07
*** mrmartin has joined #openstack-infra14:07
sdaguemordred: oh.... poo14:07
fungiapparently we only log the ready script when it fails to run, so it must be running14:08
*** shardy has quit IRC14:08
jrolljeblair: sdague: thanks, going to reproduce and email; ubuntu only yes?14:08
jeblairjroll: afaik14:08
fungijroll: as far as we know, but we're not using your package mirrors for anything besides ubuntu14:08
jrollright, thanks14:09
clarkbI think d-g uses local git cache but ggp does not. devstack retries should not affect the gate due to error on clone14:09
marcusvrnkrtaylor: ping14:09
jeblairmordred, fungi, sdague: the ready script has a sudo for the dd (it does not need it, but it should be okay).  that means it gets logged in auth.log.  i do not see it being run in auth.log.14:09
*** yfried has joined #openstack-infra14:09
sdaguejroll: this particular issue is you have a snapshot of the mirror that misses the tcpdump package the meta files say you have14:09
sdaguejroll: however, this happens every few weeks14:10
fungijeblair: that suggests the conditional isn't matching14:10
*** otter768 has quit IRC14:10
sdaguebecause the rax mirror scripts don't do mirroring correctly14:10
jrollsdague: I was just going to ask which package, thanks. and yeah, aware of this issue and hate it.14:10
jeblairfungi: agreed14:10
jrollsdague: emailing folks14:10
jeblairmordred: not held, just unused.14:10
mordred(this is going to wind up being something really stupid ultimately)14:11
sdaguejroll: I see an ord node with the failure as well, so not region specific14:11
krtaylormarcusvrn, pong, high latency though in a meeting14:11
clarkbjroll supposedly reprepro will fix this issue14:11
jeblairmordred: the configure script is not correct14:11
*** yfried__ has quit IRC14:12
jeblairmordred: as in, it looks like the old version14:12
jeblairends at pypi14:12
*** shardy_ has quit IRC14:12
clarkbjroll it does its own consistency checking when mirroring14:12
*** dboik_ has joined #openstack-infra14:12
fungiare we not running the configure script from /etc on the nodepool server?14:12
*** shardy has joined #openstack-infra14:12
jeblairi wonder if ready scripts are expected to be image-baked14:13
fungiwell, new images are kicking off now anyway14:13
jrollsdague: jeblair: email dropped, these folks are usually pretty responsive14:13
mordredfungi: hang on14:13
mordredfungi: we need to re-enable and re-run puppet on nodepool - it's disabled to put the ready script in place by hand from earlier14:13
mordreddoing that now14:13
fungihang onto what? 14:14 utc is when our image updates start14:13
fungithe daily ones14:14
sdagueI think I'm going to have to start drinking early today14:14
jeblairmordred: it should be okay, yeah?  the updates will put your hand-made ready script into place14:14
mordredI'm running puppet real quick to try to catch all of the script updates14:14
*** dims has joined #openstack-infra14:15
* jeblair transits14:15
*** dboik has quit IRC14:16
*** eharney has joined #openstack-infra14:16
funginihilist arby's has it nailed today14:16
marcusvrnkrtaylor: I saw your email ("Announcing Third Party CI Tools Repo") then I discovered that there's a CI working group...hehe there's a channel for that group? or just the weekly meetings?14:17
fungiooh! rackspace update on pypi.dfw. my suspicions confirmed... "After investigation of the physical host server I found another customer outbound spamming. They have been notified and appropriate actions have been taken. Please let us know if you see any other issues.14:18
asselinmarcusvrn, just weekly meetings14:18
*** amitgandhinz has joined #openstack-infra14:18
sdaguefungi: so... it would be nice if we didn't need to lose a day before they react there14:20
sdagueI wonder if there is something we can do to get more early warning to them14:20
*** sdake has joined #openstack-infra14:20
*** rossella_s has quit IRC14:21
*** sdake_ has joined #openstack-infra14:22
jeblairmarcusvrn, asselin, krtaylor: 3rd party ci conversation normally happens here since we're all working with the same tools.  we're just really busy focusing on some release-critical bugs right now.14:22
*** rossella_s has joined #openstack-infra14:22
*** tonytan4ever has joined #openstack-infra14:22
*** sputnik13 has quit IRC14:22
*** fhubik has quit IRC14:23
*** bauzas has joined #openstack-infra14:23
johnthetubaguyfungi: are things settling down at all now?14:24
johnthetubaguyfungi: are you still seeing swift issues, not the maintenance should completed, etc14:25
fungisdague: agreed. acutally it was only a little over 11 hours for them to confirm, but i wonder if we should go back to pestering poor rackers in irc like johnthetubaguy14:25
*** sdake has quit IRC14:25
sdaguejohnthetubaguy: so, the swift thing is not currently fatal for us14:25
johnthetubaguyfungi: we need to get you some better contacts for this stuff14:25
sdaguejohnthetubaguy: ++14:25
fungijohnthetubaguy: the pypi mirror in dfw turned out to be a noisy (spammer) neighbot on the same compute node, according to fanatical support14:26
*** Somay has quit IRC14:26
johnthetubaguyfungi: OK, interesting, what hardware is that on?14:26
johnthetubaguyfungi: performance or standard?14:26
fungijohnthetubaguy: good question... checking14:27
clarkbshould be performance but do double check14:27
fungijohnthetubaguy: 4gb performance14:27
clarkbif it was standard we wouldnt need the cinder volume14:27
johnthetubaguyhmm, crazy stuff14:27
*** mattfarina has joined #openstack-infra14:28
*** bswartz has joined #openstack-infra14:28
sdagueclarkb: ... low priority item ... however in reading through a bunch of logs this morning it would be nice to tighten up the ansible output, it kind of spews throughout and makes following things a little hard14:28
fungiso someone on that node must have really, REALLY been generating a lot of traffic to have that sort of impact14:28
johnthetubaguyhmm, odd, I would have hoped QoS would have been enough for that…14:28
johnthetubaguyyeah… nuts14:29
clarkbsdague can you expand on that it writes a json blob with return codes iirc. is that a problem?14:29
fungijohnthetubaguy: the distro package mirror problem seems to be that they're either sometimes mirroring from other broken mirrors and then serving that broken state for a while. jroll is pestering the mirror operator there to fix it but we're in the process of switching to the normal ubuntu mirror network to work around it14:29
clarkbsdague it shouldnt really spew anything14:29
sdaguewell more things like: + /tmp/ansible/bin/ansible all -f 5 -i /home/jenkins/workspace/check-tempest-dsvm-full/inventory -m shell -a 'source '\''/home/jenkins/workspace/check-tempest-dsvm-full/test_env.sh'\'' && source '\''/home/jenkins/workspace/check-tempest-dsvm-full/devstack-gate/functions.sh'\'' && tsfilter setup_workspace '\''master'\'' '\''/opt/stack/new'\'' executable=/bin/bash'14:29
sdagueas single line14:29
clarkbsdague ya thats how we run things14:30
johnthetubaguyfungi: ah, thats good to know14:30
fungier, either sometimes mirroring from other broken mirrors, or mirroring incorrectly14:30
sdaguesure, but really aggressively move xtracing things out of the console log for a while14:30
sdaguehence all the sublogs14:30
jrollfungi: manual sync in progress14:30
clarkbsdague oh its the tracing nit ansible. I see14:30
fungijroll: thanks for getting in touch with them14:31
*** mriedem has quit IRC14:31
fungijohnthetubaguy: so the current remaining mystery is that we're also getting intermittent git remote failures reaching our git servers (which are also hosted in dfw)14:31
*** notmyname has quit IRC14:31
jeblairfungi: and possibly related -- similar errors contacting zuul mergers14:32
johnthetubaguyfungi: makes me wonder if they have a noisy neighbour too14:32
clarkbfungi git is returning an error code thibg too right? maybe that is a clue?14:32
johnthetubaguyfungi: they don't back up to swift do they?14:33
*** anthonyper has quit IRC14:33
fungijohnthetubaguy: nope14:33
*** afazekas_ has quit IRC14:33
johnthetubaguyfungi: seems crazy it all happened at once14:33
fungijohnthetubaguy: right, and at a particularly inconvenient time14:33
*** bhunter71 has joined #openstack-infra14:34
johnthetubaguyfungi: quite14:34
*** tlbr has quit IRC14:34
*** notmyname has joined #openstack-infra14:34
*** tlbr has joined #openstack-infra14:34
fungiclarkb: it sometimes returns a specific-looking git error which turns out to have any manner of possible causes, many of which can be network issues14:34
sdaguethe others have been bouncing around at a rando fail rate that was not good, but not killer14:35
sdaguejohnthetubaguy: see graphs here - http://status.openstack.org//elastic-recheck/14:35
*** annegentle has quit IRC14:35
fungiwell, the pypi mirror problem in dfw yesterday was pretty heinous for gate performance too, but it's thankfully addressed at this point14:35
sdaguefungi: true14:36
jeblairyeah, i still want to track down what's causing the git and/or swift failures14:36
sdaguejeblair: agreed14:36
johnthetubaguysilly question, but what interface are you using to talk to git? public or snet?14:37
fungibecause it's accessed from other regions and other service providers entirely14:37
johnthetubaguyoh yeah, of course, my bad14:38
fungiwe don't (currently) have a per-region git mirror network14:38
*** vlaza has left #openstack-infra14:39
johnthetubaguyI mean public vs snet shouldn't be any different I guess, its not an isolated network anyways14:39
jheskethfungi, jeblair, mordred: seems like things are slowly getting under control. Anything else I can help with before I head off?14:40
*** Ala has joined #openstack-infra14:40
*** Ala has quit IRC14:41
fungiweird. when i try to click on console.html at http://logs.openstack.org/81/178181/2/check-tripleo/check-tripleo-ironic-overcloud-f20puppet-nonha/878edcb/ my browser wants to download and save it14:41
sdaguejeblair: yeh, I can exclude those14:41
ttxhmm, gate is empty now but I think more because all checks fail than because everythingn is under control ?14:41
jeblairfungi: yes, also that14:41
sdaguethere are non github ones in there14:41
sdaguejeblair: let me tweak the signature14:41
fungilooks like up until a few days ago we were just seeing this for bare-centos6-rax-dfw workers14:42
jeblairttx: yeah, ubuntu mirror fix is still pending (two parallel fixes are in progress)14:42
jeblairjhesketh: i don't think so, thanks :)14:43
ttxjeblair: ok -- should I still recehck RC3-critical jobs, or that doesn't really help ?14:43
jheskethjeblair: no trouble, sorry I can't help more!14:43
jheskethsee you guys in a few hours14:43
sdaguettx: not yet14:43
*** Ala has joined #openstack-infra14:44
*** mattfarina has quit IRC14:44
sdaguettx: yeh, well they'll just fail a lot14:44
jeblairfungi: i'm assuming the swath of tripleo failures over the past few hours are all github14:44
fungiso, ruling out the tripleo github failures, it looks like all the jobs matching the git error pattern are in hpcloud trying to reach our git mirrors in rackspace over the internet14:44
jeblairthat's a very different characterization14:45
fungiso could be hpcloud network problems, could be an issue somewhere in the route between them on the open 'net, or could be something else i guess14:45
mordredfungi: oh. yeah - that's ...14:45
openstackgerritSean Dague proposed openstack-infra/elastic-recheck: Remove tripleo from signature  https://review.openstack.org/17822014:46
sdaguejeblair: maybe14:46
sdaguejeblair: the following14:46
fungithis is just based on a quick visual scan of the logstash query results for that particular bug14:46
sdaguemessage:"fatal: The remote end hung up unexpectedly" AND filename:"console.html" AND NOT build_queue:check-tripleo14:46
*** annegentle has joined #openstack-infra14:46
sdaguewill give you the non tripleo ones14:46
sdagueit's still quite a few in the last 48 hours14:47
sdague57 in 48 hrs14:47
jeblairsdague: yeah, as long as a stackforge project doesn't stick a github clone in there :)14:47
fungimost recent one which impacted a rackspace worker was ~26 hours ago14:47
fungiso there _are_ impacts to rackspace workers, but nowhere nearly the frequency of hits to hpcloud workers14:47
sdaguegate-nova-docs is the most recent fail14:48
sdaguein that query14:48
jeblairfungi, sdague: do the zuul merger errors share the same characterization?14:48
sdagueI do not know14:48
jeblair(are they covered by this query and are they also hpcloud localized?)14:48
sdaguefungi: ?14:48
fungino clue yet14:49
sdagueso this is not hpcloud localized14:49
sdaguethis is broadly impacting14:49
*** sputnik13 has joined #openstack-infra14:49
jeblair2015-04-28 09:53:55.840 | error: RPC failed; result=7, HTTP code = 014:50
jeblairfrom http://logs.openstack.org/72/177072/4/gate/gate-senlin-python27/fce63d8/console.html  earlier14:50
*** masayukig_ has joined #openstack-infra14:50
sdagueyeh, we still don't have cloud as broken out metadata, so it's visual scan to sort that out14:50
fungisdague: as in logstash doesn't show a hit for its bug 1282876 query matching a rax worker for more than a day14:50
fungiall the hits in the past 24 hours have been tripleo and hpcloud14:51
openstackgerritMerged openstack-infra/elastic-recheck: Remove tripleo from signature  https://review.openstack.org/17822014:51
jeblairfungi, sdague: oh it emits the "remote end hung up" line, so zm failures should be included in the git failures query14:51
fungijeblair: yep, looks like roughly the same set of jobs/workers/hits when i query for that14:52
*** yamahata has joined #openstack-infra14:52
sdaguefungi / jeblair ok, so tripleo is purged from that query now, when ER updates again we'll see the updated list14:53
*** nelsnelson has joined #openstack-infra14:53
*** _nadya_ has quit IRC14:54
fungiwhat we've got here is a failure to communicate (over the internet using git)14:54
*** annegentle has quit IRC14:54
*** _nadya_ has joined #openstack-infra14:54
jeblairfungi, sdague: do we think we can stand down on the git failures then?14:54
clarkbfungi does it affect each distro/release? or is one being affected?14:55
fungiclarkb: looks like mostly bare trusty, but that may be because devstack is working around it and we don't run nearly as many jobs on other platforms14:55
*** ajmiller has joined #openstack-infra14:55
sdaguejeblair: so, I think the work will mitigate it14:56
sdagueit's not a huge failure14:56
jeblairis there a ggp workaround change proposed for it?14:56
jeblairclarkb: ?14:56
clarkbwe error on clone, I think the local git cache may make a difference instead14:56
sdaguejeblair: https://review.openstack.org/#/c/178173/14:56
sdagueit's merged14:56
sdagueand probably should be in new images14:56
clarkbjeblair devstack isnt going to retry any git clone ops for us. d-g does it all14:56
jeblairclarkb: this isn't specific to cloning, but also happens with any fetching14:57
fungithere's a short-term fix to do it in the ggp builder macro (retry the script if it exits nonzero) and a workaround in the script itself (which should end up in the new images)14:57
jeblairclarkb: we do fetch from git mirror to bring things up to date14:57
clarkbjeblair but thats d-g as well right?14:57
jeblairclarkb: yeah14:57
jeblairclarkb: i don't tthink anyone is actually talking about devstack14:57
fungithough if we think this is a useful pattern, then we may want to add something similar to zuul-cloner as well14:57
jeblairfungi: ++14:57
sdagueclarkb: correct, there are no dsvm fails here14:58
*** jaypipes has quit IRC14:58
jeblairclarkb: maybe we were just abbreviating :)14:58
sdaguethis was just about adding some retry logic to non dsvm jobs to see if that helps14:58
*** shardy_ has joined #openstack-infra14:58
anteayagithub uses rackspace for servers, or did at one point14:58
anteayanot sure if they still do14:58
*** _nadya_ has quit IRC14:59
fungithe in-script workaround for ggp sets a timeout on git operations too, which might be useful for the puppet apply jobs problem we have on centos6 (when coupled with the retry logic)14:59
*** nfedotov has quit IRC14:59
*** eharney has quit IRC14:59
fungithat is, might be useful in those jobs if we do something similar in zuul-cloner14:59
jeblairokay, to recap: a) ubuntu mirror fixes in progress, b) git retry fixes in place, c) pypi problem beleived resolved (bad neighbor), d) swift problems ongoing, not critical, workaround in pipeline14:59
*** shardy has quit IRC15:00
sdaguejeblair: yes15:00
sdaguethat sounds correct15:00
mordredI agree with the recap15:00
sdagueresolution of a) should get the trains moving again15:00
sdaguethe rest just make them run better15:00
fungiand now to begin the day o' meetings15:00
jeblairw00t.  and once we confirm a) we can status ok15:00
*** claudiub has joined #openstack-infra15:01
mordredI also agree with that w00t15:02
*** sabeen has joined #openstack-infra15:02
*** BharatK has joined #openstack-infra15:03
*** sabeen2 has joined #openstack-infra15:03
*** mtanino has joined #openstack-infra15:03
*** shardy_ has quit IRC15:04
*** e0ne is now known as e0ne_15:04
jeblairi think most image builds are complete15:05
clarkbis the swift problem related to the cloud?15:05
clarkbor should we be looking at more than a workaround for ourselves?15:05
jeblairclarkb: i don't know, we have very little data on that15:06
fungiyeah, needs more analysis. i suspect the frequency is low enough that we lack a great representative sample from which to draw conclusions15:06
*** zul has quit IRC15:06
sdagueclarkb: https://bugs.launchpad.net/openstack-gate/+bug/1449570  has a query that will get you all the hits15:08
openstackLaunchpad bug 1449570 in OpenStack-Gate "raxspace swift sometimes fails to accept log uploads with file posting error" [Undecided,New]15:08
sdaguemordred tripped over it by accident while looking at other logs15:08
sdagueit's not in ER at the moment because it's apparently non-fatal atm15:08
*** sputnik13 has quit IRC15:09
* mordred trips over many things15:09
sdaguereading random logs in openstack often turns up interesting issues that no one noticed yet15:09
*** erikmwilson has quit IRC15:10
jeblairare all the swift errors from hpcloud?15:10
*** erikmwilson has joined #openstack-infra15:11
*** jamespage_ has quit IRC15:13
*** SergK has quit IRC15:13
clarkbdid a 48hour query, it started at 4/27 1900UTC ish15:13
fungiso... back in the beforetime, when we were actually running a lot of jobs in hpcloud, a disproportionate number of network-related job failures involved hpcloud workers. we didn't see it for a while because hpcloud was so broken that we effectively stopped running jobs there. but now...15:13
jeblairso that could be because of hpcloud network, internet1, or the rax-swift public network path (but _not_ the rax internal path)15:13
fungitime for internet2 already. srsly15:14
clarkbalso looks like no rax failures at all since it started15:14
*** peristeri has quit IRC15:14
*** signed8bit has joined #openstack-infra15:15
clarkbmessage:"ERROR:root:File posting error" AND filename:"console.html" is my query15:15
fungiright, it doesn't (necessarily) cause job failures15:15
clarkbfungi: sorry I meant failcount meaning matching of that query15:15
clarkbbasically one job can match multiple times15:15
fungioh, right yep15:15
clarkbexpanding to a 7 day query there is one additional hit on the 24th15:16
*** dangers_away is now known as dangers15:16
clarkbthat was also in hpcloud but thats it, started very recently and only affects hpcloud15:16
sdaguedid gerrit fall over?15:16
sdagueI just ran rechecks on a bunch of jobs15:17
*** changbl has joined #openstack-infra15:17
clarkbits up for me15:17
sdagueand .... not showing up in zuul15:17
jeblairsdague: works for me15:17
sdaguesorry, gerrit event stream15:17
jeblairsdague: oh, did stream events get stuck?15:17
jeblairyes, it is stuck15:17
jeblairfungi: do you have things staged for a restart?15:18
sdagueso... it's gotten stuck a lot recently, right?15:18
jeblairsdague: yes15:18
jeblairsdague: i have a change to help debug the problem which we have not been able to deploy yet15:18
jeblairfungi was working on staging that so that we might try deploying it again the next time it got stuck15:19
*** spredzy_ is now known as spredzy_|afk15:19
jeblairwe could just restart it, but if fungi or someone else up on the latest there is around, i'd like to see if we can slip it in15:20
*** BobH has joined #openstack-infra15:20
fungijeblair: yep, i can pull from my env now15:20
fungiit's all set up and _should_ work (i tested it out as best i could)15:21
*** sdake_ has quit IRC15:21
jeblairi'm on review.o.o and can help out (v6 internet2 seems reliable enough)15:21
fungiat least, it should allow us to quickly recreate the bouncy castle error, hopefully see why that's happening, and then solve it or quickly switch back15:21
*** johnthetubaguy is now known as zz_johnthetubagu15:22
*** emagana has joined #openstack-infra15:22
fungiokay, i'll pull from my env now... that should trigger a service stop but won't reindex lucene15:22
jeblairfungi: and don't forget to disable puppet to avoid flapping back15:22
clarkbI am around to help too15:22
clarkbfungi: and iirc your change wasto do everything but reindex right?15:23
dansmithgerrit is down?15:23
clarkbdansmith: emergency restart (event stream hung again)15:24
sdakein case you didn't know, review.openstack.org is experiencing downtime clarkb15:24
sdakesounds like you do know :)15:24
fungipulled and puppet agent disabled for now15:24
dansmithclarkb: okay, it had been down long enough that it didn't seem like a restart, but then again, I have no idea how long it takes to restart a mega java app :)15:24
fungiError in custom provider, java.lang.SecurityException: class "org.bouncycastle.util.io.TeeOutputStream"'s signer information does not match signer information of other classes in the same package15:24
*** lucap has joined #openstack-infra15:25
jeblair  while locating com.google.gerrit.server.contact.ContactStoreProvider15:25
clarkbwe have two bcprovs in place15:25
fungithat looks like the cause15:25
jeblairyeah, system and gerrit-local, right?15:25
clarkbone from today and the other is the symlink to the system package15:26
clarkbjeblair: yup15:26
clarkbsame thing with the msql driver connector thing15:26
fungilooks like we've started bundling and unpacking bcprov-jdk16-144.jar15:26
fungiso they must be included in the builds and weren't previously?15:27
fungii'll remove the symlinks15:27
jeblaircould we have changed the job definitions?15:27
jeblairfungi: is that the right direction?15:27
fungiit's possible we've changed the build job to start bundling them15:28
jeblairthe puppet module makes the system symlink15:28
clarkbI was going to suggest moving the non symlinks15:28
jeblairclarkb: i lean toward that; i think that may be the minimal change15:28
jeblairfungi: ^15:28
fungiyeah, i wanted to see if the bundled ones work. they do not15:28
fungiremoving them and restoring the symlinks now15:29
clarkbfungi: if ou need target paths I have those in my ls scrollback15:29
jeblairfungi: you have the old state?  i have an ls if you need it.15:29
clarkbjeblair: :)15:29
fungii had the original state logged15:30
fungithat worked15:30
clarkbfungi: so it is starting with jeblair's patch in place?15:30
fungii've moved the unpacked versions to ~gerrit215:30
fungii believe so15:30
jeblairyep, git ops are working15:31
clarkbI should probably go read the puppet now to try and sort out why we aren't cleaning those libs up properly15:31
jeblairweb is up15:31
fungiPowered by Gerrit Code Review (2.8.4-19-g4548330)15:31
fungithat's the right patched war15:31
fungiclarkb: it's also possible the names of the libs changed? the unpacked ones are bcprov-jdk16-144.jar and mysql-connector-java-5.1.21.jar15:32
jeblairsdague: rerecheck15:32
clarkbfungi: i thought we were doing a puppet purge with a glob15:32
clarkbfungi: looking into it now15:32
fungiso if we were relying on puppet to delete those, then they might have been specified overly-specifically?15:32
*** tiswanso has joined #openstack-infra15:32
clarkbok its the tidy at the end of the puppet-gerrit/manifests/init.pp class15:33
clarkbI wonder if its a regex and not a glob?15:34
clarkbto the puppet docs15:34
*** sarob has joined #openstack-infra15:34
clarkbnope https://docs.puppetlabs.com/references/3.stable/type.html#tidy-attribute-matches should be shell type file globs15:34
jeblairsdague: around?15:34
*** TheJulia has quit IRC15:35
*** jamespage_ has joined #openstack-infra15:35
fungi#status notice gerrit has been restarted to clear an issue with its event stream. any change events between 14:43-15:30 utc should be rechecked or have their approval votes reapplied to trigger jobs15:36
openstackstatusfungi: sending notice15:36
jeblairttx: i think it might be worth rechecknig rc changes now15:36
fungier, that was supposed to be 14:53-15:30 but close enough15:36
*** zz_dimtruck is now known as dimtruck15:36
-openstackstatus- NOTICE: gerrit has been restarted to clear an issue with its event stream. any change events between 14:43-15:30 utc should be rechecked or have their approval votes reapplied to trigger jobs15:36
*** Swami has joined #openstack-infra15:37
openstackstatusfungi: finished sending notice15:38
ttxjeblair: on my way15:38
*** asselin has quit IRC15:39
clarkbmore notes on the tidy. We don't install bcprov jar in review site, gerrit init seems to do that for us15:39
sdaguejeblair: back, sorry was working on my linuxcon cfp while things were getting poked15:40
jeblairsdague: np, just wanted to alert you you can recheck15:41
*** jamesmcarthur has joined #openstack-infra15:41
sdaguejeblair: thanks15:42
*** sushilkm has joined #openstack-infra15:43
*** sushilkm has left #openstack-infra15:43
jeblairsdague: the results at the top of zuul status make me think the apt mirror problem is fixed15:44
fungii'm going to take this opportunity to grab a quick shower, but will help come up with a puppet patch to bring review.o.o to sanity afterward so we can reenable puppet agent on it again15:44
jeblairfungi: cool, thanks15:44
jeblairsdague: do you agree with that?15:44
openstackgerritClark Boylan proposed openstack-infra/puppet-gerrit: Run lib tidy after plugin install, before start  https://review.openstack.org/17825115:45
clarkbfungi: jeblair ^ I am sort of working on a hunch that that caused the issue. Either way I think my change is an improvment15:45
*** yolanda has joined #openstack-infra15:45
jeblairclarkb: do you think we saw the problem because we ended up with a differnt and incompatible version bundled with .19 but not in .7?15:46
jeblairer .1715:46
*** annegentle has joined #openstack-infra15:46
clarkbjeblair: I think the plugin installs may pull in the wrong stuff15:47
sdaguejeblair: I'm not sure - https://jenkins06.openstack.org/job/gate-tempest-dsvm-neutron-src-python-glanceclient/111/console is in the gate now, and is a rax job, so if that passes stack.sh we're probably good15:47
jeblairoh, huh15:47
clarkbjeblair: so if we do gerrit-init, tidy libs, plugin installs, gerrit start we end up with the wrong stuff in the review site15:47
*** hemnafk is now known as hemna15:47
clarkbjeblair: but if we do gerrit-init, plugin installs, tidy, gerrit start we should avoid that15:47
sdaguejeblair: so... it looks like that image bypassed the rax mirrors15:48
clarkbjeblair: though as I read more I am not sure we would tidy before doing a gerrit start15:48
clarkbjeblair: but we don't seem to have later run the tidy so I don't think that was the case15:49
sdaguewhich means I think we're now isolated from rax mirror15:49
jeblairi'm going to 'status ok' then, sound good?15:49
clarkbno opposition here15:50
*** ajmiller has joined #openstack-infra15:50
sdaguejeblair: wfm15:50
jeblair#status ok15:51
openstackstatusjeblair: sending ok15:51
mordredsdague: woot!15:51
jeblairblank message since the last one was suggesting rechecks anyway15:52
clarkblooking at http://puppetboard.openstack.org/report/review.openstack.org/afdb20cf974c2e2bca2bc47d3bbed678f019687e that doesn't seem to support my theory. I would've expected to see the gerrit-init and plugin install execs there15:53
jeblairand it sounds like we can declare the emergency over (i hope)15:53
clarkbfungi: did you disable those execs entirely? if so we may just not have properly cleaned up the old .17 env at all15:53
openstackstatusjeblair: finished sending ok15:53
clarkbalso yay puppet and file permissions15:54
mordredjeblair: I love the smell of napalm in the morning etc etc15:55
*** jeblair changes topic to "Discussion of OpenStack Developer and Community Infrastructure | docs http://docs.openstack.org/infra/manual/ http://ci.openstack.org/ | bugs https://storyboard.openstack.org/ | source https://git.openstack.org/cgit/openstack-infra/"15:55
*** jeblair sets mode: -o jeblair15:55
* jeblair lunches15:55
mordredclarkb: while you're puppeting - https://review.openstack.org/#/c/178180/2 is one from this morning that we didnt' happen to get in15:56
*** jamesmcarthur has quit IRC15:56
*** doug-fish has quit IRC15:58
openstackgerritMonty Taylor proposed openstack-infra/devstack-gate: Put input variables into ansible inventory  https://review.openstack.org/17794316:01
openstackgerritMonty Taylor proposed openstack-infra/devstack-gate: Move all the ansible calls into playbooks  https://review.openstack.org/17794416:01
openstackgerritJulia Kreger proposed openstack-infra/shade: Convert node_set_provision_state to task  https://review.openstack.org/17798716:01
*** tqtran has joined #openstack-infra16:01
*** rbradfor has joined #openstack-infra16:03
*** annegentle has quit IRC16:03
fungiclarkb: if you git diff in /etc/puppet/environments/fungi/modules/gerrit on the puppetmaster you'll see the very minor surgery i did to remove the lucene reindex command from the gerrit-init exec16:04
mordredclarkb: I'm reading some code elsewhere about neutron and floating ips ... and if I'm reading it right, all of our problems with how it works go away if we use neutronclient properly16:04
*** jamesmcarthur has joined #openstack-infra16:04
mordredclarkb: I'm going to verify16:04
clarkbfungi: ok, so I am not sure we ran the init at all then16:04
fungiclarkb: all i did was take out16:04
clarkbmordred: neutronclient does not solve the leak problem16:04
fungi; /usr/bin/java -jar ${gerrit_war} reindex -d ${gerrit_site}16:04
mordredclarkb: depends on which problem we're talking about16:04
clarkbmordred: nor does it solve NAT, or the extra time it takes for another API round trip16:05
clarkbmordred: so I don't know what problem you are talking about16:05
mordredclarkb: well, in this case doing the right thing with neutronclient would solve one of teh extra API calls (since we actually ahve to do 2 additional round trips, one for create, and one for attach)16:05
clarkbfungi: do you have the output from puppet on that run handy somewhere?16:05
*** jlanoux has quit IRC16:05
fungiclarkb: what i find interesting though is that when i ran into this the first time i tried to upgrade it to the 2.8.4-19 war there were apparently those extra libs unpacked, but when i switched it back to 2.8.4-17 it cleaned them up properly16:06
mordredclarkb: in that you can apparently give neutron the fixed ip you want it associated with when you create it16:06
*** otter768 has joined #openstack-infra16:06
mordredclarkb: I'm going to hack up a quick test to see if the code I'm looking at a) works b) is useful to us - but it's hella cleaner than the thing we do now if it does16:06
dansmithjogo: sdague: can this go now? https://review.openstack.org/#/c/174567/16:06
dansmithjogo: sdague: the nova patch that depends-on it is passing16:07
sdaguedansmith: yes16:07
fungiclarkb: http://paste.openstack.org/show/210390 is from my terminal buffer16:07
sdaguemtreinish: you want to look quick on that one?16:07
*** jamesmcarthur_ has joined #openstack-infra16:07
dansmithsdague: cool, thanks16:07
clarkbfungi: ok that shows the execs running, so puppetboard is just incomplate/wrong/whoknows16:07
clarkbfungi: however the tidy does not run16:08
fungiclarkb: i wonder how/why it tidied again on downgrade last time though16:08
jogodansmith: well mtreinish beat me too it16:09
*** jamesmcarthur_ is now known as jamesmcarthur16:09
dansmithjogo: better luck next time :)16:09
dansmithmtreinish: thanks16:09
*** ddieterly has joined #openstack-infra16:10
clarkbfungi: rereading tidy docs I don't see any problems with it. We recurse => true which is necessary to use matches. And the matches array items are OR'd not AND'd16:10
jogosoo http://status.openstack.org/elastic-recheck/gate.html#128681816:11
jogomassive spike ^16:11
*** baoli has quit IRC16:11
openstackgerritJeremy Stanley proposed openstack-infra/system-config: Revert "Revert "Update production gerrit to""  https://review.openstack.org/17826916:11
*** davideagnello has joined #openstack-infra16:11
*** otter768 has quit IRC16:11
clarkbfungi: since we require both inits but I thoguht puppet did the correct thing with that. However in the case of going back to .17 that worked and I can't figure how to resolve that with requires being the problem16:11
fungijogo: welcome to the fun!16:11
sdaguejogo: dude, read the scrollback16:11
fungiclarkb: i agree. the only other possibility is that something is happening out of sequence, and the -17 war didn't actually bundle these libs16:13
clarkbfungi: and the base path looks correct /home/gerrit2/review_site/lib16:13
openstackgerritMerged openstack-infra/project-config: Add retry message to gerrit-git-prep macro  https://review.openstack.org/17818016:13
clarkbfungi: I think next step is for me to run some puppet apply locally with a tidy manifest and see if I can reproduce16:14
*** pcaruana has quit IRC16:14
jogofungi: ahh I see https://review.openstack.org/#/c/178160/416:15
*** davideagnello has quit IRC16:15
*** ajmiller has quit IRC16:17
clarkbfungi: something like http://paste.openstack.org/show/210419/16:17
fungiclarkb: that looks like a rough approximation yes'16:18
clarkbfungi: and that appears to work16:20
clarkbI see tidy say it is removing files then ls shows they are gone16:20
openstackgerritMatthew Treinish proposed openstack-infra/subunit2sql: Improve run_time graph formatting  https://review.openstack.org/17827616:21
fungithis is puzzling16:21
clarkbadded in the other matches to seeif that broke puppet too and it does not16:23
zaroyolanda: is change 75514 working for you?16:23
zaroyolanda: i just tried running it and still getting the same error.16:23
fungiclarkb: so one theory is that requiring both gerrit-initial-init and gerrit-init even though we only exec the latter is the cause for not running the tidy after?16:23
zaroyolanda: ran with PS 3016:23
clarkbfungi: correct16:23
yolandazaro, i tested with jenkins-jobs test, it may be skipping some bits?16:24
clarkbfungi: I had thought that wouldn't matter because both execs are evaluated, its just that one does not actually fork16:24
clarkbfungi: and that had been sufficient to satisfy requires in the past16:24
fungioh, right, i see that now16:24
clarkbfungi: its possible newer puppet breaks that behavior?16:25
clarkbfungi: they may have optimized that node out of the graph because they know it won't exec16:25
yolandazaro, let me try again, i can try with a real update instead of test16:25
zaroyolanda: maybe my local cache is messed up. let me try to kill my jjb cach16:25
*** jamesmcarthur_ has joined #openstack-infra16:26
yolandazaro, i'm using jenkins-jobs -l WARN test --workers 4 ../project-config/jenkins/jobs16:26
*** ivar-lazzaro has quit IRC16:27
yolandaand no errors16:27
fungiclarkb: interesting data point, on review-dev (which is running 2.10 so ymmv) there is a duplicate mysql-connector-java besides the symlink, but no bcprov symlink at all16:27
*** jamesmcarthur has quit IRC16:27
*** jamesmcarthur_ is now known as jamesmcarthur16:27
*** ivar-lazzaro has joined #openstack-infra16:28
clarkbfungi: for 2.10 and bcprov we had to stop using the system package16:28
clarkbfungi: system package was not new enough16:28
zaroyolanda: looks like that was it. deleted my local cache and it works now.16:28
zaroyolanda: hmm, will need to test that.16:29
yolandazaro, darragh was writing some script for testing threads16:29
cineramapleia2: hi there16:30
*** mtanino_ has joined #openstack-infra16:30
*** isviridov is now known as isviridov_away16:31
*** mtanino has quit IRC16:33
*** harlowja_at_home has quit IRC16:33
zaroyolanda: it only happened on one of my test runs so i wanted to test some before apprv16:33
clarkbfungi: my change should address the requires issue if it is an issue, and it will make sure we tidy before starting the service so thats also good16:33
clarkbfungi: I am just not convinced it solves all the problems here16:33
zaroyolanda: mostly i think it's just more testing at this point to make sure it's solid.16:34
fungiclarkb: oh, it's possible the service start failed and the tidy didn't happen until after i suppose16:36
*** ssam2 has quit IRC16:36
*** ivar-lazzaro has quit IRC16:36
yolandazaro, what are you using for testing? apart from unit tests and doing a test run on projects.yaml?16:37
*** yamahata has joined #openstack-infra16:38
*** tiswanso has quit IRC16:38
fungihrm, only gets fired from a notify though, so nothing's depending on it in the puppet sense16:38
pleia2cinerama: hey16:39
*** emagana has quit IRC16:39
clarkbfungi: ya so even if that failed puppet should continue and run things that don't depend on it16:39
zaroyolanda: i'm just run a script that executes the update cmd  with the following params: no worker specified, workers=0, workers=2, workers=4, workers=8 on a 4 cpu VM against a jenkins master.16:39
zaroyolanda: then i just validate the timestamp from the test run16:40
cineramapleia2: hi there. so i'm having a think about how i want to structure the zanata client stuff. basically we need to run the client's 'stats' command to get what translations we have available & their percentage completion16:40
yolandazaro, and in which case did you see that error? with some specific workers setting, or just happened on some random run?16:41
cineramapleia2: so for the proposal scripts we have a file of common functions in bash & we can use some of those, but i'm doing the additional template generation and stats command result processing in python because that's a bit easier16:41
pleia2cinerama: wfm16:41
*** lucap has joined #openstack-infra16:42
clarkband the tidy doesn't appear wrapped in any conditionals16:42
zaroyolanda: the error i pointed out in PS 27 is from a messed up cache.16:43
fungiyeah, the only theory i have is that it's because of the require block16:43
pleia2cinerama: I don't know how we currently track whether something has been downloaded and prepped for gerrit, so it doesn't keep proposing the same patches that are over 75% translated but does submit changes as they happen16:43
clarkbfungi: ya, so let me change my commit message since I basically know the theory is wrong there now, but the change itself is good for other reasons16:43
yolandazaro, and the too many values to unpack from ps19? are you still seeing that?16:43
fungialso i missed your proposed change. i wonder if openstackgerrit is struggling16:44
zaroyolanda: not happening anymore because i cleaned out my cache.  but will need to verify that cache from a run against master will work with this change.16:44
yolandaah, good point16:44
*** che-arne has quit IRC16:44
openstackgerritClark Boylan proposed openstack-infra/puppet-gerrit: Run lib tidy after plugin install, before start  https://review.openstack.org/17825116:44
clarkbfungi: ^ it was first pushed when you stepped out, the bot did report it here16:44
fungiahh good. also zuul's still receiving gerrit events, so we're not stuck again (yet anyway)16:44
zaroyolanda: now i'm seeing this error: AttributeError: 'Namespace' object has no attribute 'name'16:45
*** dboik_ has quit IRC16:45
jeblaircinerama, pleia2: we can't count on keeping data around between jobs on the proposal slave (not sure that's what you were suggesting).  partly because we want to eventually run those jobs on single-use slaves.16:46
pleia2cinerama: and I think it's fine to create new scripts entirely for this16:46
yolandazaro, that comes from a publisher.. is that happening only with that change, or also in the original jjb?16:46
*** lucap has quit IRC16:46
pleia2jeblair: right, good to know, will have to poke into how state is remembered now16:46
cineramajeblair: not between jobs, just within the context of that particular job16:46
jeblaircinerama: ok cool16:47
jeblairpleia2: i think state is remembered in gerrit16:47
*** unicell has joined #openstack-infra16:47
jeblairvia queries for open changes -- if there is an open change for something, proposal bot updates it with a new patchset16:47
zaroyolanda: it's coming from cmd.py16:47
fungiclarkb: interestingly, it's the gerrit-init exec which is downloading the files we want to tidy, but i guess install-core-plugins doesn't fire until notified by either gerrit-init or gerrit-initial-init so should work out16:48
yolandazaro, can you paste it?16:48
pleia2jeblair: ah, interesting16:48
zaroyolanda: take a look at the comment in gerrit.16:48
clarkbfungi: yup, it basically map reduces the two inits for us16:48
fungias long as we keep an eye out for refactors that might change that relationship16:48
zaroyolanda: yeah, since this is all about the update command i would suggest testing it because our CI won't16:49
clarkbsdague: any idea why https://jenkins02.openstack.org/job/gate-devstack-unit-tests/360/console failed? those unittests don't seem to say much16:50
yolandazaro, can you pass me that cache file in some way?16:50
zaropelix, yolanda : still wondering why --workers is a param for 'test' command?  its a noop for test right?16:50
zaroyolanda: hmm maybe dropbox? let me take a look at how to do that.16:51
yolandaor email16:51
*** yfried has joined #openstack-infra16:51
*** e0ne has quit IRC16:52
openstackgerritDavide Guerri proposed openstack-infra/shade: WiP: Add keystone services/endpoints methods  https://review.openstack.org/17762116:54
openstackgerritLouis Taylor proposed openstack-infra/project-config: Add functional test job for python-glanceclient  https://review.openstack.org/17828516:55
pabelangerIs anybody running a public (filtered) iCal feed for infra meetings?  The iCal feed for all projects is a little noisy for me16:56
clarkbpabelanger: I just put 1900UTC tuesdays on my calendar directly. It hasn't changed since I started attending that meeting16:56
fungipabelanger: hrm... we have one infra meeting at the same time every week and it hasn't changed in at least 3 years16:56
fungiso not sure what the benefit of an ical for that would be16:57
*** lucap has joined #openstack-infra16:57
jeblairpabelanger: i think that's a feature of the upcoming yaml2ical stuff16:58
jeblairpabelanger: so hopefully after the summit?16:58
*** baoli has joined #openstack-infra16:58
*** dboik has quit IRC16:58
clarkbsdague: going to guess that is a difference with hpcloud left undetected by not testing there for a while16:59
*** ZZelle_ has joined #openstack-infra17:00
*** dustins has joined #openstack-infra17:00
*** davideagnello has joined #openstack-infra17:01
clarkbsdague: it looks like devstack is interferring with hpcloud networking on
clarkbfungi: looking17:02
yolandathx zaro17:02
fungizaro: thoughts on https://review.openstack.org/178251 before it gets approved?17:02
funginibalizer: ^ if you're around17:02
pabelangerclarkb, fungi, jeblair: Cool, that is what I am doing too.  Wanted to see if something public was available first.17:03
fungiseems like it should be safe enough, though we're not going to see it in action before the next time we update a gerrit war17:03
sdagueclarkb: pointer?17:04
*** dboik has joined #openstack-infra17:04
sdagueI thought we forced fixed network to a different range17:04
*** yfried is now known as yfried|prtially_17:05
*** ivar-laz_ has quit IRC17:05
*** unicell has quit IRC17:06
*** unicell1 has joined #openstack-infra17:06
*** rvasilets_ has left #openstack-infra17:07
*** nmagnezi has joined #openstack-infra17:08
*** ivar-laz_ has joined #openstack-infra17:09
*** ir2ivps has quit IRC17:09
openstackgerritClark Boylan proposed openstack-infra/project-config: Set FIXED_RANGE in devstack unittests  https://review.openstack.org/17829417:10
clarkbsdague: ^ I don't think you do but that change should force it17:10
clarkbsdague: https://jenkins02.openstack.org/job/gate-devstack-unit-tests/360/console is the log from the failure17:10
sdagueclarkb: can we fix that in the unit tests instead?17:11
clarkbsdague: sure, its just that you likely won't have a range in unittests that will always work, but I can set a fixed range for our env that will17:12
zaroyolanda: email was sent to you17:12
yolandayep, received it17:12
sdagueclarkb: so we don't need to source openrc17:14
zarofungi: i think i will -1 that change17:14
openstackgerritMerged openstack-infra/puppet-logstash: Modernize kibana vhost template  https://review.openstack.org/15381917:14
*** tonytan4ever has quit IRC17:15
clarkbsdague: gotcha, I have never looked at devstack unittests until about 5 minutes ago, so if that is possible I don't know17:15
sdagueyeh, I removed a similar thing a few weeks ago17:16
sdaguethat should fix that issue in the fail17:16
openstackgerritJan Klare proposed openstack-infra/project-config: remove cookbook-pacemaker from infra  https://review.openstack.org/17829817:16
clarkbsdague: oh I rechecked thinking that was several weeks old, thats a new change17:17
sdagueok, I'm going to get out for a bike ride before meetings galore17:17
openstackgerritJan Klare proposed openstack-infra/project-config: move gate-.*-chef-rake job and run it branch specific  https://review.openstack.org/17667417:17
sdagueclarkb: yes, I just wrote that one :)17:17
sdaguebut the other one17:17
sdaguegit grep HOST17:17
sdaguetest_libs_from_pypi.sh:HOST_IP="don't care"17:17
sdaguewhich let me drop openrc in that file17:18
jklarestill firefighting?17:18
*** harlowja_away is now known as harlowja_17:21
openstackgerritDavid Shrewsbury proposed openstack-infra/shade: Allow complex filtering with embedded dicts  https://review.openstack.org/17829917:23
*** dannywilson has joined #openstack-infra17:24
*** yfried|prtially_ is now known as yfried17:25
*** jamesmcarthur has joined #openstack-infra17:26
*** wenlock has quit IRC17:27
openstackgerritMorgan Fainberg proposed openstack-infra/project-config: Add keystoneauth library and testing infrastructure  https://review.openstack.org/17559617:29
fungizaro: what problem did you spot on 178251?17:29
*** claudiub has quit IRC17:29
yolandazaro, same error as you17:29
clarkbjklare: the majority of fires are out now we are trying to sort out some remaining issues like why our gerrit upgrades don't work completely as expected17:30
clarkbwe should also start collecting numbers on git fetch retries with mordreds patch in17:30
morganfainbergfungi, anteaya, clarkb, pleia2: ^ 175596, talked with jeblair and was told that should/can merge before governance change17:30
morganfainbergand sooner will go a long way to getting us moving on that.17:31
morganfainbergwhen you're not fighting fires that is17:31
zarofungi: still reviewing.  i think the use of both require & beofre is confusing17:31
clarkbmorganfainberg: devils advocate, why wouldn't you just import keystoneclient.auth and have that be lightweight?17:32
morganfainbergclarkb: because keystoneclient has a lot of extra dependencies17:32
morganfainbergclarkb: keystoneauth is meant to be very light, trimmed dependencies that is relevant for servers, clients, or SDK w/o needing to import / depend on keystoneclient17:33
morganfainbergeliminates pysaml2 for example as a dep17:33
yolandazaro, it was a typo, testing again and i'll push it17:33
yolandanow testing on a live server17:33
morganfainbergand keystoneclient will continue to depend on extra things like oslo.serialization, where keystoneauth probabably wont (if we can avoid it)17:33
zarofungi, nibalizer :commented on 17825117:35
*** ir2ivps has joined #openstack-infra17:35
zaroyolanda: here's the scrip i used for testing: http://paste.openstack.org/show/210518/17:36
yolandanice, let me test with that17:37
*** jamesmcarthur has quit IRC17:37
clarkbmorganfainberg: ok so you are concerned about dependencies17:37
*** melwitt has joined #openstack-infra17:38
morganfainbergclarkb: ksc becomes more "management of IAM stuff" and keystoneauth is "authn/authz" specifc.17:38
morganfainbergbut the dependencies is the #1 reason17:38
adam_ghuh? when did gerrit start accepting changes without a Change-Id footer?17:39
morganfainbergadam_g: afaik17:39
fungiadam_g: it's a configurable option per project. have an example?17:39
adam_gmorganfainberg, yeah thats what i thought17:39
pleia2morganfainberg: had a comment/question inline17:39
morganfainbergpleia2: looking17:40
adam_gfungi, ive ended up with a bunch of dupes in  akanda-appliance-builder, where is that toggled per project?17:40
morganfainbergpleia2: sure.17:40
* mordred supports keystoneauth with low dependencies17:40
fungiadam_g: i think it's turned on for all projects, but will check that one17:40
morganfainbergpleia2: can remove those.17:40
morganfainbergpleia2: let me respin addressing that17:40
yolandawow, so zaro, reason of my typo17:41
openstackgerritMorgan Fainberg proposed openstack-infra/project-config: Add keystoneauth library and testing infrastructure  https://review.openstack.org/17559617:41
yolandasubparser: update has option "names"17:41
morganfainbergpleia2: ^17:41
yolandasubparser: test has option "name"17:42
pleia2morganfainberg: thanks17:42
morganfainbergpleia2: np!17:42
fungiadam_g: looks like the acl needs fixing17:42
clarkbok I am really stumped on the puppet thing and think we should give my change a go and if we still see this problem my change should make debugging easier17:42
openstackgerritMerged openstack-infra/system-config: Revert "Revert "Update production gerrit to""  https://review.openstack.org/17826917:42
*** luqas has quit IRC17:42
yolandazaro, that's a different problem, but is showing up on testing17:43
*** yfried|away is now known as yfried17:43
fungiadam_g: see unfortunate tyop at http://git.openstack.org/cgit/openstack-infra/project-config/tree/gerrit/acls/stackforge/akanda.config#n1117:43
clarkbthough if you use git-review it should always add one for you17:44
clarkbso protip use git-review?17:44
clarkboh except maybe not for a stack17:44
clarkbwe can fix that in git review though17:44
adam_gfungi, an unfortunate tyop indeed!17:44
adam_gclarkb, yeah17:44
fungiadam_g: i'll patch it. i find it cargo-culted in three other acls too17:45
adam_gfungi, thanks17:45
*** lucap has quit IRC17:46
clarkbfungi: I am checking all-projects for that acl17:46
clarkbfungi: if its not there we should probably add it17:46
fungiclarkb: probably17:46
fungiit's inheriting false from all-projects17:46
clarkbya its set to false in all-projects17:46
clarkbwe can probably avoid this trouble by setting that to true17:46
*** tonytan4ever has joined #openstack-infra17:47
clarkbneed to think about why that may be a bad idea17:47
zaroyolanda: i guess 'names' makes more sense.17:47
zaroyolanda: would be best to go with names and deprecate 'name'17:47
zaroyolanda: but should support both for now so it won't break when users update17:48
openstackgerritJeremy Stanley proposed openstack-infra/project-config: Fix mistyped requrieChangeId in Gerrit ACLs  https://review.openstack.org/17830717:48
openstackgerritMerged openstack/requirements: create a separate section for pinned requirements  https://review.openstack.org/17719317:48
*** peristeri has joined #openstack-infra17:49
yolandazaro, ok, it will need another change to fix that17:50
*** jamesmcarthur has joined #openstack-infra17:50
*** tjones1 has joined #openstack-infra17:51
*** wenlock has joined #openstack-infra17:51
*** arxcruz has quit IRC17:53
*** markvoelker has quit IRC17:54
*** annegentle has joined #openstack-infra17:54
*** wenlock has quit IRC17:54
clarkbfungi: ya thats another option17:55
*** SergK has joined #openstack-infra17:56
fungiif i'm interpreting zaro's concerns correctly, using require instead of before/after makes it easier to follow what the intended order is17:57
*** sushilkm has joined #openstack-infra17:57
nibalizerzaro: 'before' means 'I go before this other thingg' and require means "that other thing goes before me"17:57
nibalizerfungi: if you want to use only require thats super ok with me17:57
nibalizerkeeping it consisent has value i think17:57
funginibalizer: it's not bothering me, but i do take zaro's concerns seriously too17:58
zaroyes, valuable when you need to debug17:58
*** e0ne has quit IRC17:58
nibalizerya so switch to all requires17:58
*** ir2ivps has quit IRC17:58
*** ildikov has quit IRC17:59
clarkbI can update the change in a moment, finishing an email first17:59
openstackgerritEmilien Macchi proposed openstack-infra/system-config: Create rubygems mirror from rubygems.org  https://review.openstack.org/17802617:59
*** sdake has quit IRC17:59
*** lucap has joined #openstack-infra18:00
openstackgerritMichael Krotscheck proposed openstack-infra/storyboard-webclient: Removed angular-eslint  https://review.openstack.org/17831218:00
*** tiswanso_ has joined #openstack-infra18:01
*** weshay has quit IRC18:02
*** melwitt has joined #openstack-infra18:03
*** melwitt has quit IRC18:03
openstackgerritDavid Shrewsbury proposed openstack-infra/shade: Allow complex filtering with embedded dicts  https://review.openstack.org/17829918:04
*** jamielennox|away is now known as jamielennox18:04
*** MaxV has joined #openstack-infra18:04
openstackgerritClark Boylan proposed openstack-infra/puppet-gerrit: Run lib tidy after plugin install, before start  https://review.openstack.org/17825118:04
clarkbfungi: zaro ^ I moved the tidy too so that if you are doing a top to bottom read of the file it flows well18:05
*** jogo has quit IRC18:06
*** dtantsur is now known as dtantsur|afk18:06
*** weshay has joined #openstack-infra18:07
*** jogo has joined #openstack-infra18:07
*** TheJulia has joined #openstack-infra18:07
*** annegentle has quit IRC18:09
*** jamesmcarthur has quit IRC18:09
pabelangerSo, are our DIBs accessible from a public URL before being moved into the cloud?  Specifically, the devstack-centos7-dib?  I'd rather just consume that directly from -infra, if possible, then build my own.18:10
*** annegentle has joined #openstack-infra18:10
*** otter768 has quit IRC18:12
fungipabelanger: not yet. we're looking into good solutions for publishing them18:12
pabelangerfungi, roger18:12
*** ildikov has joined #openstack-infra18:12
fungipabelanger: we're also only using then for some images in some providers at the momenty18:13
*** doug-fish has joined #openstack-infra18:13
clarkbit should be trivial to build your own if you have about 8GB of disk, some network bandwidht and an hour of time18:13
pabelangerclarkb, agreed.  Was more curious if anything was public before going down that path.18:13
clarkbin openstack-infra/project-config/tools/ there is a build-image.sh script which sets things up, you just have to override the default of ubuntu to get centos7 or fedora18:13
fungifor example, desk chair jousting in the office hallways18:14
pabelangertime to stock up on nerf supplies18:15
fungiokay, puppet on review.o.o is reenabled and up to date now18:16
fungiwe return you to your regularly scheduled (computer) programming18:16
*** cdent has quit IRC18:17
zaroclarkb: LGTM18:18
openstackgerritJan Klare proposed openstack-infra/project-config: move gate-.*-chef-rake job and run it branch specific  https://review.openstack.org/17667418:18
*** ivar-laz_ has quit IRC18:19
*** tqtran has quit IRC18:19
mordredpabelanger: it probably does not matter - but please be aware that our SSH keys are on all of those images18:20
mordredpabelanger: so if/when we do start publishing to something other than glance and you do decide to directly reuse them ... just know that we'll be able to ssh in to them :)18:21
*** ivar-lazzaro has joined #openstack-infra18:21
*** lucap has quit IRC18:21
*** emagana has joined #openstack-infra18:22
pabelangermordred, Ya, was thinking about that too.18:22
*** pabelanger has quit IRC18:24
*** ir2ivps has joined #openstack-infra18:24
*** jcoufal has quit IRC18:24
*** pabelanger has joined #openstack-infra18:27
clarkbif we want functional tests we can just spin up an apache with some files18:28
*** bswartz has quit IRC18:28
clarkbor are you wanting swift from devstack?18:28
clarkbmtreinish: its to test os-loganalyze18:28
clarkbtrying to understand what devstack gives us and a swift install is the only thing I can think of18:29
*** dustins has joined #openstack-infra18:29
mtreinishah, I see now. Misread your original comment. Yeah I think it would be just to provide swift (and probably keystone too)18:30
krotscheckclarkb: I have completely forgotten about the status of your vinz pitch. Any updates there?18:30
clarkbkrotscheck: gave it last night think it went ok18:31
clarkbkrotscheck: should know in a week or two if it will go beyond that18:31
krotscheckclarkb: What's the competition look like?18:31
clarkbkrotscheck: mozilla had a firefoxos thing and a research group had some high performance db stuff18:31
*** achanda has quit IRC18:34
*** achanda has joined #openstack-infra18:35
*** guest_____ has quit IRC18:35
mtreinishclarkb: http://logs.openstack.org/76/178276/1/check/gate-subunit2sql-python27/3f03cbc/console.html#_2015-04-28_16_38_00_670 that failed trying to run the first migration. Could we have failed a db cleanup on the test box before?18:36
mtreinishalthough I thought I made the tests drop existing dbs before it ran18:36
clarkbmtreinish: the test machine should be a pristine new VM just for you without any stale dbs18:36
mtreinishclarkb: hmm, then I have no idea18:37
*** Longgeek has quit IRC18:37
mtreinishoh, unless it's trying to upgrade twice. That would be something18:37
mtreinishoh, yep that's what it is, it's trying to use mysql on the postgres test18:38
mtreinishI bet it hasn't been running postgres this whole time and it was just racing, and if mysql was second it would delete the db from the postgres test...18:38
openstackgerritMichael Krotscheck proposed openstack-infra/storyboard-webclient: Update to search UI.  https://review.openstack.org/17800318:39
openstackgerritMichael Krotscheck proposed openstack-infra/storyboard-webclient: Renamed result-set-size directive  https://review.openstack.org/17800418:39
openstackgerritMichael Krotscheck proposed openstack-infra/storyboard-webclient: Result set paging update.  https://review.openstack.org/17800518:39
openstackgerritAndreas Jaeger proposed openstack-infra/project-config: Fix mistyped requrieChangeId in Gerrit ACLs  https://review.openstack.org/17830718:39
*** yfried is now known as yfried|afk18:40
*** tjones1 has quit IRC18:41
*** pabelanger has quit IRC18:42
*** xyang1 has joined #openstack-infra18:44
*** krtaylor has quit IRC18:46
*** Rockyg has joined #openstack-infra18:49
*** markvoelker has joined #openstack-infra18:49
mordredAJaeger: I LOVE that requrieChangeId is misspelled in that commit message18:50
*** med_ has quit IRC18:50
lifelessis there some performance issue within nodes atm? my pbr changes are timing out18:50
lifelessthey weren't while they were being developed18:50
fungiAJaeger: thanks for updating that18:51
*** med_ has joined #openstack-infra18:51
openstackgerrityolanda.robla proposed openstack-infra/jenkins-job-builder: Added parallelization options  https://review.openstack.org/7551418:51
*** med_ has joined #openstack-infra18:51
*** weshay has joined #openstack-infra18:52
*** pabelanger has joined #openstack-infra18:52
openstackgerritJoe Gordon proposed openstack-infra/devstack-gate: Explicitly say when job times out  https://review.openstack.org/17833018:52
*** pabelanger has quit IRC18:53
*** markvoelker has quit IRC18:54
*** hdd has joined #openstack-infra18:54
*** lucap has joined #openstack-infra18:56
*** lucap has quit IRC18:56
*** lucap has joined #openstack-infra18:57
*** lucap has quit IRC18:58
clarkbjhesketh: you were just here a few hours ago :)18:58
jheskethyep, back for the infra meeting18:58
clarkblifeless: looking at a job in #openstack-nova with jogo and if your jobs are timing out on hpcloud I think this is another symptom of hpcloud to rax connectivity not so great right now18:58
jheskethclarkb: how are the systems going?18:58
*** lucap has joined #openstack-infra18:58
*** dannywilson has quit IRC18:59
clarkbjhesketh: things are much better but we are still seeing some hpcloud failures18:59
*** e0ne has joined #openstack-infra18:59
clarkbjhesketh: seem mostly due to performance to git.o.o leading to timeouts19:00
*** tiswanso_ has quit IRC19:00
fungimeeting time19:00
*** yfried|afk is now known as yfried19:00
jheskethclarkb: still? so the noisy neighbour hasn't been dealt with, or are there more issues19:00
*** hyakuhei has joined #openstack-infra19:01
clarkbjhesketh: noisy neighbor was pypi mirror, I think we are seeing something different with this19:01
jheskethhmm, okay19:01
jheskethclarkb: are we still leaning towards DoS'ing ourselves?19:02
*** tonytan4ever has quit IRC19:02
yolandazaro, i was able to test that problem with parameter names, but i'm having some issues connecting with jjb to my master, it complains about json19:02
yolandait just needs and admin user/pass, and jenkins url, right?19:02
lifeless2015-04-28 09:59:54.642 | Building remotely on devstack-trusty-hpcloud-b5-2341338 (devstack-trusty) in workspace /home/jenkins/workspace/check-pbr-installation-dsvm19:02
lifelessclarkb: sounds like19:03
clarkbjhesketh: we might be, I need to go look at cacti graphs19:04
openstackgerritJoe Gordon proposed openstack-infra/elastic-recheck: Expand fingerprint for git fetch error  https://review.openstack.org/17833819:04
zaroyolanda: yes, but i don't use user/password. i disable security on my jenkins master19:05
jogoclarkb mriedem: expanded fingerprint for that bug ^19:05
jogoit has a *lot* of hits19:05
mriedemjogo: looking19:05
zaroyolanda: maybe you should try connecting and doing a simple get_job() with python-jenkins?19:06
*** krtaylor has joined #openstack-infra19:06
*** bswartz has joined #openstack-infra19:06
*** jaypipes has joined #openstack-infra19:06
yolandazaro, yes, i need to dig more19:06
yolandai had that working months ago19:06
*** hdd has quit IRC19:07
*** med_ has joined #openstack-infra19:08
*** med_ has quit IRC19:08
*** med_ has joined #openstack-infra19:08
*** lucap has quit IRC19:08
mriedemnvm, just buried19:09
mriedemgoes from 693 -> ~3 million hits19:09
jogothe second part may just always happen19:10
*** yfried is now known as yfried|afk19:10
jogobetter query coming soon19:11
openstackgerritJulia Kreger proposed openstack-infra/shade: Add Ironic machine power state pass-through  https://review.openstack.org/17228419:11
jklareAJaeger: ping, you got a moment?19:11
openstackgerritJoe Gordon proposed openstack-infra/elastic-recheck: Expand fingerprint for git fetch error  https://review.openstack.org/17833819:11
clarkbjhesketh: usually yes, that way it ends up in the meeting note summary easy retrieval19:11
openstackgerritJulia Kreger proposed openstack-infra/shade: Enhance error message in update_patch  https://review.openstack.org/17798519:11
jogoclarkb mriedem: that should be a bit better19:11
*** emagana has quit IRC19:11
jheskethclarkb: right, but it's okay for non-chair people to do?19:12
clarkbjhesketh: I think so, I have definitely done it when not chair19:12
openstackgerrityolanda.robla proposed openstack-infra/jenkins-job-builder: Added parallelization options  https://review.openstack.org/7551419:12
*** hyakuhei has quit IRC19:14
*** dprince has quit IRC19:15
*** dprince has joined #openstack-infra19:15
openstackgerritMerged openstack-infra/system-config: Adding openstack-horizon to statusbot channels  https://review.openstack.org/17452619:16
*** med_ has quit IRC19:17
*** emagana has joined #openstack-infra19:18
*** tiswanso has joined #openstack-infra19:18
*** tonytan4ever has joined #openstack-infra19:18
*** emagana has quit IRC19:19
*** emagana has joined #openstack-infra19:19
morganfainbergmordred: how much revolting will I have if I try and add non-python language bindings into a lib we maintain (e.g. keystoneauth) once we have the python implementation super awesome. thinking future looking19:21
*** dboik has joined #openstack-infra19:21
*** med_ has joined #openstack-infra19:21
*** med_ has quit IRC19:21
*** med_ has joined #openstack-infra19:21
mordredmorganfainberg: I'm not sure I understand your question19:21
mordredmorganfainberg: do you mean "what if you created a repo that had a rust library for doing keystone auth?"19:22
morganfainbergmordred: thinking python keystoneauth is solid, we now build a mirror functionality for <Language X>19:22
morganfainbergand want to keep it in the same tree, because $reasons$ for release management19:22
morganfainbergand forcing functionality to be ... smart or something about language things19:22
morganfainbergrust, go, c++, node19:22
mordredmorganfainberg: at a time that I'm not in the infra meeting19:23
mordredmorganfainberg: I'd like to dig further into $reasons$19:23
morganfainbergmordred: figured async ;)19:23
morganfainbergmordred: some other time.19:23
mordredbecause I'm not sure I agree that there the urge for co-located repo will outweigh the strangeness of such a beast19:23
mordredhowever, I could be wrong19:23
mordredmorganfainberg: but I think it's a great idea and we should do it19:23
mtreinishmorganfainberg: what wouldn't work about doing that?19:24
mtreinishI feel like it would just work19:24
morganfainbergmtreinish: was more of a "community revolt" thing not a "technology revolt"19:24
morganfainbergre the question19:24
mordredmorganfainberg: the community is going to need to learn to not revolt19:24
mordredmorganfainberg: it turns out there are other languages in the world19:24
morganfainbergi agree.19:24
mtreinishoh, heh I wouldn't revolt, as long as you added fortran bindings :)19:25
* morganfainberg will circle up later.19:25
mordredmorganfainberg: woot19:25
* morganfainberg also waits till post -infra meeting to chase down the "get keystoneauth into gerrit" stuff :)19:25
greghaynesyolanda: here is the DIB testing patch stack https://review.openstack.org/#/c/178040/19:27
yolandathx, looking19:27
*** eharney has quit IRC19:28
*** markvoelker has joined #openstack-infra19:28
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/shade: Do not cache unsteady state images  https://review.openstack.org/17749419:28
SpamapSclarkb: ^^ your concerns about container addressed19:29
*** markvoelker has quit IRC19:29
*** markvoelker has joined #openstack-infra19:29
yolandai was looking at something more integrated like test devstack on the nodes19:31
lifelessmorganfainberg: just as a data point19:31
yolandabut this approach can be used for another kind of testing19:31
morganfainberglifeless: yeah19:31
morganfainberglifeless: will explore all that when we have a stable interface19:32
morganfainberglifeless: alternative, rewrite in c and swig! (don't hurt me...)19:32
*** rfolco has quit IRC19:32
*** mestery has quit IRC19:34
*** mestery has joined #openstack-infra19:39
openstackgerritMatthew Treinish proposed openstack/requirements: Bump tempest-lib min version  https://review.openstack.org/17836219:43
*** dustins has joined #openstack-infra19:45
sdagueclarkb: I want a working swift in there as well19:45
sdaguegiven that the bulk of the current complexity involves the swift redirect19:45
sdagueI also want this available in devstack just for people to use19:45
sdagueI guess it's fine if I'm outvoted, but it seems like writing a devstack plugin for this is simpler than the rest of the setup for a bare node from scratch19:46
*** asahlin is now known as asahlin_afk19:46
clarkbsdague: ya it just seems odd since its not really an openstack service or driver or plugin19:49
sdagueclarkb: so instead we'll have a different environment that configures apache & swift from scratch?19:50
*** hyakuhei has quit IRC19:50
mordredclarkb, sdague: I missed the context - what's it testing?19:51
sdaguebecause we had an epic fail deploy the other day19:51
clarkbsdague: no I am not suggesting that, I was trying to understand why you thought it would be a good devstack plugin19:51
sdaguemordred: exactly19:51
clarkbsdague: I would actually suggest we have puppet deploy it19:51
clarkbsince we already have that in place19:51
*** changbl has joined #openstack-infra19:52
clarkbrather than write a new thing to have devstack do it19:52
sdagueclarkb: we can pull that bit in19:52
sdaguebut I still don't see how that gets us a working swift to develop against19:52
clarkbit doesn't19:53
*** openstackstatus has joined #openstack-infra19:53
*** ChanServ sets mode: +v openstackstatus19:53
clarkbsdague: but we could do: `stack.sh` with only swift enabled then run puppet19:53
sdaguebut if we do it this other way, we also get this enabled for actual devstacks19:54
sdaguewhich *has* been asked for a number of times19:54
notmynamehat's the goal you're looking for with a "working swift to develop against"?19:54
EmilienMnibalizer: here we go: http://logs.openstack.org/26/178026/3/check/gate-infra-puppet-apply-precise/d085597/console.html#_2015-04-28_18_01_34_65319:54
*** gyee has quit IRC20:00
yolandazaro, looks as i need to fight a bit more with my jenkins master20:00
clarkbsdague: ok I wasn't aware of that20:00
yolandawould you mind testing on your environment? unit tests passing now, and changes addressed20:00
clarkbsdague: I thought people wanted a utility to operate on logs in a similar way but not a hosted service20:00
clarkbsdague: basically smart grep20:01
zaroyolanda: so how did you consolidate the name(s) param?20:01
openstackgerritJeremy Stanley proposed openstack-infra/bindep: Add positive/negative tests exercising the parser  https://review.openstack.org/17837820:01
openstackgerritJeremy Stanley proposed openstack-infra/bindep: Allow hyphens in profile strings  https://review.openstack.org/17837920:01
sdagueso, there are those people as well20:01
sdaguebut the reason the actual wsgi toy server got added was for the devstack case, and it would be better to just be there20:01
yolandazaro, left it for another change, i found it's a bit confusing20:01
yolandadon't want to mess in that paralelization change20:01
clarkbnibalizer: so I think template.pp is already naturally split into sections20:02
yolandaas some commands have "name", not only the test one, but the delete, for example20:02
clarkbnibalizer: it will be significantly easier to review if we move a section at a time20:02
yolandazaro, i fixed my error on name/names for update and pushed again20:02
zaroyolanda: ok, good idea.20:02
morganfainbergpleia2: mind helping me understand what I am doing wrong in http://logs.openstack.org/96/175596/6/check/gate-project-config-layout/23853ce/console.html#_2015-04-28_17_46_18_43620:02
morganfainbergpleia2: the error is... uhm20:02
morganfainbergnot particularly specific/verbose20:03
zaroyolanda: will run my test again.20:03
clarkbnibalizer: and ssh keys (which were the trouble last time)20:03
clarkbnibalizer: and so on20:03
fungimorganfainberg: Job keystoneauth-docs not defined20:03
morganfainbergfungi: huh20:03
pleia2morganfainberg: 2015-04-28 17:46:18.436 | Job keystoneauth-docs not defined20:04
morganfainbergoh oh20:04
pleia2oh, fungi beat me to it20:04
morganfainbergzuul layout needs to be changed not just project20:04
pleia2yeah, forgot to note that earlier, sorry about that20:04
yolandaasselin, so we should meet to collaborate on that for sure. I saw you are working on some changes and i don't want to overlap with you, is ok if i continue moving the functionality i can see, to modules?20:04
fungimorganfainberg: AJaeger's comment on that change tells you what's missing20:04
asselin_yolanda, yes, we should coordiate that20:05
morganfainbergfungi: https://review.openstack.org/#/c/175596/ didnt see a comment from AJaeger20:05
asselin_yolanda, https://storyboard.openstack.org/#!/story/2000101 are the stories20:06
morganfainbergahhhh must be browser cache20:06
morganfainbergrefreshed like 5 times and now it appeared20:06
*** zz_dimtruck is now known as dimtruck20:06
asselin_yolanda, I'd like to either move to modules first, then move to openstackci, or move to openstackci, and move from there to the modules20:06
morganfainberguntil we popilate the doc dir with the data20:07
pleia2yeah, there are no docs in the doc/ directory20:07
pleia2I don't actually know what the build job will do with nothing to build, but it seems silly to run it20:07
*** otter768 has joined #openstack-infra20:08
asselin_yolanda, the other way to coordiate is for you to e.g. focus on the non-openstackci portions first.20:08
yolandaasselin, ok, i've been working more on the base items, but now i'm starting to move efforts to components, i'll work according to the stories20:08
morganfainbergpleia2: let me check to see what it does20:08
morganfainbergpleia2: i do expect to have docs populated as one of the first commits fwiw20:08
pleia2morganfainberg: ah, good to know20:09
morganfainbergpleia2: it just was something i wanted through gerrit not by hand20:09
*** fawadkhaliq has quit IRC20:09
* pleia2 nods20:09
morganfainbergpleia2: since i need jamielennox 's brain for it20:09
morganfainbergpleia2: if we need to put stubby docs in to make it merge other code it is at least incentive to do so20:09
yolandaanyway, EOD for today, bye20:10
* morganfainberg runs tox -edocs locally to see what happens20:10
*** thinrichs has joined #openstack-infra20:10
morganfainbergyeah this would fail20:10
pleia2thanks for checking20:11
*** otter768 has quit IRC20:12
fungimorganfainberg: pleia2: great point. if the project lacks docs, you don't need doc jobs20:12
pleia2AJaeger: ^^20:13
fungii assumed you were trying to add them instead20:13
*** lucap has quit IRC20:13
morganfainbergand we will add the docs back in [previous version of that review]20:13
morganfainbergpleia2, fungi, AJaeger: ^ that should be back to normal, and docs should now be populated in my github repo20:14
morganfainbergand now... i need to go back to the hotel.20:15
openstackgerritJoe Gordon proposed openstack/requirements: Require flake8 2.4.0  https://review.openstack.org/15798520:15
jamielennoxmorganfainberg: i'd prefer you just did a stub for docs, i want to start scratch for keystoneauth20:16
morganfainbergjamielennox: we can just trash those ones once it is in gerrit.20:16
*** lucap has joined #openstack-infra20:17
*** e0ne has quit IRC20:17
*** spredzy_ is now known as spredzy20:19
*** HeOS has joined #openstack-infra20:19
*** markvoelker has quit IRC20:20
clarkbnow to review swift upload retries20:21
openstackgerritGary W. Smith proposed openstack-infra/project-config: Add manila-ui to OpenStack  https://review.openstack.org/17506320:21
*** tjones1 has joined #openstack-infra20:21
*** jamesmcarthur has joined #openstack-infra20:22
clarkbjhesketh: any reason you used xrange over range? (thinking about potential python3 compat, but we can cross that bridge if/when we get there)20:22
fungiclarkb: thanks20:22
fungixrange() is also a premature optimization for basically all but very large ranges20:23
clarkbfungi: do you want to review the swift upload retries before I approve?20:25
clarkb(or anyone else pleia2 mordred jeblair SergeyLukjanov )20:25
fungiwhich one was it again?20:25
fungii should retry to review it ;)20:25
*** rlandy has quit IRC20:25
*** markvoelker has quit IRC20:25
mordredclarkb: go for it20:26
*** jamesmcarthur has quit IRC20:27
*** jamesmcarthur has joined #openstack-infra20:28
*** teran_ has joined #openstack-infra20:29
*** samueldmq has quit IRC20:29
*** mrunge has quit IRC20:30
*** _nadya_ has quit IRC20:30
*** mrmartin has quit IRC20:32
*** teran has quit IRC20:32
*** rmcall has joined #openstack-infra20:33
*** dustins has quit IRC20:34
fungii need to remember to pep8 my changes after _all_ my edits, not just in the middle of making them20:34
openstackgerritMerged openstack-infra/project-config: Retry log upload to swift  https://review.openstack.org/17819920:35
*** e0ne has joined #openstack-infra20:36
*** peristeri has quit IRC20:37
*** changbl has quit IRC20:37
*** jamesmcarthur has quit IRC20:38
*** frobware_ has joined #openstack-infra20:38
*** dprince has quit IRC20:38
*** sarob has quit IRC20:44
*** lucap has quit IRC20:45
*** sarob has joined #openstack-infra20:45
openstackgerritClark Boylan proposed openstack-infra/devstack-gate: Remove tracing since ansible seems to be working  https://review.openstack.org/17839820:50
gary-smith_I received an error on http://logs.openstack.org/63/175063/4/check/project-config-gerrit/7916450/console.html saying my project was "not normalized". Does that mean that I cannot have multiple groups with the same access?20:50
*** markvoelker has joined #openstack-infra20:51
*** sarob has quit IRC20:52
jheskethclarkb: no reason for xrange. I didn't realise it would be incompatible. Can redo it with range if you'd like20:53
zarofungi: thanks for the info on how to use gnupg but i'm still not sure how to apply that to the gerrit contact store. i tried setting 'appsec' in gerrit config to your gpg key but gerrit still will not start for me.20:53
*** tjones1 has quit IRC20:53
clarkbjhesketh: nah we can worry about python3 all at once20:53
clarkbjhesketh: otherwise its a loosing battle20:53
zarojhesketh: does zuul test work in your env?20:53
pleia2fungi: there seems to be consensus on this patch from folks who know about such things, but is there a way I can see it worked (aside from merging and - hey, docs showed up!) or check paths myself? re: https://review.openstack.org/#/c/17783920:54
pleia2suppose it can't really hurt right now since they are broken, but will help for future reference :)20:54
*** harlowja_ has quit IRC20:55
clarkbpleia2: you can build the infra docs and check the source path easily20:55
fungizaro: appsec is just a random string used to obscure the api call. i'll have to revisit the documentation for the contactstore feature to remind myself where pgp keys fit into this20:55
zaropleia2: we don't have a fork of zanata do we?20:55
clarkbpleia2: tox -einfra-docs I think20:55
pleia2clarkb: ah good, will do20:56
pleia2zaro: no, we're running directly from upstream20:56
jklareclarkb: do you have a minute to take another look at this one https://review.openstack.org/#/c/176674/ so we (the openstack chef people) can move forward with the gates?20:56
*** markvoelker has quit IRC20:56
*** emagana has quit IRC20:56
clarkbjklare: done20:56
jklareclarkb: amazing, ty20:57
*** emagana has joined #openstack-infra20:58
*** sarob has joined #openstack-infra20:58
*** bswartz has quit IRC20:58
zarofungi: ahh, re-reading jeblair's comment about gpg key for contact store.  he asked if i provided *puppet* a valid gpg key, i thought he meant *gerrit*.20:58
zarofungi: so i didn't use puppet to setup my test instance of gerrit at all.20:59
*** tjones1 has joined #openstack-infra20:59
*** melwitt has quit IRC20:59
*** melwitt has joined #openstack-infra20:59
lifelessgrah 404 on tcpdump in the rax ubuntu mirror20:59
lifelessw t f20:59
mordredlifeless: yeah21:00
mordredlifeless: I mean, yeah21:00
*** marun has joined #openstack-infra21:00
zarofungi: i'm guessing the puppet sets up the contact store somehow for gerrit somehow.  so i think i'm missing that critical step and that's why contact store is a no-go for me :(21:00
*** melwitt has quit IRC21:01
*** melwitt has joined #openstack-infra21:01
fungizaro: probably. hopefully i'll have time to look over it again in a bit21:02
*** gyee has joined #openstack-infra21:02
*** tiswanso has quit IRC21:03
gary-smith_clarkb: we talked last week about having multiple groups in an acl vs. creating a new group (see https://review.openstack.org/#/c/175063/). Are multiple groups prohibited?21:03
clarkbgary-smith_: apparently that check is over zealous pretty sure you can have multiples21:05
*** hdd has joined #openstack-infra21:05
openstackgerritMerged openstack-infra/project-config: move gate-.*-chef-rake job and run it branch specific  https://review.openstack.org/17667421:05
clarkbfungi: AJaeger we have had a lot of problems with this script recently, should we maybe reevaluate what we are hving it do?21:05
gary-smith_clarkb: can this change proceed with a -1 from jenkins?21:06
*** mrmartin has quit IRC21:06
*** thinrichs has joined #openstack-infra21:06
*** dizquierdo has joined #openstack-infra21:06
fungiclarkb: my opinion is that we shouldn't be testing for group name patterns in urls at all21:07
*** baoli has quit IRC21:07
clarkbgary-smith_: no, we will either need to fix the script or accomodate it in that chnage21:07
zarohashar: hi21:08
clarkbgary-smith_: well I think you can fix it in that change21:08
*** thinrichs has left #openstack-infra21:08
gary-smith_clarkb: ok, i'll look into that21:08
clarkbgary-smith_: I am trying to get a link to the file and where likely needs to be changed21:08
*** thinrichs has joined #openstack-infra21:08
gary-smith_clarkb: looks like it is tools/check_valid_gerrit_config.sh.21:10
*** gyee has quit IRC21:11
clarkband we may be running into gerrit and python not agreeing what a valid ini file looks like there21:12
clarkbI think python is collapsing so that you have unique keys but gerrit allows duplicates iirc21:12
gary-smith_right, gerrit allows it21:12
hasharzaro: hello!21:12
hasharzaro: I haven't been quite active on the jenkins / jjb reviews and coding recently :/21:13
*** dustins_ has quit IRC21:13
clarkbgary-smith_: so the fix there may be to not use python's config parser for the output and instead construct a file to write out?21:13
zarohashar: no, i'm not going to bother about that :)21:13
clarkbfungi: ^ any thoughts on that?21:13
zarohashar: wondering if you've been hacking on zuul lately?21:13
*** gyee has joined #openstack-infra21:14
*** tonytan4ever has quit IRC21:15
*** teran has joined #openstack-infra21:18
*** teran has quit IRC21:19
fungiclarkb: hrm... when i originally implemented that script it had its own parser because of that problem21:20
fungidoes it not still? i'm in the middle of some release-related and vmt-related discussions at the moment and don't have time to check21:21
clarkbfungi: oh maybe thats just a dict collapsing it then21:21
clarkbfungi: would would still lead to the same problem21:21
clarkbI may have completely misread this21:22
gary-smith_clarkb: figured out a workaround: re-ordered the entries in the file21:22
*** jtriley has quit IRC21:22
clarkbgary-smith_: oh! ya its just wanting things in order21:22
clarkbI guess that is what the diff is showing?21:22
clarkbfungi: ^21:23
fungiyes, it does also alpha-order (and number-sequence) the lines in each section21:23
*** ldnunes has quit IRC21:23
gary-smith_yup, I just sorted each section and now it's happy.21:23
clarkbI see21:23
fungiyou should also be able to just run the script against the file and have it reformat it for you21:24
*** julim has quit IRC21:24
openstackgerritGary W. Smith proposed openstack-infra/project-config: Add manila-ui to OpenStack  https://review.openstack.org/17506321:24
fungiif you pass it the path to the acl followed by all the normalization rule numbers besides 0 it will edit the file in-place for you21:24
fungi0 is a pseudo-rule which toggles modifying the file vs spitting the changed version to stdout21:25
*** spzala has quit IRC21:25
gary-smith_fungi: that's good to know21:25
gary-smith_while we're on the topic, anyone willing to review https://review.openstack.org/175063 ?21:25
fungii originally wrote it as a tool for me to normalize all the acls in that repo so that we could see which ones were able to get collapsed into fewer acls, and also to perform needed cleanup of options we no longer needed in them in a way which could easily keep up with the inevitable rebase hell reviewing those changes were bound to encounter21:26
fungiit wasn't really designed to be a linter21:26
gary-smith_The TC will be +2'ing the manila-ui project tomorrow morning, barring any unforeseen complaints: http://eavesdrop.openstack.org/meetings/tc/2015/tc.2015-04-28-20.02.log.txt @ 20:1921:28
*** HeOS has quit IRC21:29
gary-smith_gary-smith_: and we'd really appreciate having this move along. Thanks in advance!21:29
*** HeOS has joined #openstack-infra21:29
hasharzaro: since december the only zuul work I did was to package it for Debian :]21:29
hasharzaro: Wikimedia nows uses .deb packages to deploy zuul!21:29
*** emagana has joined #openstack-infra21:30
hasharzaro: next steps for me are:  integrate patches some patches pending reviews,  catch up with all the changes that happened for the last few months, build a package for Debian Jessie21:30
mordredI mean21:30
hasharor was it w00t ?21:31
hasharit is a bit challenging to package for distributions that potentially have old python modules :/21:32
openstackgerritMerged openstack/requirements: Updated oslo.config to 1.11.0  https://review.openstack.org/17344921:32
*** lucap has joined #openstack-infra21:33
*** viglesias has quit IRC21:33
*** ajmiller_ has joined #openstack-infra21:34
*** viglesias has joined #openstack-infra21:34
*** jklare has quit IRC21:34
*** frobware_ has quit IRC21:34
hasharmordred: also kudos on whoever maintains diskimage-builder :]  I really like the elements concept21:35
*** dkranz has quit IRC21:36
*** erikmwilson has joined #openstack-infra21:37
*** ajmiller has quit IRC21:37
zarohashar: hmm, trying to get someone to help me figure out why zuul tests are failing for me.  i'm at a loss.21:38
hasharzaro: zuul integration tests always looked like a magic tour to me. I am trying hard to figure out the trick being used but end up resorting on James to solve it :D21:39
hasharzaro: I would push the change to Gerrit (I guess it is done) and ask on openstack-infra list for clues21:40
*** jklare has joined #openstack-infra21:40
hasharzaro: also if you are on Mac, the python version does not have poll()  since the system poll() does not work properly (I think it does not work on sockets or something like that)21:41
zarohashar: no change, it's failing from master branch. but it only fails when i run with tox.21:41
funginothing wrong with pushing a zuul patch you're working on into review and seeing if the tests do the same thing they're doing on your workstation21:41
zarohashar: i'm running on trusty21:41
hasharzaro: oh have you tried rebuilding the tox env?21:41
*** erikmwilson has quit IRC21:41
zarohashar: yes.21:42
clarkbmordred: ansible question in http://logs.openstack.org/98/178398/1/experimental/check-tempest-dsvm-neutron-multinode-full/c93ab05/logs/devstack-gate-setup-workspace-new.txt.gz#_2015-04-28_21_03_11_699 ansible reports that it ran the setup_workspace and that returned 0, if you scroll to the bottom of that log it actuall says it exit 1'd21:42
clarkbmordred: any idea why that would happen?21:42
hasharzaro: what happens on the infra Jenkins slaves?21:42
clarkboh I bet tsfilter does the wrong thing21:42
fungii usually git clean -dfx before running tox locally just because there's all sorts of odd interactions some tests can have with stray files you've forgotten are there21:42
clarkbmordred: so probably not an ansible problem21:42
clarkbyup it doesn't set pipefail21:43
zarohashar: always get the failure, "failure: process-returncode [ multipart","returncode 1420"21:43
zarohashar: looks like it runs ok on jenkins slaves.21:44
zarohashar: setup with same tox version on my machine but still same failure21:44
*** derekh has quit IRC21:45
clarkbzaro: I just reran zuul tests locally and they passed21:46
clarkbalso I thought return codes were 8bits21:47
*** jklare has joined #openstack-infra21:47
*** annegentle has joined #openstack-infra21:48
zaroclarkb: when i run with tox ver 1.6.1 return code is 1420 when i run with tox 1.9.2 return code is 14221:48
clarkbzaro: can you paste the output of the failing test(s)?21:49
*** tiswanso has joined #openstack-infra21:49
*** tiswanso has quit IRC21:51
zaroclarkb: http://paste.openstack.org/show/210864/21:51
*** tiswanso has joined #openstack-infra21:52
*** markvoelker has joined #openstack-infra21:52
zaroclarkb: killing my .tox dir and retrying once again but i'm very sure i've tried this already.21:52
clarkbzaro: I think your tests may have hit a timeout, have you made changes to zuul?21:52
zaroclarkb: nope, none21:52
clarkbzaro: try bumping OS_TEST_TIMEOUT=30 to a bigger number in tox.ini21:53
*** erikmwilson has joined #openstack-infra21:53
zaroclarkb: but i am running it in an xsmall flavior21:53
clarkbzaro: maybe add a zero to go to 5 minutes21:54
lifelessso tomorrow is release day right?21:54
lifelessRight after that I want to cut a pbr release21:54
clarkblifeless: ish, its thursday ttx time21:54
clarkbwhich is more like day after tomorrow21:54
zaroclarkb: alright, i'll give that a try21:54
lifelessfriday bad day for releases21:54
lifelessnext monday then21:54
*** emagana has quit IRC21:55
*** spzala has joined #openstack-infra21:56
*** spzala has quit IRC21:56
*** markvoelker has quit IRC21:56
*** dboik has quit IRC21:56
*** spzala has joined #openstack-infra21:56
hasharzaro: sorry for not being of any help :(21:57
*** jklare has quit IRC21:58
*** lucap has quit IRC21:59
*** Somay has quit IRC21:59
*** Swami has quit IRC21:59
fungiremember lifeless lives in the future22:01
fungiso for him, late tomorrow22:02
*** spzala has quit IRC22:02
fungipretty sure he's already well into his wednesday at this point22:02
lifelessmaybe you guys could cut a pbr release for me after ttx releases the servers?22:03
*** jklare has quit IRC22:03
lifelessyes, 1000 wed local time22:03
*** jklare has joined #openstack-infra22:03
fungidhellmann: ^ since it's an oslo lib would you be willing/around to take care of that?22:03
fungidims: ^ ?22:03
*** signed8bit_ZZZzz is now known as signed8bit22:03
*** spzala has joined #openstack-infra22:04
*** sarob has quit IRC22:04
*** harlowja has joined #openstack-infra22:04
dimsfungi: yes, i can. when should we cut it? (trying to parse "after ttx releases the servers")22:06
*** esker has quit IRC22:07
*** sarob has joined #openstack-infra22:07
openstackgerritClark Boylan proposed openstack-infra/devstack-gate: Set pipefail when running tsfilter  https://review.openstack.org/17843722:08
clarkbgreghaynes: ^22:08
clarkbsdague: does devstack need the change in 178437 too?22:08
clarkbI want to say tsfilter comes from devstack?22:08
dimslifeless: cool22:08
*** notnownikki has quit IRC22:08
lifelessdims: doing it before could cause havoc if something goes wrong and there's a last minute OHFUCK for the release process22:08
lifelessdims: 0.11 will be the next pbr tag - from master22:09
dimsfungi: guess we'll need this? https://review.openstack.org/#/c/175369/22:09
greghaynesclarkb: hah22:09
greghaynesclarkb: This is why I like it to just be on ;)22:09
*** melwitt has quit IRC22:09
lifelessdims: no22:09
lifelessdims: thats not to be landed until 0.11 is released22:09
greghaynesclarkb: easier though is usually to just make a subshell with it on22:09
*** hashar has quit IRC22:09
lifelessdims: (see my review comment on it)22:09
dimslifeless: workflow -1 got removed22:10
lifelessdims: thanks22:10
clarkbgreghaynes: except I would have to preserve all the things a subshll doesn't right?22:11
dimsnever mind, it's in the middle of a series22:11
lifelessdims: it can but doesn't need to - because its all testing22:11
lifelessdims: not changing the behaviour22:11
clarkbgreghaynes: like -e22:11
dimslifeless: so looks like i just need to push a button when you or fungi ping me22:12
greghaynesclarkb: export SHELLOPTS22:12
lifelessdims: yes, but I'll be asleep22:12
lifelessdims: If I was going to be awake I could just hit the button myself :)22:12
greghaynesclarkb: that code looks fine to me though22:12
clarkbgah, shellopts is what I needed22:12
clarkbpretty sure that is not mentioned in the set help22:12
lifelessdims: I'm thinking things like stable branches with setup.cfg set wrongly22:13
*** zz_jgrimm has quit IRC22:13
lifelessdims: specifyign releases in the past, that sort of thing22:13
*** otter768 has quit IRC22:13
*** zz_ja has quit IRC22:13
greghaynesclarkb: yea, I missed that var too ;)22:14
greghaynesclarkb: horray for awesome docs22:14
dimslifeless: ok. i'll be ready for it :)22:14
lifelessdims: if ttx releases very first thing his day, I'll still be up, but if its his late morning or afternoon I'll be out22:14
lifelessdims: I'll check in first thing my friday of course, to help with fixing up any projects with issues22:15
dimslifeless: understood. we'll tag team22:15
*** zz_ja has joined #openstack-infra22:16
*** zz_jgrimm has joined #openstack-infra22:16
EmilienMfungi: because it's testing puppet-vswitch with beaker and add eth0 to an OVS bridge22:17
EmilienMI'm not sure the job will timeout22:17
*** jtriley has joined #openstack-infra22:18
openstackgerritClark Boylan proposed openstack-infra/devstack-gate: Set pipefail when running tsfilter  https://review.openstack.org/17843722:19
*** bknudson has quit IRC22:19
clarkbEmilienM: all of our jobs should timeout22:20
*** peristeri has joined #openstack-infra22:20
clarkbEmilienM: do you think we lost connectivity to the node because eth0 was drafted into service for something else?22:21
EmilienMclarkb: you can stop it22:21
clarkbjenkins should notice that22:21
EmilienMI'm trying to test puppet-vswitch22:21
lifelessmorganfainberg: oh another angle (dunno if you touched on it) on multi-language one-repo is the impact on test matrices22:21
EmilienMI think i'll need to create a dummy interface22:21
*** thinrichs has left #openstack-infra22:22
*** melwitt has quit IRC22:22
*** melwitt has joined #openstack-infra22:22
EmilienMclarkb: well, I think you can stop it, so we release resources22:22
*** jtriley has quit IRC22:22
*** gordc has quit IRC22:23
*** sarob has quit IRC22:23
clarkbhrm everything is failing again, looks like git trouble22:24
*** thinrichs has joined #openstack-infra22:24
greghaynesgit clone fail (22:25
SpamapSI didn't do it22:25
SpamapSit wasn't me22:25
* SpamapS just git pulled everything and had a moment of o_O22:25
* greghaynes moves SpamapS to top of suspect list22:26
fungiEmilienM: jenkins just spotted your job doing badnez22:26
SpamapSI"m off the top?!22:26
SpamapSOr rather, I was?22:26
clarkbhttp://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=877&rra_id=all seems to be at fault22:26
* SpamapS hasn't been trying hard enough22:26
EmilienMfungi: cool22:26
clarkbEmilienM: I would prefer that jenkins kill the job on its own so that we have more time to debug issues like ^22:26
clarkbEmilienM: and if jenkins does not notice then we should debug and fix that22:27
*** Rockyg has quit IRC22:27
fungigit04 looks pretty slammed22:27
EmilienMclarkb: makes sense22:27
*** nelsnelson has quit IRC22:28
zaroclarkb: wow! that was it, timeout caused that zuul error.  thanks.22:28
clarkbfungi: we haven't in the past due to unsynchronized git replication, but I have a hunch errors related to that will be less than errors related to this22:28
fungiclarkb: yeah, after the gerrit upgrade i want to finish afs-backed git22:29
clarkbfungi: since a single git command should use the same backend, its only when we start running multiple commands22:29
*** Somay has quit IRC22:29
clarkbbut I think what happens to a node like 04 is it gets stuck servicing an expensive request then a bunch of requests pile up behind it essentially dosing it22:29
fungiit's seriously bogged down22:30
clarkbya its pegged the cpus, swapping, and generally unhappy22:30
clarkbwe should probably also double check that our routine git cleanups are running gc and repacking and all that22:31
SpamapSclarkb: whats the balancing mode now?22:31
greghaynesSpamapS: consistent22:31
SpamapSoh consistent hash of source IP?22:31
greghaynesthink so22:31
fungiSpamapS: source hash22:31
fungii'll have dinner coming off the stove momentarily and can dig deeper. guessing we have a bad actor22:31
SpamapSyeah so like, everybody at a single IBM campus gets one server?22:31
fungilike that, yeah22:32
SpamapSI mean, should be fine given the scale we are talking.. but we have so much automation going on...22:32
mtreinishSpamapS: heh, it's more likely everyone at ibm gets one server...22:32
clarkbmtreinish: I would hope IBM has multiple output proxies22:32
SpamapSmtreinish: was hoping that wasn't the case. ;)22:32
fungiunfortunately we're going to need to analyze haproxy logs in multiple places to find out who's being balanced to git04 and making more requests22:33
SpamapSclarkb: yeah I figured as much. It's a good strategy but means we need to scale each node up and have spare space for expensive ops.22:33
clarkbSpamapS: not necessarily22:33
clarkbSpamapS: I am almost positive it isn't an IBM killing us22:33
*** Swami has joined #openstack-infra22:33
mtreinishclarkb: I'm not sure I saw some weird behavior with the outbound when at different sites22:33
SpamapSor single campus..22:33
SpamapSI was thinking it's a single operation .. which I think you're saying too yes?22:34
mtreinishI think they should have it too22:34
clarkbmtreinish: when I ran a proxy setup we had something like 32 IPs per region we balanced through22:34
SpamapSand soon everybody on that server is dos'ing everyone else.22:34
clarkbSpamapS: yes which is true with other connection methods too22:34
SpamapSno scale-back-on-unhealthy-service22:34
clarkbSpamapS: the specific problem here is we then throw more workload at it22:34
clarkbSpamapS: importantly we may have to scale up regardless of the balancing method if we decide that expensive op is important22:35
SpamapSI wonder if there is a method that does consistent hashing-esque behavior, but rebalances anything > say, 5s idle.22:35
*** stevemar has quit IRC22:35
*** ZZelle_ has quit IRC22:35
*** spzala_ has joined #openstack-infra22:36
SpamapSclarkb: also makes me wonder if there would be a way to queue up those expensive ops and handle them in a more queue-like manner.22:36
SpamapSlike oh you're doing a full clone, you'll need to sit over there while that window is busy.22:36
fungilooks like they're probably coming through fe0122:37
fungii need to timeslice this a bit first to be sure22:37
clarkbya, except I think the expensive thing here is harder to calculate, clones are cheap iirc. Expensive is when you have to construct an almost complete pack on the fly for a fetch22:37
*** whoops has quit IRC22:38
clarkbbut basically we would have to implement our own git smart protocol proxy22:38
*** spzala has quit IRC22:38
*** spzala_ is now known as spzala22:38
greghaynesand if you did that youre kind of back where you started re: using balance with a health check22:38
greghaynesbecause you wont be consistent any longer22:39
greghayneser, consistent based on client22:39
clarkbfungi: ?22:40
clarkbalso looks like puppet is running which won't help anything but shouldn't be the cause either22:40
*** davideagnello has quit IRC22:42
clarkbSpamapS: I do think that possibly dropping requsets that we find to be bad is a not terrible idea22:42
clarkbSpamapS: should be able to do that without a special application proxy22:42
SpamapSwhat about dynamically adjusting server weight based on load?22:42
clarkbSpamapS: that doesn't solve the inconsistent data source problem22:43
*** davideagnello has joined #openstack-infra22:43
clarkbfungi: I can take a look in just a sec22:43
SpamapSclarkb: well.. it's solved when this isn't happening.22:43
SpamapSclarkb: and when this is happening, it becomes more likely, but still stays consistent on the servers that aren't screwed.22:43
*** zz_ja has quit IRC22:43
clarkbSpamapS: ya thats true22:43
*** mriedem is now known as mriedem_away22:44
*** emagana has joined #openstack-infra22:44
*** heyongli has quit IRC22:45
*** annegentle has quit IRC22:46
fungiadam_g: any idea what em1.rapid.canonical.com is?22:46
clarkbof course the file is in /etc/sysconfig22:46
clarkbfungi: I am going to disable puppet there so I can do this correctly and not have puppet reapply things for me in half an hour22:46
fungiclarkb: thanks22:46
fungigood thinking22:46
*** maurosr has joined #openstack-infra22:47
adam_gfungi, i feel like i did at one point but not anymore :\22:48
*** jtriley has joined #openstack-infra22:48
*** davideagnello has quit IRC22:48
*** heyongli has joined #openstack-infra22:49
greghaynesfungi: any idea if that src ip is doing a large number in parallel?22:49
*** zz_jgrimm has joined #openstack-infra22:49
greghaynesSeems like an easy thing to make stuff a little better is to just rate limit based on src ip also22:49
clarkbgreghaynes: I don't think that helps22:49
fungiclarkb: lgtm22:49
*** doug-fish has left #openstack-infra22:49
greghaynesclarkb: oh?22:49
*** Swami_ has joined #openstack-infra22:50
fungiadam_g: yeah, didn't know if you brought any tribal knowledge when you moved22:50
clarkbgreghaynes: because connection request comes in costs almost no bytes, then you ask git daemon to do X and that causes CPU to go crazy22:50
adam_gfungi, what are you seeing coming from there?22:50
clarkbgreghaynes: its not the number of connections or data transferred, its the specific request22:50
SpamapSYou could limit to something super low, like 2-3 active conns.22:50
clarkbgreghaynes: because git makes a custom pack file for you22:50
*** peristeri has quit IRC22:50
fungiadam_g: a denial of service attack against our git server farm22:50
adam_gfungi, it might be the gateway that the canonical auto package build system sits behind?22:50
fungiDaviey: jamespage: if either of you are around, what is em1.rapid.canonical.com? we're blocking it from accessing our git servers22:50
greghaynesclarkb: sure, im just looking at the paste, and theres a huge number for that ip in haproxy logs22:51
SpamapSadam_g: might also be outgoing SNAT for cloud instances. ?22:51
adam_gSpamapS, could be. em1 definitely sounds familiar22:51
SpamapSadam_g: as does rapid.22:51
clarkbfungi: ok applying that rule on fe01 now22:52
lifelessadam_g: em1 is a specific ethernet port22:52
*** markvoelker has joined #openstack-infra22:52
lifelessadam_g: rapid.c.c is probably the thing to identify22:52
lifelessadam_g: its
*** dboik has joined #openstack-infra22:53
*** Swami has quit IRC22:53
* mordred would help, but just boarded a plane22:53
fungiclarkb: be aware this seems to be hitting both fe01 and fe02 roughly evenly (that tells me it's a nat address because those are round-robin dns entries so even distribution is otherwise unlikely)22:53
lifelessoh, brad might know22:53
*** gary-smith_ has quit IRC22:53
lifelesshe should be up nowish22:54
clarkbfungi: ya doing fe02 now22:54
lifelessclarkb: fungi: also - #canonical-sysadmin is the public channel to reach the canoncial sysadmins22:54
lifelessits a bit late to be ringing elmo, but they have follow-the-sun coverage these days22:55
clarkband fe02 is done now22:55
fungithough they're probably going to be popping in here any moment now that they're getting tcp resets22:55
lifelessI've pinged bradm in #canonical-sysadmin, and a couple of likely names in launchpad-dev on the offchance22:55
clarkbwhoops I lied forgot to turn off puppet on 02, fixing22:56
*** davideagnello has joined #openstack-infra22:56
*** blahdeblah has joined #openstack-infra22:56
blahdeblah\o lifeless22:57
lifelessclarkb: fungi: meet blahdeblah the Canonical sysop on vanguard atm22:57
*** ddieterly has quit IRC22:57
fungiload average: 101.87, 100.56, 104.2322:57
*** sabeen2 has quit IRC22:57
fungiwelcome blahdeblah!22:57
lifelessblahdeblah: clarkb and fungi are the ops on the openstack side22:57
*** markvoelker has quit IRC22:57
blahdeblah\h clarkb, fungi22:57
blahdeblahOr \o, even22:57
clarkbfungi: ya I don't know that this was sufficient to kill the established connections since they would already have been accepted and in the state table right?22:57
*** dboik has quit IRC22:57
blahdeblahLet me see what's going on on our end22:58
clarkbfungi: but I think haproxy can do that for us, looking now22:58
SpamapSclarkb: you didn't use state in the REJECT rule you pasted22:58
SpamapSclarkb: so they'd get an icmp reject no matter what state they were in22:58
fungiblahdeblah: bradm: in summary, we're seeing a ton of git requests bound for git.openstack.org from
greghaynesclarkb: conntrack -D can too22:58
greghaynesoh, SpamapS is right I think22:58
lifelessclarkb: fungi: the git processes will eventually block on full tcp buffers and then stop causing IO22:58
clarkbSpamapS: oh good I am not as bad at iptables as I thought22:58
lifelessuntil tcp times out they won't actually go away though22:58
blahdeblahfungi: I'm pretty sure that's the firewall behind which our OS lab lives22:59
lifelessyou need an outbound reject rule to cause them to die early22:59
blahdeblahJust checking that out now22:59
lifelessblahdeblah: cjwatson said a VPN endpoint for misc stuff amongst other things22:59
fungiblahdeblah: thanks. we're in the process of (or already have) blocked it at our load balancer to lessen the impact we're seeing for now22:59
bradmrapid.canonical.com is a firewall with lots of labby type stuff behind it22:59
SpamapSas usual.. no matter the safeguards.. evil escapes the lab to rampage in the village23:00
bradmI think the OIL stuff is behind it, that seems like a possiblity23:00
blahdeblahWhen did you see the excessive traffic start?23:00
bradmbut I'll just make random unconfirmed observations and let blahdeblah do the actual work. ;)23:01
blahdeblahbradm: ssshhhh23:01
lifelessok so my job is done, folk hooked up, I'm -> pip internals23:01
*** wenlock_ has quit IRC23:01
clarkbnetstat says my iptables rule was plenty23:01
fungibradm: blahdeblah: rough bisection of our log analysis suggests it started up at 21:50 utc today (a little over 2 hours ago)23:01
blahdeblahI'm just going to reset counters on our end so I can see what's causing it23:02
lifelessclarkb: sure, just means that load, which is 'blocked processes' will be held somewhat higher for up to 30m, even though actual /io/ and /cpu/ use will drop almost immediately23:02
lifelessclarkb: I wasn't suggesting your rule was wrong, more explaining why load could still be high23:03
fungifor reference, eating a salad and typing are more or less mutually exclusive activities. i did not know this23:03
lifelessfungi: you need the salad mounted au natural near your head23:03
fungilifeless: i need to invent a typing trough23:04
lifelessfungi: ewwwww23:04
mordred mmm. trough23:05
blahdeblahclarkb, fungi:23:05
blahdeblahwhat's the IP address we're hitting?23:05
*** peristeri has joined #openstack-infra23:06
clarkbblahdeblah: and
fungiblahdeblah: rr dns between23:06
fungiyeah those23:06
clarkbblahdeblah: its a DNS round robin (git.openstack.org)23:06
fungiseems to be pretty evenly distributed, so i'm assuming multiple sources behind that nat23:06
*** bswartz has joined #openstack-infra23:06
*** signed8bit has quit IRC23:08
*** weshay has joined #openstack-infra23:09
fungialso i mathed badly. 21:50 utc was a little over an hour ago23:09
clarkb2050 according to cacti23:09
*** Somay has joined #openstack-infra23:10
fungiyes, indeed23:10
fungiit looks like it was mostly coming through fe01 originally and then split23:11
*** dboik_ has quit IRC23:11
fungiso a little over two hours is right23:11
fungii should finish this salad. i hear it's brain food23:11
*** dboik has joined #openstack-infra23:11
*** spzala has quit IRC23:12
lifelessfungi: EWUT23:13
lifelessfungi: if its a fish and egg salad, then yes..23:14
fungii think it's just people who want me to eat salad claiming that23:14
lifelessfungi: oils and proteins are brain food :)23:14
fungiclarkb: the cacti graph says we either got really lucky or that block rule has fixed us23:14
clarkbfungi: pretty sure the block rule is working23:16
mordredmmm. oil salad23:16
fungilifeless: i'll tell my wife that's why i prefer to eat tasty dead animals23:16
fungiload average: 0.75, 4.19, 35.2423:17
fungithat's pretty quick23:17
*** tjones1 has left #openstack-infra23:18
fungiand the 5-min avg hasn't even bottomed out yet23:18
clarkbit looks like we could use haproxy's httpchk, run a http server on a separate port (or mod rewrite it I suppose), then 404 whenever load exceeds some threshold23:19
clarkbthat should take the server out of the rotation based on health checks23:19
fungidoesn't really help for something like this though23:19
fungiit just ends up strafing the load across the farm server by server knocking them offline as it goes23:19
*** zz_ja has joined #openstack-infra23:19
openstackgerritClint 'SpamapS' Byrum proposed openstack-infra/shade: Add functional tests for create_image  https://review.openstack.org/17845223:20
SpamapSmordred: ^ works against devstack23:20
*** ayoung-mtg has quit IRC23:21
SpamapSmordred: have not tried against rax because I just realized I don't have an account to play with on rax23:21
mordredv1 and v2?23:21
SpamapSdo they even give you image upload there?23:21
mordredget one23:21
*** zz_jgrimm has joined #openstack-infra23:21
mordredthey're v223:21
mordredHP is v123:21
SpamapSmordred: I am just now looking at how to make devstack do v223:21
greghaynesSpamapS: youll need a swift23:21
greghaynesfor added fun23:22
mordredrax v2 specifically is 'upload to swift and do task-create import'23:22
lifelessmailman 3 released23:22
clarkbfungi: but we can also rate limit with haproxy, not sure if our haproxy is new enough though23:22
fungimordred has apparently figured out the magic words needed to explain to hpcloud why you need to expense an account with their competitor23:22
SpamapSlifeless: prepare for the apocalypse?23:22
fungilifeless: no!23:22
lifelessfungi: yes!23:22
* fungi looks around for horsemen23:22
greghaynesfungi: well, I havent submitted the expense report yet23:23
mordredupgrade all the things now!!!23:23
greghaynesfungi: so the jury is still out on whether he has ;)23:23
fungiwe get a debian release, an openbsd release, an openstack release _and_ the fabled mailman 3 release all in one week23:23
blahdeblahclarkb, fungi: found the culprit; blocking it momentarily on our end, and will work out whom to contact thereafter23:23
fungiblahdeblah: much appreciated23:23
*** armax has quit IRC23:23
mtreinishfungi: wait I can do that? mordred me want23:23
pleia2lifeless: their mailman3.org website has been an amusing journey23:24
fungiblahdeblah: feel free to send them here and we can quite possibly help them work out a solution to whatever it is they're trying to implement which will be less impactful23:24
blahdeblahfungi: will do23:24
pleia2but looks like they stopped maintaining it, so sad23:24
*** hemna is now known as hemnafk23:25
lifelessblahdeblah: or perhaps nat them across a wider set of IPs ?23:25
lifelessblahdeblah: AIUI it wasn't total load, it was load-from-one-IP that caused us grief23:25
blahdeblahlifeless: I'm sure we can convince them just to be better behaved23:25
clarkbI think it may have even be load from a single request23:26
clarkbbecause you'll see git processes servicing the request use all the memory then things dig into swap and we all have a sad time23:26
fungilifeless: well, in this case it was also a lot of requests in total even if we did spread them across the whole farm. it may not have been as destructuve if that were to happen, but we'd be staring down danger there regardless23:26
*** bknudson has joined #openstack-infra23:27
fungiin this case they were performance-bottlenecked on the one server they were being served from. if they had all 5 to answer requests we might have seen their workload expand to fill teh entire cluster23:28
lifelessthe shadow knows23:28
fungithe weed of crime bears bitter fruit--crime does not pay!23:28
*** dims_ has joined #openstack-infra23:29
blahdeblahclarkb, fungi: OK - that IP is blocked on our end now; when you are comfortable doing so, please open up access again - feel free to rate-limit, or I can enforce a rate-limit on our end if you prefer23:30
SpamapSgreghaynes: ^23:31
fungimtreinish: well, they didn't have television when those serials originally ran23:31
mtreinishfungi: hehe23:31
*** erlon has quit IRC23:31
*** dims has quit IRC23:31
fungiblahdeblah: thanks! we'll do so soon probably23:31
*** jogo has quit IRC23:33
*** jogo has joined #openstack-infra23:34
*** wenlock has quit IRC23:36
*** pblaho_ has joined #openstack-infra23:37
fungimordred: nifty--which airline did you manage that on?23:38
*** dtantsur|afk has quit IRC23:39
*** pblaho has quit IRC23:41
mordredfungi: delta23:42
mordredfungi: they're rolling it out fleet-wide23:42
*** ashleighfarnham has quit IRC23:42
clarkbso we already do set maxconn on the haproxy listen directive for git protocol23:43
clarkbits set to a very conservative 3223:43
clarkbthis isn't a per server setting though so doesn't help a ton when we direct load to a specific server23:43
clarkbso while the canonical IP may have made a lot of requests only a total of 32 should've been serviced at one time aiui23:44
*** pblaho_ has quit IRC23:45
*** dtantsur has joined #openstack-infra23:46
clarkbin fact http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=880&rra_id=all confirms that23:46
blahdeblahclarkb, fungi: Logging a ticket with the internal owners of this system - just to clarify, was it excessive traffic causing the issue, or the nature of the git activity?23:46
clarkbblahdeblah: I am beginning to think it was the nature of the git activity23:46
clarkbblahdeblah: the above graph shows that we only had just over 30 connections at one time23:47
blahdeblahDid I see someone mention it exhausted swap?23:47
clarkbblahdeblah: not exhausted swap, but swapping23:47
fungiwell, the sort of git requests being made, but also the proportion of requests from that ip address was nearly an order of magnitude higher than any other single source23:47
clarkbblahdeblah: I think what happens is that in some circumstances git reuqests are far more expensive than normal, when that happens we chew up the available memory on our mirros and swap23:47
clarkbfungi: yup, but even then we never exceeded 40 concurrent connections23:48
clarkb(its an average though)23:48
clarkbso I think it was sustained expensive operations23:49
clarkbmaybe because they were failing so retries happened?23:49
clarkbwould be curious to know what the other end was attempting to do (yay git and its terrible logging)23:49
fungiright, this was via git:// protocol in this case, so we don't really even have apache access logs to go on23:50
blahdeblahOh, nice; it definitely seems like it was playing nasty: http://cacti.openstack.org/cacti/graph.php?local_graph_id=878&rra_id=all23:50
nibalizerdoes this look right? http://puppetboard.openstack.org/report/lists.openstack.org/ddd0f8e63111e5e0b8588cbdb3f527fd6b39dcd2 pleia2 ?23:50
fungiyep. did a doozy on the git04 server23:50
clarkbnibalizer: I want to say exim doesn't use aliases like that?23:51
fungiblahdeblah: the 5-minute load average graph is even more impressive23:51
*** ddieterly has joined #openstack-infra23:51
blahdeblahquite distinctive, no? :-)23:52
clarkbnibalizer: was there a recent change tha went in related to that?23:52
pleia2nibalizer: I think clarkb is right, so it shouldn't matter, but how did that happen?23:52
clarkbpleia2: ya curious to know why it changed, I wonder if the puppet provider for mailing lists was updated/23:52
mordreddid we land the puppet provider patch?23:53
clarkbmordred: I do not know what patch that is23:53
mordredoh - this is actually what that patch is intended to fix IIRC23:54
clarkbmordred: have any more hints so I can go looking for it?23:54
mordredone sec - link coming23:54
mordredthe problem is - the puppet maillist provider thinks it should add an alias to /etc/alaisaes23:55
*** esker has joined #openstack-infra23:55
mordredand manages that file itself23:55
mordredso we wrote a provider to make it stop fighting23:55
clarkboh I thought exim ignored aliases completel23:55
clarkband had its own mapping23:55
fungiclarkb: mordred: we've got an exim transport that routes based on what mailman db files it finds, yeah?23:55
mordredfungi: that's right23:55
mordredclarkb: just for mailman23:55
clarkbmordred: gotcha23:56
mordredclarkb: it uses them for non-mailman23:56
mordredthus the confuse23:56
mordrednibalizer: we had talked about fixing upstream at some point too, iirc23:56
*** imcsk8 is now known as imcsk8|afk23:56
clarkbmordred: so basically exim updates to what it wants then puppet updates an they go back and forth23:56
fungithat's, like, the recommended way to do mailman+exim. whereas /etc/aliases is the usual deployment for something like mailman+sendmail23:56
clarkbdoesn't affect mailing lists but is annoying in puppet23:56
mordredI haven't landed it because I haven't wanted to watch it to make sure it doesn't break23:57
mordredand it's not THAT important23:57
fungithough you could probably emulate the mailman exim transport configuration with a sendmail milter23:57
fungibut i wouldn't want to23:57
mordredgood golly no23:58
clarkbmordred: also you have feedback to address on that change23:58
mordredI do?23:58
* fungi hung up his sendmail hat along, LONG time ago23:58
clarkbmordred: you do23:58
blahdeblahWell, that was a fun start to the day - thanks clarkb, fungi, lifeless; I'll let you know if I hear anything further from the team running this system.23:58
clarkbblahdeblah: thank your for the quick response23:59
mordrednibalizer: any chance you want to just fix it? your comments make snese to me, but we're already WELL past my ruby comfort zone23:59
clarkbfungi: should I go ahead and remove our rule?23:59
mordrednibalizer: so I'll be just blindly doing whatever you and crinkle tell me23:59
clarkbI can just turn puppet back on23:59
lifelessblahdeblah: thanks23:59
mordredthanks blahdeblah !23:59
clarkbmordred: I think you can just remove that function23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!