Thursday, 2014-06-12

*** casanch1_ has quit IRC00:01
*** julim has quit IRC00:05
*** dkehn_ has joined #tripleo00:18
*** dkehn_ is now known as dkehnx00:19
BadCubgreghaynes: Hi Greg00:21
BadCublifeless: thanks for the intro :-)00:22
* BadCub needs to ponder dinner and excedrin00:24
*** matsuhashi has joined #tripleo00:27
*** nosnos has joined #tripleo00:32
*** yamahata has joined #tripleo00:33
*** noslzzp has joined #tripleo00:41
lifelessStevenK: coverage - did you see? try test --coverage --coverage-pacakge-name ...00:43
lifelessStevenK: we'll get the testr alias added in a new release but that ^ should work right now00:43
lifelessSpamapS: ^00:44
lifeless-> C's gym class00:44
SpamapSlifeless: fix all the things. :)00:49
*** saurabhs has quit IRC00:57
*** CaptTofu_ has quit IRC01:01
*** CaptTofu_ has joined #tripleo01:02
*** CaptTofu_ has quit IRC01:06
tchaypoSo I contacted support for my keyboard a few days ago to ask if there was a firmware upgrade I could apply to try to make it not reset itself a couple of times a day01:21
tchaypoafter checking to see if it might just be low on charge they've vey swiftly moved on to "it seems that it's not charging correctly, I've put you in touch with our RMA department so we can organise a replacement for you"01:22
tchaypothas more than I expected01:22
*** nati_ueno has quit IRC01:30
mordredlifeless: your patch still confuses me01:32
tchaypoI've just realised that the review time stats refer to the median, the 3rd quartile... and the 1rd quartile.01:38
openstackgerritSteve Kowalik proposed a change to openstack/os-cloud-config: Wrap register-nodes CLI in try/except
*** rwsu has quit IRC01:42
tchaypoStevenK: are you just mkaing our review backlog worse?01:43
StevenKShall I stop coding? :-P01:44
tchaypodon't stop coding01:48
tchaypojust start reviewing01:48
vinshgreghaynes, soooo I posted a comment in:
vinshmy changes to listen to glance-registry on localIP work.01:59
vinshWill have to see what Jan thinks.01:59
lifelessmordred: ok, which one?02:14
*** noslzzp has quit IRC02:18
*** CaptTofu_ has joined #tripleo02:23
*** CaptTofu_ has quit IRC02:28
*** eghobo has quit IRC02:36
*** CaptTofu_ has joined #tripleo02:38
openstackgerritClint 'SpamapS' Byrum proposed a change to openstack/os-collect-config: Make heat the default collection method
openstackgerritClint 'SpamapS' Byrum proposed a change to openstack/os-collect-config: Cache auth_ref from keystoneclient
openstackgerritClint 'SpamapS' Byrum proposed a change to openstack/os-collect-config: Add dogpile cache to keystone abstraction layer
openstackgerritClint 'SpamapS' Byrum proposed a change to openstack/os-collect-config: Split keystone away from heat collector
*** ramishra has joined #tripleo02:56
*** noslzzp has joined #tripleo02:58
*** untriaged-bot has joined #tripleo03:00
untriaged-botNo untriaged bugs so far! \o/03:00
*** untriaged-bot has quit IRC03:00
*** noslzzp has quit IRC03:03
*** CaptTofu_ has quit IRC03:26
*** CaptTofu_ has joined #tripleo03:27
*** akuznetsov has joined #tripleo03:30
*** CaptTofu_ has quit IRC03:31
lifelesstchaypo: StevenK: don't bank on it until we get an ack from cody, but I think we'll be set on tickets.03:33
*** ramishra has quit IRC03:33
lifelessmarios: btw is +2 ready I believe.03:34
lifelessmarios: note that because its a no-op rebase your -1 was sticky :(03:34
*** pcrews has quit IRC03:40
lifelessgreghaynes: ok so whats the next blocker for getting a CI job with HA control plane (over or under)03:43
*** nosnos has quit IRC03:44
greghaynesI dont know of any other than patch reviews ATM. Have you been able to re-check the os-is-bootstrap-host review stack and the heat templating for galera cluster?03:46
lifelessgreghaynes: no, gimme numbers?03:46
*** eghobo has joined #tripleo03:47
greghaynes and + its dependencies03:47
lifelessI want to put the newest patch at the top in gertty03:48
*** ramishra has joined #tripleo03:49
*** jml has quit IRC03:52
*** jml has joined #tripleo03:53
greghayneshrm, I should really update the commit msg on 8388303:56
*** elynn_ has quit IRC03:57
lifelessgreghaynes: the one I just +2d. Really ? :)03:57
greghaynesheh, well that should carry over03:57
lifelesswe need SpamapS or some other +2 to land it ...03:58
greghaynesthe bit about it not solving scaling issues is no longer a problem03:58
lifelessgreghaynes: ah cool03:58
lifelessthe cluster address is cool with a trailing , ?03:59
greghaynesI got a successful build with 3 control nodes, so I believe so04:00
lifelessah, another gertty bug04:01
lifelessplease wait while I fix my tools.04:01
lifelessonce SpamapS is back from dinner we may get the other two patches landed04:01
openstackgerritSteve Kowalik proposed a change to openstack/os-cloud-config: Check for relevant environment variables
*** akuznetsov has quit IRC04:09
*** matsuhashi has quit IRC04:11
*** lazy_prince has joined #tripleo04:13
*** tzumainn has quit IRC04:18
*** akuznetsov has joined #tripleo04:20
lifelessgreghaynes: ok so lets talk 8643504:20
lifelessgreghaynes: I must have managed to lose review comments somewhere04:21
lifelessgreghaynes: the test with || true is heinous.04:21
greghaynesYep. Got ideas for a les heinous way?04:22
greghaynesI think it was mentioned maybe exit 0 in that case but echo 25504:23
StevenKtchaypo: Remember the docs question about os-cloud-config?04:23
lifelessgreghaynes: test $?04:23
StevenK/home/steven/openstack/openstack/os-cloud-config/doc/source/index.rst:11: WARNING: toctree contains reference to document 'contributing' that doesn't have a title: no link will be generated04:23
greghayneslifeless: Would still have to || true due to set -e04:24
lifelessoh I know where my comments went. I ratholed into man bash04:24
lifelessgreghaynes: set +e04:24
lifelessset -e04:24
greghayneshah, I guess the set -e doesnt really do much if youre || true'ing to get around it :p04:24
lifelessits longer but obviously correct vs echoing which could come from anything04:24
lifelesswhereas the || true is obviously incorrect, until you read the source.04:25
greghaynesyep, fair04:25
StevenKYou can't if ! os-is-bootstrap-host ; then ... ?04:25
StevenKWhich avoids the whole set +e rubbish04:26
greghaynesStevenK: Theres 3 vals to check04:26
greghayneser, 3 possible values04:26
greghaynesso need to either catch output or test exit val04:26
StevenKgreghaynes: if ! os-is-bootstrap-host will test the exit value04:27
lifelessStevenK: its a three-value thing, one ok, and two different failure modes04:27
greghaynesyep, for a boolean, but then youve lost the exit val04:27
lifelessStevenK: is $? accessible after that express?04:27
lifelessgreghaynes: possibly not04:28
lifelessgreghaynes: this is what I was digging into when I ran out of time the other day04:28
lifelessgreghaynes: if false; then true; else echo $?; fi04:28
lifelessgreghaynes: you'll want to test it obviously04:29
lifelessgreghaynes: while you're tweaking things, care to fixup the []'s and foo;bar's in that file? separate patch if you like but it was a bit hard to read04:30
lifelessgreghaynes: (specifially space between expression and ], and ] and ;, and ; and then.04:30
greghaynesyep, np04:31
StevenKYou can only test once04:31
lifelessyeah, you need to capture it, my example was an example04:31
StevenK$? gets overridden by the first test04:31
StevenKsteven@undermined:~% ./foo.sh04:31
StevenK+ ./testscript04:31
StevenK+ '[' 255 -eq 1 ']'04:31
StevenK+ '[' 1 -eq 255 ']'04:31
*** CaptTofu_ has joined #tripleo04:31
lifelessStevenK: show the source luke04:31
StevenKlifeless, greghaynes:
greghaynesJust made a test that pretty much proved you can do it with test04:33
greghaynesso will fix that04:33
greghaynesyep, did just that :)04:33
StevenK+1 for lack of set +e04:33
tchaypoyest StevenK ?04:35
StevenKtchaypo: I pasted the warning a few lines after my prod04:35
*** CaptTofu_ has quit IRC04:35
tchaypoStevenK: yes, i remember that warning04:37
StevenKtchaypo: You also couldn't get the docs to build locally04:37
tchaypoyes. did you get that fixed?04:37
StevenKIt works for me04:38
StevenKtchaypo: So, tox -epy27 ; . .tox/py27/bin/activate ; python build_sphinx04:38
tchayponice and simple04:38
StevenKI'm adding a docs venv in the patch to fix that warning04:39
* tchaypo looks forward to the review to add "tox-edocs"04:39
cody-somervillelifeless: StevenK: tchaypo: What are you waiting on ACK from me on?04:40
tchaypofwiw i fail on "tox -epy27"04:40
tchaypo│tox.ConfigError: ConfigError: substitution key 'posargs' not found04:40
StevenKtchaypo: Pastebin the full output?04:40
StevenKcody-somerville: From lifeless' mail about pycon au04:40
tchaypocody-somerville: confirmation that we have tickets to pycon-au04:40
tchaypoi have vague memories that this is caused by a newer version of tox and can be fixed by downgrading, i think04:42
StevenK% tox --version04:42
StevenK1.6.0 imported from /usr/lib/python2.7/dist-packages/tox/__init__.pyc04:43
StevenKYeah, are you running 1.7.0 ?04:43
tchayposilly me. ?I've just realised that the review stats don't just come from tripleo-incubator project, they're from all of our projects.04:47
*** matsuhashi has joined #tripleo04:48
*** nosnos has joined #tripleo04:48
tchaypoyeah,m 1.6.0 is working better04:49
lifelesscody-somerville: (see corp email a couple days ago)04:50
lifelesscody-somerville: I'm going to mail you shortly about the other stuff you asked me04:50
StevenKHeh, python-tox is still 1.6.0 in utopic04:50
cody-somervilleI have like a two thousand unread e-mails :}04:50
openstackgerritA change was merged to openstack/tripleo-image-elements: Properly enabling and restarting snmpd
openstackgerritA change was merged to openstack/diskimage-builder: Parameterise PXE kernel and initrd selection
openstackgerritA change was merged to openstack/diskimage-builder: Tidy up SuSE kernel selection
lifelesscody-somerville: ok so look for ones from me:)04:52
lifelesstheres less of those, I hope.04:53
openstackgerritOpenStack Proposal Bot proposed a change to openstack/diskimage-builder: Updated from global requirements
*** michchap_ has quit IRC04:55
*** michchap has joined #tripleo04:55
tchaypoOaky, I've drafted as an email to send to the team as a reminder about needing to do reviews04:56
tchaypolifeless: StevenK: and anyone else who's around - feedback welcome before I hit send..04:57
lifelessgreghaynes: -1 on 9303204:57
lifelesstchaypo: sec04:57
lifelesstchaypo: looks great04:58
StevenK"handy link"04:58
lifelessyou'll be review czar before you know it04:58 might be useful for the url04:58
StevenKI agree it's a handy link, but it's still horrible04:58
lifelessand or link to the wiki page if you didn't04:58
tchaypo has way too much "tchaypo"04:59
openstackgerritOpenStack Proposal Bot proposed a change to openstack/os-cloud-config: Updated from global requirements
tchaypoespecially considering that most of it was just me parroting things other people said with #info prepended04:59
tchayponext time I'm going to encourage other people to use #info more liberally04:59
lifelesstchaypo: Idon't think they can04:59
lifelesstchaypo: since they aren't running the meeting04:59
lifelesstchaypo: IMBW05:00
lifelesstchaypo: did the url you asked me to link get in there ?05:00
tchaypo says that it's a "command for everyone"05:00
lifelessgreghaynes: reviewed the full set05:01
tchaypoyep, your #link made it05:01
lifelessgreghaynes: where is the patch to change the default to 3-node control plane?05:01
lifelesstchaypo: TIL05:01
tchaypobut jcoufal's #agreed did not, as that's a chair-only command05:01
greghaynesooo I should make that patch05:02
* StevenK tries to undistract tchaypo 05:02
tchaypoundistract me?05:03
StevenKtchaypo: os-cloud-config docs warning05:03
tchaypoi'm getting to it. as soon as i send this email.05:03
tchaypojust to check - email goes to openstack-dev with [tripleo] in the subject, right?05:05
tchaypoIf there's a more specific list I'm either not on it or forgot about it05:06
lifelesstchaypo: correct05:07
StevenKtchaypo: Right, I've fixed the contributing thing, what was the change to turn warnings into errors?05:07
*** eghobo has quit IRC05:08
*** eghobo has joined #tripleo05:09
*** noslzzp has joined #tripleo05:09
tchaypolifeless: the R1 work involves    migrating from saucy to trusty, right?05:15
tchayposo updating the reame to say trusty would be correct?05:15
lifelesswhich reame?05:16
tchaypoin tripleo-incubator05:16
lifelessI think the trusty patch landed05:16
lifelessand yes, we're running trusty in hp1 now05:16
openstackgerritJames Polley proposed a change to openstack/tripleo-incubator: Add details of which OS releases are tested in CI
tchayposo I'm looking for old reviews, to follow the advice in my email05:21
tchaypoand I run across
*** rakesh_hs has joined #tripleo05:22
*** noslzzp has quit IRC05:22
lifelesstchaypo: 97129 failed CI05:22
tchaypoit's been a month since that got 2 +2s, but it's not going to get a +a until some indefinite future event. Is there anything we can do to hide that? Would marking it as WIP help?05:23
lifelesstchaypo: indeed, done.05:24
tchaypolifeless: yes but it depends on 92749 and it needs a rebase; I'm waiting for 92749 to land before i rebase05:25
lifelesstchaypo: why05:25
tchaypoubuntu has suddenly decided that alt-tab should follow the order of the icons down the panel rather than most-recently-used app05:25
lifelesstchaypo: 'unity' has ...05:26
tchaypomy keyboard was in mac mode, so i was actually using super-tab not alt-tab05:28
tchaypolifeless: half of my reasons are invalid. the remaining reason is that I've already rebased it a few times05:29
tchaypoand I'm lazy and don't want to do it again05:30
tchaypobut actually it shouldn't be as much work as i think05:30
lifelessso folk are going to spend neurons figuring out whats wrong for you05:30
lifelesseither WIP it as a signal05:30
lifelessor fix it as a courtesy :)05:30
tchaypoyou win05:31
openstackgerritA change was merged to openstack/diskimage-builder: Debian: Support additional debootstrap arguments
openstackgerritJames Polley proposed a change to openstack/tripleo-incubator: Clean up all outstanding ReST errors and warnings
tchaypoi mean, you are correct. the project wins :)05:31
* tchaypo runs test locally05:32
tchaypooh good, i can reproduce that locally now05:33
tchaypoprobably should have checked before the commit :(05:34
*** rbrady has quit IRC05:34
*** akuznetsov has quit IRC05:40
openstackgerritJames Polley proposed a change to openstack/tripleo-incubator: Clean up all outstanding ReST errors and warnings
*** rdopieralski has joined #tripleo05:42
openstackgerritSteve Kowalik proposed a change to openstack/os-cloud-config: More documentation fixes
*** lsmola has joined #tripleo05:54
* tchaypo gives up05:57
tchaypomy wifi just isn't reliable this afternoon05:57
tchaypotune tio prepare for german class05:57
tchaypo*time to05:57
*** akuznetsov has joined #tripleo06:11
*** dshulyak_ has joined #tripleo06:12
*** CaptTofu_ has joined #tripleo06:19
*** CaptTofu_ has quit IRC06:24
*** jprovazn has joined #tripleo06:24
*** akuznetsov has quit IRC06:25
openstackgerritNikhil Manchanda proposed a change to openstack/diskimage-builder: Do not use DatasourceNone for precise cloud-init
openstackgerritNikhil Manchanda proposed a change to openstack/diskimage-builder: Do not use DatasourceNone for precise cloud-init
*** akuznetsov has joined #tripleo06:48
*** jtomasek has joined #tripleo06:51
*** cody-somerville has quit IRC06:53
* SpamapS returns from dinner and opens reviews06:58
*** xuhaiwei has quit IRC07:01
SpamapSlifeless: I read this, and it strikes me that we don't ever take into account disaster recovery07:02
*** akuznetsov has quit IRC07:02
SpamapSrandomstrings are fine for things we can re-assert, but the cluster name is something we'll have to inherit if we're restoring the database.07:03
lifelessSpamapS: wouldn't we be restoring the heat stack too then ?07:04
lifelessSpamapS: and wouldn't restoring it restore the string ?07:05
*** cody-somerville has joined #tripleo07:05
*** cody-somerville has joined #tripleo07:05
SpamapSlifeless: that's an interesting question07:06
SpamapSlifeless: I think no, because we may be recovering from disaster in a separate data center.07:07
lifelessSpamapS: let me rephrase07:07
SpamapSlifeless: my thinking is simply that we should pass this in.07:07
SpamapSinstead of making it random07:07
lifelessSpamapS: there doesn't seem like there is any reason that it shouldn't be owned by heat07:07
lifelessSpamapS: restoring a heat stack with data heat generated seems like a heat problem, different data centre doesn't seem like an interesting distinction07:08
SpamapSIt is, because you'd be restoring addresses..07:08
SpamapSall kinds of things07:09
SpamapSphysical UUID's of servers07:09
SpamapSthats assuming that you've somehow synced all those servers to the new DC07:09
lifelessright, 'restore' is a complex thing to do to a cluster07:09
SpamapSseems completely unlikely07:09
SpamapSRight, so I'm saying, to keep it simple, we should parameterize anything that is immutable state rather than make it random.07:10
lifelessI think you're saying that you expect to implement 'restore a stack' as 'deploy a new stack and do restores to servers within it'07:10
SpamapSI also don't know why making it random would actually be a good idea.07:10
SpamapSSince we may want two stacks to share masters in a warm-standby DR scenario.. they'd need to share wsrep_cluster_name07:10
lifelessI don't know where to pull on this one07:11
lifelessobviously it can either be passed in07:11
lifelessor random07:11
SpamapSLet me back up from the doom and gloom.07:11
*** eguz has joined #tripleo07:11
lifelessbut heat broke defaulting parameters to expressions07:11
SpamapSThis particular random string doesn't actually need to be random. It just needs to be identical among this cluster. That is all.07:11
lifelessSpamapS: and unique, no ?07:11
*** eguz has quit IRC07:11
*** pblaho has joined #tripleo07:12
SpamapSlifeless: uniqueness _does_ protect against accidentally joining two clusters to eachother that are not meant to be joined.07:12
lifelessSpamapS: so, you know more about this than I07:12
lifelessSpamapS: but there are many many DR things we don't support yet07:13
SpamapSI think I want it to be stackname+resourcename07:13
lifelessSpamapS: I think having HA >> not having HA07:13
SpamapSbut even stackname might be too rigid.07:13
*** jcoufal has joined #tripleo07:13
lifelessSpamapS: do we need to get this right, now ?07:14
SpamapSlifeless: it even has a default value07:14
SpamapSDefault Value:my_wsrep_cluster07:14
*** eghobo has quit IRC07:15
*** akuznetsov has joined #tripleo07:15
lifelessSpamapS: if we get the design wrong for this, AIUI, the consequence is that folk doing a restore may have to tweak the template they restore with07:15
SpamapSlifeless: I'm trying to find out if it is something that ends up in the state stored on disk. If not, then I'm less concerned about long term ramifications as we can change it with just some downtime.07:15
SpamapSIf it is stored on disk, I think we _must_ think this through.07:16
lifelessSpamapS: we can always bring up a new cluster, downtime copy, delete old cluster, no ?07:16
SpamapSlifeless: that is what I'm looking at07:16
lifelessSpamapS: or can one install of galera only participate in one cluster at a time ?07:16
SpamapSlifeless: I don't thinki there should be pressure to push something this important through.07:16
SpamapSlifeless: I'm not going to block it for a week. Just, let's think this through.07:17
lifelessSpamapS: I'm just skeptical that we'll actually consider everything07:17
lifelessSpamapS: my experience has been that we'll have to evolve it no matter what we decide today07:18
lifelessSpamapS: so I'm not trying to say 'shove'07:18
lifelessSpamapS: I'm trying say lets understand the consequences of getting it wrong07:19
lifelessSpamapS: which I think is what you're looking at too, no?07:19
*** dshulyak_ has quit IRC07:21
SpamapSOk, well I say -1 to this. It has no business being random. A good default value is {"Fn::Join": ["_", [ {Ref: "AWS::StackName"}, "controller" ] ] }07:22
greghaynesSpamapS: Not sure what you mean by re-assert cluster name if we are restoring the database. Is the cluster name somehow tied with the data?07:23
lifelessSpamapS: I don't follow your reasoning07:23
SpamapSgreghaynes: it appears it is not.07:23
greghaynesSeems like if the whold cluster dies, its fine if we start it back up with a new name07:23
SpamapSgreghaynes: but I cannot find definitive assertions to say it does not stay with the on disk data07:23
lifelessSpamapS: I understand you don't like it, but I don't understand the negative consequences you care concerned about07:23
SpamapSgreghaynes: What I can find are a lot of "this is just a safety precaution so you don't join two clusters that shouldn't be joined"07:24
*** e0ne has joined #tripleo07:24
lifelessSpamapS: we *expect* to be deploying (in CI) multiple stacks with the same name serially07:24
lifelessSpamapS: if we hav a rogue node, having a predictable name is entirely likely to end up conflicting07:24
lifelessSpamapS: AIUI your only concern is about DR?07:25
SpamapSthe same name, but a different heat, with the same network?07:25
lifelessSpamapS: same heat, 20m later07:25
lifelessSpamapS: do a stack-delete, ironic fails to power off the node07:25
SpamapSYeah I'm seeing that as a real possibility.07:25
SpamapSwe don't have a kvm process that we can definitely say "its dead"07:26
lifelessSpamapS: I can't tell if thats sarcasm or not; but we are systematically locking up BMCs at the moment07:26
SpamapSNo sarcasm this time.07:26
lifelessSpamapS: Are there any other negative consequences other than having to have a way to re-inject the previous value in a new stack that is going to have data restored onto it ?07:27
SpamapSOk so I now have two opposing viewpoints, one of which is supported by the current implementation, and the other supported by conjecture and FUD from my brain. I release my hold. Let me read the third patch...07:27
lifelessSpamapS: does not mention the cluster name at all as a factor07:28
lifelessSpamapS: for xtrabackup (vs dumps which are obviously not impacted)07:29
jprovazngreghaynes: hi, -  galera can deal with comma at the end of "gcomm://x.x.x.x,y.y.y.y," expression?07:29
lifelessjprovazn: I asked that too; greg says yes :)07:29
SpamapSlifeless: Right, and there's also garbd07:29
lifelessSpamapS: but it might be oversight07:30
lifelessmordred: around? happen to know?07:30
openstackgerritA change was merged to openstack-infra/tripleo-ci: Name logs tarball after the instance name
SpamapSlifeless: which can be used to make a snapshot ... and then that is restorable without the cluster name set.07:30
lifelessSpamapS: so its sounding more and more to me that the disk files don't embed the cluster id07:30
jprovaznlifeless: interesting, IIRC in some *older* version of galera behavior of comman at the end was "start in standalone mode if joining to other nodes failed", but this might have changed already07:30
SpamapSI'm quite certain if the backup was done w/ xtrabackup the restored data would not have wsrep_cluster_name in it07:30
greghaynesjprovazn: Pretty sure, yes. More than one set of testing would be great though07:31
SpamapSand garbd does indeed just result in a populated innodb database07:31
* jprovazn liked previous expression more :)07:31
SpamapSI'm convinced now, that it is an ephemeral value07:31
SpamapSjust used for coordination "in the moment"07:31
*** jcoufal has quit IRC07:33
SpamapSgreghaynes: <-- tiny commit message fix. I will +A without re-pass on tests. :)07:35
SpamapSI left this one un-approved since lifeless had a -1
howleytlifeless: if you have a minute, have a question on
*** mugsie has quit IRC07:36
*** cody-somerville has quit IRC07:36
openstackgerritGregory Haynes proposed a change to openstack/tripleo-image-elements: Add os-is-bootstrap-host element and script
openstackgerritGregory Haynes proposed a change to openstack/tripleo-heat-templates: Add galera clustering properties
openstackgerritNikhil Manchanda proposed a change to openstack/diskimage-builder: Do not use DatasourceNone for precise cloud-init
*** jcoufal has joined #tripleo07:43
*** e0ne has quit IRC07:44
*** e0ne has joined #tripleo07:45
openstackgerritGonéri Le Bouder proposed a change to openstack/tripleo-image-elements: Drop some unnecessary lsb_release calls
*** e0ne has quit IRC07:49
gonerilifeless: do you have 5 minutes to speak about ?07:50
*** StevenK has quit IRC07:50
*** cody-somerville has joined #tripleo07:50
*** cody-somerville has joined #tripleo07:50
lifelessgoneri: I will about 20 past07:50
lifelesshowleyt: I can in a little bit, sure07:51
*** jcoufal has quit IRC07:51
*** StevenK has joined #tripleo07:51
*** mrunge has joined #tripleo07:59
*** jcoufal has joined #tripleo08:00
*** jistr has joined #tripleo08:02
openstackgerritMarios Andreou proposed a change to openstack/tripleo-image-elements: Refresh heat-cfntools element
openstackgerritMarios Andreou proposed a change to openstack/tripleo-image-elements: Prepare os-*-config for CI
mariosstevebaker: rebased ^^^08:03
howleytlifeless: thanks08:04
*** derekh_ has joined #tripleo08:11
*** lucasagomes has joined #tripleo08:12
SpamapSok, sleepy time08:12
*** jcoufal has quit IRC08:14
*** markmc has joined #tripleo08:19
derekh_Today I would mostly like to merge08:20
derekh_Extract F20 log file from the journal                   -
derekh_Increase our ci PIP timeout (again)                     -
derekh_Small fix to decrease logs from wget                    -
derekh_Delete keystone tokens on all distros                   -
derekh_Log a warning if there is a delay getting a ci test env -
openstackgerritA change was merged to openstack/tripleo-image-elements: Enable rsync daemon on swift-storage
*** pelix has joined #tripleo08:26
*** giulivo has joined #tripleo08:28
derekh_"Brocade OSS CI" commenting on tripleo-incubator patch? are we getting a 3rd party CI system ?
derekh_lifeless: ^08:34
lifelessderekh_: they're probably testing the world08:35
lifelessand they're reporting the wrong url08:35
lifelessderekh_: ok hear you on those patches08:36
lifelessderekh_: do they all pass CI ?08:36
derekh_lifeless: yup,08:37
openstackgerritA change was merged to openstack-infra/tripleo-ci: Write log file for each systemd unit
lifeless98232 needs an infra +A, ask there08:38
*** cody-somerville has quit IRC08:39
derekh_lifeless: ? its in toci08:40
derekh_lifeless: added a comment to this morning, we are already extracting logs for ubuntu and striping /var/log/upstart out of the pathname08:41
uvirtbotLaunchpad bug 1328645 in openstack-ci "ubuntu tripleo-ci jobs are not logstash indexable" [Undecided,In progress]08:41
derekh_lifeless: so we just need to handle /mnt...08:41
lifelessderekh_: oh, it i too oops.08:41
openstackgerritA change was merged to openstack-infra/tripleo-ci: Increase the pip default timeout to 60
lifelessderekh_: we need it to not be tarred up08:42
derekh_lifeless: its copied over then the logs are being untarred08:43
lifelessderekh_: oh, ?!08:43
*** rlandy has joined #tripleo08:43
*** jcoufal has joined #tripleo08:43
derekh_let me find the patch08:43
lifelessoh sweet08:43
derekh_lifeless: my f20 patch was a follow on to provide systemd logs in the same way08:44
*** martyntaylor has joined #tripleo08:45
openstackgerritA change was merged to openstack-infra/tripleo-ci: Switch to Mega progress reports
gonerilifeless: for, is it fine if I restore this change ?08:48
lifelesshowleyt: goneri: ok my phone meeting is over08:49
lifelesslet me review derekh's stuff08:49
lifelessthen i'm yours08:49
lifelessderekh_: See the -infra logs from this morning my time, sdague and I had a heart to heart08:49
openstackgerritA change was merged to openstack/diskimage-builder: Add tar as an output type
derekh_lifeless: saw them already, so pcrews is going to look into the dashboard changes ?08:50
*** cody-somerville has joined #tripleo08:52
openstackgerritA change was merged to openstack-infra/tripleo-ci: Log a warning if wait on a te worker is excessive
lifelessderekh_: yes08:53
*** e0ne has joined #tripleo08:53
derekh_lifeless: so we pretty much will have to wait for that to happen before getting comments from E-R08:56
*** untriaged-bot has joined #tripleo09:00
untriaged-botNo untriaged bugs so far! \o/09:00
*** untriaged-bot has quit IRC09:00
lifelesshowleyt: goneri: ok, hit me09:05
* goneri hits lifeless09:05
goneriMy understanding is that you don't think we can replace this check
goneriby the check on the meta-data done here
howleytlifeless: sorry, I meant I had replied with question in that review - here it is: Ok, so you want to remove it from the heat template. If it is still left as configurable in the element, how do I make it default to True? Mustache will not differentiate between a variable being set to False or not being set at all. Is there a pattern for this in another element that you can point me to.09:07
howleyt(you might need to check for context)09:08
goneriso I see to option, either I keep the bash check and we do a similar test to time in a row. Or I give up :)09:08
lifelessgoneri: thats right, they are totally different things09:08
lifelessgoneri: well, what bug are you fixing?09:08
gonerinone, the goal was to simplify the code09:09
lifelessgoneri: ok, so - I don't really hold a strong opinion about the metadata check, though it seems like it makes writing elements harder, so I'd really like to see a better reason for it09:11
lifelessthe 'did things work' check has to say IMO09:11
*** ramishra has quit IRC09:11
goneriinitially, I realized it was possible to include two incompatible root elements at the same time. Error was unclear and start to append after the debootstrap, so a long time after the begining.09:12
openstackgerritMichael Kerrin proposed a change to openstack/tripleo-image-elements: Try and persist mac address for the bridge across reboots
goneriso I started some patches to simplify this part of the code, that's the long story.09:13
lifelessgoneri: ok, so that might provide reason to do the metadata check; the issue with the functional check is that this makes sure the preconditions for the rest of the code are in good shape09:13
lifelessgoneri: specifically, what if debootstrap (for instance) fails silently09:14
gonerilifeless: ok, we can keep the bash check as a sanity check09:14
goneriI can adjust the error message this way: "Please include at least one distribution root element." → "Failed to deploy the root element."09:15
lifelessthat makes sense to me09:16
goneriok then, let's go :)09:17
lifelesshowleyt: whats the review # for your patch adding keepalive_enabled to the elements?09:19
openstackgerritGonéri Le Bouder proposed a change to openstack/diskimage-builder: fail at startup with no operating-system element
gonerilifeless: ↑09:21
lifelessooh unicode up arrow; fancy :)09:22
openstackgerritCian O'Driscoll proposed a change to openstack/diskimage-builder: Explicitly name element enable-serial-console
openstackgerritCian O'Driscoll proposed a change to openstack/diskimage-builder: Explicitly name element enable-serial-console
lifelesshowleyt: ? I"m heading to bed soon - wanted to look at the relevant in-instance code...09:28
gonerilifeless: before you leave, if the change is fine for you, can you please put a +1 or +2?09:30
lifelessgoneri: I have09:30
gonerilifeless: thanks, good night =)09:31
howleytand the corresponding element change:
lifelesshowleyt: so, one thing you could do is have it be disable_keepalive09:34
lifelesshowleyt: if missing, leave it enabled, if set disable it09:35
derekh_lifeless: pblaho jprovazn Joshua thanks for the reviews09:36
jprovaznderekh_: np09:36
lifelessderekh_: joshua==jhesketh on IRC09:36
*** jprovazn has quit IRC09:38
derekh_lifeless: cool, was looking for that, thanks09:39
howleytlifeless: ok, fair enough. I wouldn't mind if mustache was a bit more expressive.09:39
lifelesshowleyt: I wouldn't mind either : - we could look at going to handlebars(e.g. via pybars) at some point, or adding an extension to pystache09:42
derekh_lifeless: so any changes to R1 since I was last here ?09:44
lifelessderekh_: nope, I suck09:44
lifelessderekh_: I didn't even get to login to it09:44
*** e0ne_ has joined #tripleo09:44
lifelessderekh_: the regex patch was failing undercloud09:45
lifelessderekh_: I started to poke at that09:45
derekh_lifeless: no prob, I'll jump in this morning and see what I can do, we got status==0 yesterday and could ssh to user image09:45
lifelessand then its passed, yay09:45
pblahoderekh_: np09:45
howleytlifeless: ya, was looking at handlebars yesterday, haven't tried it out, though.09:45
lifelesshowleyt: FWIW I wrote pybars, so if it gives you headaches, you can blame me09:46
derekh_lifeless: yup, so then I added some of the prepare stuff, to setup what CI expects flavors/images/networks etc... so will continue on that today09:46
lifelessderekh_: so, if I may suggest09:46
derekh_lifeless: yup, fire ahead09:46
lifelessderekh_: lets get that vlan fixup change, and the associated config changes pushed up into gerrit for reviw09:46
lifelessderekh_: so we have minimal state to lose09:46
lifelessderekh_: secondly, I'd like to get a CI job that tests with the public IP on separate network via vlan up09:47
lifelessderekh_: so that we can lock this in as a supported thing - it will fail initially09:47
*** e0ne has quit IRC09:48
lifelessderekh_: but - long as you document where you leave things, I will be poking at this next week09:48
derekh_lifeless: ok will get what I can up to gerrit, some of the vlan changes you should probably submit (you've a better understanding of what the vlan changes are)09:49
lifelessderekh_: ack09:50
derekh_lifeless: either way I'll try an leave it so you know where I leave everything befor I go09:50
lifelessrocking, thanks09:51
lifelessetherpad ftw :)09:51
openstackgerritMichael Kerrin proposed a change to openstack/tripleo-image-elements: Configure neutron-ovs-cleanup to run after openvswitch
openstackgerritMichael Kerrin proposed a change to openstack/tripleo-incubator: The default in heat-templates is to preserve ephemeral disks
*** akrivoka has joined #tripleo10:07
*** jprovazn has joined #tripleo10:08
openstackgerritMatthew Macdonald-Wallace proposed a change to openstack/tripleo-image-elements: Install the "classic" icinga interface
*** gcha has quit IRC10:13
*** gcha has joined #tripleo10:19
openstackgerritDerek Higgins proposed a change to openstack-infra/tripleo-ci: Log get_state_from_host to a file
*** matsuhashi has quit IRC10:20
*** jml has quit IRC10:20
*** matsuhashi has joined #tripleo10:20
*** akuznetsov has quit IRC10:21
*** yamahata has quit IRC10:21
*** jml has joined #tripleo10:23
lxslilifeless howleyt: if we move to handlebars have you considered how EG an oac file might add a helper?10:24
*** matsuhashi has quit IRC10:25
lxsliThis could make passthrough much more useful10:28
*** rakesh_hs has quit IRC10:32
*** e0ne_ has quit IRC10:37
*** e0ne has joined #tripleo10:37
andrearosaanyone is available to look at this easy fix for a bug in the nova.conf mustache template?
*** e0ne has quit IRC10:41
*** martyntaylor has quit IRC10:43
*** martyntaylor has joined #tripleo10:47
*** gcha has quit IRC10:54
*** gcha has joined #tripleo10:55
*** lathiat has quit IRC11:04
*** lathiat has joined #tripleo11:05
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Wait for os-collect-config to complate on the seed
*** nosnos has quit IRC11:13
*** e0ne has joined #tripleo11:15
*** e0ne has quit IRC11:17
*** rbrady has joined #tripleo11:17
*** e0ne has joined #tripleo11:17
derekh_lxsli: so your long line patch has a merge conflict now so needs to be rebased, I have rebased it to put another patch on top of it (to avoid another conflict), mind if I push a new rebased version of your patch up ?11:19
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Wait for os-collect-config to complate on the seed
*** e0ne has quit IRC11:22
*** shakayumi has joined #tripleo11:23
*** shakayumi has quit IRC11:23
*** shakayumi has joined #tripleo11:24
*** CaptTofu_ has joined #tripleo11:44
*** e0ne has joined #tripleo11:45
*** akrivoka has quit IRC11:45
*** CaptTofu_ has quit IRC11:45
*** CaptTofu_ has joined #tripleo11:45
*** shakayumi has quit IRC11:50
*** mrunge has quit IRC11:55
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Wait for os-collect-config to complete on the seed
*** dprince has joined #tripleo12:03
*** yamahata has joined #tripleo12:10
*** morazi has joined #tripleo12:11
*** jdob has joined #tripleo12:17
*** akrivoka has joined #tripleo12:21
*** e0ne_ has joined #tripleo12:24
*** weshay has joined #tripleo12:25
openstackgerritDerek Higgins proposed a change to openstack-infra/tripleo-ci: Split very long line in
openstackgerritDerek Higgins proposed a change to openstack-infra/tripleo-ci: Add sbin to PATH when loging hostinfo
*** e0ne has quit IRC12:28
*** tzumainn has joined #tripleo12:36
*** akuznetsov has joined #tripleo12:38
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Clean-up os-apply-config lines.
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Clean-up os-apply-config lines in
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Clean-up os-apply-config lines in
openstackgerritAndrea Frittoli  proposed a change to openstack/tripleo-image-elements: Configurable tests2skip file
*** jml has quit IRC12:46
*** openstackgerrit has quit IRC12:46
*** rakesh_hs has joined #tripleo12:46
*** jml has joined #tripleo12:46
*** openstackgerrit has joined #tripleo12:48
*** noslzzp has joined #tripleo12:50
openstackgerritGonéri Le Bouder proposed a change to openstack/tripleo-image-elements: Drop some unnecessary lsb_release calls
*** julim has joined #tripleo13:01
giulivoanyone who could merge ?13:02
*** yamahata has quit IRC13:03
*** yamahata has joined #tripleo13:03
*** matty_dubs|gone is now known as matty_dubs13:12
*** ohadlevy_ is now known as ohadlevy13:13
*** ohadlevy has quit IRC13:13
*** ohadlevy has joined #tripleo13:13
*** CaptTofu_ has quit IRC13:15
*** CaptTofu_ has joined #tripleo13:16
gonerican a core reviewer have a look on It introduces a cache system for apt that speed up a lot the build of an image.13:19
*** CaptTofu_ has quit IRC13:20
d0ugaltzumainn: I'd be up for going ahead with that plan now if you like13:21
tzumainnd0ugal, sorry, which plan was it?13:21
d0ugaltzumainn: What you said in here last night. /me looks for the wording13:22
tzumainnI'm not sure I was speaking rationally yesterday afternoon after spending a few hours with a bunch of screaming seven-year old girls trying to sew13:22
d0ugaltzumainn: I think you wanted to lock down the models13:22
tzumainnd0ugal, oh, yeah13:22
tzumainnso, assuming the spec is split into a) models and b) storage backends13:22
tzumainnI think b) has a lot of open questions, but if we can get consensus on a), I think we could start coding on that13:22
d0ugalSounds good13:23
d0ugalThen we would be in a position to try different b)'s13:23
*** rha has quit IRC13:23
d0ugaltzumainn: I'm not sure we are in a position yet to fully define the Role13:24
openstackgerritA change was merged to openstack/diskimage-builder: Correct the wrong rename in rhel element
d0ugaltzumainn: or, at least, I wasn't sure yesterday and jdob didn't seem to be either when we spoke13:25
tzumainnthat's fair13:26
tzumainnit seems like everything we model has random associated metadata that we're either not exactly sure how to store, or which we attempt to store by stuffing into a template or something13:27
tzumainnwell, I guess the glance artifact store allows arbitrary metadata, but we're no longer sure that'll be ready for juno?13:28
d0ugalI think it will be ready for juno, but that may not give us enough time to actually use it.13:28
*** rha has joined #tripleo13:29
*** rha has quit IRC13:29
*** rha has joined #tripleo13:29
d0ugalThey seem confident that it will be ready, but yeah, I don't know how long we will then need to implement it.13:29
tzumainnso, okay, we assume that each storage backend provides some way of letting us associate arbitrary metadata13:29
tzumainnand for the ones that don't, we say that arbitrary metadata will be stored as, I dunno, a dictionary stored in a file?13:29
d0ugalYup, I could pop that into the requirements.13:29
d0ugalYeah, otherwise we need to store it as another file/object13:30
d0ugaltzumainn: I think every backend we are considering has the ability to store metadata13:30
d0ugal(including swift)13:31
tzumainnoh, okay13:31
tzumainnactually, this is all irrelevant to the role model, isn't it13:31
tzumainnmy mind is nowhere today13:31
d0ugalThat's fine, it was a good thing to check anyway13:31
tzumainnso the question about the role model is what template(s) need to be associated?  whether it's just the direct heat template, or also the ones indicated in the acyclical directed graph, as shadower says?13:32
d0ugaltzumainn: yeah, I think so13:34
*** CaptTofu_ has joined #tripleo13:34
d0ugaltzumainn: but I'm a bit confused by it at the moment13:34
tzumainnI say, just propose both options in the spec, and see what input we get13:36
d0ugaltzumainn: good idea.13:37
d0ugaltzumainn: I'll start that now, I'm just doing a proof read of my other updates first13:37
d0ugaljdob: Is "master template" a Heat term?13:44
d0ugalTuskar specific?13:44
jdobya, heat will just call it a template. for our purposes, that word feels too generic and I wanted to call it out explicitly13:45
jdobwe could use the term "plan template" instead13:45
jdobi just kinda started using master template one day and it stuck, it wasn't a real conscious decision13:45
d0ugalMakes sense, just wondering if I could clarify what it is13:45
d0ugalheh, I think it makes sense but maybe it should be defined somewhere.13:46
jdobi think i tried to define it in my spec, but at this point, its a friggin blur13:46
jdobmaybe i intended to and forgot13:46
d0ugalI'll add a comment and ask before I forget13:46
openstackgerritAndrea Frittoli  proposed a change to openstack/tripleo-image-elements: Configurable tests2skip file in tempest element
*** beekneemech is now known as bnemec13:54
*** mrunge has joined #tripleo13:55
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Wait for os-collect-config to complete on the seed
tzumainnd0ugal, ah, so the other complication in B. is (iirc) the notion that some templates might be shared13:59
tzumainnoh, wait13:59
tzumainnthat's an implementation detail13:59
tzumainnyeah, htat makes sense14:00
d0ugaltzumainn: I don't think the structure is important to us, rather we just want to be able to find them all14:00
d0ugalbut I'm not sure about that.14:00
*** jistr has quit IRC14:00
tzumainnd0ugal, no, I think you're right14:01
tzumainnat least, not important in terms of modeling : )14:01
d0ugaltzumainn: ha, sounds good. I'll propose 1 and 2a then :)14:01
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Clean-up os-apply-config lines in
d0ugal(Well, 1 *OR* 2a, but I'll include both)14:02
*** jistr has joined #tripleo14:02
*** pcrews has joined #tripleo14:10
*** mrunge has quit IRC14:22
*** jcoufal has quit IRC14:27
*** ci-overcloud has joined #tripleo14:29
ci-overcloud************** ci-overcloud complete status=1 ************14:29
*** ci-overcloud has quit IRC14:29
openstackgerritDougal Matthews proposed a change to openstack/tripleo-specs: TripleO Template and Deployment Plan Storage
janghey guys - does anyone know what gerrit's rules are about wrapping llong lines in its diff output?14:37
*** lazy_prince has quit IRC14:37
*** rdopieralski has quit IRC14:38
*** yamahata has quit IRC14:41
*** mkerrin has quit IRC14:44
jprovaznpacemaker experts, any idea why pacemaker doesn't see ceilometer-agent-central.service (but haproxy.service does)?
jprovazndo I have to do some extra magic?14:46
*** martyntaylor has quit IRC14:47
*** martyntaylor has joined #tripleo14:49
*** jistr has quit IRC14:56
*** jistr has joined #tripleo14:56
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Improve readability of long JQ expression
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Improve readability of long JQ expression
*** marun has joined #tripleo14:58
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Improve readability of long JQ expression
*** untriaged-bot has joined #tripleo15:00
untriaged-botUntriaged bugs so far:15:00
uvirtbotLaunchpad bug 1329238 in tripleo "OVS isn't persisting mac addresses on OVS bridges" [Undecided,In progress]15:00
*** untriaged-bot has quit IRC15:00
*** noslzzp has quit IRC15:01
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Make a separation between --heat-env
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Separate Heat BM and VM configs for Nova-BM.
*** rakesh_hs has quit IRC15:03
*** jprovazn is now known as jprovazn_afk15:04
*** dprince has quit IRC15:06
*** rollerj has joined #tripleo15:07
rollerjhello all!  i attempted a stack update, increasing the computescale by 1.  the new compute node deployed but os-collect-config on that node failed with a "Client Error: AccessDenied"15:09
rollerj[2014-06-12 03:12:21,489] (os-refresh-config) [INFO] Completed phase migration15:09
rollerjINFO:os-refresh-config:Completed phase migration15:09
rollerj2014-06-12 03:12:21.927 1015 WARNING os_collect_config.cfn [-] 403 Client Error: AccessDenied15:09
rollerji ran stack-update with the same stackrc credentials that the original stack was created15:09
openstackgerritCian O'Driscoll proposed a change to openstack/tripleo-image-elements: Store ssh host keys on ephemeral partition
*** yolanda has joined #tripleo15:11
*** akuznetsov has quit IRC15:11
yolandahi, having an issue with dib, when scheduling an image i created, i get this error: Stderr: "qemu-img: 'image' uses a qcow2 feature which is not supported by this qemu version: QCOW version 3\nqemu-img: Could not open '/var/lib/nova/instances/_base/94d0200a5bb90968e0e40f682f9e187025d84276.part': Operation not supported\n"15:11
yolandalooks as some issue with qcow versions, has anyone seen that before?15:12
yolandai've been told to force --compat=1.0 when using qemu-img, but dib doesn't provide this option15:12
* mordred is curious about the answer ^^15:13
*** andreaf has quit IRC15:16
yolandain the meantime i'll convert it manually and upload15:19
*** noslzzp has joined #tripleo15:19
*** akuznetsov has joined #tripleo15:19
*** dprince has joined #tripleo15:25
*** CaptTofu_ has quit IRC15:25
openstackgerritNicholas Randon proposed a change to openstack/tripleo-incubator: Add Quotes to $NEW_JSON to preserve Json format.
openstackgerritA change was merged to openstack-infra/tripleo-ci: Log a message when we skip the temprevert
*** CaptTofu_ has joined #tripleo15:41
*** eghobo has joined #tripleo15:45
*** robsparker has joined #tripleo15:52
*** matty_dubs is now known as matty_dubs|gone15:53
*** bogdando has quit IRC15:55
*** bogdando has joined #tripleo15:56
*** rakesh_hs has joined #tripleo15:58
*** e0ne_ has quit IRC15:59
*** e0ne has joined #tripleo16:00
*** akuznetsov has quit IRC16:03
*** mestery has quit IRC16:03
*** pblaho has quit IRC16:03
*** mestery has joined #tripleo16:04
*** e0ne has quit IRC16:04
*** pblaho has joined #tripleo16:05
yolandamordred, so i have something working forcing qemu-img to 0.10 version16:07
yolandabut we don't have that option on dib, may i file a patch for it?16:07
yolandasomething like passing a string of options to be passed to qemu-img, and be able to send that from disk-image-create16:08
*** pblaho has quit IRC16:10
*** chuckC has quit IRC16:11
openstackgerritDougal Matthews proposed a change to openstack/tripleo-specs: TripleO Template and Deployment Plan Storage
*** chuckC has joined #tripleo16:13
*** jistr has quit IRC16:19
openstackgerritDerek Higgins proposed a change to openstack-infra/tripleo-ci: Add some docs descibing tripleo CI
*** akuznetsov has joined #tripleo16:34
bnemecyolanda: I would suggest opening a bug describing the problem you're hitting, including what version of operating system/OpenStack/qemu/anything-else-related you are trying to use.16:37
*** openstackgerrit has quit IRC16:38
*** mkerrin has joined #tripleo16:53
*** jcoufal has joined #tripleo16:58
*** rwsu has joined #tripleo17:00
*** bogdando has quit IRC17:00
derekh_lifeless: only made a little progress on R1 today, was busy chasing down patches, then got sidetracked looking at gertty :-) , the newly deploy ci-overcloud has gotten a little further in the process but more todo, will pick it back up in the morning17:01
*** eghobo has quit IRC17:04
*** eghobo has joined #tripleo17:04
*** dprince has quit IRC17:05
*** nati_ueno has joined #tripleo17:06
*** lucasagomes has quit IRC17:09
*** dshulyak_ has joined #tripleo17:12
*** markmc has quit IRC17:14
*** derekh_ has quit IRC17:14
*** akrivoka has quit IRC17:15
*** dprince has joined #tripleo17:25
*** rwsu has quit IRC17:30
*** rwsu has joined #tripleo17:32
*** martyntaylor has left #tripleo17:32
greghayneshrm, are we still using trello?17:36
greghaynesjprovazn_afk: wondering if we should add a card for ceilometer (if thats what youre working on?) to trello17:36
SpamapSgreghaynes: doh, probably not17:37
greghaynesits actually (surprisingly) not *that* out of date for what weve been doing on HAey stuff17:38
greghaynesnot sure thats a good or bad thing17:38
greghaynesbut yea, its been a while17:39
*** jprovazn_afk is now known as jprovazn17:41
jprovaznhm, I'm not sure what is current "trello status"17:42
greghaynessounds like no17:43
SpamapSgreghaynes: btw did you need more reviews to get the Galera stuff landed?17:46
* greghaynes checks17:46
greghaynesjprovazn: lifeless:
jprovazngreghaynes, ah, I thought this was already merged, looking now17:47
SpamapSgreghaynes: dunno why trivial rebase didn't carry the other +2 forward17:50
greghaynesSpamapS: yea.. :/17:50
greghayneslucky number 30!17:50
SpamapSgreghaynes: gah.. I really think we need to make os-refresh-config dependency based. I dislike that we just have random things deciding to halt the whole process.17:53
greghaynesSpamapS: Which script are you referring to?17:54
greghaynesI tried to prevent the mysql one from doing that...17:54
SpamapSgreghaynes: So Kiall is right in that we probably have other scripts further in the linear chain that expect mysql to be working, and it is easier to debug if we just fail immediately as soon as we don't have a configuration for something that we expect to eeventually have.17:55
SpamapSgreghaynes: but I'd prefer that we take a page out of the convergence book, and try to do as much as we can that we know we can try to do, each time.17:56
KiallSpamapS: ++ And, it'll retry in 60 seconds17:56
SpamapSSo if we change it from ##-xxxx-xxx to a graph expression at the top (a-la systemd or lsb-init) .. then we can say "if we have a config for mysql, set "mysql is ready" and then configure mysql" and then anything that needs mysql expresses the same sort of dependency.17:57
greghaynesI could see an argument for mysql (and maybe rmq) being the two special cases for that...17:57
SpamapSso IMO mysql isn't a special case, and things should fail gracefully if it isn't available, because it might not even be on the local server.17:57
*** dshulyak_ has quit IRC17:57
Kiallgreghaynes: Therese has been working on getting 3x node percona cluster up inside TriplO - without exiting when theres no data available, she's ended up getting multiple single nodes17:58
greghaynesKiall: It does require some in-flight patches but ive been doing that just fine the past few days17:58
KiallOnce she added a elif == 255, the cluster came up reliably every time (apparently - I've not seen it myself)17:58
greghaynesSpamapS: Your "change it from", can you elaborate?17:59
greghaynesit being the o-r-c script running?17:59
KiallIt's the same thing we saw until I added an "exit 1" when there's no metdata, but we use the elements outside of OOO itself etc17:59
*** edmund has joined #tripleo18:00
*** morazi has quit IRC18:00
greghaynesKiall: I wonder if theres already a master running on boot with her her setup?18:00
*** morazi has joined #tripleo18:01
KiallBased on the code snippets I saw today, she has your don't start mysql on initial boot thing included18:01
Kiall(2nd hand info here BTW - Probably easier to ask the source :))18:01
greghaynesheh, yep18:02
*** pelix has quit IRC18:03
jprovazngreghaynes, SpamapS: do you remember if there was a discussion how to deal with monitoring/restarting failed services? E.g. if a mysql on one node goes down, cluster will still work, but the failed node should be fixed (e.g. try to restart service or restart the whole node...)18:06
greghaynesso part of that is the heat convergence issue18:07
greghaynesanother part of that is our dopey leader election18:07
greghaynesIMO restart the node18:08
greghaynesand in this case hope its not the 'bootstrap host'18:08
*** shausy has joined #tripleo18:10
*** jdob has quit IRC18:12
jprovaznthanks, /me dives into the heat convergence spec18:12
*** jdob has joined #tripleo18:13
*** openstackgerrit has joined #tripleo18:13
greghayneshrm, I thought for Go apps you dont actually need go.. they are all self contained binaries18:16
lifelessSpamapS: I agree with you about random things; just want to note that convergence doesn't constrain the choice of linear or dag18:16
lifelessgreghaynes: I am18:17
lifelessgreghaynes: we've been on ha for a lllllong time18:17
*** jcoufal has quit IRC18:17
lifelessyolanda: never seen that before18:17
lifelessgreghaynes: I mean, I still care about trello18:18
greghaynesBadCub: ^ sounds like trello is at least somewhat still used18:18
*** shausy has quit IRC18:19
*** e0ne has joined #tripleo18:21
Kiallgreghaynes: looking at your latest os-is-bootstrap-host patchset, you added an `else echo "Refusing to bootstrap mysql cluster"`, but that won't prevent any "future" scripts running, like the mysql-migration elements migration.d script, which will startup mysql as a cluster of 118:21
greghaynesyes, I am still surprised we want to stop the show in that case18:22
*** jprovazn is now known as jprovazn_afk18:22
SpamapSgreghaynes: The issue is that you have failed to configure the system.. so things should stop18:23
SpamapSI actually missed that there was no exit 118:23
greghaynesah ok, this is why I was confused18:23
greghaynesI think failed to configure the system is harsh, we just havent configured that part yet...18:24
greghaynesthere is a case for failure where we do exit 118:24
Kiallgreghaynes: think of it this way, if all remaining scripts complete with exit 0's after mysql refuses to bootstrap, then os-collect-config is going to go ahead and ping Heat to say "This host is fully configured"18:25
*** lsmola has quit IRC18:25
KiallIt's also not going to re-run until it detects a metadata change - which may never happen18:25
*** jdob has quit IRC18:25
*** jdob has joined #tripleo18:25
lifelessgreghaynes: its basically turning off set -e18:26
lifelessgreghaynes: right ?18:26
lifelessgreghaynes: so, why do we want to do that ?18:27
greghaynesoh ok, now I understand what SpamapS was saying18:27
greghayneswhich is why I didnt like the exit 118:27
SpamapSlifeless: convergence works a heck of a lot better if we can converge branches of a dag rather than one linear process.18:28
lifelessSpamapS: warning - this may be a 'beer' conversation18:28
SpamapSKiall: Note that what I'm suggesting is we should not kill the whole configuration process if something is _missing_ from metadata, because its reappearance would trigger another o-c-c run. But We can only continue partial bits reliably if we have expressed the required config metadata sections in each script18:29
KiallSpamapS: you mentioned you think having a DAG would be better ++, but, a short term fix might be to allow the scripts to exit with a known code - say 255 - that let's os-collect-config continue with future scripts, but prevents Heat from being pinged, and ensures os-collect-config will re-run without a metadata change. Simple stopgap that may just work..18:29
lifelessSpamapS: but I truely think its orthogonal. I'm not arguing that we stay linear per se, but I really don't see other than parallelisation that dag affects anything18:29
SpamapSKiall: I think that code is 0, and the signal is fine, because the signal is reset when deployments are exposed to the server.18:30
KiallSpamapS: you're right actually that a re-run will happen.. But Heat will still be pinged, allowing the next phase of servers that depend on mysql (in this example) to proceed18:30
openstackgerritA change was merged to openstack/tripleo-image-elements: Name 03-mariadb files uniquely
openstackgerritA change was merged to openstack/diskimage-builder: Name 99-setup-first-boot uniquely
*** vinsh has quit IRC18:31
greghaynesI was under the impression that we basically relied on the depending service detecting if its requirements werent met, which (hopefully) gets of some of the convergence behavior18:32
greghaynesso for the fear of sending a signal to heat,  nova or something else should fail if it needs mysql18:32
greghaynesor maybe we can signal a soft fail18:36
* greghaynes stops overthinking and changes to exit 118:36
Kiallgreghaynes: doesn't that just punt the problem out to 1+ other elements? While they clearly need some of that logic (e.g. can I reach MySQL), they don't necessarily need other bits, like "Is the MySQL node I've just connected to in the middle of a bootstrap"18:36
*** jang1 has joined #tripleo18:37
Kiall(which does actually happen - percona accepts connections during the bootstrap ;))18:37
greghaynesugh, I was terrified of that second statement18:37
greghayneswe should fix that18:37
greghaynesfor the first thing - it does only need to check for can I reach mysql18:37
greghaynessimplifying the problem a lot - the next scrip is just going to fail because its all setup db commands18:38
greghaynesso its actually not a big deal here18:38
KiallYea, but if it can reach a mid bootstrap mysql .. :) Anyway - I've gotta run, I'll leave you with - My original patchset's build multi node percona clusters reliably every time ;)18:38
greghayneswe should never leave a mid bootstrap mysql running the way thats coded... I hope18:39
*** rpodolyaka1 has joined #tripleo18:39
KiallIn my case - We're using the elements outside TripleO - which means the next script may be something else. And, you can't necessarily count on the next script remaining the same even in TripleO18:40
greghaynesyes, I mean for adding an exit 1 - for us itll end up almost identical if we add it so may as well err on the safe side18:40
lifelessSpamapS: the signal won't be reset if we race18:40
KiallAnyway - Gotta run18:41
openstackgerritGregory Haynes proposed a change to openstack/tripleo-image-elements: Add os-is-bootstrap-host element and script
greghaynesrejoice! the exit 1 has returned!18:41
lifelessSpamapS: (I think - check my logic here :18:41
lifeless - occ starts with missing mysql deploy18:42
lifeless - 2m window while occ runs through stuff reach the pingback, and mysql exits 018:42
lifeless - heat gets what it needs together to prep the mysql deployment and offers that to the node18:42
lifeless - the node pings in with 'success'18:42
lifelessSpamapS: how does heat tell that the success was not for the mysql thing (in the current structure where we have one success marker) ?18:43
lifelessSpamapS: I think we need to be more sophisticated with our ping-backs to do what you're suggesting18:43
lifelessSpamapS: and we'll need to not run hooks for deployments that are not ready18:44
lifelessSpamapS: oo - I have stuff in my head I haven't expressed. Quick brain dump.18:44
lifelessSpamapS: a) we need service level dependencies - an example of this is the 'don't try to use mysql until the cluster is fully initialised' issue Kiall points out above18:44
lifelessSpamapS: b) that needs to be expressed in heat because its a cluster issue not just in-instance18:45
lifelessSpamapS: c) a thought I had about implementation was to create subdirs in e.g. pre-configure.d called by the deployment name18:45
SpamapSKiall: to that point, we are misusing the signals18:45
SpamapSKiall: we should have a deployment specifically for mysql18:45
SpamapSKiall: and the downstream things should reference the deployment if they can't be started until mysql is up18:46
SpamapSKiall: though IMO, they should just try over and over until mysql is up because mysql isn't going to be up forever18:46
lifelessSpamapS: this would give you linear scripts for responding to a deployment, and no attempt to run scripts for which dependencies are not ready18:46
SpamapSlifeless: I'm in the mysql uds session and it is sort of interesting ATM .. hang on18:49
*** dshulyak_ has joined #tripleo18:50
lifelessSpamapS: the mysql alternatives ?18:50
lifelessSpamapS: actually, shoot me a link ?18:50
SpamapSlifeless: it's done now18:53
SpamapSwell except they're talking about MongoDB18:53
SpamapSgreghaynes: webscale!18:53
greghaynesoh great, I need a special ubuntu one acct18:53
clarkbdidn't ubuntu one die recently?18:55
greghaynesapparently not18:55
SpamapSnot in our hearts and minds18:56
*** rakesh_hs has quit IRC18:57
SpamapSI haven't actually uninstalled it from my 14.04 box yet18:58
lifelessSpamapS: oh, nvm then :)18:59
SpamapSlifeless: ok, now I'll read what you said.18:59
SpamapSlifeless: OK brain dump is right on and I agree on the need. Have not thought too much about in-instance implementation..19:00
SpamapSlifeless: we already have service level deps in software deployments19:00
SpamapSlifeless: one can send data back in the signal, and that data can be referenced by a downstream deployment..19:01
lifelessSpamapS: we're not expressing nearly enough, and we're not consuming it structurally19:01
lifelessSpamapS: I'm sure software deployments has the capability19:01
SlickNikCan I get some eyeballs on when folks get a chance?19:01
SpamapSlifeless: so we can do  {{result.of.mysql.configuration}}19:01
SlickNikIt fixes the DatasourceNone cloud-init issue for precise.19:01
SlickNik Thanks!19:01
SpamapSlifeless: right we only grew this ability recently, it is in our best interests to split everything up into its own config+deployment19:02
SpamapSSlickNik: +2'd19:02
*** phschwartz has quit IRC19:02
*** funzo has quit IRC19:02
*** phschwartz_ has joined #tripleo19:02
*** funzo has joined #tripleo19:03
SpamapSlifeless: on the o-r-c side, I like the idea of giving orc some visibility into configs so that it just naturally does not run whole branches until a variable is available. Thats sort of what I'm getting at with the dag.19:03
SlickNikSpamapS: Thanks much!19:03
SpamapSlifeless: that said, I think we should draw a hard line between "do something only after mysql is initialized" and "do something only if mysql is available" .. as I'm wary of trying to orchestrate around all possible states of all possible services.. I'd rather dependents be resilient in the face of an unavailable dependency19:05
*** eguz has joined #tripleo19:05
SpamapSlifeless: I saw this with upstart too.. people would try to use upstart to order boot because they thought that was the right way.. but then you move the mysql server to a dedicated host and you find that your service is incapable of coping with 5 minutes of mysql downtime and explodes violently.19:06
*** eguz has quit IRC19:06
greghaynesfor the 'cluster is initialized' case, cant we just monitor the service and not have it report deployment success until there are enough members in the cluster (or $cluster_initialized_metric)19:08
*** eghobo has quit IRC19:08
greghaynesre: lifeless's point about heat having to know about the cluster19:09
*** phschwartz_ is now known as phschwartz19:11
greghaynesIm trying to reconcile what this deployment stuff means if we, say, want to prevent services from using mysql until we have a cluster of a certain size19:11
*** e0ne has quit IRC19:13
lifelessSpamapS: agreed that the goal has to be things are resilient19:13
*** e0ne has joined #tripleo19:14
lifelessSpamapS: I think the point about mysql connections during init is that the mysql server in question isn't allowed to be in the set of machines we point requests at until its checked in19:14
lifelessSpamapS: which is different19:14
lifelessSpamapS: consider a rebuild of a node, takes (say) 3 hours. Don't want all the rest of the world using it until its finished.19:14
*** sseago has quit IRC19:14
lifelessSpamapS: thats also not what I wrote about above, it just took me a bit to click.19:14
lifelessso we need the expression 'nodes for mysql's haproxy rule' to be 'nodes that have checked in', not 'all mysql nodes'19:15
lifelessgreghaynes: ^19:15
greghaynesis heat really where we want that information to live?19:16
lifelessI might still not be synced with what Kiall is saying, but I think I'm correct in saying we want this anyway, because even if galera does transparently handle it, its going to be terribly slow while the IST is going on19:16
*** e0ne has quit IRC19:16
lifelessgreghaynes: yes19:16
lifelessgreghaynes: heat is the cluster layer thing, no ?19:16
greghaynesyep, just having to think through it a bit19:18
*** mugsie has joined #tripleo19:19
lifelessgreghaynes: so, I think this is ok in 'future work' (but the future is now :)) - but lets capture a bug about the fact there is this problem19:19
lifelessgreghaynes: and we're going to want a spec to tease out the set of broad work we need to do to pull this all together19:19
*** rpodolyaka1 has quit IRC19:19
greghayneswhat about service reports success, gets added to list, then crashes19:19
lifelessI think its ok in future work because the current setup blocks local scripts until the local mysql is initialised19:19
lifelessgreghaynes: if its offline, haproxy won't route traffic to it19:19
greghaynessure, but it also needs to get removed from the list in a timely fashion19:20
lifelessgreghaynes: if its online and nonfunctional we need to check it back out again19:20
*** noslzzp has quit IRC19:20
lifelessgreghaynes: I presume you can check cluster sync state remotely somehow? we could have an haproxy check script that checks that19:20
Kiallgreghaynes: percona clustercheck script + haproxy works great from removing dud nodes.19:21
lifelessthere is probably one out there we can use19:21
lifeless^ tada19:21
greghaynesyep, so were going to need to do that per service for heat19:21
lifelessKiall: will it remove a node that is bootstrapping ? can we use just that ?19:21
lifelessgreghaynes: yes19:21
Kiall(it's an xinet.d script that serves "HTTP 200" or "HTTP 500" for HAProxy to check19:21
Kialllifeless: yes19:21
Kiallit will also remove nodes doing SST's etc19:21
lifelessKiall: care to throw up a patch for that ?19:22
KiallSure, I'll have a quick look tomorrow.. It should be pretty trivial to add (famous last words and all that..)19:23
lifelesshmm, patches, mysql status service on local ipv4; haproxy needs config glue for health check configuration, heat template needs to specify the script port for the mysql haproxy address19:23
*** noslzzp has joined #tripleo19:25
*** sseago has joined #tripleo19:30
greghaynesSpamapS: is going to not like me saying this, but that also essentially solves our leader election needs19:31
greghaynesif heat is keeping a list of nodes that have checked in for a service19:31
*** ohadlevy has quit IRC19:32
Kiallgreghaynes: does it? Two nodes can, at the same time, notice there is no leader and bootstrap themselves19:32
greghaynesso pick the lowest row id in the database19:33
greghaynesfor that checkin event19:33
KiallIf two percona node bootstrap, you would have to "wipe" one and re-join it to the cluster if it wasn't the lowest row id?19:34
*** ohadlevy has joined #tripleo19:34
*** ohadlevy is now known as Guest624319:34
greghaynesim having to hand wave around heat internals, but assuming it can maintain a list of nodes that have checked in for service X, and each checkin gets a row id19:34
greghayneseveryone could repeatedly checkin for starting_X, query for nodes that have checked in starting_X and the rowids, if im lowest rowid I should bootstrap19:36
greghaynesobviously dont expose internals that badly...19:36
greghaynesbut in theory19:36
*** rpodolyaka1 has joined #tripleo19:37
greghaynesbasically the same system we have now19:37
greghaynesjust not statically defined19:37
*** rpodolyaka1 has quit IRC19:42
*** morazi has quit IRC19:45
* SpamapS had to break for lunch sorry19:46
*** nati_ueno has quit IRC19:47
*** e0ne has joined #tripleo19:48
SpamapSgreghaynes: So Heat would have a list of check-ins and row numbers.. that is fine.. but the question is.. how do you know you have the most recent list?19:49
SpamapSgreghaynes: w/ an etcd you can lock the list while you act on it.19:49
SpamapSgreghaynes: which can be used for other nodes to go "try: lock_or_fail(list); except FailLock: i_am_a_slave"19:51
hewbroccaI dunno guys19:52
hewbroccaI feel like the pattern of an external thing doing master election is fraught with peril19:52
*** morazi has joined #tripleo19:52
* hewbrocca channels Fabio19:52
*** nati_ueno has joined #tripleo19:56
*** akuznetsov has quit IRC19:56
*** e0ne has quit IRC19:59
SpamapShewbrocca: You are right, the closer leader election is to the state being managed, the less failure modes there will be.20:00
*** e0ne has joined #tripleo20:00
SpamapShewbrocca: Ideally, Galera would use its own membership protocol and quorum tracking to elect leaders or refuse to continue.20:00
hewbroccaI mean, that's what we should strive for20:00
hewbroccathen you have to figure out a way to deal with the various failure cases20:00
SpamapShewbrocca: but absent that capability in Galera, we are forced to tip-toe around and/or fence.20:01
lifelesswe should filea bug on galera saying20:01
lifeless'we'd like to be able to do zero-knowledge Just Go cluster init, where we list the nodes on every node, and it gets up and running'20:01
lifelesss/nodes/intended nodes/20:02
hewbroccaRyan O'Hara has a POC patch from a buddy of his at Percona20:02
hewbroccawhich is a Pacemaker resource for Galera20:02
greghaynesSpamapS: You dont need to know you have the most recent list, you just need to have a lock on a choice once its made, ORDER BY id LIMIT 1 should do that since id's are increasing only20:02
hewbroccaIntended to handle the above20:02
hewbroccaBut... not ready for primetime yet IIUC20:02
*** e0ne has quit IRC20:02
lifelessgreghaynes: so they aren't increasing only.20:04
lifelessgreghaynes: say you have a three node galera cluster20:04
lifelessgreghaynes: an INSERT from node A will get 1, then the next 4, then the next 7 - there's a stride == cluster size.20:04
lifelessgreghaynes: node B gets 2,5,820:04
lifelessgreghaynes: node C gets 3,6,920:04
*** dprince has quit IRC20:05
SpamapShewbrocca: pacemaker is just as fraught with danger as etcd or zookeeper.20:05
SpamapShewbrocca: same problem, different tools20:05
hewbroccacan't argue with you there... just happens to be the devil we know20:06
SpamapSlifeless: I don't believe thats true for _row numbers_20:06
SpamapSlifeless: that is for PK's20:06
SpamapSlifeless: row number is the agreed upon next ID for the entire data set.20:06
lifelessSpamapS: whats a row number ?20:06
lifelessgot a url ref? Its exceeded my plumbing-of-mysql knowledge20:07
lifelessSpamapS: but also clearly its not straight sql so we can't use it from e.g. postgresql20:08
SpamapSseqno Is the actual term IIRC20:08
SpamapS slide 2020:08
SpamapSlifeless: this is entirely galera plumbing20:08
greghaynesit does sound like this blasts past the cost benefit line of not using $other_service20:09
SpamapSso if you know for a fact that the seqno's are not advancing20:10
SpamapSthen you can do it with just Heat, because you can build a quorum from the dead nodes' seqno's20:10
SpamapSI guess loss of quorum always means seqno's won't advance..20:11
SpamapSand thats the only scenario not covered by just starting up all the galera's20:11
SpamapSgreghaynes: that sound right?20:11
*** rlandy has quit IRC20:12
SpamapSIt seems like recovering is really hard to orchestrate20:12
greghayneswhat do you mean 'just starting up all the galeras'?20:13
lifelessso I am not sure that we need to solve this in the short term20:13
lifelessat all20:13
lifelesswe definitely do medium/long20:13
lifelessbut just treating first-node-as-initial-master seems like a pretty reasonable stopgap: I mean, until heat has convergence, if that node fails to come all the way up, we'll still have STACK_FAILED20:14
SpamapS is the source of truth here20:14
SpamapSAnd I agree, we can just live with loss of quorum leads to reading that manual page, and getting your dbs up and running20:15
hewbroccalifeless: agree20:15
SpamapSlifeless: true, why am I here? I have a spec to get merged. ;)20:15
hewbroccait is the safest option anyway20:15
greghaynesI occured to me that a stack-update could result in a different node thinking it should be master given our current setup20:16
greghaynessince we just alphasort and pick one, and that list could change...20:16
hewbroccaOnce the cluster is up you want to make sure *nothing* thinks it's master, ever20:16
hewbroccaunless you explicitly go in and tell it so20:16
greghaynes"going to have a bad time"20:17
*** boris-42 has quit IRC20:18
*** morganfainberg has quit IRC20:18
*** boris-42 has joined #tripleo20:18
*** morganfainberg has joined #tripleo20:18
lifelessgreghaynes: so stack-update is wholly different to init20:21
lifelesslets not overengineer non-bottlenecks.20:21
lifelessThis isn't a bottleneck.20:21
greghaynesspeaking of:
greghayneslifeless: ^ please to review20:23
openstackgerritA change was merged to openstack/tripleo-image-elements: Update mysql cluster.cnf to match heat templates
openstackgerritA change was merged to openstack/tripleo-image-elements: Fall back to keystone-manage if pt-archiver isn't available
openstackgerritA change was merged to openstack/tripleo-image-elements: Fix sed regex from deleting old configs
openstackgerritA change was merged to openstack/tripleo-image-elements: Ceilometer Config element for custom pipeline
lifelessgreghaynes: passed CI yet ?20:24
openstackgerritA change was merged to openstack/tripleo-image-elements: Enable neutron.conf passthrough configuration
greghayneslifeless: no20:24
openstackgerritA change was merged to openstack/tripleo-image-elements: Configure passthrough in swift config files
greghayneslifeless: ill ping when it does20:24
*** matty_dubs|gone is now known as matty_dubs20:27
*** nati_ueno has quit IRC20:28
*** noslzzp has quit IRC20:29
*** TravT has joined #tripleo20:29
lifelessBadCub: and it begins ;)20:34
*** julim has quit IRC20:35
*** openstackgerrit has quit IRC20:35
*** openstackgerrit has joined #tripleo20:36
*** jprovazn_afk has quit IRC20:37
lifelessgreghaynes: got to it; -1 :)20:37
greghayneswell, thats one way to beat the CI20:38
SpamapSlifeless: are you driving pbr's fixes for coverage/testr, or can we let this one through until somebody does that?
SpamapSI keep having to cherry pick that fix in to run coverage :-P20:44
*** nati_ueno has joined #tripleo20:44
lifelessSpamapS: it works in pbr20:44
lifelessSpamapS: we found that the command is test not testr,20:45
lifelessat least, thats what mordred said20:45
lifelessSpamapS: there is also a patch from stevenk to register testr properly20:45
lifelessSpamapS: 715c59738e3643f579b913921e90cf3b6bfc66e3 which is in trunk20:46
lifelesslet me see about getting a release20:46
*** noslzzp has joined #tripleo20:48
*** noslzzp has joined #tripleo20:48
SpamapSlifeless: ok, so I fund that the command was test too.. :-P20:49
*** noslzzp has quit IRC20:49
lifelessSpamapS: does it accept the --coverage --coverage-package-name option ?20:49
SpamapSlifeless: yes.. the patch.. that I just linked.. go look.. reconsider. :)20:50
lifelessSpamapS: oh, I totally misread the patch initially.20:51
lifelesswe should add a setup.cfg thing to define the package name though20:52
SpamapSlifeless: In a separate patch?20:52
SpamapSlifeless: and to what end?20:52
lifelessSpamapS: as a pbr feature20:54
lifelessSpamapS: so that anything which needs to know it - such as coverage and test discovery - can infer it rather than be hand maintained.20:54
SpamapSisn't module-name that?20:54
* SpamapS has never understood "package" vs. "module"20:55
lifelessI don't want to think right now :)20:55
SpamapSdon't think.. I know just what you're saying.. and I don't need your reasons.. don't tell me cause it hurts20:55
*** jtomasek has quit IRC20:55
* SpamapS un-channels Gwen Stefani20:56
openstackgerritA change was merged to openstack/diskimage-builder: Correct source-repository comments
*** noslzzp has joined #tripleo20:56
lifelessSpamapS: heh. so a dir like os_apply_config with a is a package. a file in os_apply_config is the module os_apply_config.foo20:57
*** e0ne has joined #tripleo20:57
*** jdob has quit IRC20:59
*** untriaged-bot has joined #tripleo21:00
untriaged-botUntriaged bugs so far:21:00
uvirtbotLaunchpad bug 1329238 in tripleo "OVS isn't persisting mac addresses on OVS bridges" [Undecided,In progress]21:00
*** untriaged-bot has quit IRC21:00
openstackgerritGregory Haynes proposed a change to openstack/tripleo-image-elements: Add os-is-bootstrap-host element and script
openstackgerritOpenStack Proposal Bot proposed a change to openstack/diskimage-builder: Updated from global requirements
*** noslzzp has joined #tripleo21:02
*** vinsh has joined #tripleo21:05
*** nati_ueno has quit IRC21:07
lifelessSpamapS: 86435 looks good to me now21:12
*** eghobo has joined #tripleo21:14
SpamapSlifeless: got it, thanks for the explanation, and the reviews. :)21:14
SpamapSlifeless: speaking of reviews.. where are we at in resurrecting R1?21:15
lifelessSpamapS: sec21:20
*** dshulyak_ has quit IRC21:23
*** morazi has quit IRC21:25
*** e0ne has quit IRC21:26
*** e0ne has joined #tripleo21:26
*** jang1 has quit IRC21:27
*** petertoft has joined #tripleo21:28
lifelessSpamapS: ok21:28
*** TravT has quit IRC21:31
*** e0ne has quit IRC21:31
lifelessSpamapS: should be realistically debuggable now21:34
lifeless(relevant to deploying hp1 again, since its a nontrivial cluster21:35
lifeless2014-06-12 09:44:53.393 | Service ec2 created21:35
lifeless2014-06-12 09:44:55.624 | Authorization Failed: Unable to establish connection to
lifeless2014-06-12 09:44:55.895 | Authorization Failed: Unable to establish connection to
lifeless2014-06-12 09:44:56.123 | usage: keystone user-role-add --user <user> --role <role> [--tenant <tenant>]21:35
lifeless2014-06-12 09:44:56.123 | keystone user-role-add: error: argument --user/--user-id/--user_id: expected one argument21:35
lifelessI think thats the race nicholas has been seeing21:35
lifelesswhere we don't have signalling from the seed on ready21:35
morganfainberglifeless, whoa21:36
lifelessmorganfainberg: not a keystone issue21:37
morganfainbergi know :)21:37
morganfainbergdeos look racy though :)21:37
* morganfainberg continues lurking.21:37
SpamapSlifeless: oh lovely21:37
lifelessSpamapS: I'll report a bug21:38
*** morazi has joined #tripleo21:38
SpamapSlifeless: the whole signal handling needs an overhaul21:39
lifelesshmmm, no thats not it, but I need to file this bug anyway.21:39
SpamapSthey should probably all be individually handled by each respective deployment<->element relationship21:39
lifelessSpamapS: I've filed about the seed case21:41
uvirtbotLaunchpad bug 1329528 in tripleo "seed cloud cannot signal 'ready' - we guess at readiness and then race with os-collect-config" [High,Triaged]21:41
lifelessdo we have ntp between the hypervisors and testenvs? bet we have skew21:46
lifeless and
lifelessdon't really line up21:46
*** Lexis has quit IRC21:46
lifelessJun 12 09:39:16 overcloud-controller0-ejdwnvryudch os-collect-config[780]: + os-svc-enable -n keystone21:48
lifelesskeystone shouldn't be accessible before that21:48
lifelessand - this is the overcloud21:48
lifelessso we deployed with heat and were in STACK_READY21:48
*** Lexis has joined #tripleo21:49
lifelessJun 12 09:42:19 overcloud-controller0-ejdwnvryudch systemd[1]: Stopping keystone Service...21:49
lifelessJun 12 09:42:19 overcloud-controller0-ejdwnvryudch systemd[1]: Starting keystone Service...21:49
lifelessthe error occurs at21:50
lifeless2014-06-12 09:44:55.624 | Authorization Failed: Unable to establish connection to
lifelessJun 12 09:42:13 overcloud-controller0-ejdwnvryudch keystone-all[4199]: 2014-06-12 09:42:13.580 4199 WARNING keystone.common.wsgi [-] Could not find user, ec2.21:51
SpamapSlifeless: at one point we were setting NTP for tripleo-cd .. but maybe we never made it default for everything everywhere.21:51
*** ccrouch has quit IRC21:51
lifelessmorganfainberg: ^ that happens when we do keystone user-get21:52
lifelessmorganfainberg: having anyone in the world be able to trigger a WARNING: seems bad :)21:52
lifelessso I think ssh initialisation happened here:21:52
lifelessJun 12 09:42:12 overcloud-controller0-ejdwnvryudch keystone-all[4199]: 2014-06-12 09:42:12.827 4199 WARNING keystone.common.wsgi [-] Authorization failed. The request you have made requires authentication. from
lifelessJun 12 09:42:13 overcloud-controller0-ejdwnvryudch keystone-all[4199]: 2014-06-12 09:42:13.580 4199 WARNING keystone.common.wsgi [-] Could not find user, ec2.21:52
lifelessthats when keystone became able to authenticate requests21:53
lifelessactually no. I'm thoroughly confused here21:53
lifelessSpamapS: ^ halp :)21:53
lifelesswe successfully use keystone21:53
lifelessthen we get authentication-required errors, from nowhere21:53
lifelesswe know other requests are working because user-get is being done against heat etc21:54
lifelessmorganfainberg: ^ any thoughts on what would cause a working keystone to stop transiently ?21:54
lifelessoh, I know21:54
lifelessnova neutron heat etc on the same machine polling it21:54
lifelesstheir credentials aren't valid yet. Maybe?21:55
lifeless0.3 is the machine itself I expect, since its not the jenkins slave21:55
lifelessok, I'm going to shelve this and go back to really answering SpamapS question21:55
SpamapSlifeless: what can I halp with?22:00
*** nati_ueno has joined #tripleo22:03
BadCublifeless: you jumping into meeting?22:03
*** ccrouch has joined #tripleo22:04
lifelessoh right22:04
lifelessBadCub: coming22:04
lifelessBadCub: I need 2m with spamaps first22:04
*** ccrouch has quit IRC22:04
BadCublifeless: LOL okay22:04
lifelessSpamapS: is the etherpad as you know22:04
lifelessSpamapS: my chores for today are:22:05
lifeless - retest the bad nodes and update the tickets (see machine-information-tab etc)22:05
lifeless - turn the remaining local edits in the code on the bastion into patches in gerrit22:05
lifeless - add a jenkins job to test vlan configurations22:05
lifeless - add incubator support for vlan configurations22:05
lifeless - tie that all together22:05
lifelessSpamapS: I think priority wise we need the new CI job first, since thats going to block on infra folk reviewing it, get it up and nag  ;)22:06
lifelessSpamapS: then probably the next thing I would love it if you did would be to refresh the tickets on the 6 bad nodes22:06
SpamapSlifeless: Ok. I saw that at least one just got summarily closed because we didn't respond, even though it was not us but DC that needed to respond. :-P22:07
lifelessSpamapS: yes, so retest and reopen if its still issue etc.22:08
SpamapSlifeless: FTR, I'd like to do less of this, and more Heat core work. But.. I get that we still don't have anybody else up to speed on ops.22:08
lifelessSpamapS: I'm not asking you to do this specifically, and tchaypo is now an admin too22:08
lifelessSpamapS: most of the work is doable locally though22:09
*** andreaf has joined #tripleo22:09
* tchaypo celebrates22:10
*** ccrouch has joined #tripleo22:12
tchaypoI have to have breakfast and a driving lesson but in about 2.5 hours I'll be asking to be pointed in the direction of useful things to do with my exciting new cd-admin rights22:13
SpamapSlifeless: cool, just need to keep the convergence train's engine full of spec-coal22:13
lifelessSpamapS: I woud like to help there too22:14
lifelessSpamapS: but I can't track every iteration; could you perhaps poke me at relevant times?22:14
SpamapSlifeless: yeah. Currently getting a lot of feedback and have just recently identified that there may be need to adopt garbage-collection semantics for deleted nodes that we hadn't considered before.22:17
*** andreaf has quit IRC22:18
*** andreaf has joined #tripleo22:18
openstackgerritA change was merged to openstack/diskimage-builder: Rename old image file instead of rewrite it
*** andreaf has quit IRC22:19
*** andreaf has joined #tripleo22:20
*** giulivo has quit IRC22:20
*** morazi has quit IRC22:21
*** matty_dubs is now known as matty_dubs|gone22:21
*** morazi has joined #tripleo22:21
*** morazi has quit IRC22:21
openstackgerritGregory Haynes proposed a change to openstack/tripleo-image-elements: Extract mysql reset files from reset-db
openstackgerritpatrick-crews proposed a change to openstack-infra/tripleo-ci: Alter how we grab and store machine files to use logstash indexing.
*** noslzzp has quit IRC22:26
*** petertoft has quit IRC22:26
*** weshay has quit IRC22:28
*** noslzzp has joined #tripleo22:31
vinshgreghaynes, can you share any tripleo-ci script changes you used to fire up multiple control nodes?22:31 doesn't seem to be working its magic as I expected22:31
vinshlikely a large dose of operator error here22:32
greghaynesI think the makefile change in t-h-t for that was merged22:32
greghaynesso likely need to rebase off master22:32
*** rollerj has quit IRC22:32
vinshO i see. Thankya.22:33
vinshdangerous living on our dev master branch we have over here sometimes.22:34
vinshit should be in sycn next monday I hear.22:34
greghaynesyea, it always gets a bit tricky with all these cross repo deps too22:34
greghayneshave to constnatly rebase off master22:35
*** rollerj has joined #tripleo22:35
vinshbecause clouds.22:35
*** andreaf has quit IRC22:36
vinshhmm. I had made this same change locally already.. not any different22:38
openstackgerritRichard Su proposed a change to openstack/tripleo-heat-templates: Update ip_local_port_range through sysctl
vinshmust be something in my template22:38
SpamapSvinsh: two spaces is all it takes for yaml to ruin your day ;)22:40
vinshWise words :)22:40
* SpamapS decides that 25 is enough reviews for one day and goes to dip brain in ice bath22:40
vinshmargarita flavored ice we hope.22:41
vinshoh neat. it worked.22:42
greghaynesthe margarita ice?22:42
vinshthe scale ice.22:42
openstackgerritRichard Su proposed a change to openstack/tripleo-heat-templates: Update ip_local_port_range through sysctl
vinshnow to pour over all 800 lines and sort out "ERROR: Arguments to "Fn::GetAtt" must be of the form [resource_name, attribute]"22:43
*** noslzzp has quit IRC22:44
vinshits nice how heat doesn't even give you a range of lines its unhappy with.22:44
vinsh"error in cloud"22:44
vinshk thx.22:44
SpamapSvinsh: you get "in cloud" .. quit whining ;)22:45
* vinsh leaves the house plant alone then22:45
SpamapSIt may be a symptom of review-brain-melt.. but I am dying laughing here22:47
*** noslzzp has joined #tripleo22:49
SpamapSvinsh: just for you
greghaynesvinsh: not sure if there isnt an easier way, but ive definitely stuck a pdb in the /opt/stack/... heat parser before22:53
greghaynesactually, not pdb, just lots of prints22:53
vinshah, I hadn't yet thought about digging into those guts, still been thinking of that as a blackbox.22:54
vinshthar be dragons.22:54
SpamapSThere's been talk of adding the dict path to exceptions22:58
SpamapSso have the recursive resolve function keep track of the breadcrumbs and any raised error would spit out " $ERROR"22:59
greghaynesmaybe thats a good breaking into heat pach I should do...23:00
SpamapSwhich is really all you need23:00
vinshthat would be enough of a ball park to get going atleast23:00
SpamapSline numbers involve deep yaml foo23:00
SpamapShave to subplant all the objects with line-number-aware objects23:00
SpamapSor parse twice, once with line number aware objects, and still do the dict path thing23:01
greghaynesogod, compilers course flashbacks23:01
vinshwelp, hope someone figures that out :)23:01
greghaynesvinsh: with your glance registry-host bind issue, was that an error where you just couldnt hit the glance API or was it actually causing o-r-c to fail?23:05
vinshIt was an error where would try to register an image in overcloud.. and just hit a 500.  this was because the glance-api could not reach the glance-registry23:06
vinshas the glance-api.conf was trying to reach registry at when registry was only on node-ipv423:07
vinshadded a comment to the glance-api.conf on:
* vinsh hp -> gym -> home.23:10
*** vinsh has quit IRC23:15
openstackgerritA change was merged to openstack/tripleo-incubator: Add tchaypo to tripleo-cd-admins
openstackgerritRichard Su proposed a change to openstack/tripleo-image-elements: Move rabbitmq-server cluster port
lifelessSpamapS: cool, so shout at me when you want eyeballs23:24
openstackgerritlifeless proposed a change to openstack/tripleo-image-elements: Add a hp1 region configs.
openstackgerritlifeless proposed a change to openstack/tripleo-image-elements: Syntax fix the keepalived docs.
openstackgerritlifeless proposed a change to openstack/tripleo-image-elements: HP1 region deploy config fixups.
openstackgerritA change was merged to openstack/diskimage-builder: Yum: support pkg-map in bin/install-packages
*** edmund has quit IRC23:42
*** ccrouch has quit IRC23:44
openstackgerritlifeless proposed a change to openstack/tripleo-image-elements: Unbreak Ironic default logging.
openstackgerritlifeless proposed a change to openstack/tripleo-image-elements: Add debug and verbose log support for Ironic.
lifelessSpamapS: did you poke at some ilo ?23:49

Generated by 2.14.0 by Marius Gedminas - find it at!