19:03:18 #startmeeting tripleo 19:03:18 Meeting started Tue Apr 22 19:03:18 2014 UTC and is due to finish in 60 minutes. The chair is lifeless. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:03:19 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:03:21 The meeting name has been set to 'tripleo' 19:03:41 #topic agenda 19:03:48 bugs 19:03:48 reviews 19:03:48 Projects needing releases 19:03:48 CD Cloud status 19:03:49 CI 19:03:54 Atlanta stuff 19:04:04 Open discussion 19:04:46 #topic bugs 19:05:09 #link https://bugs.launchpad.net/tripleo/ 19:05:09 #link https://bugs.launchpad.net/diskimage-builder/ 19:05:09 #link https://bugs.launchpad.net/os-refresh-config 19:05:10 #link https://bugs.launchpad.net/os-apply-config 19:05:10 #link https://bugs.launchpad.net/os-collect-config 19:05:12 #link https://bugs.launchpad.net/tuskar 19:05:14 #link https://bugs.launchpad.net/python-tuskarclient 19:05:21 hmm, I think we need to add os-cloud-config to that now/soon. 19:05:51 o/ 19:05:53 9 criticals in tripleo: ( 19:06:18 but untriaged is in better shape \o/ 19:06:31 o/ 19:06:45 Hooray for untriaged-bot :-) 19:07:24 so the criticals 19:07:25 i tried to setup a test-env for https://bugs.launchpad.net/neutron/+bug/1290486 today, but wasn't able to immediately repro. (i did nova boot as suggested by reporter). will continue to poke tomorrow 19:08:07 tchaypo: did you get up at awful oclock today? ^ 19:08:55 derekh: do you need help on https://review.openstack.org/#/c/88223/ ? 19:09:06 https://bugs.launchpad.net/tripleo/+bug/1308407 is killing us daily , 19:09:31 lifeless: could do with somebody who knows nodepool better then I confirm the problem and my approach 19:09:47 derekh: ok, I'll look 19:09:58 lifeless: but I'm pretty sure what I have reproduced locally is what is happening 19:10:10 derekh: I think your analysis is right, but did you consider just switching the order of the nodes in nodepool.yaml ? 19:11:27 so all criticals have assignees 19:11:35 anyone need help with their bug ? 19:11:40 are the assignees stale ? 19:11:52 Yes, awful o'clock. And because I'm not used to the time yet, and because I'm not at home, that actually meant waking up every 10 minutes starting at awful-1 o'clock in a panic 19:12:02 lifeless: yes, that may work, it didn;t in my test but I think it may have if I made the ratio 2:1 instead of 4:1 (precise:f20) 19:13:59 ok, lets move on, since noone else is asking for help :) 19:14:05 #topic reviews 19:14:22 http://russellbryant.net/openstack-stats/tripleo-openreviews.html 19:14:33 19:14:34 Stats since the last revision without -1 or -2 : 19:14:34 Average wait time: 8 days, 16 hours, 28 minutes 19:14:34 1rd quartile wait time: 4 days, 8 hours, 45 minutes 19:14:34 Median wait time: 6 days, 15 hours, 18 minutes 19:14:36 3rd quartile wait time: 13 days, 4 hours, 34 minutes 19:14:42 what was it last week ? 19:15:17 lifeless: what's the status on the config pas-through work... we are getting flooded by the various 'enable foo' config reviews 19:15:21 worse than last week :/ 19:15:37 marios: ironically, waiting on reviews I believe 19:15:39 should we continue to pass these with a light touch 19:15:59 marios: Maybe those should be -2'd and revisited after the passthrough is in to see if that addresses their need? 19:16:12 https://review.openstack.org/#/c/87843/ 19:16:19 https://review.openstack.org/#/c/87844/ 19:16:36 bnemec: yeah i think that may be a good idea, otherwise we'll end up with a huge number of enabled options 19:17:08 87844 failed tests 19:17:15 why cant 87843 be pushed? 19:17:17 (approved) 19:17:17 I just noticed that 19:17:21 * marios looks more closely 19:17:39 87843 needs a recent CI pass 19:17:43 looks like it can to me 19:19:01 87844 will fail until 87843 is in 19:19:07 cross project dependency 19:19:08 * marios about to approve unless the +1 with nits have objections? 19:19:13 87843 19:19:43 theres only been one patch on trunk since 19:19:47 which was undercloud 19:19:54 so the test results should still be valid 19:20:13 done 19:21:30 sounds like we should do a pass over all the reviews going 'plumbing - -1' for any we want to use passthrough 19:21:38 light touch, clear it out 19:21:43 or perhaps even -2 19:22:01 note that some may need more t-i-e patches to passthrough enable additional fies 19:22:04 files 19:22:43 thoughts? 19:23:05 I like -2 - makes the list of reviews to have a look at a lot smaller 19:23:18 i like the idea of -2 to make it seem more like a concerted effort to purge the queue 19:23:24 i agree with this approach too 19:23:38 +1 to -2 :-) 19:23:55 bnemec: i couldn't help but add that up in my head 19:23:57 then they can revisit once software config is done. for anything that is critical we can get in on a per review case (critical/urgent) 19:24:12 weirdly, I thought we'd already done -2s to all the little config option reviews 19:24:19 but +1 to the idea 19:24:31 Ive been holding off until I have something to actually point people at... 19:24:34 Ng: i kinda thought the same thing, I thought they were all in a holding pattern 19:24:50 we -2'd all the heat things tht were interfering with software-config 19:24:51 which seems like it should be merge momentarially 19:24:53 thats landed 19:25:00 ok 19:25:19 #note core reviewers to do a one-pass identify-plumbing-and--2-the-world 19:25:36 #topic 19:25:38 #topic Projects needing releases 19:26:23 any wolunteers? 19:26:37 or as we say in nz 'volunteears'? 19:27:07 tap tap tap ? 19:27:26 i'll throw slagle under the bus again 19:27:39 given he's not here to defend himself ;-) 19:27:45 lol 19:27:56 ccrouch: good guy charles crouch 19:27:58 ccrouch: hah! 19:28:13 #action slagle to debusify himself and do releases of the world. 19:28:24 #topic CD cloud status 19:28:36 dprince: / derekh: hows the RH region? I haven't poked at it recently 19:28:42 its not in CI yet I presume? 19:28:52 lifeless: it should be soon... 19:28:52 lifeless: no, the patch is waiting to be merged 19:28:53 it sin't. on my list of things to do this afternoon 19:29:10 wwwwwicked 19:29:11 https://review.openstack.org/#/c/83057/ 19:29:18 healthy otherwise? 19:29:32 (I know, hard to say w/out load on it) 19:29:38 lifeless: I think so. We'll see 19:29:46 The HP region failed yesterday, SpamapS caught that one. 19:29:54 lifeless: I believe so, although haven't tried it in over a week, 19:29:55 going to grab lunch then push that through. hopefully will start going in about 2 hours 19:29:59 I believe it to be fully up and happy again now 19:30:16 lifeless: The F20 jobs in particular are running slowish in general though (on the HP rack) 19:30:35 be interesting to see what the timings are on the RH rack for comparison 19:30:36 clarkb: I should be back on later is any problems pop up when you flick the switch 19:30:38 dprince: at a guess that will be mirror access performance 19:30:48 derekh: thanks 19:30:50 but we can look at the log to see 19:30:56 lifeless: yep, we could try that 19:31:22 lifeless: the devstack-gate setup stuff consistently taker 5 minutes linger on f20, Its on my list of figure out 19:31:23 derekh: what are your thoughts on the F20 slowness, will simply mirroring the RPMs closer help? 19:31:38 derekh: oh, interesting. 19:31:53 lifeless: dprince rpm mirror/cache might help also 19:31:56 so we have a list of things we want to do local mirrors for 19:32:13 I believe thats in the list, right ? 19:32:22 lifeless: a lot of nodes seem to be going to the erorr state today, I took quick look between appointments today and 3 compute nodes are having problems taking trafic (I couldn't ping them) 19:32:24 also 19:32:26 #topic CI 19:32:44 derekh: ruh roh, we might have an uptime related bug then 19:32:47 but still reporting to nova as running, sounds like the issue we had on the controller a couple of times. 19:33:09 lifeless: I havn't dug into it much more then that 19:33:10 derekh: since SpamapS saw a couple fall over - or perhaps we don't have the mellanox driver on the non-compute nodes 19:33:45 lets get the RH region going, then perhaps take the time to finish the automated bringup work, then redeploy the RH region with trusty 19:33:57 which we know makes the hardware in that rack much happier 19:34:20 lifeless: did you mean the HP region with trusty? 19:34:26 derekh: yes 19:34:33 lifeless: sounds like a plan 19:34:41 derekh: with RH live, we won't have CI downtime in the same way 19:34:54 we'll backlog but we won't halt 19:34:59 yup 19:35:07 lifeless: why are we switching the RH region to trusty again :) 19:35:26 doh just noticed the time sorry guys 19:35:26 dprince: we're not - I'm keen to have every regions cloud be a different OS 19:35:35 #topic Atlanta stuff 19:35:41 dprince: it was a typo 19:35:49 derekh: got it :) 19:35:53 lifeless: mellanox was not part of the failure yesterday. 19:36:06 lifeless: there was a failure to load the module on the controller on boot. 19:36:07 SpamapS: ack; the panic was different? 19:36:15 SpamapS: on the hypervisors? 19:36:19 but it was actually powered off, inexplicably 19:36:35 two hypervisors were down, 1 was frozen entirely. The other had a kernel panic 19:36:42 SpamapS: derekh is saying he's seeing more hypervisors falling over 19:36:48 lifeless: how many session spots do we have for Atlanta? 19:36:54 SpamapS: with symptoms that look like the mellanox fail 19:36:55 dprince: 6 19:37:04 lifeless: well that is crap 19:37:15 well 19:37:21 my suggestion is that we update to trusty 19:37:21 who's bad side did we get on? 19:37:34 dprince: http://lists.openstack.org/pipermail/openstack-dev/2014-April/033317.html 19:37:39 dprince: thats more than Ironic 19:37:44 since it has the good version of the mellanox driver, and we need to get there anyway 19:37:49 SpamapS: indeed, mine too - see above ;) 19:37:56 oh right :) 19:38:44 https://etherpad.openstack.org/p/tripleo-icehouse-summit <- 19:39:00 I'm looking to folk to help assess the sessions 19:39:12 lifeless: I would really like to has out the network stuff, in particular since there was so much pushback /w the ensure-bridge refactoring. 19:39:16 PTL is meant to be an enabler and tie breaker - sadly only PTL can approve the sessions 19:39:35 lifeless: should we take 6 votes each? 19:39:44 This will be my first summit, so I have to admit I don't really know what makes for a good session. 19:39:45 derekh: yeah, that might be a good way 19:39:55 bnemec: ok - 19:39:56 It seems to me we should focus on things where either: 19:39:56 - we need to build basic consensus 19:39:56 - crowdsourcing is at play 19:40:00 bnemec: ^ they are key IMO 19:40:09 lifeless: Okay, thanks 19:40:15 bnemec: on the side of the person putting the session forward, they need to do prep work 19:40:24 turning up and saying 'lets chat' == poor outcome usually 19:40:33 Sure, makes sense 19:40:36 In my mind, summit sessions are places to build consensus on issues that are somewhat complex and could go in multiple directions. 19:40:55 SpamapS: exactly ++ 19:40:55 for consensus stuff, having a good well thought out overview and then drilling into figure out where we're disagreeing - good 19:41:23 for crowd sourcing aspects, its similar - have good explanations about it all and then bits 19:41:26 They're not places to do the bulk of design work, as design by committee is not awesome. 19:41:58 Good crowd sourced things are "what are some concrete use cases for this." 19:42:08 as a for instance, SpamapS and I are going to be proposing some fairly deep and extensive changes to heat's internals, and for that I expect we'll do a couple of hours of prep beforehand, at least. 19:42:32 so that the discussion can be effective 19:43:00 a related thing I'd like for atlanta is the new specs repo to be online 19:43:10 I don't think anyone has volunteered to get that setup yet ? 19:43:59 lifeless: is it just a matter of creating a blank repo? I can get that together 19:45:03 derekh: yeah - copy the nova one which has doc building and a template spec 19:45:15 lifeless: ok, will do 19:45:16 derekh: get it into openstack/ in gerrit 19:45:23 yup 19:45:29 #action derekh to setup tripleo-specs repo 19:45:30 thanks! 19:45:34 np 19:47:06 with only 6 sessions 19:47:12 I expect a fair number of double duty ones 19:47:19 like CI might touch on several aspects. 19:47:28 the very last session overlaps the wrap-up 19:47:45 so we might use that for either super contentious stuff, or (relatively) niche... I dunno. 19:47:50 any other atlanta stuff ? 19:48:04 lifeless: note that you'll have a tripleO "project pod" 19:48:08 for extra discussions 19:48:19 ttx: yeap 19:48:22 lifeless: we don't have any conference sessions. 19:48:33 SpamapS: whew, we can all get work done :) 19:48:38 lifeless: talks rather. We might want to collaborate on a single lightning talk submission. 19:48:41 lifeless: I placed it close to the ironic pod for cross-pollination 19:49:07 ttx: hah... so near heat or nova might be better - Ironic and TripleO are very well connected 19:49:27 but we only have one pseudopod into heat, and only 3 or so into nova 19:49:39 * ttx checks the map 19:50:39 same floor but separate rooms 19:50:44 all good 19:50:49 anyway, Nova has no pod 19:50:55 since they have sessions running all the time 19:51:00 #topic open discussion 19:51:27 one q regarding pass-through patches 19:51:51 it can be used for adding extra set of config options, right? 19:52:02 something like this: https://review.openstack.org/#/c/88105/1/elements/haproxy/os-config-applier/etc/haproxy/haproxy.cfg 19:52:09 couldn't be done with it 19:52:18 jprovazn: its got three parts 19:52:40 jprovazn: we passthrough enable a config file by dropping an entirely data driven section into the moustache template 19:53:03 jprovazn: then we generate data into that section via heat, picked up from a user parameter 19:53:16 jprovazn: finally the user passes in a json struct matching this 19:53:55 so for haproxy.cfg, we'd need to passthrough-enable haproxy.cfg, and the user would have to provide a json file? 19:54:27 jprovazn: Until we figure out how to tell heat to do the merging of keys/values etc (and whatever semantics we want for that), we can't really union the user input and heat calculated inputs 19:54:40 jprovazn: that haproxy section looks more like heat calculated stuff to me 19:54:42 AIUI you need to make sure the app will deal with duplicated config options sanely, which is something im not sure about with haproxy 19:55:21 in particular for this example, haproxy doesn't have a dict-model for its config file 19:55:55 so you might do something similar, but different, to make the haproxy stuff much more configurable 19:56:45 so this is kindof in the bucket of 'v2 passthrough' where we take the time to figure out all the possible use cases and design a long term answer. IMO. 19:56:47 lifeless, well, I noticed this patch today, I was thinking that setting a default value in heat template + accepting optional value from metadata would be most reasonable 19:56:59 jprovazn: I think thats entirely reasonable. 19:57:15 but was not sure how far it's in conflict with pass through, thanks for clarification 19:57:19 jprovazn: and/or - feel free to push back a little on the patch and ask why it needs to be configurable. 19:57:47 lifeless, yes, that's my plan ;) 19:57:49 the HP folk putting forward these patches have /lots/ of production experience - they may well be able to say 'X works better', and we can just change to X. 19:58:04 jprovazn: \o/ 19:58:09 good to know 19:59:36 Yeah I'd like to see us default to things that production-hardened people want. 20:00:09 thanks for coming everyone! 20:00:11 #endmeeting