19:00:38 #startmeeting tripleo 19:00:39 Meeting started Tue Aug 12 19:00:38 2014 UTC and is due to finish in 60 minutes. The chair is lifeless. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:41 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:43 The meeting name has been set to 'tripleo' 19:01:03 hi :) 19:01:11 #topic agenda 19:01:14 * bugs 19:01:14 * reviews 19:01:14 * Projects needing releases 19:01:14 * CD Cloud status 19:01:16 * CI 19:01:19 * Tuskar 19:01:21 * Specs 19:01:24 * Insert one-off agenda items here 19:01:26 * open discussion 19:01:29 Hi everyone :) 19:01:38 O/ 19:02:07 hey TripleO's 19:02:10 o/ 19:02:48 o/ 19:02:54 #topic bugs 19:03:04 o/ 19:03:05 #link https://bugs.launchpad.net/tripleo/ 19:03:05 #link https://bugs.launchpad.net/diskimage-builder/ 19:03:05 #link https://bugs.launchpad.net/os-refresh-config 19:03:05 #link https://bugs.launchpad.net/os-apply-config 19:03:07 #link https://bugs.launchpad.net/os-collect-config 19:03:09 #link https://bugs.launchpad.net/os-cloud-config 19:03:12 #link https://bugs.launchpad.net/tuskar 19:03:14 #link https://bugs.launchpad.net/python-tuskarclient 19:03:26 looking at criticals only atm - since we're swamped 19:04:08 bug 1263294 19:04:10 Launchpad bug 1263294 in tripleo "ephemeral0 of /dev/sda1 triggers 'did not find entry for sda1 in /sys/block'" [Critical,In progress] https://launchpad.net/bugs/1263294 19:04:15 bug 1316985 19:04:16 Launchpad bug 1316985 in tripleo "set -eu may spuriously break dkms module" [Critical,In progress] https://launchpad.net/bugs/1316985 19:04:19 GheRivero: ^ 19:04:39 no sign of rpod :/ for bug 1317056 - I think we need to unassign it 19:04:40 Launchpad bug 1317056 in tripleo "Guest VM FS corruption after compute host reboot" [Critical,Triaged] https://launchpad.net/bugs/1317056 19:04:51 bug 1344326 19:04:53 Launchpad bug 1344326 in tripleo "Updating nodes with persistent state data is not performed gracefully" [Critical,Confirmed] https://launchpad.net/bugs/1344326 19:05:07 bug 1346424 0 dprince 19:05:08 Launchpad bug 1346424 in tripleo "Baremetal node id not supplied to driver" [Critical,In progress] https://launchpad.net/bugs/1346424 19:05:16 bug 1352336 bnemec 19:05:18 Launchpad bug 1352336 in tripleo "init-keystone fails with ConnectionRefused" [Critical,Fix committed] https://launchpad.net/bugs/1352336 19:05:25 bug 1353953 bnemec 19:05:26 Launchpad bug 1353953 in tripleo "Race between neutron-server and l3-agent" [Critical,In progress] https://launchpad.net/bugs/1353953 19:05:45 and bug 1354305 stevebaker (who will be asleep right now but I can speak to :)) 19:05:47 Launchpad bug 1354305 in tripleo "tripleo-heat-templates broken on even one-month old underclouds" [Critical,In progress] https://launchpad.net/bugs/1354305 19:06:32 I want to downgrade 1344326 19:06:35 its true that its abrupt 19:06:36 1352336 is waiting on https://review.openstack.org/#/c/112367/ 19:06:42 but its no more abrupt than a power failure 19:06:54 and we're in the first iteration of do-upgrade-at-all 19:07:33 1353953 should probably be downgraded to high since we have a workaround in place (although it's still happening from time to time in CI). 19:08:02 +2'd https://review.openstack.org/#/c/112367/ 19:08:09 \o/ 19:08:19 lifeless: Agreed on 1344326 19:08:21 Downgrading 112367 seems reasonable to me, on the basis that its the first iteration of upgrades 19:08:42 Certainly important, but it's basically new functionality, not something broken that was working. 19:08:50 s/112367/1344326/ 19:09:28 downgraded 19:09:28 at a cursory glance, it looks the ipmi driver also "reboots" nodes with a hard power off, perhaps that the cause of 1317056 19:10:15 so rpod says he can't reproduce 19:10:26 I'll ping cian about what sort of reboot was done 19:10:32 ok 19:11:42 C/win 10 19:11:46 doh 19:11:47 * bnemec just figured out who rpod is 19:12:19 rpoliaka sorry 19:12:21 lazy fingers 19:12:52 dprince: are you around ? 19:12:58 Thank you TheJulia 19:13:05 lifeless: yes 19:13:10 bug https://launchpad.net/bugs/1346424 19:13:11 Launchpad bug 1346424 in tripleo "Baremetal node id not supplied to driver" [Critical,In progress] 19:13:31 * dprince uncloaks for a bit 19:14:04 dprince: ah, you've toggled it to fix-committed 19:14:24 dprince: theres no further action needed in tripleo right? as can close our task ? 19:14:31 lifeless: yeah, we shouldn't see that anymore 19:15:01 ok - if you could close that off, thanks! 19:15:54 we need to ping GheRivero about https://bugs.launchpad.net/tripleo/+bug/1316985 - its not clear to me what remains to fix it 19:15:55 Launchpad bug 1316985 in tripleo "set -eu may spuriously break dkms module" [Critical,In progress] 19:16:42 #action lifeless to ping GheRivero about https://bugs.launchpad.net/tripleo/+bug/1316985 - its not clear to me what remains to fix it 19:17:24 ok, thats all criticals discussed 19:17:31 any other bug subject matter? 19:18:32 #topic reviews 19:18:36 I shudder to think 19:18:45 #info There's a new dashboard linked from https://wiki.openstack.org/wiki/TripleO#Review_team - look for "TripleO Inbox Dashboard" 19:18:48 #link http://russellbryant.net/openstack-stats/tripleo-openreviews.html 19:18:51 #link http://russellbryant.net/openstack-stats/tripleo-reviewers-30.txt 19:18:53 3rd quartile was 15.5 last week 19:18:54 #link http://russellbryant.net/openstack-stats/tripleo-reviewers-90.txt 19:19:03 Stats since the last revision without -1 or -2 : 19:19:03 Average wait time: 11 days, 16 hours, 20 minutes 19:19:03 1st quartile wait time: 1 days, 10 hours, 27 minutes 19:19:03 Median wait time: 7 days, 2 hours, 58 minutes 19:19:03 3rd quartile wait time: 18 days, 0 hours, 43 minutes 19:19:13 I know I haven't been pulling my weight 19:19:36 I was horrrrridly sick though, and really am only solidly better yesterday and today. So I'm going to be going on a splurge. 19:19:43 CI issues weren't helping things either. 19:20:07 Average hasn't changed since last week, neither had median. I still think we're being dragged out by a long tail of really old things 19:20:14 ok 19:20:24 there's also a possibility the stats don't show what we want them to show 19:20:34 I propose that we have a discussion about this particular metric on the list 19:20:49 and decide what we want to actually measure, and then we can fix the code. 19:21:01 I looked at the 30/90 day stats last time; in each there were about 14 people who were managing 3/day, so it doesn't look to me like its a problem with cores slacking off 19:21:58 e.g. should we include time the review sat there with a -1 and no reply from the author or should we discount that in terms of aging 19:22:13 New changes in the last 30 days: 372 (12.4/day) 19:22:21 That's tough to keep up with. 19:22:35 14*3=42 19:22:46 25 core reviews per day to merge those, assuming no changes are needed. 19:22:57 (which isn't true of course) 19:22:58 its about 50% of the review count being delivered 19:23:13 * greghaynes was a bit afk last week but is back now 19:23:23 bnemec, changes is for actual changes in gerrit or for patchsets? 19:23:30 The long tail has lots of reviews that have no negative reviews but aren't landing 19:23:33 It's worse than that though because reviewstats doesn't count weekends like we do. 19:23:39 yeah 19:23:43 so there are lots of changes 19:23:47 weekends? whats that? 19:23:48 a high count new per day 19:24:01 so - we want weekends counted 19:24:11 or at least a 2/7 discount on the stats 19:24:23 I noticed reviewstats has bugs with matches that return from abandonment or WIP too 19:24:31 which really makes the tail end stats not accurate 19:24:39 s/matches/patches 19:24:52 ok so I'm proposing that we a) discuss that set of things on the list, b) fix it 19:24:58 +1 19:24:59 +1 19:25:00 +1 19:25:10 +1 19:25:11 do we have a volunteer to start the discussion ?[6~ 19:25:26 I can do that 19:25:43 #action tchaypo to start reviewstats metric-bug-discussion on -dev list. 19:25:54 ok, the other reviews thing we need to discuss 19:25:58 is spec approvals 19:26:03 I think all we need to do is repeat the things we mentioned here and ask if people have ideas about what we should measure 19:26:50 lifeless: i didn't see any feedback on my mail or wiki page 19:27:02 i guess everyone was happy with it ;) 19:27:02 slagle: canhas #info on wiki page 19:27:13 or #link 19:27:16 #link https://wiki.openstack.org/wiki/TripleO/SpecReviews 19:27:17 or however we do that ;) 19:27:26 slagle: ah crud, I meant to go back and reply to that 19:27:35 slackers 19:27:43 so the thing I've been doing 19:27:47 that isn't in that wiki page 19:28:03 and which caused some concern when I did it to a couple of open specs 19:28:08 is to look at our concurrent WIP 19:28:33 there is a great discussion happening at the moment in the nova/wider openstack context about that 19:28:39 e.g. the slots proposal 19:29:00 I've been working under that basic mindset this whole time :) 19:29:23 I like the slots proposal for its clarity about what is currently being worked on vs what is pie-in-the-sky 19:29:30 lifeless: we have a fraction of the number of specs they do though 19:29:44 dprince: and a fraction of the reviewers, developers etc 19:29:48 And yet we're still way behind. :-) 19:29:52 But to me part of the rationale seems to be being limited by what they can fit into a release cycle 19:29:56 lifeless: also, we have some specs which are pretty much implemented, (and the specs aren't approved) 19:29:59 Afaik we don't follow a release cycle 19:30:14 tchaypo: that is the main thing for me 19:30:20 tchaypo: we're not part of the integrated release. Thats slightly different. 19:30:31 i'm hesitant to tie ourselves to the integrated release cycle, since we aren't part of it 19:30:43 using a deadline for specs, etc 19:30:58 slagle: we're tied to it already though via the symmetric gating around things like dib 19:31:17 obviously there are dependencies, so we must be cognizant of those types of changes 19:31:18 slagle: and our backwards compat story is phrased in terms of releases and support 19:31:46 anyhow, I tink the release cycle aspect is a distraction: the thing I've been doing isn't about release cycles, its about WIP 19:31:51 what we're focusing on 19:32:01 thinking out loud 19:32:22 regardless of being tied to a cycle i think it's still good to be able to distinguish between things being actively developed and things we're floating for future work, which i think is what lifeless means 19:33:34 so the heart of it is probably what those things /mean/ in an open allocation environment 19:33:42 i suspect most folks are actively developing on their specs...whether they are approved or not 19:33:47 what does 'future work' mean when someone feels that that thing is the most important thing to them 19:33:50 lifeless: trying to gauge things on some sort of WIP metric can be faulty though. I know myself for instance that many things come out of momentum (working in a specific area). If I like go write specs for all these things and check back with the group to see how our WIP is doing it'll just be a lost cause 19:34:10 lifeless: it is too controlling to view things like that IMO 19:34:46 dprince: yeah, so the idea of kanban was to not control, but channel - make it really clear what our dependencies are to deliver big ticket things like 'ha' 19:35:02 lifeless: they way I see it... if you don't want to approve a spec, then fine. Don't. But don't hold one up just because you think your angle on WIP is key or something 19:35:04 dprince: and separate out that meta-planning stuff from design and code etc 19:36:37 I'm not trying to hold onto the reigns or anything here 19:36:56 The goal of opening up spec +A to everyone was to get consensus on what that +A means 19:37:22 this is something I have been doing, so the question is do we keep doing it as a group 19:37:30 or is it something we want to stop doing? 19:37:45 i don't know how you're measuring WIP 19:37:51 Do we see value in trying to keep the number of concurrent things we're pushing on at once contained 19:38:01 My memory is that what we discussed at the mid-cycle was around cores who +2 a spec committing to at least reviewing related patches; it sounds to me as though if we have enough cores giving +2 to a spec (and a majority giving at least +1) it 19:38:02 lifeless: to me it doesn't always mean the same thing. It depends on the subject matter... which is why I see the all of this as a guide 19:38:02 i have a hard time seeing how we could measure that accurately 19:38:05 s s ready for a +a 19:38:49 slagle: indeed - so the openstack way for doing that in other projects is to say 'if its not a bugfix and its not shallow, you need a spec' - and then limit the number of specs 19:39:04 slagle: I don't particularly like that 19:39:14 lifeless: well, i think that's under quite a bit of disucssion still :) 19:39:27 i don't think there's an "openstack way" yet 19:39:31 slagle: indeed! but a variation of it has been in place for years 19:39:47 slagle: back with blueprints, nova was refusing to approve blueprints without sponsors, for instance 19:40:07 slagle: and would still over-commit by a huge number - approved but not landing 19:40:30 sure 19:40:32 personally, what I want is for the things I put in my platform 19:40:35 HA and CI 19:40:45 to get concierge service 19:40:52 and everything else can come in as it wans :) 19:40:57 ^ full disclosure! 19:41:10 oh and update 19:41:18 * lifeless has a terrible memory sometimes 19:41:22 shall we codify on the wiki page that +2 to a spec means you're willing to review the patches? 19:41:42 i already kind of added that at the end 19:41:45 The fact that things are important should drive this. Sure. If things are important people will show interest and +2 19:41:56 but didn't draw a hard distinction between +1 and +2 19:42:09 dprince: one aspect there is surfacing the availability of patches that are important 19:42:40 are we proposing to do something similar wrt not landing patches unless they're bugfix/shallow/tied-to-a-spec? 19:42:45 dprince: so that folk can see them. THere's a project manager for tripleo @ HP now and heling them see what we're up to us a full time job :/ 19:43:05 tchaypo: I'm not proposing that 19:43:09 lifeless: the way I see it any given patch may be important, or critical 19:43:19 lifeless: I don't need a manager to tell me that 19:43:30 dprince: of course not :) 19:43:47 lifeless: and if it is associated with a SPEC that helps me understand context that is great 19:43:50 dprince: what I mean is if there are 200 open patches, how do I find the HA ones, or the CI ones, or the update ones 19:44:11 lifeless: but it likely won't change my oppinion about the importance much, maybe a little, but not much 19:44:11 dprince: after I've done my 3 oldest, how do I pickup the ones that I want to accelerate 19:44:33 lifeless: so FWIW I wrote reviewday for this (to rank things) 19:44:35 dprince: (and replace the themese there with whatever you find compelling) 19:44:42 lifeless, I thought that would happen after a spec was approved 19:44:51 lifeless: so I do care about things, but there are always trump cards 19:45:21 lifeless, which is why, overall, I think we urge reviews of the specs 19:45:33 if you link to the bp in the commit message won't you see all the patches on the bp itself? 19:45:48 slagle: +1 19:45:53 maybe we need a top level bp for HA, etc, if we don't already have one 19:45:56 slagle, yeah that's what I had in mind 19:46:04 ok 19:46:05 slagle: yes, you will 19:46:09 but isn't that subordinated to the spec being approves? 19:46:32 what would the point of a bp linking submissions related to a spec which isn't approved? 19:46:34 so I propose that +2 on a spec is a commitment to review it over-and-above the core review responsibilities 19:47:05 if its not important enough for a reviewer to do that thats a pretty strong signal 19:47:06 lifeless: +1, I thought we already agreed to that at the meetup 19:47:17 yea, sounds fine to me 19:47:20 +1 19:47:30 dprince: it wasn't clear whether it was part-of-responsibility, or additive, I'm proposing we make it clearly additive 19:47:52 and separately I think we need to make surfacing reviews-for-themes a lot better 19:47:57 but that doesn't need to hold up this 19:48:21 ok - this needs a broad vote I think, I'll table it on the -dev list 19:48:32 If linking to the BP is enough to make it show up there, I think we just need to start landing specs 19:48:44 #action lifeless to table spec approval on the dev list 19:48:46 right now, if I'm understanding, we're not getting +as and hence don't get the BPs craeted 19:49:07 alternatively, my prposoal to create the BP as soon as the spec is created might make them easier to find even while the spec is WIP 19:49:10 give it a couple days and if there is still a majority in favor its live 19:49:10 tchaypo, +1 I'm stuck there, despite surfacing reviews being around 19:49:29 i thought the two could co-exist; my BPs and spec reviews were both out at the same time, I didnt delay the BP 19:49:38 * bnemec notes that most people here were in favor of creating the blueprints up front 19:49:46 jdob: I have a personal fear of blueprint hell 19:49:57 but most folk are in the team in LP to let them garden blueprints 19:49:59 so please do 19:50:03 is that a different ring from spec limbo hell? :) 19:50:08 if you can't, I'll HAPPILY fix that 19:50:17 jdob: yeah, specs can be fixed with git and gerritt :)( 19:50:26 fair enough 19:50:33 I think that most people don't plan to fill the BP with content until the spec lands 19:50:38 ok 19:50:47 but just having it up and saying "WIP, see review" will be enough to let us track related patches 19:50:49 tchaypo: the bp shouldn't have lots of content - the spec has it 19:50:58 #topic projects needing releases 19:51:05 do we have a wolunteer ? 19:51:13 and then it doesn't need to be touched until the spec lands - and if it doesn't need any real content bar a pointer to the spec, maybe not even then 19:51:37 #info I'd like to thank marios for vounteering to do this for the first time last week 19:51:47 tchaypo: to do what? 19:51:53 the releases 19:51:57 tchaypo: cool 19:52:01 it was his first time, and nothing seems to have blown up.. 19:52:05 marios: up for doing it again ? 19:52:14 i believe he's sleeping right now 19:52:17 ah 19:52:23 so, that's a yes, clearly 19:52:35 i can do it 19:52:38 please, just to make it more obvious to me 19:52:43 awesome, thanks 19:52:47 #action slagle to release the world 19:52:50 i've been forcing others to do it the last few times :) 19:52:51 we're suggesting to create the BP even if the spec isn't reviewed 19:52:55 thanks slagle, I have a super busy week and vacation next week, it was gonna be rough to fit it in this week 19:53:04 we're okay with submissions tied to the BP 19:53:17 yet we don't have real expectations (time constraints) on the spec being approved? 19:53:24 BPs without an approved spec are rejected though 19:53:24 seems like contradicting to e 19:53:42 okay so we're back to the original issue 19:53:46 but as long as there's a bunch of folk doing that, I'm happy ;) 19:54:04 basically this aspect is a mess - and we're going to be out of time 19:54:17 so I think we have to shelve the fine tuning of bp<->spec to next week, ok? 19:54:25 #topic CD cloud status 19:54:28 or take it to the list? 19:54:53 #info work is progressing (slowly) on hp2 - hopefully faster today as I'm avoiding codeine in favour of being able to think 19:55:00 hp1 is being stabilised at the moment, we're finding lots of little glitches e.g. in ironic, ha etc. 19:55:05 #info hp1 is being stabilised at the moment, we're finding lots of little glitches e.g. in ironic, ha etc. 19:55:28 #info hp2 is being brought up by tchaypo and Ng (and one hp1 is up, lifeless) 19:55:35 dprince: rh1 is fine, right? 19:55:42 dprince: I haven't seen any firedrills for it 19:56:11 lifeless: rh1 is a go 19:56:24 * tchaypo cuddles dependeable trusty rh1 19:56:30 #info rh1 is good 19:56:31 #topic CI 19:56:50 CI has been a bit fragile but is ok atm 19:56:55 lifeless: backtracking a bit: both Nova core and TripleO core both have 22 members. 19:57:20 * dprince may not always count correctly 19:57:24 anyways, close enough 19:57:31 anything else on CI? capacity etc will be future when hp1 is back 19:57:44 I don't think we're hurting on capacity at this point. 19:57:54 It's mostly just heisenbugs. 19:57:57 maybe we should add some more of the jobs? 19:58:13 (really rushing) 19:58:16 #topic tuskar 19:58:19 anything for this^ ? 19:58:25 not from my end 19:58:26 #topic specs 19:58:27 lifeless: I'd prefer to hold the line 19:58:30 Any specs to discuss? 19:58:39 lifeless: some level of capacitiy is appreciated I think 19:58:45 a buffer 19:58:49 dprince: ack, we need better stats on the buffer size 19:58:54 dprince: rather than 'oops, we passed it' 19:58:56 Also, more jobs = more bugs 19:59:02 lifeless, yeah well in relation to HA there's the cinder stuff 19:59:10 gfidente: you have 30s 19:59:11 I have surfacing submissions too, yet not tied to any BP 19:59:31 so feedback over both surfacing submissions AND the spec are appreciated 19:59:39 it's definitely an important problem to solve in an HA scenario 19:59:54 unless we expect volumes to disappear :P 20:00:03 surfacing submissions? 20:00:17 and there's the bell... thanks everyone 20:00:22 #link https://review.openstack.org/101237 Introduce support for Cinder HA via shared storage 20:00:26 https://review.openstack.org/#/c/113300/ and https://review.openstack.org/#/c/112051/ 20:00:38 #endmeeting