19:02:19 #startmeeting tripleo 19:02:19 Meeting started Tue Feb 25 19:02:19 2014 UTC and is due to finish in 60 minutes. The chair is lifeless. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:02:21 o/ 19:02:21 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:02:23 The meeting name has been set to 'tripleo' 19:02:36 good morning everyone 19:02:44 hi 19:02:48 o/ 19:02:50 hello 19:02:53 Morning 19:02:56 ahoy 19:03:02 o/ 19:03:14 good morning 19:03:16 morning 19:03:35 #topic agenda 19:03:36 hello 19:03:41 bugs 19:03:41 reviews 19:03:41 Projects needing releases 19:03:41 CD Cloud status 19:03:41 CI virtualized testing progress 19:03:43 Insert one-off agenda items here 19:03:46 open discussion 19:03:48 #topic bugs 19:03:52 #link https://bugs.launchpad.net/tripleo/ 19:03:52 #link https://bugs.launchpad.net/diskimage-builder/ 19:03:52 #link https://bugs.launchpad.net/os-refresh-config 19:03:54 #link https://bugs.launchpad.net/os-apply-config 19:03:57 #link https://bugs.launchpad.net/os-collect-config 19:03:59 #link https://bugs.launchpad.net/tuskar 19:04:02 #link https://bugs.launchpad.net/tuskar-ui 19:04:04 #link https://bugs.launchpad.net/python-tuskarclient 19:04:07 hmm, sthould remove -ui from there, its now a horizon problem ;) 19:04:22 :-) 19:05:06 ok, so lets see 19:05:16 we're drowning in criticals 19:05:58 https://bugs.launchpad.net/tripleo/+bug/1270646 19:06:02 https://bugs.launchpad.net/tripleo/+bug/1271344 19:06:09 https://bugs.launchpad.net/tripleo/+bug/1272803 19:06:14 https://bugs.launchpad.net/tripleo/+bug/1272969 19:06:20 https://bugs.launchpad.net/tripleo/+bug/1278861 19:06:25 https://bugs.launchpad.net/tripleo/+bug/1280941 19:06:30 https://bugs.launchpad.net/tripleo/+bug/1283921 19:06:35 https://bugs.launchpad.net/tripleo/+bug/1284054 19:06:45 * SpamapS opens new window and starts ctrl-clicking 19:06:55 and we've got untriaged! 19:07:03 https://bugs.launchpad.net/tripleo/+bug/1277168 19:07:08 https://bugs.launchpad.net/tripleo/+bug/1279537 19:07:13 https://bugs.launchpad.net/tripleo/+bug/1281174 19:07:17 https://bugs.launchpad.net/tripleo/+bug/1281702 19:07:22 https://bugs.launchpad.net/tripleo/+bug/1281705 19:07:26 https://bugs.launchpad.net/tripleo/+bug/1281719 19:07:31 https://bugs.launchpad.net/tripleo/+bug/1281977 19:07:35 https://bugs.launchpad.net/tripleo/+bug/1284242 19:07:52 https://bugs.launchpad.net/tripleo/+bug/1272605 19:07:55 much wow 19:07:56 o/ hey guys 19:09:30 jcoufal: ola! 19:10:02 ok so untriaged: I think we need to all jointly commit to doing 1hr of bug triage a week 19:10:13 and this will be all sorted next week with no stress 19:10:28 can anyone *not* make such a commitment ? 19:10:30 well I just +A'd the fix for 1284054 19:10:34 bug 1284054 I should say 19:12:15 lifeless: +1 for the bug triage. I've always found it quite useful to have a specific day in the week that is "my bug triage day" so that I just know when that 1 hour will happen. 19:12:59 fine 19:12:59 I can do that, but I'm going to need some hand-holding 19:13:16 i have e-mail notifications on tuskar and tuskarclient bugs and usually i just triage as they come 19:13:28 My bug triage hour is usually right now. ;-) 19:13:52 tchaypo: As Stevie Wonder said.. "Do allll that you caaaan" 19:14:17 trying to do that, at least on tuskar :-) 19:14:25 jistr: so that good, but like review please share the load with the rest of the team 19:14:46 jistr: so put aside time after a coffee break one day and look at all the tripleo projects for untriaged bugs 19:15:03 untriaged == no priority and status != triaged 19:15:21 [ignoring in-progress and fix committed of course] 19:15:25 thats a good idea for the e-mail notification, I didn't realize it supported that 19:16:18 lifeless: is there no way to generate a single query? 19:16:32 ccrouch: we have canned queries on the wiki page I believe 19:16:37 lifeless: i don't have 100% confidence in my ability to correctly triage bugs on non-tuskar projects, so that's why i don't do it much, but i'll try 19:16:39 ccrouch: but no, not across multiple projects 19:16:46 ah ok 19:17:04 Can only core do full triage? I noticed I can't set some of the values 19:17:04 jistr: can you tell 'omg this shouldn't happen' from 'it might be nice if?' :) 19:17:08 jistr: thats really all it takes 19:17:16 yeah i hope so ;) 19:17:20 d0ugal: there is a team on launchpad - ~tripleo - request membership of that 19:17:23 tchaypo: ^ you too 19:17:32 lifeless: aha, will do. Thanks. 19:17:34 do we try to reproduce or confirm the problem during triage? 19:17:51 tchaypo: use your own judgement 19:17:55 lifeless: I think we still have some people with membership pending, fyi 19:18:03 I will check post meeting 19:18:06 so these criticals 19:18:31 Team request made 19:18:51 the PMTU one I think we probably need to put a ethtool -K gro off in automatically, drop the bug to high and pursue out of band 19:19:10 the leases one neutron are pursuing, but the undefined priority there makes me cry 19:19:30 https://bugs.launchpad.net/tripleo/+bug/1272803 is low hanging fruit for someone interested 19:19:45 just make a new internal interface rather than reusing the mac of the bridge 19:20:11 https://bugs.launchpad.net/tripleo/+bug/1272969 dprince_'s work will solve 19:20:41 jistr: https://bugs.launchpad.net/tripleo/+bug/1278861 - any additional news on that ? 19:21:24 lifeless: not beyond what Dmitri posted... i'm reprovisioning my lab machine now so i'll try to run as non-root and we'll see 19:22:00 jistr: I think we should downgrade this - CI is working [there's a different failure right now but we had three-green-bars yesterday] 19:22:05 jistr: so its not systemic 19:22:09 lifeless: +1 19:22:29 https://bugs.launchpad.net/tripleo/+bug/1280941 we need to back out our workaround and then its done 19:22:51 https://bugs.launchpad.net/tripleo/+bug/1283921 I think we landed my proposed patch 19:23:03 but we haven't followed up to see if the life cycle is managed properly 19:23:11 I'm inclined to close it and wait and see 19:23:21 we know we need to do more work on migrations as we finish the HA arc 19:23:25 thoughts? 19:24:39 ok, silence == assent ;P 19:24:55 and SpamapS says he +A'd https://bugs.launchpad.net/tripleo/+bug/1284054 19:25:30 ok, any other bug stuff? 19:25:31 lifeless: it had a Partial-Bug tag.. I think we should drop it to High since this workaround mitigates the impact 19:25:44 SpamapS: # ? 19:25:52 bug #1284054 19:25:59 We still go too slow. 19:26:01 #action all tripleo devs to do 1hr bug triage a week, random offsets. 19:26:02 We just wait longer now 19:26:31 sure 19:26:46 though I suspect folk will perf optimise independently 19:26:51 I'd be inclined to just close it 19:26:58 up to you! 19:27:09 any other bug stuff to discuss? 19:28:16 ok 19:28:25 #topic reviews 19:28:59 I'm going to resume my monthly reviewer summaries now that we're all well and truely back from holidays 19:29:08 current status 19:29:10 http://russellbryant.net/openstack-stats/tripleo-openreviews.html 19:29:21 Stats since the last revision without -1 or -2 : 19:29:21 Average wait time: 4 days, 1 hours, 33 minutes 19:29:30 Median wait time: 3 days, 19 hours, 16 minutes 19:29:49 we're letting ourselves down - thats a long time to be waiting for feedback on a branch 19:30:28 lifeless: I've seen a lot more operational focus lately, I don't think we've been focused on reviews. 19:30:28 longest review is up at 6 days 19:30:47 there are 9 people in cd-admins, and ~20 in the review team. 19:31:04 SpamapS: so while cd-admins can be affected by that, it doesn't cover the whole team. 19:31:14 SpamapS: not by a long shot. 19:31:32 so reviewers are just not doing enough reviews 19:31:46 I'd say so 19:31:54 reviews are how we scale development bandwidth 19:32:00 actually yeah.. I'm at the top for 30 days.. and I KNOW I haven't been doing enough 19:32:10 they are more critical than bug fixing 19:32:18 because they actually *deliver the code to users* 19:32:38 I'm not so much seeing bug fixing as testing and using. 19:32:50 the non admins.. I don't know whats up with that. 19:32:51 I'm not interested in singling individuals out as not reviewing enough - this is a team wide challenge. 19:32:55 the CI jobs are slowing reviews down 19:32:59 but admins are also the most active reviewers 19:33:10 not that we shouldn't be doing CI, but it's a data point to consider 19:33:12 slagle: you can still -1 or +2 based on the code. 19:33:20 slagle: the only thing CI affects is +A. 19:33:45 ok, i've been -1'ing still 19:33:58 but, i've been holding off on the +2 as well 19:34:07 it doesn't "look good to me" if jenkins hasn't run yet 19:34:28 of the 4 oldest reviews 3 have on one +2, the last has +A but depends on a patch with two -1s from non-core 19:34:36 but, if the pattern is to still +2, i can start doing so 19:34:47 both -1's which the author (me) disagreed with. 19:35:08 I think cores need to review things with -1's on them, at least far enough to detect this sort of thing 19:35:32 With the pipeline length we have... CI would have to be DAYS behind before it blocks a reviewer. 19:35:48 slagle: Here's my thought on the +2 thing - say you upload something, and two people +2 it then CI passes, I think the author could +A reasonably. 19:35:51 I review everything, even if it has -2's 19:35:54 slagle: what do you think ? 19:36:25 wfm 19:36:29 slagle: [without invoking any of our special clauses about jointly edited patches, CD features etc] 19:36:32 ok 19:36:35 so 19:36:48 #action pick up the game on reviews everyone. EVERYONE. 19:37:13 #info +2 is ok even when CI hasn't checked in yet. 19:37:19 * SpamapS whinces as the whip cracks 19:37:37 SpamapS: sadface, I was trying to avoid that framing of the problem. 19:37:43 haha I'm kidding. :) 19:37:48 seriously, its a jointly affected, jointly solved issue 19:37:52 we have to want it. 19:38:04 Not reviewing means your own patches sit longer. 19:38:06 I get it. :) 19:38:10 #topic Projects needing releases 19:38:24 I can take. not a problem 19:38:36 awesome 19:38:43 #topic CD Cloud status 19:38:55 dprince_: RH region status? 19:39:33 lifeless: still fleshing out NW access. More good progress today. I expect full access for everyone this week I think 19:39:40 awesome 19:39:45 HP region status: 19:40:01 we've got a bad node running the ci overcloud, which is a problem 19:40:17 may be as simple as a BIOS firmware upgrade though - firedrill card open in trello 19:41:05 we have about 25% capacity bad in some way, an HP tripleo-cd-admin needs to take ownership of going through all the machines listed bad, opening JIRA tickets and sheparding them through 19:41:32 JIRA my old friend, we meet again 19:41:34 ng and spamaps have done a chunk of work here but its not finished 19:42:08 I'd like to say any HP person, but reality is that without access to the machines to test etc it would be very hard. 19:42:31 the cd-undercloud network card seems to be glitching again; our old friend mellanox 19:42:36 thats about it 19:42:42 #topic CI 19:43:22 so we've had an exciting week 19:43:26 we're back with check jobs 19:43:29 and yesterday they were all passing 19:43:30 OMG 19:43:43 \o/ 19:43:53 one thing that became clear last week, I want to reinforce here 19:44:23 tripleo-cd-admins - this is a production quality admin team: we're on the hook for treating CD-undercloud, CD-overcloud and CI-overcloud failures as production failures 19:44:29 we've got enough folk to do follow the sun 19:44:40 its a volunteer team 19:44:42 but 19:44:44 IMO 19:45:26 if you're in the team, it requires commitment - specifically if it breaks (e.g. if infra go 'wtf this region is down') we need to drop anything else and fix it 19:45:46 if that happens a lot, its in our power to correct it (e.g. more HA, take servers out of rotation, fix bugs in the code) 19:46:13 So - I'm going to call a vote of the cd-admins folk here: does this all make sense 19:46:37 remember - tripleo-cloud/tripleo-cd-admins in incubator is the list of admins 19:46:44 lifeless: yes 19:47:02 #vote does cd-admins membership imply production quality respones to everyone? 19:47:16 erm, I hope that calls a vote :P 19:47:25 +1 19:47:26 #startvote does cd-admins membership imply production quality respones to everyone? 19:47:26 ? 19:47:27 Begin voting on: does cd-admins membership imply production quality respones to everyone? Valid vote options are Yes, No. 19:47:29 Vote using '#vote OPTION'. Only your last vote counts. 19:47:33 Yes 19:47:34 #vote Yes 19:47:36 ah 19:47:37 #vote yes 19:47:38 #vote Yes 19:47:39 #vote Yes 19:47:46 greghaynes: you're not an admin, but thanks :P 19:47:46 lifeless: +1 (when I'm online I'll help) 19:47:48 :p 19:47:54 lifeless: I'm happy to volunteer for the team, but I don't think I can be very useful just yet 19:48:23 i'm going to ask an embarrasing question though 19:48:24 tchaypo: read ttripleo-cloud/README.md 19:48:35 where are the ci clouds documented? 19:48:47 we have TripleOCloud on the wiki 19:48:53 that pertains only to the CD cloud does it not? 19:49:11 slagle: tripleo-cloud/README.md + https://wiki.openstack.org/wiki/TripleO/TripleOCloud (linked from the README) + https://wiki.openstack.org/wiki/TripleO/TripleOCloud/Regions (linked from the first wiki page) 19:49:27 slagle: we probably need more CI-overcloud docs! 19:49:37 slagle: plus the admin spreadsheet with network ranges, passwords etc. 19:49:45 #endvote 19:49:46 Voted on "does cd-admins membership imply production quality respones to everyone?" Results are 19:49:52 ah, ok the spreadsheet 19:50:01 b/c i dont see any ci hostnames on that wiki page 19:50:04 slagle: the spreadsheet is linked from the wiki pages 19:50:27 got it, will check it out 19:51:17 dprince_: ok, so on contacting us. I'd like to explore us sharing phone numbers to permit follow-the-sun handoffs even if folk are offline (e.g. I'm not going to keep poking at servers at 11pm when e.g. derekh or ng are awake and much more compos mentis 19:51:39 dprince_: I'll send mail to the list for that though, I think it needs everyone to be involved - consensus discussion 19:51:57 #action lifeless to mail list about tripleo-cd-admins vote + contact-options topic 19:52:18 #topic open discussion 19:53:12 I'm hoping to finagle a vpn token and cert from the office at rhodes today, at which point I'll be able to read my work email, which will be exciting 19:54:57 lifeless: i didn't catch the discussion but i heard a rumor that you picked up extraction of overcloud init from devtest into a separate library that tuskar api could use. Is that right? 19:55:03 tchaypo: woohoo :) 19:55:23 jistr: I've committed to bootstrapping that this week yes 19:55:49 jistr: long as the clouds we're running stay up 19:56:01 does anyone know how multi region keystone is meant to work 19:56:02 like 19:56:17 do we run two keystones each with the other's services registered and round-robin DNS ? 19:56:31 or do we run one globally distributed keystone ? 19:57:06 I mean - we'll have two underclouds soon, which should be separate but the overcloud should present a single multi region cloud to user, no ? 19:57:19 lifeless: I had always thought it was achieved using shared users/catalogs, but not tokens basically. 19:57:20 lifeless: re the general stuff: sounds good, thanks. I just wanted to check on a rough ETA. (re keystone i don't know :) ) 19:58:11 lifeless: is there a kickoff of sorts on monday morning for the meetup? 19:58:16 jistr: I'm going to get a separate tree together, move stuff across into it untangling deps as needed, make it pip installable, and then say 'here, add what you need on top into this thing' :) 19:58:32 jdob: yes, on monday we run around going wtf are we going to be. 19:58:37 lifeless: cool :) 19:58:45 awesome. what time do the festivities begin? 19:59:11 jdob: 0900 :P 19:59:25 hopefully y! will have some news, or cody-somerville may pull a wabbit out of a hat 19:59:27 Oh, and do we just show up at the HP office and say "We're here for the TripleO meeting?" 19:59:34 it appears we might not have a room monday 19:59:43 sounds good. and once again let me thank you for saving me from the miserable NJ winter and giving me cause to got to Sunnyvale 19:59:48 lol 19:59:51 LOL, jdob++ 20:00:18 matty_dubs: you could do that, but you'd be in the wrong place 20:00:22 it's at the Yahoo office 20:00:26 :) 20:00:47 slagle: I guess I will just follow you Monday morning. ;) 20:00:56 matty_dubs: If you're at the Wild Palms, you can just follow the nerd herd onto the shuttles ;) 20:01:17 Ooh, I am. Sounds like a plan! 20:01:23 #endmeeting