19:18:18 <derekh> #startmeeting tripleo
19:18:19 <openstack> Meeting started Tue Apr  1 19:18:18 2014 UTC and is due to finish in 60 minutes.  The chair is derekh. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:18:20 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:18:23 <openstack> The meeting name has been set to 'tripleo'
19:18:31 <bnemec> \o/
19:18:56 <derekh> basing this on last weeks meeting and havn't done this before give me some slack
19:18:58 <jrist> o/
19:19:05 <derekh> #topic agenda
19:19:10 <derekh> bugs
19:19:13 <derekh> reviews
19:19:19 <derekh> Projects needing releases
19:19:25 <derekh> CD Cloud status
19:19:27 <derekh> CI
19:19:37 <derekh> Insert one-off agenda items here
19:19:43 <derekh> #topic bugs
19:20:24 <derekh> #link https://bugs.launchpad.net/tripleo/
19:20:25 <derekh> #link https://bugs.launchpad.net/diskimage-builder/
19:20:25 <derekh> #link https://bugs.launchpad.net/os-refresh-config
19:20:25 <derekh> #link https://bugs.launchpad.net/os-apply-config
19:20:25 <derekh> #link https://bugs.launchpad.net/os-collect-config
19:20:25 <derekh> #link https://bugs.launchpad.net/tuskar
19:20:28 <derekh> #link https://bugs.launchpad.net/python-tuskarclient
19:20:42 <derekh> lets go down through criticals
19:20:42 <lifeless> SpamapS: thanks
19:20:53 <derekh> https://bugs.launchpad.net/tripleo/+bug/1270646
19:21:00 <derekh> anybody working on this?
19:21:10 <derekh> lifeless: your here wanna take over ?
19:21:21 <slagle> yea, i'm "working" on that
19:21:34 <slagle> i proposed some fixes, i feel they address the bug
19:21:37 <lifeless> derekh: nope
19:21:54 <lifeless> derekh: slept through my alarm, got to make breakfast for C etc
19:22:03 <slagle> the neutron bug is unassigned though
19:22:04 <lifeless> is CI back up ?
19:22:16 <jomara> make && make install cheerios ?
19:22:26 <lifeless> jomara: :)
19:22:38 <derekh> slagle: ok, cool review time so
19:22:38 <bnemec> This is Python: pip install cheerios :-P
19:22:43 <derekh> next https://bugs.launchpad.net/tripleo/+bug/1272803
19:22:44 <jomara> lifeless: i prefer the "precompiled" variety
19:22:55 <derekh> lifeless: nearly see https://etherpad.openstack.org/p/cloud-outage
19:23:06 <marios> so i guess no tripleo meet today
19:23:26 <marios> oops.hi should have scrolled down
19:23:34 <derekh> I'm working on this one, no progress kind of waiting on dprince's fix to ensure bridge to land first
19:23:46 <derekh> https://bugs.launchpad.net/tripleo/+bug/1272969
19:23:47 <tchaypo> marios: welcome ;)
19:23:59 <marios> tchaypo: tx :)
19:24:06 <slagle> https://bugs.launchpad.net/tripleo/+bug/1300663
19:24:13 <slagle> https://bugs.launchpad.net/tripleo/+bug/1290490
19:24:20 <slagle> unassigned criticals ^^
19:24:48 <derekh> slagle: thanks, any takers?
19:25:32 <derekh> dprince: and Ng both were looking at https://bugs.launchpad.net/tripleo/+bug/1300663 I beleive there is a patch push up
19:25:55 <Ng> https://review.openstack.org/#/c/84466/
19:26:10 <slagle> i believe https://bugs.launchpad.net/tripleo/+bug/1290490 could be bumped down to High
19:26:16 <Ng> is the simple fix with no refactoring to make dhcp-all-interfaces less bonkers
19:26:22 <derekh> Ng: will take a proper look at it tomorrow
19:26:34 <bnemec> slagle: Agreed
19:26:40 <derekh> Ng: but looked sane when I looked earlier
19:26:49 <Ng> derekh: np. it's literally just adding $INTERFACE to stop dhcp-all-interfaces going crazy :)
19:27:11 <derekh> slagle: agreed, *changed to high*
19:27:51 <slagle> yea, i left a comment too and switched to incomplete
19:27:57 <slagle> we're awaiting submitter feedback :)
19:28:36 <derekh> slagle: ok
19:28:47 <derekh> we got another critical in o-c-c https://bugs.launchpad.net/os-collect-config/+bug/1299110
19:29:06 <derekh> fix propose needs review
19:29:12 <derekh> *proposed
19:29:27 <derekh> moving on
19:29:31 <slagle> i see no other untriaged or unassigned crit bugs
19:29:41 <derekh> slagle: cool
19:29:56 <greghaynes> Seems like untriaged bot is working :)
19:29:59 <derekh> #topic reviews
19:30:09 <derekh> greghaynes: yup, it does
19:30:28 <derekh> how are we on reviews this week?
19:31:16 <derekh> my quick glance show that the review queue is longer then last week so commitments to do more reviews doesn't seem to have helped much
19:31:51 <derekh> lifeless: was also going to send out a mail to maybe increase the number of reviewers
19:31:55 <lifeless> possibly still more inputs ?
19:31:58 <lifeless> yes! I am.
19:32:00 <jistr> derekh++
19:32:01 <lifeless> cores anyhow
19:32:12 <derekh> lifeless: yup, cores sorry
19:32:17 <slagle> we have improved some from last week
19:32:23 <slagle> Stats since the last revision without -1 or -2 :
19:32:24 <slagle> Average wait time: 3 days, 19 hours, 59 minutes
19:32:34 <tchaypo> Have we fixed the untriaged nagbot to only nag about untriaged and not incomplete?
19:32:36 <slagle> that was 5 days last week
19:32:50 <lifeless> slagle: +1
19:32:54 <derekh> so lets keep the commitments up and wait for mail from lifeless
19:33:08 <lifeless> 3rd quartile wait time: 4 days, 20 hours, 50 minutes
19:33:09 <derekh> slagle: sounds good
19:33:24 <rpodolyaka1> tchaypo: it should not nag incomplete, unless the bug submitter has responded
19:33:48 <derekh> any other suggestions on reviews, things seem to be improving so lets keep it up :-)
19:34:41 <tchaypo> rpodolyaka1: ah, maybe that's what it was doing and we all just tuned them out because we saw them as not untriaged
19:34:53 <derekh> ok, moving on
19:34:55 <derekh> #topic Projects needing releases
19:35:05 <rpodolyaka1> I'm still up for this, no problems
19:35:07 <slagle> i'd like to volunteer for releases this week
19:35:14 <tchaypo> I added a link to https://wiki.openstack.org/wiki/TripleO asserting that we follow the standard tripleo reviewchecklist
19:35:19 <rpodolyaka1> slagle: go ahead :)
19:35:23 <slagle> and create the stable branches for tripleo-* as well during the release process
19:35:37 <slagle> rpodolyaka1: your wiki page has been helpful to get up to speed on the process :)
19:35:39 <tchaypo> I don't know if that's actually the case but it looked like a useful link so I thought I'd just do it and wait for people to get upset if it's wrong
19:35:39 <derekh> #action slagle to release projects
19:35:50 <slagle> i will need lifeless to add me to the tripleo-ptl group in gerrit though
19:36:10 <rpodolyaka1> slagle: cool!
19:36:27 <derekh> lifeless: is that something you can do?
19:36:43 <lifeless> hi yes!
19:37:04 <tchaypo> speaking of links on wiki pages though - maybe rpodolyaka1's page could be linked from the TripleO page as well?
19:37:09 <lifeless> slagle: have you got fully up to speed (-infra /really/ don't like mistakes in the releasing of things )
19:37:15 <rpodolyaka1> tchaypo: good point!
19:37:39 <derekh> tchaypo: sounds like a good suggestion
19:38:00 <slagle> lifeless: i've read the wiki page and understand the steps
19:38:08 <tchaypo> I'd add the link myself, if i knew where it should point ;)
19:38:17 <lifeless> adding you
19:38:20 <slagle> lifeless: beyond that, i guess i don't know what i don't know
19:38:35 <derekh> rpodolyaka1: can you point tchaypo at the link please
19:38:56 <lifeless> slagle: done
19:39:06 <rpodolyaka1> tchaypo: https://wiki.openstack.org/wiki/TripleO/ReleaseManagement
19:39:50 <tchaypo> thanks
19:39:52 <derekh> lifeless: slagle ok sounds like ye can progress on that outside of meeting
19:40:01 <derekh> moving on
19:40:06 <slagle> yea, i'll ask if anything is not clear
19:40:15 <slagle> i do not want any -infra wrath :)
19:40:20 <derekh> #topic CI cloud status
19:40:35 <derekh> so hardware failure over the weekend
19:41:03 <derekh> ci-overcloud was redeployed and we've been dealing with issues ever since
19:41:43 <SpamapS> sorry .. meatspace issues
19:41:44 <SpamapS> o/
19:41:55 <derekh> ci was up last night for a bit but the init process quickly hit its limit of file descriptors
19:42:08 <derekh> the fix for that is in dhcp-all-interfaces
19:42:19 <derekh> current status is
19:42:59 <derekh> its still down but we made progress in the last few hours, zuul is now running jobs but they will fail becase we need a new geard broker
19:43:10 <derekh> what we tried and current status is here https://etherpad.openstack.org/p/cloud-outage
19:43:18 <tchaypo> slagle: i updated the wording about stable branches as well, I'm hoping you agree with the wording (although it probably needs to change very soon, once we actually have the stable branches)
19:43:18 <derekh> anything else? questions ?
19:43:40 <slagle> tchaypo: ok, will check that out
19:44:19 <derekh> What I have put int he few lines below "Apparent Solution : neutron floatingip-delete" is what I think needs ot happen next
19:44:32 <lifeless> derekh: nova thinkgs only 2 hypervisors are up
19:44:37 <tchaypo> derekh: no question, just a note that once things calm down I want to start getting access to and familiar with the CI infra so i can take some of the load next time this happens
19:44:42 <lifeless> derekh: I'm trying to reconcile that with your status update
19:44:51 <derekh> lifeless: the others still need the dhcp-all-interfaces update
19:44:58 <lifeless> derekh: ah! where is it
19:45:14 <derekh> we were manually poking at compute 4 5 and 6
19:46:01 <derekh> lifeless: manually change the dhcp-all-interfaces.conf upstart config to say
19:46:04 <derekh> exec /usr/local/sbin/dhcp-all-interfaces.sh $INTERFACE
19:46:18 <lifeless> tchaypo: cool, we're always looking for more admins - basically when you feel you know enough of the setup (by asking, following along, reviewing) submit yourself to the team
19:46:26 <derekh> lifeless: its not the long term fix but seems to be good enough to get us going again
19:48:16 <derekh> lifeless: sound ok to you?
19:48:45 <lifeless> oh wow
19:48:49 <lifeless> thats a big thinko isn't it
19:49:07 <lifeless> novacompute4 has no dhcp-all-interfaces job ?
19:49:17 <lifeless> derekh: ok so recovery is- roll that out to all the nodes
19:49:23 <lifeless> derekh: rebuild te broker ?
19:49:26 <lifeless> derekh: profit ?
19:49:31 <derekh> lifeless: yup, thats the plan
19:49:52 <derekh> lifeless: I gotta run after this meeting so can you take over?
19:50:28 <derekh> at least that my plan
19:50:45 <derekh> ok moving on again
19:50:47 <derekh> #topic CI
19:51:14 <derekh> ok, so when the cloud is up the jobs them selves seem to be mostly stable
19:51:22 <derekh> I've been keeping an eye on them
19:51:55 <derekh> sometime we get fails, we need to chase those down
19:52:06 <derekh> and we also get broke by other projects
19:52:16 <slagle> agreed, we were humming along nicely there for a while
19:52:53 <derekh> we were broken last week by changes in swift and then neutron (although the neutron thing could arguably be our fault)
19:53:15 <SpamapS> Sure there are definitely things that are our fault, but that we find out after they've landed.
19:53:19 <notmyname> derekh: ?
19:53:27 <lifeless> derekh: I can and wkll
19:53:36 <SpamapS> Let's just stay focused, and keep treating our jobs as a gate.
19:53:45 <derekh> SpamapS: yup, good suggestion
19:53:52 <notmyname> derekh: please let me know (maybe after the meeting) how changes in swift has broken things for you. first I've heard of it
19:53:54 <derekh> notmyname: lifeless sent a mail to the list
19:54:01 <SpamapS> We'll have HA soon, and thus ci-overcloud will be less firedrill prone.
19:54:19 <derekh> notmyname: will dig details up for you after
19:54:25 <notmyname> derekh: thanks
19:54:27 <derekh> SpamapS: sounds good
19:54:46 <derekh> any other CI observations ?
19:55:59 <derekh> notmyname: a swift change made chages to permisions on the ring files, here was our fix https://review.openstack.org/#/c/83645/
19:56:10 <slagle> tuskar folks: i'm assuming you want stable icehouse branches as well?
19:56:37 <derekh> in general if people could keep an eye on http://goodsquishy.com/downloads/tripleo-jobs.html if you see 4 or 5 jobs fail in a row we have a problem
19:57:00 <derekh> time is short so
19:57:01 <derekh> #topic open discussion
19:57:05 <slagle> tuskar folks: i'm assuming you want stable icehouse branches as well?
19:57:14 <derekh> anything people want to talk about for 3 minutes ?
19:58:08 <tchaypo> favorite cat pictures of the week?
19:58:13 <jcoufal> slagle: I'd say so
19:58:18 <jdob> slagle: lsmola and jistr would know best, but I think so
19:58:23 <notmyname> derekh: thanks. we should talk after
19:58:31 <lsmola> slagle, not sure, we kind of rely on some new heat features e.g.
19:58:47 <ccrouch> i know jprovazn: had questions about next steps for HA
19:58:49 <matty_dubs> tchaypo: What about favorite doge pics? http://www.mst.edu/
19:58:51 <slagle> lsmola: right, but those features will be in heat icehouse right?
19:58:59 * ccrouch nudges jprovazn
19:59:03 <jdob> i thought heat was done with features for icehouse
19:59:06 <SpamapS> next steps for HA is to have Heat inform nodes when they're about to be rebooted or deleted.
19:59:12 <jprovazn> yes, but probably on #tripleo channel after the meeting
19:59:14 <derekh> notmyname: I gotta run quick after this but can pick up on it tomorrow no problem
19:59:16 <SpamapS> https://review.openstack.org/#/c/81666/ will need to land
19:59:19 <lsmola> slagle, most of them in Juno I hope
19:59:27 <SpamapS> which will move us to using software config/deployment from Heat
19:59:28 <derekh> notmyname: or lifeless can fill you in
19:59:38 <slagle> lsmola: i see, let's continue in #tripleo
19:59:46 <SpamapS> then we have to write a resource plugin which will create a deployment just to tell the server that it is being rebuilt or deleted
20:00:05 * jistr switches to #tripleo
20:00:25 <derekh> SpamapS: I've been meaning to review that, willdo tomorrow assuming overcloud is running
20:00:37 <ccrouch> SpamapS: any more element stuff urgently required? now we have mysql/rabbitmq/keepalived
20:00:49 <SpamapS> It is blocked by https://review.openstack.org/#/c/83614/
20:00:53 <derekh> ok times up, lets get HA discussion moving on #tripleo
20:00:58 <greghaynes> the mysql isnt quite done but its in the review pipeline
20:01:07 <SpamapS> -> #tripleo
20:01:11 <greghaynes> yep
20:01:20 <derekh> #endmeeting