20:01:08 #startmeeting tripleo 20:01:09 Meeting started Mon Jul 22 20:01:08 2013 UTC. The chair is lifeless. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:01:10 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:01:13 The meeting name has been set to 'tripleo' 20:01:13 #topic agenda 20:01:20 bugs 20:01:21 Grizzly test rack status 20:01:21 CI virtualized testing progress 20:01:21 open discussion 20:01:56 #topic bugs 20:02:06 https://bugs.launchpad.net/tripleo/ 20:02:06 https://bugs.launchpad.net/diskimage-builder/ 20:02:06 https://bugs.launchpad.net/os-refresh-config 20:02:06 https://bugs.launchpad.net/os-apply-config 20:02:08 https://bugs.launchpad.net/os-collect-config 20:03:41 o/ 20:04:00 o/ 20:04:03 https://bugs.launchpad.net/tripleo/+bug/1182249 20:04:35 lifeless: _almost_ ready to tackle that 20:04:42 https://bugs.launchpad.net/tripleo/+bug/1183223 20:04:47 https://bugs.launchpad.net/tripleo/+bug/1184484 20:04:49 lifeless: once we swap in os-collect-config , should be able to realistically address it. 20:04:54 https://bugs.launchpad.net/tripleo/+bug/1189385 20:04:59 https://bugs.launchpad.net/tripleo/+bug/1200201 20:05:03 https://bugs.launchpad.net/tripleo/+bug/1201580 20:05:09 https://bugs.launchpad.net/tripleo/+bug/1201581 20:05:14 https://bugs.launchpad.net/tripleo/+bug/1201584 20:05:19 https://bugs.launchpad.net/tripleo/+bug/1202322 20:05:29 https://bugs.launchpad.net/diskimage-builder/+bug/1202612 20:05:32 will be working on 1184484 this week 20:05:40 wheee we have a bunch of crits ;) 20:06:25 did we lose the bug bot? 20:06:35 bug 1202322 20:06:39 been cought up in rootwrap nova-network entrypoints land 20:06:41 appears so 20:07:54 ok so 20:08:04 the dib one is kinda worrying 20:08:13 since it's -really- harsh when it happens 20:08:37 lifeless: https://bugs.launchpad.net/tripleo/+bug/1202322 just need to land the 2 reviews on os-collect-config and then try devtest with os-collect-config instead of heat-cfntools 20:08:40 lifeless: this happened me quite a few times last week, but not once today .... 20:09:01 SpamapS: Well, lets get that done :> 20:09:13 lifeless: If it happens again will try and track it down 20:09:25 lifeless: yeah, its what I was working on, when the meeting started :) 20:09:34 SpamapS: 1201581 - do we need new tenant instances ? 20:09:38 SpamapS: or just server side ? 20:09:52 lifeless: also for the di-b bug.. we can fix it by using rm --one-file-system 20:10:22 SpamapS: you think the bind mounted dev is still in place when we rm ? 20:10:27 SpamapS: say so in the bug :) 20:10:36 lifeless: I think we can just deploy a newer keystoneclient in keystone's venv. 20:11:36 bug 1201580 is going to be dependent on the precious fs movement stuff 20:11:45 plus in-instance upgrade hacks 20:12:19 jog0: we will need your nova expertise at some point, we have this crazy idea about updating the boot ramdisk + kernel for ari+aki using flavors 20:12:30 jog0: (and making that work and push out to existing instances) 20:12:41 s/flavors/images/ 20:12:43 lifeless: is this the rsync based thing? 20:12:54 jog0: not directly, but tied into it. 20:13:04 lifeless: I was just thinking that we could push out a git-tree-puller and a 'pip install -U''er and an 'apt-get upgrade''er 20:13:16 SpamapS: ah, so a hack :) 20:13:19 lifeless: as ghetto and stinky as possible 20:13:24 right 20:13:26 lifeless: have a link to this crazy idea? 20:14:03 jog0: no, I put a bug and etherpad up about the issue 20:14:06 *will* 20:14:31 lifeless: cool 20:15:20 #action lifeless to ensure we have bugs surrounding the in-instance upgrade path and new ramdisks/kernels 20:15:51 jog0: actually what we'd like to do is to rebase an instance onto a new kernel/ramdisk and disk image - but not reboot it - trust it will redo the image contents itself. 20:16:00 but yeah, will write that up 20:16:38 interesting, thanks don't fully grok how that works so a writeup would be great 20:16:44 SpamapS: - https://bugs.launchpad.net/tripleo/+bug/1200201 - still exists ? 20:17:29 lifeless: I haven't verified it is closed yet. 20:17:35 lifeless: forgot to tag it in ORC-REFACTOR 20:17:38 kk 20:18:10 https://bugs.launchpad.net/tripleo/+bug/1189385 is still pending something; we haven't seen reproduction in a while now. 20:18:13 * SpamapS assigns self 20:18:24 and I still owe https://bugs.launchpad.net/tripleo/+bug/1184484 some config extraction 20:18:33 ok 20:18:45 any pet bugs folk want to chat about ? 20:19:35 ok 20:19:46 #topic grizzly test rack status 20:19:53 so this is ticking along 20:20:06 I found the network node services had no upstart jobs yesterday 20:20:09 this had everything down 20:20:13 and I have NFI how/why.. 20:20:35 I added them using os-svc-install 20:20:37 but sheese. 20:20:45 I think the thing is a little rickety and concerning. We realy do need at least a tiny subset rack to be able to CD to so we don't have a dead duck. 20:22:16 A huge portion of what was done in the POC has been rewritten and refactored a lot since then.. no idea if it would apply to that rack now. :-P 20:22:55 so there are spare machines 20:23:24 someone needs to grab the hw list and examine the machines that are faulty and try recovery 20:23:31 I can offer some offline hints about that 20:23:43 also I have a list of 8 or so other machines that were earmarked for monty and are idle 20:23:51 again, someone with time needs to JFDI 20:24:19 time + access ;) 20:24:29 so, access - good point. 20:25:17 This is HP hardware in a production datacentre; I don't have the authority to give control plane access to the cloud to non-HP staff, *but* any HP staff involved in tripleo should be totally fine. 20:25:51 # action HP tripleoers If you don't have access to the POC rack control plane. ping me/ng/spamaps - all of us should be able to add you. 20:25:55 #action HP tripleoers If you don't have access to the POC rack control plane. ping me/ng/spamaps - all of us should be able to add you. 20:26:17 huh, failbot ? 20:26:25 #action HP-tripleoers If you don't have access to the POC rack control plane. ping me/ng/spamaps - all of us should be able to add you. 20:26:30 NFI.... 20:26:57 hey the bug bot is back :) 20:27:05 heh 20:27:14 so - we need to action the criticals around the rack 20:27:19 but we talked about that 20:27:25 so - next topic time ? 20:27:35 are all of the criticals assigned? 20:27:45 or at least, the blocking criticals? 20:27:51 (may be ordering issues..) 20:28:41 no 20:28:43 they are not 20:29:27 Ok well I think we can address them as criticals and just attack them one by one. 20:29:39 yup 20:29:41 #topic CI virtualized testing progress 20:29:48 ok, that works. da fuq 20:29:55 pleia2: oh hai. 20:30:03 pleia2: I suspect you're going to say 'nochange' :> 20:30:26 yeah, at oscon this week 20:31:08 #topic open discussion 20:32:30 once again all reviews have been addressed on the neutron and neutronclient, going to push for merge in next meeting 20:32:37 coool! 20:32:43 It's worth stating here, I am overhauling os-refresh-config and replacing cfn-hup with os-collect-config .. so please do report any weirdness you see there. 20:32:53 wooo 20:33:09 I'm going to be AWL from thursday through wednesday 20:33:44 Oh and 20:33:47 we're like, official and stuff 20:33:55 I have some leave thursday/fri then tuesday doing tech @ work day in Sydney. Mon and wednesday are a combination of being not-at-home connectivity spottiness and travel. 20:34:04 so I need someone to run this meeting next week. 20:34:32 * SpamapS checks schedule to be sure 20:35:07 lifeless: I will run it 20:35:18 thanks! 20:35:26 #action SpamapS run da meeting next week. 20:35:33 #help 20:36:29 ok 20:36:35 so something I think we should try and sync on 20:36:37 is the roadmapish 20:36:46 we're now at the opencloud - woo! 20:37:08 in my head, it's now time to take our narrow feature set and start expanding sideways 20:37:12 - updates 20:37:14 - HA 20:37:34 I need to spend some time on bringing Heat up to our expectations. 20:37:40 we need more failures to be retryable 20:37:56 and rolling/canary updates will not make h3 if I don't start on it by next week. 20:37:57 - bare metal improvements (like including vendor firmware flashes in the deploy ramdisk) 20:38:06 - heat ^ 20:38:44 SpamapS: so H3 doesn't worry me too much, as long as we're not entirely blocked for 2 months - can you land it with an option to enable it or something 20:40:21 lifeless: it can be marked experimental for sure. 20:40:31 lifeless: it won't interfere with anything until you say "please update using canaries/rolling" 20:40:43 SpamapS: great 20:40:45 so yeah, the manual can say "This is experimental use at your own risk" 20:40:59 SpamapS: well more I mean do you *need* to stress about H3 20:41:03 we deploy trunk today 20:41:11 so as long as we can get the support into trunk... 20:41:38 I have a collaborator from outside tripleo who will be helping, who might care about H3 :) 20:41:42 kk 20:42:03 what else can we bifurcate onto 20:42:05 - performance 20:42:16 - monitoring [as NobodyCam is already!] 20:42:20 CI/CD for di-b 20:42:22 - reporting 20:42:28 - CICD yes yes yes! 20:42:31 I know a lot has been done already 20:42:40 basically - we've automated the stuff the POC taught us. 20:42:45 seems like we're close with the offline features to being able to test it in isolation 20:42:51 Perhaps we need a new stretch goal to consolidate around ? 20:42:59 well there is this sprint.. 20:43:40 its ages off, we should be finished by then :P 20:43:47 true 20:44:42 baseline - if anyone is aimless, we have tonnes to do, ping me [or anyone else on the team] and we'll help you find a useful thing that is within reach 20:45:05 last call on discussion ... 20:47:15 ok, thanks for playing! 20:47:18 #endmeeting