14:00:04 #startmeeting tripleo 14:00:05 Meeting started Tue Jul 12 14:00:04 2016 UTC and is due to finish in 60 minutes. The chair is shardy. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:06 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:08 The meeting name has been set to 'tripleo' 14:00:09 #topic rollcall 14:00:16 o/ 14:00:17 Hi all, who is around? 14:00:20 o/ 14:00:23 o/ 14:00:33 o/ 14:00:34 Hello!! o/ 14:00:55 o/ 14:01:07 o/ (will have to leave a little early though) 14:01:17 o/ 14:01:18 #link https://wiki.openstack.org/wiki/Meetings/TripleO 14:01:26 #link https://etherpad.openstack.org/p/tripleo-meeting-items 14:01:27 o/ 14:01:34 o/ 14:01:41 new format for one-off items, please add to the etherpad as some folks can't write to the wiki 14:02:01 #topic agenda 14:02:18 o/ 14:02:23 o/ 14:02:28 #topic agenda 14:02:28 * one off agenda items 14:02:28 * bugs 14:02:28 * Projects releases or stable backports 14:02:28 * CI 14:02:31 * Specs 14:02:33 o/ 14:02:33 * open discussion 14:02:53 o/ 14:03:28 Ok then, I see some additions to the agenda in the etherpad :) 14:03:41 Lets get started with the one-off-items in https://etherpad.openstack.org/p/tripleo-meeting-items 14:03:48 #topic one off agenda items 14:03:58 1) (sshnaidm) What could be done to speed up reviews and *merges* in tripleo-ci repository in particular? 14:04:05 sshnaidm: So, what did you have in mind here? 14:04:27 I think we basically need more folks willing to spend time reviewing patches and gaining familiarity with those scripts 14:04:38 which the mentoring/rotation is supposed to help with 14:04:40 shardy, I had a few patches waiting days and weeks for merge, for example this poor tempest fixing patch, so maybe there is a way to speed it up..? 14:04:51 any other suggestions? 14:04:52 imho, things move quite faster over the last weeks. We have to also take in consideration we're running on best-effort CI right now, with less jobs less testing, etc. So we need to be more careful in reviews 14:05:32 shardy, maybe some list of urgent patches in our weekly-mentors document and cores will look at it in their time at least daily? 14:05:49 sshnaidm: Yeah, sorry about that, we've had a bad time with CI lately but velocity has been much improved over the last couple of weeks 14:06:01 the weekly mentors thing should not be dedicated for reviewing urgent patches imo 14:06:09 sshnaidm: I suggest asking for reviews on IRC if patches have been waiting more than a week and are passing CI 14:06:19 sshnaidm: I know you've already started doing that 14:06:23 shardy, yeah, I do it every day :) 14:06:24 if you need reviews, that's something you can rise anytime on IRC 14:06:30 or using ML also 14:06:32 shardy, but it's not efficient so much 14:06:43 EmilienM: No, it's about getting more folks able to help with reviewing the patches 14:07:01 right ^ 14:07:04 e.g if people learn how CI is wired together when debugging, they will hopefully help with tripleo-ci reviews :) 14:07:10 I thought asking for reviews in ML was definite No-No 14:07:12 that is my hope, anyway 14:07:22 jokke_: I thought that too, just adds noise. 14:07:28 jokke_: Not on the ML, on IRC is OK tho IMO 14:07:34 ++ 14:07:39 but only if the patches have been there more than a few days and are passing CI 14:07:47 I do it in IRC but it's still the issue 14:07:49 jokke_: using ML to give status on a task/blueprint/bug with some reviews, does not hurt imho 14:07:58 Its a matter of spreading the load, the more people we have working on CI related stuff the more people there will be able to review patches for it 14:08:49 I think most people have a hard time getting reviews 14:08:55 I remind the we need at least 2 +2s for merge, ususally I get easily 1 and then... 14:09:32 so nobody is interested to have a list for urgent reviews for tripleo-ci? 14:10:09 sshnaidm: another thing you can do is review other peoples patches - folks often review your patches in return if you commit time to helping them land their stuff 14:10:15 sshnaidm: a list would be good, I worry a bit that we'll still fallback to irc but its oworth a try 14:10:23 sshnaidm: sure, we can do that 14:10:24 things that are urgent are in launchpad with "alert" tag 14:10:24 we did a list once 14:10:32 Didn't we try a general TripleO review etherpad recently? 14:10:33 everyone just added their own patches 14:10:40 Yeah 14:10:43 EmilienM, yeah, but it's about the bugs 14:10:43 the problem is the lists end up unmaintained after a while, so everyone ignores them 14:10:55 etherpad does not work for that, we already tried 14:11:03 shardy: exactly 14:11:09 I guess we can have a maintained list? 14:11:14 the advantage of using launchpad and actively triaging the bugs is we can set a priority 14:11:22 saneax: Who is going to maintain it? ;) 14:11:22 etherpad doesn't provide a good way to do that 14:11:28 ++ on using launchpad or storyboard 14:11:38 EmilienM, shardy I suppose you're right about etherpads, maybe any other ideas? 14:11:56 sshnaidm: report bugs for critical CI impacting issues, and tag them "alert" 14:12:12 then we get an annoying reminder in CI periodically which motivates reviews for critical issues 14:12:35 shardy, it works for real critical like broken CI 14:13:03 when using "alert" it should be critical, otherwise we'll end up ignoring it, we can't dilute it 14:13:09 sshnaidm: for less critical issues, still report a bug and target it at the next milestone 14:13:29 then we have it visible on the roadmap, and folks will hopefully review it to help burn down the open issues before a release 14:13:35 right, tripleo folks should get emails at every new bug 14:13:39 (that's the way I prioritize my reviews) 14:14:16 Ok, can we timebox this, and perhaps follow up on the ML or #tripleo? 14:14:24 shardy, sure 14:14:38 I know velocity is a problem, but it's been a lot better recently, hopfully we can continue that trend 14:14:41 using launchpad or storyboard is the only way to not interrupt people during their work and let them review when possible 14:15:10 #info use launchpad bugs to help focus review attention on next-milestone reviews 14:15:12 EmilienM, does anybody use storyboard? I saw it empty all the time 14:15:24 2) (derekh) rh1 ovb or not? 14:15:29 So we've been using OVB based jobs exclusively for a week now, rh1 is ready to be brought back up 14:15:45 sshnaidm: not yet but I know slagle did some investigation 14:15:53 do we take the time to make it OVB based aswell or bring up the legacy system back 14:15:57 ? 14:16:06 sshnaidm: I am not using storyboard, all the release tracking is in launchpad, the "tripleo" projects 14:16:28 if we go with OVB, there is no turning back and it will probably take longer to get up and running 14:16:30 I do not consider any of the other tripleo-* projects, or storyboard, trello or anything else when preparing for the release 14:16:33 derekh: +1 for OVB and get rid of legacy, even if it takes a bit more time to stabilize it 14:17:01 derekh: I think the ovb job has been pretty stable over the last week or so? 14:17:08 but it will get us away from a legacy system thats becomming maintainable 14:17:20 do we have many outstanding issues to fix before we have similar rates of false-negatives to before? 14:17:29 shardy: yes, it has, we've did have a few teething problems but I think those are ironed out now 14:17:41 derekh: ack, well I'm +1 on moving to OVB 14:18:02 I think the flexibility it gives us wrt periodic scale testing etc will be a huge win in the long term 14:18:18 +1 ovb, it's very easy to devs to use that 14:18:20 Ditto. I think most of the false negatives now are not necessarily related to OVB. 14:18:38 shardy: we had a lot of "no host available errors", those were fixed (hidden) by https://review.openstack.org/#/c/340359/ 14:19:02 and we had some problems getting testenvs (and creating them), but I think we've solved those now 14:19:23 so it sounds like everybody is plus one to doing it 14:19:27 derekh: excellent work on ovb :) 14:19:32 derekh: do we have a handle on the no host available thing? 14:19:38 other than the n+1 workaround? 14:19:41 EmilienM: we'll ovb itself is bnemec baby ;-) 14:19:51 bnemec++ 14:20:00 * bnemec coos at rh2 :-) 14:20:23 shardy: We're working around it by give an extra host to the testenv, so if one failes the other gets deployed in a retry 14:20:46 shardy: but I think this would also fix it https://review.openstack.org/#/c/338886/ 14:21:11 the down side to switch rh1 is this, I have no idea how long it will take to get up and running 14:21:11 shardy: When I've looked into it, that looks like the Ironic NodeAssociated race to me. 14:21:46 rh2 took me 6 weeks, but there was a lot of tweaking and getting stuff tested that we wont have to do 14:21:48 derekh: how long did it take to deploy rh2 w/ovb? 14:21:56 best case I would say we can aim for is 2 weeks 14:22:23 derekh: I think 2 weeks is a worthwhile investment if you have time to commit to it now 14:22:24 how can we help? 14:22:40 ++ & ++ 14:22:40 We also need to get the ovb jobs running stuff like net-iso, but I think we're pretty close on that. 14:22:46 and it's better doing it now before the mad rush to land stuff at the end of newton 14:23:22 So i'm thinking it makes sense for me not to do it, would anybody else have the time to push it (with me here a second) 14:23:46 weshay_mtg: Do you have any folks willing to help out? 14:24:17 derekh: what kind of tasks would it be? What skills are required to move fast? 14:24:34 shardy, sshnaidm and panda would be avail to help 14:24:55 shardy: I'm really afraid about newton cycle,feature freeze is in less than 1 months 1/2 14:25:06 EmilienM: its essentially a normal mitaka rdo deployment, we have scripts for the custom stuff 14:25:15 EmilienM: Yeah, we'll get to that in the releases topic ;) 14:25:28 EmilienM: but, god knows what peoblems may be hit when the bit hit the Hardware 14:25:33 shardy: we don't have the upgrade job for now, which is worries me to make progress in our cycle 14:25:46 could not we take care of CI at the end of Newton between the 2 cycles? 14:25:56 EmilienM: Yeah, that's why I'm saying we need to get full coverage back in place well before newton is final 14:26:03 ideally in the next 2 weeks ;) 14:26:07 ok 14:26:21 which makes us less than 1 month to finish Newton 14:26:28 EmilienM: if we don't do it now, we have a much higher risk of shipping newton in a broken state 14:26:35 right 14:26:38 we need full coverage of net-iso, SSL and upgrades 14:26:44 except if we roll-back to legacy 14:26:55 EmilienM: Not everyone will stop working while this CI work happens 14:27:13 is there any appetite for leaving upgrade testing to outside of tripleo-ci 14:27:28 so we have 2 options: 1) rollback to legacy rh1, and use full CI until newton, and then switch to rh1 ovb during the middle of 2 cycles or 2) switch now and rush for Newton 14:27:41 weshay_mtg: if there's a third-party job which can be reused, I'd be fine with that (at least as an interim measure) 14:27:41 we have success w/ major upgrades liberty -> mitaka it's been running well for a bit 14:27:54 shardy, ya.. let's go for that 14:27:58 matbu, apetrich_ ^ 14:28:04 1) is not goof for longterm but less risky to produce Newton on time 14:28:11 Ohh, the other thing is that I'm not sure how much longer we can have rh2 for, it was a loan 14:28:13 how much work it is to bring up the legacy vs. the new one? 14:28:17 2) is good for long term but much more risky for Newton date 14:28:24 derekh, well.. that one is up to me :) 14:28:33 Does it make sense to bring it up for couple of months and take back out again 14:28:37 derekh, you have it as long as you need it.. with in reason of course :) 14:28:38 weshay_mtg: ;-) 14:28:45 weshay_mtg: thanks 14:29:17 shardy, we could consider ssl and ipv6 third party candidates as well 14:29:40 weshay_mtg: what do you mean by third party? 14:29:47 adarazs, care to comment 14:30:11 EmilienM: jobs running on CentOS CI. 14:30:14 weshay_mtg: well we had pretty good coverage of those via tripleo-ci before, so I'm happy to just retain that as part of the upstream test if possible 14:30:21 k 14:30:27 right +1 with shardy 14:30:28 I have a PoC job tested here: https://ci.centos.org/job/tripleo-quickstart-thirdparty-gate-master/ 14:30:31 weshay_mtg: the problem is our previous "upgrades" job never actually tested a proper version to version upgrade 14:30:40 no need of 3rd party CI for SSL & ipv6, we already cover it 14:30:46 shardy, ya.. we have that nailed down now 14:30:48 so if we can save time by reusing some third-party tests for that, I think that's fine 14:30:59 matbu, can elaborate 14:31:10 weshay_mtg: that's good news, lets go with that as a first step then 14:31:16 roger that 14:31:43 so what's decision? rh1 ovb during the next 2 weeks or defer it? 14:31:45 so a third party upgrade test hooked into tripleo git repos? 14:31:58 Ok, we're going to have to timebox this, I think the consensus is OVB, but lets confirm via a vote on the ML 14:32:07 shardy: ack 14:32:11 my preference is to go ahead and OVB it 14:32:14 shardy: we really need to take newton cycle in consideration 14:32:24 as a reminder, tripleo is now following official schedule since Newton 14:32:33 #link releases.openstack.org/newton/schedule.html 14:32:35 EmilienM: we landed a ton of stuff for newton while rh1 was completely down 14:33:05 I'm not sure I get why having a couple of folks, who aren't working on major features for newton, work on bringing it back up, will delay newton features 14:33:07 + for ovb 14:33:12 +1* 14:33:15 it does increase the risk we break something in the meantime 14:33:28 which is why we need it now, so we've got time to fix the regressions 14:33:31 I also +1 ovb, just warning people about newton schedule 14:33:49 EmilienM: ack, thanks for the reminder, we've certainly still got a lot to do 14:34:08 right, a very few blueprints are completed 14:34:14 3) Status of the switch to mistral API in tripleo-common and python-heatclient 14:34:21 Ok, so I'll keep this one quick 14:34:36 we landed the first two steps towards adopting the mistral based API this week 14:34:43 \o/ 14:34:49 tripleoclient now drives node registration and introspection via mistral workflows :) 14:35:04 so, kudos to rbrady, dprince, d0ugal and everyone else who worked on that 14:35:23 I wanted to ask everyone to help with reviews and testing to speed up landing the remaining patches 14:35:46 nice move folks! 14:35:47 not all of them are ready yet, but we need folks to help improve the velocity during n-3 14:35:59 nice one! 14:36:04 that is all :) 14:36:09 woohoo! 14:36:11 4) (slagle) TripleO Deep Dive reminder. Thursdays at 1400 UTC https://etherpad.openstack.org/p/tripleo-deep-dive-topics 14:36:15 list of patches here: https://etherpad.openstack.org/p/tripleo-mistral-api 14:36:34 So on Thursday I'm going to do a t-h-t overview into deep-dive 14:36:35 +1 for deep dive sessions, as a noob it helps a lot 14:36:48 shardy: i just wanted to remind folks. i guess it's been done :) 14:36:57 please let anyone who may be interested in attending know, and ping me with ideas of specific topics you'd like to see covered 14:37:06 slagle: ack, thanks! :) 14:37:47 Ok, since EmilienM already mentioned it, I'm going to skip to releases if that's OK with folks 14:37:54 #topic Projects releases or stable backports 14:38:03 the deep dives would be gr8 if that very same timebox didn't have triple booking already :( 14:38:17 jokke_: what TripleO booking? 14:38:33 shardy: not TripleO, triple 14:38:43 jokke_: ah, OK well sorry about that 14:38:48 as in I have 3 other things to multitask on that same box 14:38:50 at least it's recorded 14:38:54 ++ 14:39:02 So, re releases - newton-2 has been tagged 14:39:04 https://bugs.launchpad.net/tripleo/+milestone/newton-2 14:39:09 Great work everyone! 14:39:17 \\o \o/ o// o/7 14:39:18 thanks to EmilienM for preparing the release patch 14:39:29 I sent an email about composable roles 14:39:45 the blueprint is considered implemented, remaining work is tracked here: https://bugs.launchpad.net/tripleo/+bugs?field.tag=composable-roles 14:39:50 Regarding velocity tho, we landed zero blueprints in n1, and three in n2 (admittedly the composable sevices one was *huge*) 14:40:00 https://bugs.launchpad.net/tripleo/+milestone/newton-3 14:40:05 EmilienM: so you want to use bugs because shardy closed our blueprint? 14:40:17 we have 14 blueprints targetted to n3, and I suspect some of them will not make it 14:40:21 dprince: yeah, use launchpad to track what needs to be done in n3 14:40:34 so please help by accurately ipdating status for delivery of these 14:40:35 EmilienM: confusing :), but fine with me 14:40:46 so we can see before the last week of the milestone what's going to slip 14:41:03 trown: are you still going to do a patch to update tripleo-docs to replace instack-virt-setup? 14:41:08 Also, please prioritize reviewing stuff so we can burn down that list 14:41:18 dprince: the implementation is considered as done, so shardy suggested to use bugs in launchpad for the rest of the work 14:41:28 slagle: ya, it is on my list... just keeps getting pushed down 14:41:28 I think it's a good idea 14:42:08 trown: k, np. just wanted to double check 14:42:11 dprince: the archtecture change landed, and tracking all the service refactoring via one blueprint had become unmanageable 14:42:19 it's like more than 80 patches on 1 bp 14:42:22 also just reminder as the time is problem here. The regressions and bugfixes are ok to land after the FF, so perhaps focusing to get the features in by the end of Aug deadline 14:42:26 ? 14:42:31 so I thought we can track the remaining stuff in a more granular way 14:42:47 it'll also help showing real velocity vs one mega-bp which keeps slipping 14:43:37 shardy: yep, for similar reasons this is why we agreed to abandon the SPEC for composable roles in favor of just iterating on docs... 14:44:10 jokke_: Yep, good point, some of the bugs may slip into the RC/FF phase 14:44:26 so focus on landing features (and the most high priority bugs) would be good 14:44:42 * bnemec is not a fan of slamming in broken things just to make FF 14:44:48 Though it is the OpenStack Way(tm) 14:45:08 shardy: can we get the jinja patch in, or done enough to land an initial iteration of composable roles (not services... but the next step) in Newton? 14:45:22 shardy: I can help here if you need it 14:45:28 dprince: I hope so, I'm waiting on the deployment via mistral to be ready 14:45:29 shardy: w/ Mistral stuff 14:45:35 bnemec: didn't mean merging broken things just to make the deadline, but last week before FF is not the time to drop the 100 comment and doc typo fixing changes to the CI 14:45:36 shardy: ack 14:45:47 FYI, we are working on DPDK and will push patches soon 14:45:47 then I'll refactor the jinja2 patch into tripleo-common and revive the t-h-t series 14:45:50 s/the/those/ 14:46:14 shardy, dprince: big +1 on it 14:46:22 bnemec: agree, but we do need to improve velocity on the remaining features or they won't make newton at all 14:46:43 karthiks: ack 14:47:12 shardy: regarding the release schedule and given our composability I don't think we need to block newly composable service patches that are off by defualt 14:47:21 shardy: I'd like to get agreement on this 14:47:47 shardy: essentially, if a service is off by default... we can accept it safely I think if a group (vendor, etc.) confirms it works for them 14:47:48 dprince: during feature freeze maybe, but we can't backport features to stable branches anymore 14:47:55 it hurts me to review code that is not tested but well, it's like it's an agreement 14:48:12 EmilienM: this is stuff we can't test upstream in most cases I think 14:48:15 dprince: I think folks can ask for a FF exception for new service stuff landing near the end of the cycle 14:48:27 dprince: not sure if I have vote here, but I'd like to disagree. New feature is new feature regardless if it's default on or off. And the whole point of Feature Freeze is to get folks focusing on stabilizing before the release 14:48:40 i'm not sure this is a decision we should make as a project. does it violate the intent of the OpenStack FF? 14:48:48 we're cycle-lagging so we have a little time beyond the main FF http://releases.openstack.org/newton/schedule.html 14:49:03 e.g we can release up to two weeks after newton final, although I'd prefer not to 14:49:30 ya, I think we should be releasing as close to puppet as possible 14:49:37 slagle: the process is (AIUI) folks can request a FFE, which is either approved or denied by the project 14:49:47 I think we can just do that 14:50:10 ok, yea, that seems more inline 14:50:13 if someone wants to integrate an off-by-default service, we'll say yes, if it's something else super-risky we'll say no 14:50:18 shardy: process is fine, but I'm suggestning we be leanient 14:50:24 shardy: projects can also set their own freeze dateb before RC1 14:50:39 like Nova for example froze effectively already 14:51:02 trown: we release the day of official release or sometimes between > 0 and < 5 days after (trailing model) 14:51:12 Yeah, we could freeze later, say when the other projects are cutting their RC's 14:51:38 I'll start a ML thread on that topic 14:51:44 EmilienM: right, just saying that tripleo should be as close to that as possible... not a week later because we merged something risky last minute 14:51:51 shardy: well if tripleO is release-with-milestones the RC1 date is locked again 14:52:04 #topic CI 14:52:14 so postponing FF date just shortens the time between 14:52:18 trown: ++ 14:52:28 derekh: want to give an update on the CI impacting issues, e.g nonha and mitaka? 14:52:33 I've nothing pressing to discuse in CI, that we havn't already 14:52:46 shardy: mitaka should now be working again I think 14:52:55 (Sorry I'm going to skip bugs and specs this week as we're short on time) 14:52:56 shardy: and we're testing a fix for nonha 14:53:30 derekh: Yeah, I noticed we don't run nonha on tripleo-ci changes 14:53:41 so I pushed a dib change with a Depends-On 14:53:46 we should probably add that I guess 14:54:02 Other than the OVB thing, anyone have anything else re CI? 14:54:07 shardy: I added a comment to your review comment, its in the expremental queue, I don't know why it didn't trigger 14:54:09 slagle: what's the status of the multinode job? 14:54:25 derekh: Ah, OK - I saw the experimental result and it wasn't ther 14:54:27 there 14:54:32 thanks 14:54:55 shardy: passed yesterday 14:54:55 kudos to slagle to make multi-node job working :-) 14:55:02 for the 1st time 14:55:15 ++ it's great to see another way to add coverage 14:55:17 i am cleaning up a couple of things 14:55:25 shardy: are are two sets of jobs that run in check experimental, the tripleo ones and the non tripleo ones(multinode), they report back separately 14:55:35 we need the tht patch to land as well 14:55:42 derekh: aha, thanks, that explains it 14:55:51 https://review.openstack.org/#/c/222772/ 14:55:58 and multinode is quick, while is nice 14:56:00 actually i guess it's ready to go, it passed CI 14:56:21 shardy: i will need to sync with you later, i could not get the POLL_HEAT setting to work 14:56:27 slagle: I commented on that, I think the credentials part can be improved but we can do that later 14:56:30 or whatever it's called 14:56:34 slagle: OK, I can help with that 14:56:41 not a blocker to landing the first revision IMO 14:56:49 #topic Open Discussion 14:56:54 So, there was one thing 14:57:18 on the ML there's a thread about OpenStack logos, and I was approached directly about the "official" project logo 14:57:37 I assume everyone is keen to keep our proven TripleOwl theme as a mascot? 14:57:50 shardy: ++ 14:58:03 yes i am 14:58:08 I think this would be "wise" 14:58:13 ++ 14:58:14 although our tshirts will be vintage if we get a new logo 14:58:14 * derekh like the owl 14:58:17 +1 14:58:17 #link http://lists.openstack.org/pipermail/openstack-dev/2016-July/099046.html 14:58:18 lol at wise 14:58:18 owl++ 14:58:19 +1 14:58:33 dprince: I see what you did there, you such a hoot 14:58:37 lol :) 14:58:37 cant switch it after EmilienM gets it tattoo'd 14:58:56 dude I'll got my tatoo soon, I know you're waiting for it 14:59:01 derekh: hoot! 14:59:03 Ok, cool, well I replied to that effect already but wanted to confirm and draw attention to the discussion 14:59:26 shardy: because this is the most important thing 14:59:36 They're talking about redesigning logos to be more consistent, but I assume we'll get a vote in the outcome 14:59:39 TripleO also gets the priority for it as it's established logo IIUC 15:00:00 Ok, out of time, thanks everyone! 15:00:02 shardy: oh, that scares me 15:00:10 thnx 15:00:12 #endmeeting