14:04:37 #startmeeting tripleo 14:04:39 derekh++ 14:04:39 Meeting started Tue Mar 1 14:04:37 2016 UTC and is due to finish in 60 minutes. The chair is dprince. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:04:40 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:04:43 The meeting name has been set to 'tripleo' 14:04:44 \o/ 14:04:54 hello :) 14:04:55 hiya \o 14:05:02 hi! 14:05:02 :) 14:05:02 derekh++ 14:05:06 I am speechless 14:05:20 l33t hack! 14:05:23 ;-) say nothing 14:05:33 o/ 14:05:43 o/ 14:05:56 o/ 14:06:15 o/ 14:06:32 derekh++ 14:07:04 #topic agenda 14:07:04 * bugs 14:07:04 * Projects releases or stable backports 14:07:04 * CI 14:07:04 * Specs 14:07:06 * one off agenda items 14:07:09 * open discussion 14:07:23 there on no on off agenda items this week I think 14:07:41 anything else to add to the agenda before we start? 14:07:51 dprince: I'd like to discuss when we branch for mitaka RC, but that can be in the releases/stable section 14:08:56 shardy: cool 14:09:00 lets go 14:09:04 #topic bugs 14:09:40 Hello \o 14:09:41 There were a couple IPv4 network isolation bugs fixed this week 14:10:24 #link https://bugs.launchpad.net/tripleo/+bug/1551048 14:10:24 Launchpad bug 1551048 in tripleo "network validation tests fail: No module named ipaddr" [Critical,Fix released] - Assigned to Dan Prince (dan-prince) 14:10:29 #link https://bugs.launchpad.net/tripleo/+bug/1550380 14:10:29 Launchpad bug 1550380 in tripleo "network isolation broken when SoftwareConfigTransport == POLL_TEMP_URL" [Critical,Fix released] - Assigned to Dan Prince (dan-prince) 14:10:58 both of those should be fixed. I was surprised to see that IPv4 network isolation has probably been broken for some time 14:11:25 o/ joining late cos other call 14:11:27 locally IPv4 netiso is working for me again 14:11:41 * bandini waves 14:11:48 dprince: the second one looks worrying from an upgrade perspective, e.g 20-os-net-config still isn't owned by any package is it? 14:12:06 so if (as I think stevebaker wanted) we switch to tempurl on upgrade, things may break? 14:12:19 shardy: I don't think it effects upgrades unless you also switch to Heat TEMP URL's 14:12:33 shardy: and then you could push a new 20-os-net-config with the artifacts patch 14:13:18 dprince: Ok, just something to bear in mind, looks like the patch switching transports for liberty stalled: 14:13:21 https://review.openstack.org/#/c/257657/ 14:13:22 shardy: basically you'd create a tarball and use t-h-t to extract that tarball 14:14:06 dprince: ack, I guess provided we document it in the upgrade/release notes for mitaka all should be OK then 14:14:22 derekh, ugh sorry guys my bad 14:14:37 on a related note I'd really like to see us get any sort of IPv4 network isolation CI job in place ASAP https://review.openstack.org/#/c/273424/ 14:14:48 ndipanov: no prob, we got there in the end 14:14:55 switching away from 20-os-net-config ref https://review.openstack.org/#/c/271450/, or at least getting it packaged seems like a better plan long term to me tho 14:15:33 shardy: yep 14:15:50 okay, any other bugs to talk about this week? 14:16:16 the puppet-keystone fix blocking master has landed (keystone-manage bootstrap, etc.) 14:16:52 derekh mentioned to me earlier that we actually had 2 of the master periodic jobs pass for the first time I think last night 14:17:04 \o/ 14:17:15 dprince: yup, http://tripleo.org/cistatus-periodic.html 14:17:21 dprince: it did, but we had to also patch tht to not use it 14:17:27 or, EmilienM did rather 14:17:45 dprince: https://bugs.launchpad.net/tripleo/+bug/1551501 14:17:45 Launchpad bug 1551501 in tripleo "CI: HA jobs failing with Error: /Stage[main]/Keystone/Exec[keystone-manage bootstrap]: Failed to call refresh: keystone-manage bootstrap --bootstrap-password returned 1 instead of one of [0]" [Critical,Fix released] - Assigned to Emilien Macchi (emilienm) 14:17:47 right, and now that https://bugs.launchpad.net/tripleo/+bug/1551501 is fixed the HA job should pass again. 14:18:03 but you're going to have boostrap issues when you'll update keystone 14:18:40 EmilienM: I see. SO the master job will still be failing 14:18:55 EmilienM: for the HA job we just need to run the bootstrap manually then? 14:19:05 EmilienM: or later in the steps? 14:19:31 dprince: I don't know, we need testing 14:19:58 okay, so it sounds like the HA job for master is still broken to me. I guess we'll see what happens 14:20:34 #topic CI 14:21:04 It is my understanding we are hitting memory issues with a couple new patches 14:21:40 dprince: yup, the patch for aodh and gnocchi 14:21:43 slagle, derekh: and we think that using a bit of swap may help get things passing? until we can get more memory added to our nodes 14:21:48 We also need to decide if we're going to disable the containers job ref https://review.openstack.org/#/c/285325/ 14:22:16 dprince: it's a theory anyway 14:22:30 dprince: we can bump it a little, but would like to find out what exactly needs the RAM first, also if swap works then great ;-) 14:22:36 dprince: we'd need to get the swap patch fixed up, base the aodh/gnocchi patches on that, and then see how performance is affected 14:23:02 yup, what slagle said 14:23:09 shardy: +A for disabling contains (for now) 14:23:27 slagle: agree, lets go with this plan 14:23:51 I've been mostly concentrating on CIv3.0 stuff, but at a quick glance, things seem to be ticking along at the moment, although the HA job seems to be failing too much... 14:23:55 ok, will push up another revision to the swap patch 14:24:24 i'm thinking i will leave it not used by default, since it's not really a production configuration, but then we can create a custom env in CI to use it 14:24:39 dprince, oh by the way 14:24:44 I almost have the gate fixed 14:25:09 rhallisey: nice, a simple revert gets contains back in the mix... 14:25:19 yup just letting you know 14:25:47 next I want to get the periodic job to push artifacts to a mirror-server, would be good if somebody could look over the patch series to enable the mirror server https://review.openstack.org/#/q/status:open+project:openstack-infra/tripleo-ci+branch:master+topic:mirror-server 14:26:00 hit a merge conflict earlier today /me will ifx 14:26:24 derekh: nice 14:26:53 I will try and review it 14:27:09 dprince: thanks 14:27:14 #topic Projects releases or stable backports 14:27:30 shardy: you wanted to mention Mitaka branches? 14:28:24 #info slagle plans to do stable liberty/releases later today 14:28:51 slagle: ack, thanks 14:30:01 dprince: yup, it's nearly RC time for other projects: 14:30:01 #link http://releases.openstack.org/mitaka/schedule.html 14:30:01 and although we've not done milestone releases, I'd like to propose we do an RC release, which will also enable us to branch for N in ~2weeks 14:30:01 which may also unblock some features expected to progress in early N 14:30:01 #link https://github.com/openstack/releases 14:30:01 what do folks think? 14:30:02 e.g can we land all the remaining features in the next 7-10 days then declare a RC1? 14:31:06 what are the remaining features? 14:31:37 the main ones are probably IPv6, SSL, and upgrades? 14:31:54 and satellite 5 support, although that is only one patch 14:31:58 jistr: any outstanding integrations? 14:32:00 SSL is already in master. 14:32:02 I'd like to get opendaylight merged if possible 14:32:11 In order to be able to test it, we'll need to coordinate with #rdo so we have a repository to use 14:32:25 jistr: (sorry wasn't really a question, i meant 'these too') 14:32:30 shardy: given that we don't actually have CI to cover any of those (yet) I'm questioning if we can do this in 2 weeks 14:32:39 marios: yeah. there are some patches from bigswitch to os-net-config pending, and stable/liberty backport of opencontrail i think 14:32:47 The other problem with this timing is that we're going to be simultaneously pushing backports for the downstream release that is due at about the same time. 14:32:47 marios: might be more than that 14:32:59 shardy: how stable the ci is over those days is crucial 14:33:48 Hmm, lost shardy. 14:34:28 should I change my nick again? 14:34:44 derekh: please do so we can move on 14:34:54 ;-) 14:35:01 no, maybe he'll re-join in a minute 14:35:16 sounds like there is concern 7-10 days is to soon for a Mitaka branch though 14:35:30 what is the alternative though? 14:35:47 gah, poorly timed network outage :( 14:36:13 trown: alternative would be to actually create CI jobs (upstream) which test these features before we claim them as features on a stable Mitaka branch 14:36:15 if we do not put out a RC with the rest of openstack, then it makes it pretty impossible to release tripleo in RDO with the rest of openstack 14:36:19 shardy: http://paste.openstack.org/show/488759/ 14:36:42 http://paste.openstack.org/show/488760/ is what I wrote fwiw 14:36:46 trown: or we cut a release and don't claim any of the new features we aren't testing 14:36:50 derekh: thanks :) 14:36:56 dprince: I would be more into that 14:37:09 the release schedule is not arbitrary 14:37:58 trown: do you know when the rdo mitaka repos will be up? 14:38:05 will that happen at rc1 as well? 14:38:31 shardy: I like the motivation for this in that it opens up master again for architectural work 14:38:42 slagle: that is when we could start working on it, but RDO aims to release within 1 week of upstream 14:38:45 shardy: i.e. composable service roles 14:38:49 and split stack 14:39:00 dprince: yup, and it also makes it much better for those consuming tripleo IMO, as trown has indicated 14:39:35 shardy: my concern is that I don't want to see everyone slam down a bunch of code that probably isn't working (upstream anyways) 14:40:02 shardy: so if the plan is to cut the branch without the code-slamming then I'm probably fine 14:40:18 it probably implies a big push on reviews over the next week tho I guess 14:40:26 re the CI thing, we can announce features as experimental if they don't have CI coverage? 14:41:04 shardy: yes, but my concern w/ things like IPv6 network isolation is that by landing those patches we'll just break IPv4 (again) 14:41:20 shardy: because we still don't have the CI job on that... 14:42:08 I guess I'm fine with cutting the Mitaka branches whenever, but I'll note again that over the next couple of weeks my focus is likely to be on downstream OSP 8. 14:42:10 I think gfidente is working on ipv4 netiso, that should happen in the near future I think. 14:42:26 dprince: yup, I guess the question is just is it feasable to get to a reasonable branch point in ~2 weeks 14:42:26 I know we discussed having a lagging release on the ML, but I personally am not in favor of that 14:42:26 I think it'll be much clearer if we just start adhering to the exact same process/schedule as the rest of OpenStack 14:42:30 dprince: but how much time will it take to get that stuff... is an extra week likely to make any difference? 14:42:52 I am -1 on lagging release 14:43:00 yeah short update on that, we're in a sort of debugging session and trying to figure what is wrong 14:43:26 i'd like to see the needed features at least landed in master before cutting mitaka 14:43:27 gfidente: let me know if I can help. 14:43:34 gfidente: You saw my comment on the latest patch set? 14:43:37 okay, so if not laggin the release is the priority then fine. I just don't like slamming in features that haven't been tested on upstream 14:43:37 slagle: +1 14:43:39 if they're not, we'd have to violate our new policy of not backporting features 14:43:57 Can we come up with a list of must-haves for Mitaka? 14:43:58 bnemec: Yeah, I guess that is the subtext to my question, e.g if we need a big push on reviews, we need folks with bandwith to do them, and probably a list of the backlog we expect to land for stable/mitaka 14:44:10 it appears to me that many of the patches are being testing against stable/liberty (OSP8) and then thrown upstream... where they may or may not work 14:44:11 if we're not going to allow feature backports, getting that right is a lot more important 14:44:22 this isn't the way it was supposed to work 14:44:25 bnemec, I dropped those yes, the overcloud external network is a flat network bridged into br-ex with 192.0.2 subnetting 14:44:48 bnemec, this is why pingtest works without netiso 14:45:20 adarazs, ack, pvt 14:45:24 dprince: agreed, but getting this release cut at the right time is IMHO the first step to fixing our process 14:45:36 then the deadlines for things landing upstream become very clear from now onwards 14:45:40 gfidente: Umm, no it's not. We should talk after the meeting. 14:45:53 what I'm suggesting is it would be fine to land a feature upstream. Say IPv6, so long as whoever is reviewing it is manually verifying that it actually works and doesn't break previous features for each patch. Because we simply don't have enough CI coverage on much of this to verify we aren't breaking things 14:46:55 shardy: I'm fine with that release angle, just saying that there seems to be disagreement about landing or not landing all these features in 2 weeks time 14:46:58 dprince: requiring some manual testing is probably fine, provided folks actually have time to do that as well as the reviews (bnemec is giving me the feeling some folks don't have bandwidth for either, which is clearly problematic) 14:47:07 i think we do have the IPv6 + SSL + upgrades support very close though. If we can land those while making sure we don't break anything else, i'd be even ok with landing the majority of those features even without knowing that the new features work 100% great (experimental features, as shardy said), and then we can fix the new features later. Main concern is breaking what we already had but we don't have CI coverage on (e.g. IPv4 14:47:07 net iso). 14:47:37 it is problematic to work on the previous release for a whole cycle and then try to land all the features in the last 2 weeks 14:47:43 yea :( 14:47:57 lol @ "fix the new features later" 14:48:07 This is why I'd be in favor of lagging a bit this cycle and then cutting N in sync with everyone else. 14:48:12 experimental feature == will fix later 14:48:13 N'Sync even. ;-) 14:48:14 :)) 14:48:38 if we lag release, tripleo will not be able to participate in RDO Mitaka release 14:48:53 jistr: sure, I agree on the experimental features 14:48:59 which means even more packstack users 14:49:15 and if we lag, we'll keep all N work blocked, which will eat into the time for N feature development 14:50:10 for me, not being able to participate in the RDO mitaka release is a deal breaker, and it means we have to branch at the right time 14:50:17 jistr: but network isolation IPv4 has been broken for weeks upstream so we clearly need to do a better job of having reviewers (manually) test upstream until some of these CI jobs are in place. 14:50:18 I mean, that's fine, but if the features aren't landed then we're screwed. 14:50:19 same 14:50:33 We either violate the new backport policy, or we carry downstream patches for stuff. 14:50:42 getting some of these automated in CI is really the only scalable solution 14:50:46 Neither of which are good either. 14:51:08 I would prefer downstream patches over twisting the upstream process for downstream needs 14:51:11 dprince: +1 on prioritizing automated CI 14:51:28 that is a really non-starter argument for any other openstack project 14:51:45 Can we take this discussion offline and come up with the list of features that must go into Mitaka, and make a decision from there? 14:51:53 trown: It's also reality in this project right now. 14:52:04 Most of the developers have downstream responsibility, whether we like it or not. 14:52:19 trown: I agree, IMHO we can't block the entire upstream release due to downstream requirements, worst case some patches will have to be carried, but it'd be far better if we could have a coordinated push and just test/land what remains 14:52:29 ya, but it is still possible to seperate downstream arguments from upstream process and needs 14:52:29 downstream patches are also a non starter 14:52:42 it's not even a worst case scenario 14:52:45 it's no scenario 14:52:55 Downstream patches are what got us into this situation in the first place, FWIW. 14:53:01 ok, so if tripleo is now a downstream project, I do not have much to add 14:53:12 It's why we have features in Kilo that don't exist in Mitaka yet. 14:53:18 working on the previous release when we should have branched is also part of our current problem tho 14:53:53 Totally agree, but until we get everything to the same level feature-wise, we're going to be working on the previous release. 14:53:55 imagine trying to make this argument in nova 14:53:59 bnemec: +1 14:54:03 Ok, looks like we don't have consensus - how about this: 14:54:06 trown: We are not Nova. 14:54:18 1. We create an etherpad with blocker for mitaka patches, like today 14:54:21 my opinion is that if we want to branch on time, we have to allow for some *possible* feature backports to stable/mitaka *if* they have not all landed 14:54:29 2. revisit next week, and make a call re the branch for RC1 date 14:54:42 shardy: +1 14:55:01 +1 14:55:04 wfm 14:55:05 sounds like a fine plan to me shardy 14:55:07 slagle: Ok, I guess we can discuss that if it becomes clear certain patches won't land 14:55:37 #link https://etherpad.openstack.org/p/tripleo-mitaka-rc-blockers 14:55:43 yea, i just dont want to shoot ourselves in the foot 14:55:57 i think we can pull this off, branch on time, avoid downstream patches 14:56:04 Please add patches there, and we can review/discuss in #tripleo to ensure they are actually blockers 14:56:05 we just might have to allow ourselves some initial leeway 14:56:32 okay, very little time left in the meeting 14:56:40 #topic open discussion 14:56:55 anything else someone needs to bring up? 14:57:14 I just came along to mention I put up a spec for moving the puppet manifests into a module and reducing the code duplication 14:57:59 I wanted to check I'm not missing some really obvious reason not to do that 14:59:09 michchap: we already planned to do this as part of composable services 14:59:22 michchap: so ++ for the idea 14:59:28 dprince: is there an implementation or should I put one up? 14:59:46 michchap: lets talk on #tripleo afterwards 15:00:09 dprince: sure 15:00:13 thanks everyone 15:00:17 #endmeeting