14:00:58 <dprince> #startmeeting tripleo
14:01:03 <openstack> Meeting started Tue Feb 23 14:00:58 2016 UTC and is due to finish in 60 minutes.  The chair is dprince. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:01:04 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:01:05 <d0ugal> o/
14:01:07 <openstack> The meeting name has been set to 'tripleo'
14:01:16 <jaosorior> o/
14:01:28 <slagle> hi
14:01:29 <dprince> anyone around for a tripleo meeting
14:01:43 <derekh> o/
14:01:46 <dtantsur> o/
14:01:50 <jtomasek> o/
14:01:56 <gfidente> o/
14:02:30 <jdob> o/
14:03:14 <leanderthal> o/
14:03:19 <dprince> #topic agenda
14:03:19 <dprince> * bugs
14:03:19 <dprince> * Projects releases or stable backports
14:03:19 <dprince> * CI
14:03:21 <dprince> * Specs
14:03:27 <dprince> * Create stable branch of DIB to help phase out old ramdisk http://lists.openstack.org/pipermail/openstack-dev/2016-February/086738.html
14:03:33 <dprince> * open discussion
14:03:57 <trown> o/
14:04:08 <dprince> There is one extra item above to discuss creating of a DIB branch. Any other topics to add for this week that I missed?
14:05:21 <dprince> okay, lets go
14:05:28 <dprince> #topic bugs
14:06:41 <dprince> I'm not actually sure there are bugs on all of these patches but it would be good to get eyes on the depends-on here: https://review.openstack.org/#/c/278553/
14:07:01 <dprince> as these are blocking us from testing current
14:07:16 <trown> dprince: it is down to just 2 patches, 1 for undercloud and 1 for THT
14:07:30 <trown> and both derekh and I got a successful pingtest with those
14:08:15 <dprince> trown: nice. So is the plan that once the tripleo-ci patch passes we land them all?
14:08:37 <trown> dprince: that would be my hope ya
14:09:13 <derekh> trown: we'll have to edit out part of the patch to tripleo.sh but ideally yes I think so
14:10:20 <dprince> okay, I've got one quick undercloud bug to (finally) get zaqar working: https://review.openstack.org/#/c/283221/
14:10:21 <trown> derekh: right, we will need to use current-tripleo, but we can take the hash for it from what (hopefully) passes ci
14:10:32 * derekh points to the change to DELOREAN_REPO_URL
14:11:05 <dprince> derekh: thanks
14:11:16 <derekh> trown: yup, I also have a sleep 600 in there, I put that in yeterday wehn I noticed nodes we're ready but I may have just been crazy
14:11:53 <trown> derekh: ah cool, I have a slightly better scriptlet we could put there that checks nova hypervisor-stats
14:12:01 <derekh> trown: dprince we can just remove the sleep and merge, if it causes problems put the sleep back in while we investigate
14:12:18 <slagle> derekh: was it that oc services weren't up after create_copmlete?
14:12:52 <slagle> i've seen that in the ha job. i put in that 1 hack for crm_resource --wait in tripleo-ci
14:12:54 <trown> slagle: no, before deploy, ironic nodes not available
14:13:01 <slagle> oh ok
14:13:09 <derekh> slagle: not sure, iirc on friday when I tried a deploy I had lods of 500's from services
14:13:51 <derekh> slagle: when I cam back on monday and tried overcloud deploy on the same undercloud it went ok
14:14:12 <derekh> slagle: so I figured the weekend ebtween deploy/register and deploy helped somehow
14:14:37 <derekh> anyways, if nobody else has seen it lets drop it and merge
14:14:47 <derekh> if it causes a problem add a quick sleep in
14:14:54 <derekh> and figure it out
14:15:02 <trown> +1
14:15:03 <dprince> okay, that sounds like a plan
14:15:04 <slagle> k, wfm
14:15:23 <marios> o/ sorry also on a call
14:15:41 <dprince> #topic Projects releases or stable backports
14:16:06 <dprince> I'd actually like to mention the desire to create a stable branch for DIB here I think
14:16:13 <dtantsur> ++
14:16:20 <dprince> dtantsur: want to drive this?
14:16:28 <dprince> #link http://lists.openstack.org/pipermail/openstack-dev/2016-February/086738.html
14:16:30 <trown> I thought DIB was supposed to be backwards compatible?
14:16:42 <dtantsur> please check the link, I'll give you tl;dr
14:17:00 <dtantsur> we (ironic) want to drop support for the old bash ramdisk from out code
14:17:13 <dtantsur> if we do that, we won't be able to gate on DIB any more
14:17:24 <derekh> its supposed to be, but has anybody tried build a F19 images latetly /me would be surprised if it worked
14:17:25 <dtantsur> then we're losing DIB support for our stable branches as well
14:17:41 <dtantsur> (it can get broken at any moment)
14:18:09 <dtantsur> the easiest way out for everyone is for DIB to get a stable/liberty branch (ideally, stable/mitaka would work too)
14:18:27 <dtantsur> then we can just drop things from both our master and DIB itself, and live happily
14:18:28 <trown> hmm... so a stable branch for a single element?
14:18:41 <dtantsur> not sure what you mean by "for a single element"...
14:18:55 <dprince> trown: we already have stable branches for other projects. t-h-t
14:18:56 <trown> the bash deploy ramdisk is a single element
14:19:06 <dtantsur> yes, but branch is for project, not for element
14:19:09 <dprince> trown: not a single element. All of DIB
14:19:25 <dtantsur> also while I agree that DIB is supposed to be backward compatible, things do happen from time to time
14:19:27 <trown> ya, but the rest of DIB does not need a stable branch, so we create it for that single element
14:19:34 <slagle> dtantsur: why the jump from "if ironic drops bsah ramdisk support" -> "can't gate on dib"
14:19:48 <dtantsur> slagle, DIB gate is running ironic master and will fail if we remove the code
14:20:21 <slagle> what part of the DIB gate? where it uses the bash ramdisk?
14:20:28 <derekh> why can't we just stop maintaining it, like any other element not used in CI
14:20:34 <dprince> dtantsur: if we update the DIB gate to use IPA would that solve it?
14:20:34 <slagle> why dont we just update that to not use the bash ramdisk?
14:20:43 <slagle> ^^ my question too
14:20:58 <dtantsur> slagle, gate-tempest-dsvm-ironic-pxe_ssh-dib
14:21:17 <dtantsur> dprince, yes, but we'll lose coverage for this element, and our stable branches may get hurt
14:21:34 <dtantsur> the old ramdisk was perfectly supported in Kilo and Liberty
14:21:35 <trown> well, really just liberty...
14:21:56 <dprince> dtantsur: so we create a stable/liberty as a buffer. No harm done
14:22:07 <dprince> dtantsur: and then we update the CI job to use IPA anyways?
14:22:18 <dtantsur> dprince, yep, then you feel free to do anything on master
14:22:31 <dtantsur> cause you won't affect our stable gates
14:22:41 <dtantsur> so yes, we're removing the bash-based gate
14:22:58 <slagle> i disagree with changing the dib backwards compatibility expectation on account of this
14:23:02 <dtantsur> then it's a good point that we should get an IPA-based gate
14:23:16 <slagle> if we need to do something as a buffer to get us by, that's fine
14:23:21 <trown> I am so confused... we remove the bash-based gate, why do we need the bash deploy element?
14:23:28 <dprince> slagle: I don't think we are changing backwards compat. Support for this element is just being removed
14:23:41 <dtantsur> trown, we can remove it
14:23:51 <dtantsur> if we have stable branches for DIB, I mean
14:23:52 <slagle> dprince: as long as this is clear. we're not free to do anything on master once/if the stable branch is there
14:23:56 <dprince> dtantsur: another option if people seem to push back on this (for whatever reason). Just leave the (broken) code in DIB and add a comment to the readme
14:24:12 <slagle> infra relies heavily on dib, etc
14:24:27 <dtantsur> dprince, well... then someone can land a change breaking our stable gates, still relying on DIB and the old ramdisk..
14:24:30 <trown> dprince: I prefer that option
14:24:53 <trown> if we create a stable/liberty for DIB, then that means we get that branch for liberty delorean packages
14:25:07 <slagle> i really dont think we want that
14:25:07 <dtantsur> probably
14:25:17 <trown> which seems a bit not ideal, if it is just meant to save us from one unsupported element
14:25:29 <dprince> dtantsur: I view creating a branch as a nice safe place for people to hang out while they update to IPA. Sounds like some people would prefer we not do that. So just add a comment to the readme and move on.
14:25:41 <dtantsur> what comment?
14:26:00 <dprince> dtantsur: a comment to the DIB element for the old bash ramdisk
14:26:03 <dtantsur> I'm sorry guys, I don't get how a comment will prevent people from breaking stable gates.....
14:26:13 <dtantsur> especially somewhere in a base element
14:26:17 <dprince> dtantsur: We'll just drop the gates
14:26:38 <dtantsur> dprince, how do you ensure you don't break ironic stable/liberty then?
14:26:53 <jroll> dropping the stable gates is equivalent to dropping old ramdisk support on liberty IMO
14:27:00 <dprince> dtantsur: leave the gates on for those branches. Using DIB master since there is no branch
14:27:17 <dtantsur> dprince, how do we prevent DIB for breaking us?
14:27:32 <dtantsur> since DIB master will no longer be cogated with ironic at all
14:27:37 <derekh> dtantsur: can't the stable job get run on DIB master ?
14:27:47 <dprince> dtantsur: TO be clear creating a stable branch is the cleanest idea here. We created stable branches for everything else so what harm does it cause to do it for DIB too?
14:28:05 <dtantsur> derekh, maybe? I'm not sure how much infra would hate us for mixing it.. also what about requirements? how do we merge them?
14:28:26 <dtantsur> dprince, are you asking me? :) I'm not against it, you probably want to ask slagle
14:28:28 <trown> dprince: it means we freeze liberty DIB... unless we backport every DIB change
14:28:46 <dprince> trown: probably a good idea anyways
14:28:53 <bnemec> DIB makes backward compatibility promises.  If we break those, that's a bug that we should fix.
14:29:03 <slagle> i'm against changing the expectation of the project being backwards compatible on account of this reason
14:29:14 <slagle> a stable branch is also a lot of maintenance for $someone
14:29:20 <bnemec> I don't think we should branch an entire project because we _might_ screw up our backwards compatibility promise.
14:29:20 <dprince> This isn't a DIB backwards compat promise. It is a feature that is going away in the element
14:29:35 <dtantsur> bnemec, that's what the whole openstack does
14:29:47 <slagle> dtantsur: that's not true
14:29:55 <slagle> some projects use stable branches, some do not
14:30:20 * derekh steps out for a minute
14:30:21 <slagle> earlier on, it was decided, with wide concensus that dib would not use a stable branch and be backwards compatible
14:30:24 <dtantsur> a couple of telemetry projects do not iirc
14:30:39 <dtantsur> ok, the vast majority of openstack projects, including libraries and clients
14:30:51 <dprince> THe problem here is DIB has code it can't control. I think the issue here is that some of these elements don't belong in DIB because they break the promise
14:31:08 <dprince> This is just an element that is going away.
14:31:10 <bnemec> If Ironic is that concerned about dib breaking their gate, put a cap on dib in the stable branches.
14:31:30 <jroll> bnemec: that will affect all of openstack, unfortunately
14:31:37 <dprince> bnemec: Ironic wants to dump a feature that will break our gate
14:31:42 <dtantsur> bnemec, that's an option, but it will 1. put a cup on DIB for all stable/liberty branches, including tripleo itself, 2. prevent anyone from landing DIB fixes in liberty
14:33:42 <dprince> ANytime someone tries to optimize something by not creating a branch it gets complicated I think. To me the simplest thing is just to create a stable/liberty branch and move on
14:33:51 <slagle> having to land dib fixes in liberty is exactly what i want to avoid
14:34:42 <dtantsur> slagle, well, then if it gets broken (e.g. external mirror change), it's broken forever
14:34:47 <dprince> slagle: I understand your desire to avoid this work. But I think it is the cleanest solution here
14:34:50 <dtantsur> cause it will be capped for liberty release of all projects
14:35:12 <bnemec> I mean, if we make a breaking change in dib's base elements, that's going to break Ironic's gate anyway because they're still running against master on master.
14:35:30 <bnemec> And that's a bug we need to fix anyway.
14:35:38 <dtantsur> bnemec, we won't be gating DIB any more
14:35:43 <dtantsur> so no, you won't
14:36:28 <trown> wait... if you wont be gating DIB... whats the big deal?
14:36:41 <slagle> ya, my head just 'sploded too
14:36:44 <jroll> we still need to gate DIB on stable/liberty
14:36:45 <dtantsur> trown, not on master
14:36:50 <bnemec> Fine, but my point stands.  _If_ we break the job, then it's a bug that should be fixed.
14:36:51 <slagle> is that the end game here?
14:36:53 <dtantsur> trown, slagle, bnemec was token on master
14:37:03 <bnemec> That is _not_ reason to branch dib.
14:37:20 <dtantsur> bnemec, sigh... but you won't have any jobs on master... so you will break us, and we'll come back reverting things in hope it will help, etc..
14:37:46 <trown> seems like the reverse is true for tripleo almost everywhere
14:37:58 <dtantsur> what we do right now is reinventing the whole path that lead openstack to stable branches, to be honest
14:39:14 <dprince> dtantsur: okay. Lack of consensus. But I don't think anything is blocked by not doing anything
14:39:15 <lucasagomes> we still can gate on DIB on master, but using the ironic-agent element (instead of the deploy-ironic one)
14:39:26 <slagle> dprince: that may be. i would like to understand if infra has a take on this as well though, given they are a heavy consumer as well
14:39:28 <dtantsur> lucasagomes, that's not relevant to the discussion
14:39:52 <bnemec> It is actually.
14:40:12 <bnemec> It gives us test coverage of everything except the bash ramdisk element.
14:40:16 <dtantsur> bnemec, no
14:40:19 <dprince> dtantsur: yep, just use DIB master for the stable ironic/liberty branches.
14:40:46 <dtantsur> dprince, that what we do, how does it solve the problem?
14:41:04 <dtantsur> bnemec, IPA is built in a completely different way. even using a different command
14:41:15 <dtantsur> from DIB point of view, IPA is not a ramdisk, we build it as a disk image
14:41:44 <bnemec> dtantsur: ramdisk-image-create is literally a symlink to disk-image-create.  There's far less difference than you might think.
14:41:56 <sambetts> if DIB master is meant to be backward compatible can't we leave the bash ramdisk in there and just remove the gate jobs from Ironic master/mitaka and the Ironic code that supports the old ramdisk, then just leave a comment in the old ramdisk README that says this is only supported up to Ironic liberty
14:41:58 <dtantsur> bnemec, yeah, but base elements are different, at least used to be
14:42:18 <trown> sambetts: ya, that would be my preference
14:42:19 <dtantsur> sambetts, it is meant does not mean is always is. that's what gate guarantee
14:43:00 <sambetts> ?
14:43:13 <dtantsur> sambetts, if it's not tested, it's broken :)
14:43:27 <dprince> dtantsur: sounds like we aren't even close to resolving this. Sorry. I thought this would be a simple thing :/
14:43:32 <sambetts> it tested in stable/liberty, just not in mater or stable/mitaka
14:43:34 <dtantsur> yeah...
14:43:41 <dprince> dtantsur: can you start another thread for TripleO regarding this topic
14:43:57 <dtantsur> well... I can try, but looks like we need workarounds
14:43:58 <dprince> dtantsur: explain the sides, creating the branch, extra work involved by some in maintinain that, etc.
14:44:13 <dtantsur> anyway, thanks dprince for bringing it up
14:44:21 <dprince> dtantsur: np
14:44:23 <dprince> #topic CI
14:44:33 <jroll> well, there's two clear paths: 1) make a branch, drop the support in ironic now, 2) don't make branch, drop ironic support in three cycles
14:44:41 <jroll> lots of tradeoffs :)
14:44:41 <dprince> derekh: CI is working right? :)
14:44:45 * jroll shuts up now
14:44:59 <sambetts> jroll: just because its in DIB doesn't mean that mitaka has to support it right?
14:45:01 <derekh> dprince: yup, at the moment
14:45:17 <jroll> sambetts: let's take this elsewhere, we're off topic now
14:45:21 <dprince> cool.
14:45:29 <derekh> dprince: problems with a new mariadb package yesterday
14:45:33 <dprince> jroll: yeah, lets just start another thread on this for now
14:45:42 <dprince> derekh: which are solve now right?
14:45:44 <gfidente> a little OT but about CI, I had a submission to rename the ceph into upgrades but it isn't landed yet, not sure if you guys want to vote on it https://review.openstack.org/#/c/281997/1
14:46:07 <trown> derekh: dprince, ya RDO will still want to move to mariadb10, so we need to figure out what all went wrong there and try to get fixes in place
14:46:17 <dprince> gfidente: thanks, I will look
14:46:39 <dprince> gfidente: was that the right patch?
14:46:46 <trown> there was at least one issue (missing clustercheck binary) that was the packaging fault, but even after adjusting for that I could not get our galera to bootstrap with the new package
14:46:54 <gfidente> there is this guys as well https://review.openstack.org/#/c/260466/ which potentially runs upgrades in the upgrade job
14:47:14 <dprince> #link https://bugs.launchpad.net/tripleo/+bug/1547660
14:47:14 <openstack> Launchpad bug 1547660 in tripleo "Could not find command '/usr/bin/clustercheck'" [Critical,Triaged]
14:47:16 <dprince> trown: ^^?
14:47:21 <gfidente> dprince, https://review.openstack.org/277419
14:47:23 <trown> ya, that one is packaging fault
14:47:41 <dprince> gfidente: thanks, that is it
14:47:42 <trown> dprince: so we reverted the package from the deps repo in RDO
14:48:18 <gfidente> #link tripleo ceph rename into upgrade https://review.openstack.org/277419
14:48:44 <gfidente> #link trigger upgrades in ci https://review.openstack.org/#/c/260466/
14:49:29 <dprince> okay lets move on
14:49:33 <dprince> #topic specs
14:49:35 <derekh> gfidente: thanks for the reminder, will look again
14:50:23 <dprince> getting mostly positive feedback on the Mistral spec https://review.openstack.org/#/c/280407/
14:50:49 <dprince> rbrady: anything you'd like to add here?
14:51:20 <rbrady> dprince: nope.  I was giving it a day or so for feedback and then going to update
14:51:35 <dprince> rbrady: cool, sounds good
14:52:22 <dprince> Any other issues on specs this week?
14:53:46 <dprince> #topic open discussion
14:54:15 <bnemec> Missed this in the CI topic, but I think https://review.openstack.org/#/c/282462/ would help our HA job stability a bunch.
14:54:40 <gfidente> thanks bnemec !
14:54:41 <bnemec> Probably half or more of the failures I'm seeing are the nohostfound due to swift getting OOM'd.
14:54:48 <trown> I will be demoing tripleo-quickstart 2 weeks from tomorrow: https://www.youtube.com/watch?v=4O8KvC66eeU
14:55:09 <marios> i'd like to hightlight this bug https://bugs.launchpad.net/heat/+bug/1539541 which is a problem for upgrades at the moment
14:55:09 <openstack> Launchpad bug 1539541 in heat "Can't ignore updates to OS::Nova::Server" [High,In progress] - Assigned to Steve Baker (steve-stevebaker)
14:55:18 <bnemec> Need to talk to derekh about whether we can increase the size of the undercloud again.
14:55:55 <gfidente> hey guys I wanted to raise a question as well, I was thinking to use something like https://review.openstack.org/#/c/270189/ to switch the networking configuration on using hostnames instead of ips
14:56:03 <gfidente> on the basis that this should help with cleaner ipv6 support
14:56:04 <dprince> bnemec: +2
14:56:11 <gfidente> do you think that is doable and is worth?
14:56:40 <marios> gfidente: thanks will take a look. time is the main concern before i even look at the change
14:56:41 <dprince> trown: did you see my comment about distro support for tripleo-quickstart?
14:57:01 <marios> gfidente: but ultimately would help solve all of the '[]' issues with ipv6
14:57:06 <trown> dprince: yep, it supports everything that instack-virt-setup does
14:57:12 <bnemec> gfidente: No idea whether it's doable, but it seems like a good idea.
14:57:13 <gfidente> marios, exactly, my main point is that by using names we don't need the conditionals for cope with [] and : in the ip addresses
14:57:27 <gfidente> bnemec, marios but it'll need some review time :)
14:57:38 <marios> bnemec: its a trap!
14:57:44 <gfidente> I know bnemec -1 my submissions all the times so that's good
14:57:56 <gfidente> marios volunteers too? ;)
14:58:06 <trown> dprince: or python-tripleoclient for that matter
14:58:19 <derekh> bnemec: ack, we can probably bump it again another 1G if needed
14:58:45 <dprince> trown: that may be. but I think it gets us farther from multi-distro support because some of the scripts in the incubator (used by instack-virt-setup) did in fact support multiple distributions
14:59:00 <bnemec> derekh: I think we do, although the extra swap usage doesn't seem to have slowed the job down any.  It still finished in the same time as the ceph job.
14:59:18 <trown> dprince: I think ansible is better at supporting multi-distro than random bash scripts though
14:59:35 <dprince> trown: that was my real question. Does it get us in a better place...
14:59:45 <trown> dprince: for example, tripleo-quickstart uses the generic 'package' module instead of 'yum'
15:00:08 <derekh> bnemec: ok, lets merge that and see how it goes, if speed isn't affected then maybe its enough
15:00:13 <trown> that said, I have not tried it at all on anything except fedora and centos
15:00:23 <dprince> trown: cool, I might like to see package name abstractions. I know when I came to TripleO I was really discouraged to see everything hard coded to Debian package names
15:00:36 <dprince> trown: now the opposite is true in some cases. Hard coded to RH
15:00:38 <bnemec> derekh: Just pulled the trigger.
15:00:49 <dprince> oops. out of time
15:00:57 <dprince> Thanks everyone. Sorry about the quick cuttoff
15:01:01 <derekh> bnemec: ack
15:01:06 <dprince> #endmeeting