14:00:27 <shardy> #startmeeting tripleo
14:00:28 <openstack> Meeting started Tue May 10 14:00:27 2016 UTC and is due to finish in 60 minutes.  The chair is shardy. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:29 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:31 <openstack> The meeting name has been set to 'tripleo'
14:00:39 <shardy> #topic rollcall
14:00:41 <jdob> o/
14:00:47 <shardy> Hey all, who's around?
14:00:57 <tzumainn> hi!
14:01:05 <dprince> hi
14:01:14 <EmilienM> o/
14:01:56 <beagles> o/
14:02:09 <derekh> o/
14:02:22 <shardy> Ok then, let's get started :)
14:02:32 <shardy> #topic agenda
14:02:32 <shardy> * one off agenda items
14:02:32 <shardy> * bugs
14:02:32 <shardy> * Projects releases or stable backports
14:02:32 <shardy> * CI
14:02:35 <shardy> * Specs
14:02:35 <rhallisey> hi
14:02:39 <shardy> * open discussion
14:02:47 <shardy> Anyone have anything to add to the one-off items?
14:02:57 <shardy> there's one from me and one from beagles atm:
14:03:05 <shardy> #link https://wiki.openstack.org/wiki/Meetings/TripleO#One-off_agenda_items
14:03:21 <shadower> hey (sorry for being late)
14:03:34 <shardy> hey shadower, np
14:03:50 <shardy> #topic one off agenda items
14:04:04 <shardy> beagles: Hey, do you want to cover your item first here?
14:04:10 <beagles> shardy: sure
14:04:25 <shardy> #info bug tag for partial bug fixes
14:04:40 <beagles> I've been going through some of the older bugs especially the ones that are "in progress"
14:05:06 <d0ugal> o/
14:05:10 <beagles> there have been patches landed as workarounds, but its not clear that they are needed any longer so I would like to tag bugs like this for now
14:05:22 <beagles> so that we can go through and clean them as time and resources permit
14:05:37 <EmilienM> I like the idea, specially for things related to our CI
14:06:13 <shardy> there is already a CI tag, but +1 on having a possibly-obsolete-hack-workaround tag (naming to be determined)
14:06:38 <EmilienM> ++
14:06:47 <shardy> anything that allows us to divide and conquer so we start purging old and no longer valid bugs gets my +1 :)
14:06:54 <shardy> beagles: thanks for helping out here!
14:07:04 <beagles> cool... I'm open to suggestions on the name. Naming is always the hardest part
14:07:21 <beagles> shardy: np .. it's good stuff. Nice way to get some historical context on things
14:08:06 <beagles> temp-workaround might be reasonably descriptive and not be as scary as "hack"
14:08:31 <beagles> anyways, that's that in a nutshell we can bikeshed on names later
14:08:33 <shardy> beagles: +1. or just "workaround" would be OK I guess
14:08:46 <EmilienM> both wfm
14:08:49 <beagles> k
14:08:51 <shardy> we can decide the name in #tripleo later, thanks!
14:09:05 <shardy> Ok next topic was midcycle plans
14:09:28 <shardy> this was mentioned a couple of times recently, and I'd not considered organizing any f2f meetup this time
14:09:36 <EmilienM> +1
14:09:49 <shardy> what do folks think, should we aim for some sort of virtual hackfest/meetup around the middle of the cycle?
14:10:00 <beagles> +1 on virtual meetup
14:10:05 <shardy> I could arrange a series of focussed topic video calls or something
14:10:13 <marios> shardy: sounds good
14:10:14 <EmilienM> +1 on using #openstack-sprint + videoconf if needed
14:10:21 <shadower> yea better than traveling
14:10:23 <ccamacho> +1 virtual meetup
14:10:38 <shardy> obviously we can do that at any time, but it'd maybe be good to encourage some high-bandwidth discussion via some google hangouts or whatever
14:10:46 <EmilienM> or you can come in Quebec and I can cook french cooking
14:11:03 <beagles> :)
14:11:16 <slagle> +1 to virtual
14:12:14 <shardy> Ok, sounds like vague consensus, I'll start ML thread where we can decide the date and agenda
14:12:29 <EmilienM> shardy: I can help you to start agenda
14:12:37 <shardy> #info agreed to aim for virtual mid-cycle meetup, ML thread pending
14:12:45 <derekh> I'm happy to go virtual if bnemec turns on his cam so we can see what kind of gestures he is making at the screen ;-)
14:13:01 <shardy> Any other one-off items before we continue?
14:13:06 <bnemec> derekh: :-)
14:13:36 <shardy> #topic bugs
14:13:56 <shardy> #link https://bugs.launchpad.net/tripleo/
14:14:23 <derekh> Current bug on trunk https://bugs.launchpad.net/tripleo/+bug/1580076
14:14:24 <openstack> Launchpad bug 1580076 in tripleo "Upgrades job failing pingtest with "Message: No valid host was found."" [Critical,Triaged]
14:14:37 <derekh> causing all upgrades jobs to fail
14:15:25 <slagle> hmm, is that happening before of after the stack-update?
14:15:26 <EmilienM> weird, I swear I saw pingtest working last night
14:15:27 <shardy> There's also https://bugs.launchpad.net/tripleo/+bug/1580170 which looks like a puppet module version mismatch on liberty->mitaka upgrade, possibly need to get more info on that
14:15:28 <openstack> Launchpad bug 1580170 in tripleo "overcloud upgrade liberty to mitaka failed" [Undecided,New]
14:15:41 <slagle> i would assume after since other jobs are passing the pingtest
14:15:42 <EmilienM> (I saw pingtest working *after* stack update last night)
14:15:48 <slagle> i wonder if we have a real upgrades bug?
14:15:56 <ccamacho> derekh, but not only affects upgrades, also (sometimes) creating the overcloud
14:16:14 <derekh> slagle: after, I have a hunch this patch started the problem https://review.openstack.org/#/c/312300/1 but its just currently a hunch
14:16:27 <ccamacho> Im trying to reproduce it creating the overcloud in my CI
14:16:44 <derekh> ccamacho: ok
14:17:24 <derekh> slagle: yup, we may possible have found a real upgrades bug
14:17:29 <EmilienM> derekh: why?
14:17:38 <EmilienM> why 312300 ? do you have logs?
14:18:17 <dprince> derekh: are you tested a puppet pin that uses keystone prior to that yet?
14:18:52 <derekh> EmilienM: dunno, don't worry about it being that patch until I run some tests, its just a hunch based on the auth problem we're seeing, when the problem started and the timing of that patch
14:18:57 <EmilienM> dprince: it's nova
14:19:12 <EmilienM> derekh: 5th may iirc
14:19:40 <ccamacho> derekh, agreed on the starting time for the issues
14:19:55 <EmilienM> derekh: it can't be this one, upgrade job passed at 2016-05-05 23:52:20
14:20:06 <EmilienM> and the puppet-nova patch merged at May 5 6:23 PM
14:20:18 <EmilienM> source: http://tripleo.org/cistatus.html and https://review.openstack.org/#/c/312300/
14:20:31 <dprince> EmilienM: okay, either way I'd say lets test them both to see
14:20:41 * EmilienM checks in tripleo logs to check we had the commit
14:21:17 <shardy> Ok, so it sounds like we've got enough eyes on this issue, are there any other bugs folks want to highlight?
14:21:30 <derekh> EmilienM: ok, btw I'm not suggesting we jump into reverting it or anything, just letting people know what my current train of thought was
14:21:46 <EmilienM> derekh: wait, I checked in the logs, and this is the commit in puppet-nova that worked: https://github.com/openstack/puppet-nova/commits/b108a7c36bbc733b3aa90786540e978f5c0ec059
14:21:55 <EmilienM> and we don't have the one you mentionned, so it's still a possibility
14:22:17 <derekh> dprince: will try a temprevert, kindof tried something similar here but it didn't even get to the ping test https://review.openstack.org/#/c/314510/
14:22:22 <derekh> EmilienM: ack
14:22:32 <shardy> Just a reminder to please target bugs when you triage them, e.g if it's an actual bug in TripleO pieces vs a CI fix
14:22:35 <shardy> https://launchpad.net/tripleo/+milestone/newton-1
14:22:56 <shardy> Then we can burn down the open list for the milestone and know when it's a good time to release
14:22:56 <dprince> derekh: I see. I should know to always check the "nothing to see" patches first ;)
14:23:15 <derekh> dprince: you should have learned by now ;-)
14:23:22 <EmilienM> derekh: well, 314510 has same effect as temprevert... and it fails :(
14:23:47 <derekh> EmilienM: ya but it didn't get as far as the ping test,
14:23:51 <derekh> EmilienM: going to recheck now
14:23:58 <dprince> derekh: lets recheck that on I think
14:24:09 <shardy> and also ref the cleanup beagles has been helping with - please review the list of bugs raised by you, and close any ye-olde ones which are no longer valid
14:24:25 <derekh> dprince: done
14:25:20 <shardy> Ok shall we continue and defer further discussion re bug #1580076 to after the meeting?
14:25:21 <openstack> bug 1580076 in tripleo "Upgrades job failing pingtest with "Message: No valid host was found."" [Critical,Triaged] https://launchpad.net/bugs/1580076
14:25:29 <EmilienM> ++
14:25:40 <derekh> +1
14:25:42 <shardy> #topic Projects releases or stable backports
14:26:12 <shardy> #link http://releases.openstack.org/newton/schedule.html
14:26:26 <shardy> So, I wanted to run a plan past you all for the n-1 milestone
14:27:04 <shardy> I was thinking it'd be good to do a coordinated release of all the tripleo pieces, based on a passing periodic CI job, around the time of the n-1 milestone (e.g in about 3 weeks time)
14:27:32 <shardy> I'll probably write a script that can scrape the latest periodic CI pass and propose all-the-things to openstack/releases
14:27:46 <shardy> then at any time, we can tag a release for a combination of things we know to work
14:28:02 <EmilienM> just an FYI about puppet modules, we might produce a first newton release by end of this month
14:28:17 <dprince> shardy: would be cool to display that on tripleo.org perhaps too. Maybe in the CI status page or something
14:28:41 <slagle> shardy: once the releases were done, how would people consume them?
14:29:19 <derekh> shardy: your script can look at this file http://trunk.rdoproject.org/centos7/current-tripleo/versions.csv to see what version of each project is included (just FYI in case you didn't know)
14:29:33 <slagle> just wondering how we'd be able to definitevely install a n-1
14:29:34 <shardy> slagle: that's a good question - I've not yet figured that out - I was hoping we could wire up tripleo-quickstart to enable easily deploy for a given release
14:29:49 <shardy> obviously we can also publish the delorean hash for the passing CI run
14:30:06 <shardy> but I was hoping we could get the tagged repos more directly consumable
14:30:26 * shardy looks around for trown
14:31:21 <derekh> shardy: havn't seen trown around, I guess his wife had the baby, it could be a while before we see him
14:31:36 <shardy> slagle: I guess I was focussing on the first step, which is to define a point-in-time release which we expect to work, and has a known bunch of features/bug-fixes in it
14:31:50 <shardy> slagle: you're right, we need to then define and document how folks consume it
14:31:55 <shardy> derekh: ah, cool
14:32:24 <slagle> shardy: k, i think we could come up with something
14:32:45 <shardy> If folks are OK with the idea of milestone related releases, we can do that and figure out the consuming of it
14:33:18 <dprince> shardy: I always use trunk, but whatever :)
14:34:18 <shardy> Ok, lets table the release discussion and work it out over the next couple of weeks
14:34:24 <shardy> anything else release related?
14:34:52 <shardy> https://review.openstack.org/#/c/308236/ has some discussion re our stable branches FYI
14:35:11 <shardy> there's some resistance to our application for the follows-stable tag
14:36:33 <shardy> feel free to pitch in there if you have an opinion - I may start a ML thread on the same topic
14:36:56 <shardy> #topic CI
14:37:09 <shardy> So, other than the upgrades job, any CI news to discuss?
14:37:25 <EmilienM> yes
14:37:31 <shardy> the newly discovered step removal aka "turbo" option? :)
14:37:34 <dprince> shardy: Emilien is chopping some of the steps from deployment
14:37:37 <EmilienM> yesterday I released puppet-ceph
14:37:43 <slagle> all jobs now 7 minutes faster!
14:37:44 <dprince> that should speed things up I think
14:37:48 <slagle> upgrades job now 14 minutes faster!
14:37:55 <EmilienM> and we need to bump puppet-ceph to stable/hammer
14:37:58 <EmilienM> please review https://review.openstack.org/#/c/314311/
14:38:05 <EmilienM> I'm not sure this patch does what I want
14:38:32 <shardy> Cool, well nice work on the optimization! :)
14:38:45 <dprince> EmilienM: we aren't using stable for the other modules though?
14:38:47 <EmilienM> and yeah, I'm also working on step6 removal https://review.openstack.org/314253
14:38:51 <derekh> Its a tripleo improvment though not just ci, I wouldn't like people thinking we just fixed something in CI
14:38:52 <dprince> EmilienM: why puppet-ceph?
14:38:53 <EmilienM> dprince: yeah but ceph is special
14:39:04 <shardy> I've been in discussion with zaneb and other heat folks, and it sounds like the heat memory usage issues are likely to be improved soon
14:39:16 <EmilienM> dprince: not a lot of people are working on this module, and we found out tripleo can't deploy Jewel *right now*, so better to pin
14:39:36 <sshnaidm> I'd like you consider tempest for periodic nonha jobs, please: https://review.openstack.org/#/c/297038/
14:39:46 <shardy> https://review.openstack.org/#/c/311837/ is the first step if you'd like to follow it
14:40:20 <dprince> EmilienM: https://review.openstack.org/#/c/184844/ was an idea too
14:40:47 <slagle> shardy: was wondering if we should setup a periodic job to test with convergance?
14:41:08 <slagle> shardy: what do you think our plans should be around convergance? do you think we could switch in newton?
14:41:09 <EmilienM> dprince: should I update my patch or mine works too?
14:41:19 <shardy> slagle: Probably not yet - I tested it locally quite recently, it makes the memory usage and DB utilization/timeout issues much, much worse atm
14:41:28 <dprince> slagle: yeah, hearing about the memory usage of convergence at Austin was a bit concerning
14:41:29 <slagle> i see, ok
14:42:02 <shardy> slagle: I think we should probably wait for things to get optimized and configure for non-convergence at least until later in Newton
14:42:37 <shardy> Heat may switch the default relatively soon, but IMHO the benefits don't yet outweigh the performance issues for the TripleO use-case
14:42:43 <dprince> EmilienM: I'd just like to clarify what we are doing where. If we do it in your patch at the very least a comment explaining why. Initially I thought we'd have a separate element for this though. That was my point.
14:42:55 <EmilienM> dprince: commit message is not enough?
14:43:14 <dprince> EmilienM: #comment
14:43:21 <EmilienM> kk
14:43:40 <shardy> slagle: we could set up a periodic job tho I suppose, or maybe an experimental job that we can try
14:44:18 <dprince> shardy: the non-convergence path would eventually get removed though right/
14:44:35 <dprince> shardy: so we'd be on borrowed time to fully migrate over once heat switches?
14:44:40 <shardy> dprince: eventually I guess, but there's no discussion of that happening anytime soon
14:45:12 <shardy> it's not even deprecated yet, they'll switch the default, then we'd need *at least* two full cycles before anything got removed, I anticpate it being longer than that
14:45:26 <EmilienM> shardy: what is the progress on dropping OPM rpm in stable jobs and use git?
14:46:03 <shardy> EmilienM: I proposed an approach, but dprince had a preference for a different implementation - I've not yet had time to revisit, so it's still TODO
14:46:20 <shardy> if anyone wants to pick that up, feel free
14:46:29 <EmilienM> ack
14:46:55 <shardy> we actually need to switch master to using the per-module delorean packages too ref trown's item last week
14:47:09 <shardy> Ok, anything else re CI before we continue?
14:47:19 <derekh> Main thing I got to report is that it looks like we're going ahead with the HW upgrade tomorrow
14:47:25 <derekh> I'll likely take the cloud down around 1pm UTC, and expect it down for about 12 hours
14:47:36 <derekh> sshnaidm: once thats done, assuming it speeds things up even more I think we can start thinking about adding tempest to the period job
14:47:48 <dprince> derekh: hope it goes well
14:47:48 <sshnaidm> derekh, cool, thanks
14:48:00 <shardy> derekh: nice - do you need any help or do you have the bringup post-upgrade covered?
14:48:05 <dprince> I've got one thing that has to happen before the 14th as well. Very important
14:48:14 <derekh> dprince: certs ?
14:48:14 <dprince> We need a new SSL cert :/
14:48:23 <dprince> derekh: yep :)
14:48:29 * derekh meant to remind dprince last week, opps
14:48:38 <shardy> heh, good reminder :)
14:48:41 <dprince> derekh: I got my own reminder. So its on the list
14:48:47 <bnemec> Who needs certs when your cloud is down for upgrades? :-)
14:48:57 <shardy> OK #topic Specs
14:48:59 <sshnaidm> another topic - please comment my elastic-recheck related mail in maillist if you're interested
14:49:02 <sshnaidm> sorry :)
14:49:03 <dprince> bnemec: well, if it comes back up we'll need em
14:49:06 <shardy> #topic Specs
14:49:17 <derekh> shardy: I think I'm got it handled, will ping people if I need extra hands
14:49:26 <shardy> So, there's a ML message about two specs related to opnfv
14:49:32 <shardy> would be good to get some eyes on those
14:49:59 <shardy> #link https://review.openstack.org/#/c/313871/
14:50:12 <shardy> #link https://review.openstack.org/#/c/313872/
14:50:49 <shardy> I also added https://blueprints.launchpad.net/tripleo/+spec/custom-roles to track the fully-composable model we discussed at summit, which those two will probably require
14:51:18 <shardy> #link http://lists.openstack.org/pipermail/openstack-dev/2016-May/094287.html
14:52:21 <beagles> fwiw, I don't think they actually know how the dpdk one is going to work out just yet. The SR-IOV one is "closer" in terms of reality at the moment
14:52:44 <beagles> there are some non-tripleo related things to work out with respect to dpdk
14:53:03 <shardy> beagles: cool, please comment on the specs
14:53:10 <beagles> yup
14:53:18 <shardy> there's a lot of detail in both, but not that much clarity on the actual implementation AFAICT
14:53:39 <shardy> Anything else spec related?
14:54:05 <shardy> https://launchpad.net/tripleo/+milestone/newton-1
14:54:30 <shardy> we only have two features on the n-1 list, so we should add anything we expect to land in the next 2-3 weeks
14:54:46 <shardy> #topic open discussion
14:54:58 <shardy> Sorry, only 5 mins left - anything to raise?
14:55:34 <dprince> shardy: I added a new spec too
14:55:37 * derekh runs out the door a little early
14:55:38 <dprince> #link https://blueprints.launchpad.net/tripleo/+spec/remote-execution
14:56:02 <dprince> The power there is really in the Mistral workflow bits... but the CLI work shows it nicely I think
14:56:42 <EmilienM> I like it, but I wondered about security here, and the capacity of running malicious software remotely
14:57:01 <slagle> can we try and link etherpads tracking patches/reviews into the blueprints?
14:57:09 <dprince> EmilienM: I'm using the same mechanism we use for Heat. So nothing new really I think
14:57:13 <shardy> dprince: Nice, I saw that, looks good - it'd be interesting to see how that aligns with operator requests re e.g running ansible against a dynamic inventory generated by TripleO
14:57:14 <EmilienM> if someone gets admin credentials, it's easy to run malicious software remotely
14:57:15 <slagle> i noticed there were ones for composable services and mistral
14:57:19 <slagle> etherpads, that is
14:57:22 <slagle> but no one can find them
14:57:31 <shardy> EmilienM: it's already easy to do that if you have credentials
14:57:42 <marios> EmilienM: if you already have access to the pvt keys then you can do whatever you want anyway (e.g. from undercloud)
14:57:45 <EmilienM> ok, I'm just highlighting, just in case.
14:58:09 <EmilienM> slagle: https://etherpad.openstack.org/p/tripleo-composable-roles-work ?
14:58:23 <shardy> slagle: good point, lets link them from the whiteboards on the blueprints
14:58:28 <slagle> EmilienM: yes, pads like that
14:58:41 <slagle> the link isn't discoverable unless you already know it
14:58:41 <dprince> composable services is here: https://etherpad.openstack.org/p/tripleo-composable-services
14:59:00 <slagle> lol, or we end up with 2 etherpads :)
14:59:13 <shardy> <facepalm>
14:59:16 <slagle> qed
14:59:20 <shardy> Ok, we're out of time - thanks all!
14:59:24 <shardy> #endmeeting