14:01:05 <EmilienM> #startmeeting tripleo
14:01:06 <openstack> Meeting started Tue May 23 14:01:05 2017 UTC and is due to finish in 60 minutes.  The chair is EmilienM. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:01:08 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:01:10 <openstack> The meeting name has been set to 'tripleo'
14:01:12 <EmilienM> #topic agenda
14:01:18 <EmilienM> * review past action items
14:01:19 <EmilienM> * one off agenda items
14:01:21 <EmilienM> * bugs
14:01:24 <EmilienM> * Projects releases or stable backports
14:01:25 <EmilienM> * CI
14:01:27 <EmilienM> * Specs
14:01:29 <EmilienM> * open discussion
14:01:31 <EmilienM> Anyone can use the #link, #action and #info commands, not just the moderatorǃ
14:01:33 <EmilienM> Hi everyone! who is around today?
14:01:36 <jtomasek> o/
14:01:38 <beagles> o/
14:01:39 <panda> o/
14:01:40 <rasca> o/
14:01:41 <ccamacho> o/ o/
14:01:42 <marios> \o
14:01:43 <sshnaidm> \o/
14:01:51 <atoth> o/
14:02:00 <EmilienM> sshnaidm: /o\
14:02:01 <matbu> o/
14:02:08 <saneax> o/
14:02:10 <jrist> o/
14:02:13 <shardy_> o/
14:02:50 <ccamacho> the owl say hi!  (~˘▾˘)~
14:02:56 <EmilienM> ccamacho: perfect
14:02:59 <EmilienM> #topic review past actions items
14:03:01 <adarazs> o/
14:03:06 <EmilienM> * rasca to send ML about tripleo-quickstart-utils so we keep open discussion going: done
14:03:49 <EmilienM> rasca: feel free to post updates, sounds like we're waiting for your feedback on the replies now
14:04:03 <EmilienM> * EmilienM to share tripleo project updates slides to ML: done
14:04:14 <rasca> EmilienM, sure
14:04:24 <EmilienM> #link tripleo project updates - boston https://docs.google.com/presentation/d/1knOesCs3HTqKvIl9iUZciUtE006ff9I3zhxCtbLZz4c
14:04:26 <mwhahaha> o/
14:04:36 <d0ugal> o/
14:04:36 <EmilienM> #topic one off agenda items
14:04:43 <cdearborn> \o
14:04:56 <EmilienM> panda: please go ahead!
14:04:58 <panda> EmilienM: thanks, I'll make it quick
14:06:02 <panda> so, there has been an explosion in featureset files and the current assignement function i_ll_just_pick_one() is not really working, so until we'll find a better solution I set up this etherpad  https://etherpad.openstack.org/p/quickstart-featuresets for coordination
14:06:08 <pradk> o/
14:06:42 <panda> so we can acquire the lock on a index number without stomping ecah other's feet
14:07:11 <EmilienM> does i_ll_just_pick_one() will query the etherpad content?
14:07:20 <panda> the other this is that currently there are a lot of transition in oooq oooq-extras and tripleo-ci, and also we are still working to dissolve some confusion around featureset files
14:07:38 <EmilienM> panda: I think etherpad is a good start for now
14:08:06 <EmilienM> I also believe the featuresets should be documented in-tree of tripleo-quickstart
14:08:10 <panda> So I'd like to ask that people that want to +2 reviews on those project, attend some of the CI meeting, to be sure they are updated on the latest developements
14:08:28 <panda> EmilienM: yes, we have to expand documentation around featurests a lot
14:08:44 <panda> I'm done.
14:09:03 <EmilienM> panda: actually, everytime to add a featureset in oooq, it should be *required* to document it in the patch
14:09:19 <EmilienM> panda: maybe you can send a reminder to openstack-dev [tripleo] so everyone can read this ingo
14:09:22 <EmilienM> info*
14:09:28 <panda> EmilienM: yes, we currently update the matrix, but we have to add other pieces to the documentation
14:09:46 <panda> like what variable should go in featureset files
14:09:50 <EmilienM> good, maybe make it clear again, and take the opportunity to share the etherpad url
14:09:54 <panda> and how to treat them
14:10:13 <panda> yes, I'll follow with an email to openstack-dev
14:10:26 <EmilienM> thank you sir
14:11:12 <EmilienM> ccamacho: go ahead
14:11:20 <ccamacho> hey guys, just a quick update on 2 features I want to have, basically to have a command to backup and restore an undercloud..
14:11:26 <ccamacho> https://blueprints.launchpad.net/tripleo/+spec/undercloud-backup-restore
14:11:42 <ccamacho> I have create this blueprint to log the progress.. I started it last thursday
14:11:55 <ccamacho> now testing the 3th iteration locally
14:12:25 <ccamacho> Please, feedback is welcomed there :)
14:12:26 <ccamacho> thanks
14:12:40 <marios> ccamacho: ++ nice addition esp for pre-upgrade backup. some changes esp around network config have proven very problematic and best recovery is undercloud restore for the networks
14:12:57 <marios> ccamacho: will you write a spec?
14:13:12 <marios> is it necessary i mean if it will be large change?
14:13:18 <ccamacho> sure if its needed
14:13:31 <ccamacho> I dont think will be a large change
14:13:49 <EmilienM> I'm always asking myself if the undercloud is considered as ephemeral or if there are some cases where we want to backup it
14:13:54 <panda> will this have to work also in baremetal undercloud ?
14:13:57 <ccamacho> its more like to automate 2 steps we already have documented
14:14:20 <EmilienM> I think I've had this discussion with dprince a while ago
14:14:43 <ccamacho> for the backup I want to keep the data needed to run again the undercloud install and make it work again..
14:15:15 <EmilienM> when you say the data, what else do we have beside the mysql database?
14:15:19 <ccamacho> I think is critical for users to have a "simple execute this command" to have a backup
14:15:50 <matbu> ccamacho: +1
14:16:01 <dprince> if we had backup-restore mechanism we could just always use that as a means to "update" (reinstall a fresh, undercloud)
14:16:09 <marios> EmilienM: some of this is from the recent conversations about improving the upgrade but i am sure it has been discussed in the past esp the issue with undercloud restore for the changed networks
14:16:11 <dprince> and could even upgrade to a containers undercloud this way
14:16:35 <ccamacho> undercloud configuration like networks
14:16:36 <ccamacho> yeah
14:16:59 <marios> EmilienM: sorry, 'recent' conversationss, e.g. thread at http://lists.openstack.org/pipermail/openstack-dev/2017-May/116876.html
14:17:06 <marios> as an examplt (for anyone who missed it)
14:17:07 <EmilienM> do we want to store the bits on the filesystem or in something like Swift?
14:17:29 * ccamacho wasnt thinking about containers yet dprince but yeah probably can  be a good thing to have
14:17:30 <EmilienM> marios: thx for the link
14:17:40 <marios> ccamacho: we should discuss offline i guess, but why not just db dump and restore?
14:18:13 <EmilienM> at this stage of the cycle, is it safe to target it for queens-1?
14:18:22 <ccamacho> marios ack we can check it later, what about keeping your OC images also in the backup..
14:18:24 <EmilienM> or do we want this one done by pike?
14:19:20 <ccamacho> EmilienM we can have it for Q but is it possible to backport it to P ?
14:19:38 <EmilienM> backporting a feature is against OpenStack stable policy
14:19:53 <EmilienM> ccamacho: https://docs.openstack.org/project-team-guide/stable-branches.html#support-phases
14:20:14 <ccamacho> mm EMilienM how much time we have for landing it on P?
14:20:21 <ccamacho> at least the backup
14:20:29 <EmilienM> ccamacho: https://releases.openstack.org/pike/schedule.html
14:20:34 <EmilienM> I'll let you do some reading :)
14:20:44 <ccamacho> EmilienM ack thanks we can check it later then
14:20:53 <EmilienM> I don't see much pushback on this feature right now. I would suggest to also discuss it on the ML to make sure
14:21:11 <EmilienM> ccamacho: yes, you can check the schedule later and let me know what you think.
14:21:31 <EmilienM> #action ccamacho to propose undercloud backup/restore blueprint on ML
14:21:43 <ccamacho> thanks!
14:21:53 <EmilienM> #action panda to remind oooq featureset policy / etherpad on ML
14:22:22 <EmilienM> #action rasca follow up tripleo-quickstart-utils discussion on ML
14:22:29 <EmilienM> sshnaidm: go ahead please
14:22:52 <sshnaidm> just to make it visible to everyone:  http://lists.openstack.org/pipermail/openstack-dev/2017-May/117263.html
14:23:05 <sshnaidm> a proposal to do a sprint to reduce tripleo deployment time
14:23:20 <EmilienM> sshnaidm: good idea. I already replied
14:23:31 <EmilienM> sshnaidm: anything you want to discuss here?
14:23:47 <sshnaidm> EmilienM, no, that's it I think, the rest is in ML
14:23:54 <EmilienM> ok
14:24:12 <EmilienM> marios: go ahead please!
14:24:18 <marios> EmilienM: did you miss matbu?
14:24:24 <EmilienM> yes
14:24:27 <matbu> marios: lol yes :)
14:24:28 <EmilienM> I'm blind sorry
14:24:41 <EmilienM> matbu: go ahead
14:24:46 <matbu> but no worries :)
14:25:05 <matbu> so i wrote a BP for the major upgrade workflow:  https://blueprints.launchpad.net/tripleo/+spec/major-upgrade-workflow && https://etherpad.openstack.org/p/upgrade-workflow-and-validation
14:25:15 <EmilienM> sounds cool
14:25:22 <matbu> and an etherpad where is summerize all the ongoing discussions reviews
14:25:23 <EmilienM> this one is for pike :D
14:25:25 <matbu> and BP
14:25:33 <matbu> well, why not ? :)
14:25:38 <EmilienM> the blueprint sounds too general too me
14:25:48 <matbu> its in progress anyway
14:25:49 <EmilienM> and I'm pretty sure it could be broken down into smaller bits
14:26:05 <EmilienM> but it's a first iteration I guess
14:26:32 <matbu> EmilienM: idk, smaller like one BP per upgrade steps ?
14:26:36 <EmilienM> matbu: could you make blueprints dependencies in Launchpad?
14:27:05 <EmilienM> matbu: not really but a BP per problem we solve
14:27:21 <EmilienM> in the etherpad, L21, it seems we have a list of BPs
14:27:27 <EmilienM> which sounds good
14:27:46 <matbu> EmilienM: yep, there is also the one that marios wants to discuss today
14:27:49 <EmilienM> I'm wondering if https://blueprints.launchpad.net/tripleo/+spec/major-upgrade-workflow is really needed
14:27:52 <marios> EmilienM: o/ am coordinating with florian on https://etherpad.openstack.org/p/tripleo-pre-upgrade-validations for some pre-upgrade validations. just ansible tasks in the existing tripleo-validations. additions/thoughts welome. I think we can easily get some things here for P. thanks.
14:28:10 <marios> EmilienM: the blueprint for that is https://blueprints.launchpad.net/tripleo/+spec/pre-upgrade-validations
14:28:12 <EmilienM> you see, marios should have talked first :P
14:28:14 <marios> linked from the etherpad
14:28:19 <matbu> EmilienM: this one is for having the workflow in mistral and the cli implementation
14:28:28 <marios> EmilienM: yeah it is different
14:28:36 <EmilienM> all of this sounds good to me
14:28:52 <matbu> cool, i think we can target it to pike
14:29:01 <matbu> if it sounds reasonable to you
14:29:02 <EmilienM> we just need to do a better communication on the blueprints we created during the cycle
14:29:14 <EmilienM> summarize the work in progress and make some prioritization and scheduling
14:29:24 <marios> EmilienM: also end of my item. https://blueprints.launchpad.net/tripleo/+spec/pre-upgrade-validations maybe target this one to P as well
14:30:20 <EmilienM> marios, matbu: I would let you guys talk each other and summarize the blueprints you want in Pike and the ones for Queens and maybe share it to the ML so we can discuss there. Also, from the list we can make triage
14:30:37 <EmilienM> does it make sense?
14:30:39 <marios> EmilienM: but fwiw/info all of these things are discussed in that mail thread i pointed to earlier at http://lists.openstack.org/pipermail/openstack-dev/2017-May/116759.html
14:31:16 <marios> EmilienM: i mean, upgrade workflow in client/common, backup/restore (not sure that one is there actually), better validations/checks during the upgrade undercloud/overcloud
14:31:21 <EmilienM> marios: yes, I know, just re-use the thread then. I'm just thinking at sharing an overview of all blueprints related to $topic
14:31:24 <marios> EmilienM: ack thanks
14:31:30 <matbu> ack
14:31:52 <EmilienM> matbu, marios: it will help us to make the correct release triaging
14:32:23 <EmilienM> #action matbu + marios to share all upgrade-related blueprints on the ML thread
14:33:04 <EmilienM> matbu, marios: thanks! great work here
14:33:22 <EmilienM> please keep https://etherpad.openstack.org/p/upgrade-workflow-and-validation updated if you can
14:33:37 <matbu> yep thx
14:33:43 <EmilienM> do we have any other items this week before we go to the regular agenda?
14:33:46 <marios> thanks
14:34:00 <EmilienM> #topic bugs
14:34:04 <EmilienM> #link https://launchpad.net/tripleo/+milestone/pike-2
14:34:22 <EmilienM> do we have any outstanding bug to discuss this week?
14:35:07 <marios> EmilienM: the newton ovb job is still blocked right
14:35:13 <EmilienM> yes
14:35:17 <sshnaidm> EmilienM, maybe to define what could be done for pingtest bug
14:35:20 <marios> EmilienM: getting link sorrysec
14:35:24 <EmilienM> I'm wondering if someone was looking at it
14:35:29 <EmilienM> because I haven't seen much progress
14:35:39 <EmilienM> I did a quick investigation yesterday and commented on the bug report
14:36:06 <EmilienM> let's talk about it during the CI topic
14:36:26 <marios> EmilienM: ack
14:36:30 <EmilienM> beside CI issues, is there any outsanding bug in tripleo to discuss?
14:36:36 <EmilienM> outstanding even
14:36:47 <EmilienM> alright, let's talk about ci
14:36:54 <EmilienM> #topic CI
14:37:03 <EmilienM> so we currently have 4 alerts
14:37:09 <sshnaidm> EmilienM, I don't think pingtest failure is CI issue, it's most likely tripleo issue
14:37:17 <EmilienM> 2 for newton jobs, 1 for master (pingtest) and 1 for containers
14:37:25 <EmilienM> sshnaidm: deployment time?
14:37:36 <sshnaidm> EmilienM, no, the failure of pingtest in HA
14:37:43 <marios> https://bugs.launchpad.net/tripleo/+bug/1690373 this one. so it was blocked on the os-refresh-config fixup and the new package build dummy reviews. but those are blocked now on master?
14:37:45 <openstack> Launchpad bug 1690373 in tripleo "stable/newton gate-tripleo-ci-centos-7-nonha-multinode-oooq broken" [Critical,Triaged] - Assigned to Marios Andreou (marios-b)
14:37:45 <EmilienM> ah
14:38:08 <EmilienM> marios: I'm not sure it's something in packages
14:38:19 <EmilienM> marios: I looked at it and the jobs timeouts
14:38:35 <EmilienM> sshnaidm: do we have some HA experts looking at it?
14:38:39 <marios> EmilienM: this i mean https://review.rdoproject.org/r/#/q/Ie205c93a3cdcc3c68668327fde6327cd373a8739,n,z
14:38:45 <marios> EmilienM: it was part of the fix right?
14:38:46 <sshnaidm> EmilienM, afaik no
14:38:46 <marios> i think
14:39:02 <EmilienM> marios: I'm not sure it's really helpful, tbh
14:39:07 <trown> I am looking at newton jobs
14:39:08 <sshnaidm> EmilienM, who are HA experts that I can ask them to look?
14:39:23 <EmilienM> marios: have you look at logs? the job *timeouts*
14:39:33 <adarazs> EmilienM: failure for pingtest is definitely not a timeout issue. see https://bugs.launchpad.net/tripleo/+bug/1680195/comments/5
14:39:34 <openstack> Launchpad bug 1680195 in tripleo "Random ovb-ha ping test failures" [Critical,Triaged]
14:39:38 <marios> EmilienM: so bandini has already been helping on this bug wrt 'haexperts'
14:39:39 <trown> it seems like maybe they are fixed on latest newton current-passed-ci repo... at least my local env deployed fine
14:39:48 <EmilienM> sshnaidm: you can ask bandini and his team
14:39:59 <sshnaidm> EmilienM, ok, will do
14:40:10 <marios> EmilienM: but i didnn't check for couple days as i was waiting for https://review.openstack.org/#/c/465934/
14:40:40 <sshnaidm> #action sshnaidm to ask bandini and his team to look at pingtest HA failures:  https://bugs.launchpad.net/tripleo/+bug/1680195
14:40:41 <openstack> Launchpad bug 1680195 in tripleo "Random ovb-ha ping test failures" [Critical,Triaged]
14:40:44 <EmilienM> trown: in how much time?
14:41:15 <EmilienM> marios: can you explain why a dummy patch in t-i-e would help?
14:41:24 <trown> EmilienM: multinode jobs cant be a time thing can they?
14:41:37 <trown> EmilienM: I think maybe streams are crossed since there are 4 alert bugs
14:41:50 <trown> EmilienM: I was specifically looking at the 2 newton ones
14:42:03 <marios> EmilienM: sshnaidm sorry i was referring to the newton oooq nonha bug/1690373
14:42:06 <EmilienM> trown: well, gate-tripleo-ci-centos-7-nonha-multinode-oooq on newton is timeouting
14:42:18 <marios> EmilienM: i had it ready from the earlier discussion and hit return too quickly :/ sorry for the confusion sshnaidm
14:43:10 <trown> EmilienM: ya... but that was actually a hang right? as in it would not finish given infinite time
14:43:47 <EmilienM> probably
14:44:16 <EmilienM> anyway, let's move forward, we'll follow-up on #tripleo
14:44:18 <trown> I just would be very surprised if we slowed things down to the point multinode jobs were timeouting
14:44:26 <EmilienM> yeah me too
14:44:33 <EmilienM> AFIK it happenned during the oooq transition
14:44:43 <EmilienM> but it was the ovb transition
14:45:03 <sshnaidm> EmilienM, yep, it was ovb only
14:45:07 <EmilienM> the package diff didn't help much, quite a lot of changes in a few weeks
14:45:36 <EmilienM> is there anything ci-related we should talk now?
14:45:45 <marios> EmilienM: https://bugs.launchpad.net/tripleo/+bug/1690373 this one
14:45:46 <openstack> Launchpad bug 1690373 in tripleo "stable/newton gate-tripleo-ci-centos-7-nonha-multinode-oooq broken" [Critical,Triaged] - Assigned to Marios Andreou (marios-b)
14:46:08 <marios> EmilienM: is the one i was referring to. it was waiting for https://review.rdoproject.org/r/#/q/Ie205c93a3cdcc3c68668327fde6327cd373a8739,n,z and then https://review.openstack.org/#/c/465934/
14:46:34 <EmilienM> marios: and I'm asking *again*: what make you think this thing will help?
14:46:51 <marios> EmilienM: the discussion on the bug
14:46:56 <EmilienM> why would gate-tripleo-ci-centos-7-nonha-multinode-oooq timeout because of this?
14:47:27 <EmilienM> marios: https://review.openstack.org/#/c/465935/ didn't pass CI
14:47:36 <marios> EmilienM: there is issue in o-r-c which will cause the stack update to hang
14:47:40 <EmilienM> that's why I abandoned it
14:47:55 <EmilienM> we run stack update on gate-tripleo-ci-centos-7-nonha-multinode-oooq ?
14:48:44 <sshnaidm> EmilienM, I don't think so
14:49:01 <EmilienM> marios: ^?
14:49:09 <marios> EmilienM: so if you see for example https://bugs.launchpad.net/tripleo/+bug/1690373/comments/7
14:49:10 <openstack> Launchpad bug 1690373 in tripleo "stable/newton gate-tripleo-ci-centos-7-nonha-multinode-oooq broken" [Critical,Triaged] - Assigned to Marios Andreou (marios-b)
14:49:52 <marios> EmilienM: there is a yum update being executed there  e.g. http://logs.openstack.org/29/463529/1/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/362437e/logs/subnode-2/var/log/yum.log.txt.gz
14:49:55 <EmilienM> marios: how is it related to "stack-update"? And again, we don't run stack-update in this job I think
14:49:58 <marios> shows some things being updated
14:50:20 <EmilienM> is it updated by a stack update?
14:50:38 <marios> EmilienM: ok perhaps i am not being clear enough then :) I'd be happy to discuss offline some more if you line EmilienM
14:50:59 <EmilienM> we can take it offline for sure
14:51:09 <EmilienM> moving on now
14:51:09 <marios> EmilienM: i've updated the bug already and bandini was also involved as mentioned
14:51:12 <marios> EmilienM: thanks :D
14:51:16 <EmilienM> #topic specs
14:51:20 <EmilienM> #link https://review.openstack.org/#/q/project:openstack/tripleo-specs+status:open
14:51:44 <EmilienM> do we have anything to discuss about specs this week?
14:52:26 <EmilienM> I guess no
14:52:29 <EmilienM> #topic open discussion
14:52:38 <EmilienM> if there is any question or feedback, it's the right time
14:52:46 <sshnaidm> EmilienM, who are cores that can approve it? https://review.openstack.org/#/c/420878/
14:53:10 <EmilienM> sshnaidm: tripleo cores
14:53:28 <sshnaidm> EmilienM, ok
14:53:33 <EmilienM> sshnaidm: you can +2
14:53:44 <EmilienM> I'll approve it once we have enough votes
14:53:47 <sshnaidm> ok
14:53:55 <EmilienM> anything else this week?
14:54:49 <EmilienM> alright. Have a nice week and have fun
14:54:51 <EmilienM> #endmeeting