22:01:05 #startmeeting containers 22:01:06 Meeting started Tue Mar 10 22:01:05 2015 UTC and is due to finish in 60 minutes. The chair is adrian_otto. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:01:07 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 22:01:09 The meeting name has been set to 'containers' 22:01:12 #link https://wiki.openstack.org/wiki/Meetings/Containers#Agenda_for_2015-03-10_2200_UTC Our Agenda 22:01:17 #topic Roll Call 22:01:21 o/ 22:01:21 Adrian Otto 22:01:32 OTSUKA, Yuanying 22:01:52 madhuri 22:02:01 Perry 22:02:06 Thomas Maddox 22:02:29 Hongbin Lu 22:02:48 hi apmelton yuanying-alt madhuri juggler thomasem hongbin 22:02:56 hi 22:02:59 hello adrian 22:03:03 hi sdake_ 22:03:15 howdy 22:03:24 ok, we can begin. 22:03:30 #topic Announcements 22:03:38 1) Proposal to add Magnum to the OpenStack git namespace 22:03:45 #link https://review.openstack.org/161080 Magnum Governance Review 22:04:07 this was discussed in today's TC meeting, and voting was deferred by two weeks. 22:04:34 no actionable feedback was provided for Magnum, as the issues among the TC members have little to do with us 22:04:45 there was one question raised, which I will address in a moment. 22:05:09 2) Magnum Release #2 22:05:49 #link http://lists.openstack.org/pipermail/openstack-dev/2015-March/058714.html Magnum Release Announcement 22:06:14 thanks everyone for all your work on this release 22:06:30 this was a big step forward for us with a lot of super valuable results 22:06:52 any other announcements from team members? 22:07:06 =] 22:07:17 #topic Review Action Items (From Midcycle) 22:07:28 1) sdake: To think of a name for new stackforge heat-kubernetes repo. 22:07:40 sdake_: thoughts on this? 22:08:24 we will revisit it later 22:08:34 (makes sense once you see the next…) 22:08:38 2) sdake: To schedule a meeting with Lars to discuss new repo, and what it means to keep-it-simple. 22:08:44 Status: COMPLETE. Scheduled for 2015-03-11. 22:09:04 sdake and I will meet tomorrow and work through the specifics 22:09:28 the intent is to remove this code from Magnum, and reference it as a library listed in requirements.txt 22:09:51 so we can talk about #1 tomorrow 22:10:01 any thgouhts from team members on either of these? 22:10:13 do you want to see notes published about this, or just get an update next week? 22:10:32 either is fine here 22:10:46 Is it in IRC? 22:10:53 notes published would be nice, i think 22:11:14 +1 for not published 22:11:15 I think it's a call. I can take notes and share them. 22:11:20 s/not/note/ 22:11:30 as long as the participants have no objections 22:11:30 ah hongbin :) 22:11:38 I default to making all decisions openly 22:11:38 sounds good to me 22:11:50 but I also want to respect the wishes of those who are not yet part of our team 22:12:22 #action adrian_otto to update the team about the outcome of plans to move heat-kubernetes to Stackforge 22:12:30 +1 for note published 22:12:36 so that will be on next week's agenda at a minimum 22:12:50 #topic Blueprint/Task Review 22:13:06 most of our reviews are in good shape 22:13:30 I wanted to say thanks to all of our contributors for being disciplined about adding bug numbers and blueprints to all commit messages 22:13:54 we have been very good about that, and our release notices are comprehensive now as a result. I appreciate it! 22:14:00 1) Discuss how to implement Bay status 22:14:11 When creating a bay, we use Heat to deploy Nova instances. There is not currently an event feed from Heat that we can subscribe to. Polling Heat for status changes is not elegant. How would we like to address this? Current discussion to reference: 22:14:23 #link http://lists.openstack.org/pipermail/openstack-dev/2015-March/058619.html Expression of Bay Status 22:14:44 #link https://review.openstack.org/159546 When polling heat set bay status 22:14:55 ok, so we have a decision point to discuss 22:15:15 about how best to express our bay status, and how to sync that up with the heat stack status 22:15:25 we got some input from Zane and Angus form the Heat team 22:16:01 yuanying-alt: also commented 22:16:26 as well as input from hongbin 22:16:45 so please take a moment to review the ML thread, and let's decide how to proceed 22:17:43 I am pursuaded by hongbin's remarks about a tight coupling causing a fragile system that is harder to debug. 22:18:32 in that case approach #3 (Don’t store any state in the Bay object, and simply query the heat stack for status as needed.) would be more resilient. 22:18:52 option #3 looks quit expensive to me.. 22:19:22 I expect the bay status will be polled quit often 22:19:38 so, it does look like heat exposes notifications already 22:19:54 hongbin: yes, but we would only be polling stacks that have not reached a completion status. 22:19:54 https://wiki.openstack.org/wiki/SystemUsageData#orchestration.stack..7Bcreate.2Cupdate.2Cdelete.2Csuspend.2Cresume.7D..7Bstart.2Cerror.2Cend.7D: 22:21:11 apmelton: wow, that sounds like exactly what we would want for #1 22:21:22 adrian_otto: I think we need to block pod/service/rc creation until bay is completed 22:21:32 but we would need access to the RPC events from Heat 22:22:02 hongbin: you mentioned a concern about troubleshooting with option 1 22:22:09 the remarks from Angus and Zane assume that status changes could be made to users of heat as well 22:22:23 apmelton: yes 22:22:46 hey 22:22:51 sorry had emergency phone call 22:22:59 sdake_: wb 22:23:01 hongbin: is your concern that we may end up missing a notification 22:23:11 and in that case we may get out of sync with heat? 22:23:25 apmelton: yes, that is part of the concern 22:23:29 everything is ok - mom freaked out over snake in hosue :) thanks for the posittive thoughts folks :) 22:23:34 apmelton: upon inital setup if the network prohibits the status change notifications, that could be very easy to miss 22:23:56 sdake_: got to love AZ! 22:24:02 lol 22:24:18 adrian_otto: which network are we talking about here? 22:24:33 between the Magnum server and the Heat server 22:24:52 if Heat were to post back status to us in the form of webhooks for example 22:25:26 snakes on a plane ;-) 22:25:41 during heat dev long ago we talked about web hooks 22:25:43 another option would be that we open a socket to heat and keep it open (long poll style) 22:25:49 could we take a multi-part approach to this? one where we're waiting on a notification or webhook, and eventually timeout and ask heat for the status directly? 22:25:52 but the way we ended up with was to use the notifications api 22:25:56 and have it emit staus over that until a complete state is reached 22:25:58 that runs over RPC 22:26:12 notifications are totally event driven and prvoide enough info for us 22:26:19 we could just use that 22:26:37 I think that is the fastest path to meeting the requirements of the thread 22:26:39 sdake: you mean the usage notifications? 22:26:52 each time a resource is created/delted a notification si created 22:27:12 any state change in any resource creates an event in the notification system 22:27:21 #link https://wiki.openstack.org/wiki/SystemUsageData#orchestration.stack..7Bcreate.2Cupdate.2Cdelete.2Csuspend.2Cresume.7D..7Bstart.2Cerror.2Cend.7D: Heat Usage RPC Messages 22:27:25 when a heat state is stuck - e.g. waiting, it might not be emitted - but it should be reflected on bay status 22:28:00 suro_ I Thin kthe model is that we trust heata twil ldo the right thing 22:28:03 and wont g et "stuck" :) 22:28:13 if it does, we are probably in bad shape anyway 22:28:17 suro_: agreed that is ideal, but couldn't we also assume that state? 22:28:19 and consider the bay failed 22:28:45 I was debugging yesterday when it was stuck in create_in_progress for cloud formation wait 22:28:46 we could have a bay creation timeout 22:29:02 and the deault timeout is 6000 there 22:29:02 suro_: that should eventually timeout and put it in error 22:29:03 if we do not reach a completion or error state by then, we fail the bay and delete it 22:29:24 fail the bay and delete the heat stack I mean 22:29:26 * juggler agrees with suro_ 22:30:04 6000 seconds (almost 2 hours)? 22:30:08 The difference between option #2 and option #3 is whether to cache the stack status in our DB. Is that correct? 22:30:11 yeah, is 6000 reasonable? 22:30:16 seems really high 22:30:44 what's reasonable...10 minutes? 30 minutes? 22:30:49 more? 22:30:56 Depends on the stack, unfortunately. 22:31:00 is my thinking 22:31:04 I suppose it does. 22:31:16 magnum/templates/heat-kubernetes/kubenode.yaml 22:31:17 maybe conditional it based on the stack's performance? 22:31:19 ? 22:32:01 ok, so is anyone opposed to sdake_'s suggestion of using the usage RPC messages? 22:32:06 e.g. if stack A, A' timeout is reasonable, etc.. 22:32:38 Agree to juggler comment. Its more upon stack performance 22:32:41 juggler: are you suggesting we let the user supply a timeout? 22:32:44 that sounds like a sensible approach that gives us near-realtime status consistency 22:32:56 optionally 22:33:05 if they understand how long a bay may take to spawn 22:33:17 some VM's take 30 + minutes to launch 22:33:23 yes :( 22:33:25 Yes they do 22:33:36 thomasem not necessarily. a default could be set. a user-supplied option is a later possibility if we feel it feasible and worth the resources perhaps 22:33:39 so having an arbitrary timeout is probably a bad idea 22:33:45 +1 on using RPC messages 22:33:54 adrian_otto: why's that? 22:34:03 adrian_otto: could really mess with stats 22:34:07 thomasem: ^^ 22:34:27 which stats? 22:34:30 our stats? 22:34:35 hypothetical stats* 22:34:44 you totally lost me, sorry. 22:35:25 so when a stack times out, it errors it 22:35:45 rather, when that wait condition times out, it errors the stack 22:36:14 heat stack creation timeout is already a parameter in Heat 22:36:15 if you're monitoring heat failures, and someone is constantly setting arbitrarily low timeouts, that could set of alarms 22:36:26 that's true 22:36:34 adrian_otto: yea, I suppose thats true 22:36:35 so we could have a feature to set that on stack create 22:37:23 ok, so I think Tom C has enough guidance to proceed now 22:37:55 Yeah, sorry, +1 for RPC 22:38:04 any other work items, such as Blueprints or Bug/Task tickets to discuss as a team? 22:38:06 got side-tracked there 22:38:38 ok, should I open a feature request for stack timeout as a heat parameter? 22:39:06 we can allow for a default value, and a per-create override of that value 22:39:24 I suppose I'm fine with that 22:39:30 and operators/users can tweak that to suit any preferences 22:39:41 +1 22:40:37 #action adrian_otto to open a bug ticket for a new feature to set heat stack creation timeout on bay create, with Magnum's default value in our config and a per-request override for Magnum users to use upon Bay creation. 22:40:47 I think to apmelton's point, it would be good to have some flexibility in that for operators so as to avoid alarming for crazy low timeouts supplied by users? That does sound like a real problem. 22:40:58 s/?/./ 22:41:08 thomasem: I think the operator could just add a constraint to their template 22:41:26 offhand, did anyone see my early peer suggestions / questions for quickstart.rst posted to #..-container posted earlier? or is there a better forum / suggestion box for that.. 22:41:49 I suppose the default could also have a lower and upper limit for acceptable override values? 22:42:16 apmelton: yeah, sure, that seems like a good solution. Like if less than our default, don't alert? 22:42:27 juggler: you could use the ML for that 22:42:44 thomasem: heat just wouldn't accept the stack-create 22:42:46 or you are welcome to raise that in Open Discussion which I am about to begin here in a sec. 22:42:49 lets discuss the defautl on the ml 22:42:57 +1 to have range for timeout value 22:42:58 maybe operators will have input 22:43:01 great, thanks. 22:43:05 kk 22:43:16 sdake__: agreed 22:43:46 I will reply to the current ML thread with a summary of our intent 22:43:58 and a link to the bug ticket mentioned above in the AI. 22:44:09 #topic Open DIscussion 22:44:34 juggler: please ask your question about the dev-quickstart.rst 22:44:49 did you have suggested improvements for that? 22:44:59 a few 22:45:03 I can help you to submit a patch against it 22:45:04 Line 13 is incorrect. Link is now: 22:45:05 http://docs.openstack.org/infra/manual/developers.html 22:45:23 and we can discuss line-by-line in gerrit? 22:45:47 sure 22:46:19 that way as soon as agreement is reached we will merge that as code, and get the document updated 22:46:43 offhand, is there any desire to test on CentOS? or should I just abandon that and test off Fedora 21 22:47:22 good question. We have not discussed that yet. 22:47:47 I would think that we'd want to merge our devstack module code 22:47:58 and anything that devstack works on, we could also work on 22:48:17 ah ok 22:48:27 are you tripping on os-specific issues? 22:49:08 not necessarily yet. but if there were others with heartache in CentOS, I don't necessarily want to go that route either ;) 22:49:49 most openstack setups that I have seen are run on Ubuntu 22:50:15 our default gate tests use ubuntu, but there are also gates for fedora that we could use 22:50:17 the earlier part of the quickstart suggests various platforms, but lines 102&103 recommends 2 test platforms. so it's ambiguous what works and what's needed for test 22:50:24 at least my perspective 22:50:24 I have bringing this up on a DevStack on VM (RHEL 7) - I was following the dev-quickstart.rst => there were some OS-independent issues I faced, which were helped out by hongbin - some of them under review - template specific definition missing - I can capture them if it helps 22:51:08 ok, let's get that recorded 22:51:12 suro_ gold plate teh quickstart to your heart's content ;-) 22:51:26 we want that document perfect plus one 22:51:29 just some of my observations. i'll probably lean towards Fedora 21, but it's something to consider perhaps? 22:52:01 I think these are great areas for first contributions 22:52:21 it being possible other OSes, as feasible with our resources 22:52:22 so let's find those items that can be improved, and start submitting patches against them 22:52:27 sdake__: sure! I feel quickstart is "Welcome" banner, it should be as smooth as possible 22:52:32 I'm happy to help get you started 22:52:38 suro_ t otally agree 22:53:15 agree 22:53:33 Hoping to make changes to it myself as I install magnum. 22:54:37 thanks suro_ and juggler for your help on this. We really value getting a fresh perspective on the docs. There are a lot of people starting to look at Magnum, so having that stuff crisp and tight is really important to us. 22:54:56 and I want to be sure you guys are getting credit for your contributions 22:55:35 adrian_otto: thanks for the welcome, looking forward to contributing ... 22:56:24 I will be heading out in a moment for today, but will be around tomorrow to lend a hand with getting started stuff 22:56:43 thanks adrian.. always looking to help where I can :) 22:57:15 ok, I'm going to wrap us up with some housekeeping. Any parting thoughts before we adjourn? 22:57:28 one more thing 22:57:38 good here 22:58:13 I will announce a Doodle poll to the ML for a scheduler implementation IRC meeting to occur some time this coming week 22:58:47 we will be discussing the use of our swarm backend for initial implementation in accordance with our design summit talk 22:59:07 so keep an eye out for that, and vote for times you can attend if you are interested in the topic 22:59:22 most likely a 1600-1900 UTC time 22:59:40 Our next meeting is 2015-03-17 at 1600 UTC. I look forward to seeing you all then! 22:59:43 sounds good 22:59:48 #endmeeting