20:00:50 #startmeeting heat 20:00:51 Meeting started Wed Oct 23 20:00:50 2013 UTC and is due to finish in 60 minutes. The chair is stevebaker. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:52 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:00:54 The meeting name has been set to 'heat' 20:01:07 #topic rollcall 20:01:11 o/ 20:01:17 o/ 20:01:21 yo 20:01:25 o/ 20:01:27 hello all 20:01:28 hi 20:01:35 hello 20:01:36 hello 20:01:39 o/ 20:01:43 hello folks 20:01:48 o/ 20:02:13 I've just updated https://wiki.openstack.org/wiki/Meetings/HeatAgenda#Agenda 20:02:27 not long till summit! 20:03:00 #topic review last week's actions 20:03:01 yip 20:03:07 shardy to sync Havana release notes etherpad with wiki 20:03:11 That's done 20:03:16 sweet 20:03:22 stevebaker to review all summit proposals and reduce to 9 sessions 20:03:33 done 20:03:33 I also added a note about HOT being under heavy development, hence possibly being subject to change 20:03:43 good 20:03:46 after some IRC discussions re the evolving HOT spec 20:04:36 #topic Summit session proposals 20:04:46 #link http://summit.openstack.org/ 20:05:14 We were oversubscribed by 2x, so I had to be brutal 20:06:08 stevebaker: is there a way for us plebs to see the _schedule_ (as opposed to the session list)? 20:06:16 which didn't make the cut? 20:06:16 so I suggest we just go through each of the 9 sessions quickly now, and feedback would be welcome if any topics are over or under represented. 20:06:30 shardy: sort by topic and scroll down to heat 20:07:03 even better, sort by status first, then by topic 20:07:24 I've also come up with a draft schedule, but I don't yet have visibility on other design summit schedules for clashes so that may change 20:07:44 let me paste the provisional schedule 20:08:09 #link http://paste.openstack.org/show/49281/ 20:08:20 Thursday is a big day 20:09:20 I put the topics that seem to need the most discussion first 20:09:22 Heat Software Orchestration 20:09:26 Heat workflow: consider taskflow, Mistral or other 20:09:50 then the afternoon sessions are fairly feature based, but ordered arbitrarily 20:10:30 looks good steve 20:10:32 stevebaker, It's missing the autoscaling one 20:10:38 Err 20:10:42 stevebaker, sorry misreading :/ 20:10:45 stevebaker: lgtm 20:10:47 hehe 20:11:01 I counted 8 for some reasons 20:11:02 The 2 friday morning sessions are a bit different to all the others (one is about technical debt, the other is about usability features) 20:11:38 so the controversial sessions are first, so we can all be hating on each other for the rest of the day ;) 20:11:50 Heh 20:11:59 :-) 20:12:20 so one scheduling problem I see so far is 5:20 Thursday clashes with http://openstacksummitnovember2013.sched.org/event/d021c726f6fbe4d1fc7ade0a72a6ae2a 20:12:50 oh, that's mine 20:12:59 it would be good to go to that, even if we have to jog to all get there 20:13:10 that's late in the day 20:13:13 How it works? The person who proposed the session in responsible for preparing it? 20:13:33 bgorski, not much prep 20:13:39 so I'll ask if we can swap our 5:20 with another project, we're allowed to horse-trade slots 20:13:49 I was about to suggest that 20:13:49 it's a discussion 20:14:09 bgorski: it depends on the session, generally it will be less structured than that 20:14:18 I don't know about this jogging business, but count me in for horse-trading 20:14:28 so lets go through each session 20:14:31 can we leverage the horses? 20:14:35 in time order 20:14:43 #link http://summit.openstack.org/cfp/details/82 20:14:50 asalkeld, yea I know its are design sessions but some 5 or 10 min introduction would be nice 20:14:50 Heat Software Orchestration 20:15:14 bgorski, yeah that's about right 20:15:21 * SpamapS imagines randallburt with a very large see-saw being accosted by some angry horse farmers.. 20:15:47 stevebaker, maybe we can have people with different solutions for that? 20:15:56 this will be a bit of a free-for-all, but hopefully it can be kept on track with a concrete proposal 20:15:57 quick 5 min 20:16:15 I am trying to make a PoC now 20:16:16 bgorski: Right, typically the "leaders" of a session will set the table. What you _don't_ want is a lot of slides and "here is how we're doing it".. even if you want that to happen.. as it hat squelches the discussion. You 20:16:21 using marconi 20:16:22 asalkeld: yep 20:16:48 SpamapS, +1 20:17:05 lovely typing there.. thank you latency 20:17:08 want some possible solutions, and discuss 20:17:16 #link http://summit.openstack.org/cfp/details/121 20:17:23 Heat workflow: consider taskflow, Mistral or other 20:17:56 zaneb: I wonder if you and harlowja could come up with a plan for running this session? 20:17:59 stevebaker, so that is for the engine to use 20:18:23 (not to do with the software config) 20:18:28 I would like to recommend that the task flow people and mistral people try to have a breakout session some time before this session to allow interested parties to get an idea where the projects are now. 20:18:45 SpamapS: +1 20:18:52 sure, I can help out but I imagine harlowja knows basically what he wants to discuss 20:19:09 And/or direct us to where we can ascertain that offline before summit. 20:19:10 zaneb, you know what we need tho' 20:19:14 yes, this is to consider transforming our declarative orchestration to workflow for execution 20:19:21 btw, I don't know where the idea of us using Mistral came from 20:19:23 but... no. 20:19:29 lol 20:19:34 SpamapS, I was not planning to give you presentation about our multi region vision with a lot of slides so do not worry :) 20:19:38 zaneb: distributed workflow would mean no need for a single-engine stack lock. 20:19:57 weeeeeeel maybe not. 20:20:08 zaneb: you've got the best knowledge of heat scheduling currently, so you can best represent heat's concerns 20:20:25 I mean, I think we should *probably* use taskflow. But not Mistral. 20:20:27 zaneb: but if Mistral isn't quite aimed at distributed celery-like workflow then the point is moot. :) 20:21:22 #link http://summit.openstack.org/cfp/details/79 20:21:30 we discussed Convection already at the last summit and decided that we would share a library not consume an API 20:21:36 Autoscaling Design 20:21:46 yip 20:21:53 we have a couple of issues to figure out there 20:22:01 A nice easy topic for straight after lunch ;) 20:22:04 stevebaker: getting the easy ones out of the way first? ;) 20:22:05 heh heh 20:22:18 I have been working on a proposed API spec 20:22:21 bring your coffee 20:22:35 stevebaker and I talked about splitting up the discussion into API first and resources second 20:22:46 I think the load balancer thing is probably the most interesting point to figure out during the discussion 20:22:47 I'd suggest people write up as much as possible in advance 20:22:59 so we have the maximum context 20:23:02 It seems like this will be a session which is more about presenting a solution, with discussion just about details and problems? 20:23:18 * SpamapS wonders if we can get some of that extra-strength Thai red bull... :) 20:23:42 stevebaker: I think a lot of us have come to a baseline agreement about the basic solution, but there's some gnarly things to figure out 20:24:00 the wiki page mentions all the problems we have to solve afaik 20:24:08 oh, wait, there was another one therve thought of this morning, we should add that 20:24:21 monster xtreme ftw 20:24:35 load balancing is definitely a tricky one 20:24:56 and I suggested the first half be only about the API, and the second half be about the resources. Our autoscaling talks tend to descend into a gordian knot of api<>resources arguments ;) 20:25:06 brb in 5min 20:25:53 btw, all 9 slots are full, if we can free up one slot that leaves us a spare for the most important topic that arises while we're there 20:26:25 the main issue about LB is that I'd like to avoid knowledge of it in the autoscale API/service, and instead make it flexible enough to support the general class of relationships like LB 20:26:38 however that's done, I don't care :) 20:26:40 IMO we should talk about the API as if there were no resources 20:26:55 then we can add the resources after, like we do for every other OpenStack service 20:27:01 stevebaker, retry/convergence and healing could be covered in the same slot, IMHO 20:27:02 agreed 20:27:25 zaneb: +1 ... make a good micro service, then make resources to support it. 20:27:32 zaneb: agree, I thought the point of the AS API is that folks didn't want to consume AS via heat resources 20:27:59 therve: I don't see how they are related, and I think there is enough in healing to discuss on its own 20:28:17 Right, so agree on what a good AS micro service will (and will not) expose in an API, and then we can look at what resources are needed to support that. 20:28:18 yeah, it's talking about both at the same time that causes the Gordian knot, with API parts that rely on the resources, which is backwards 20:28:20 #link http://summit.openstack.org/cfp/details/205 20:28:22 Support API retry function with Idempotency 20:28:30 the resources should be a pretty obvious mapping to the API anyway 20:28:35 ok, /me shuts up :) 20:29:09 I noted down details here: https://etherpad.openstack.org/p/kgpc00uuQr 20:29:21 the retry stuff is pretty important, and there seem to be differing approaches suggested, so an entire session may be needed just to articulate the approaches and choose one. 20:29:25 stevebaker: that session could actually be combined with the taskflow one IMO 20:29:35 http://summit.openstack.org/cfp/details/205 that is 20:29:45 IMHO we just don't want retry loops in heat 20:29:46 and make sure everyone knows that this _isn't_ the same as healing :) 20:29:51 so this might not deserve a full session, but there could be some worthy discussion on how the retry policy is specified, and a guide on implementation if it is a non-heat-core dev doing it 20:29:56 If they should exist, they should be in the *clients 20:29:59 radix: exactly 20:30:03 stevebaker, It seems to be that retry is a poor-man convergence 20:30:33 therve: they aren't the same thing at all IMO 20:30:44 one is about aligning the template with the real state of resources 20:30:52 details of how exactly it will work are not always a good summit session goal. Details of _what_ to do are paramount. 20:30:53 therve: no, retry happens within create or update on a single resource, convergence is a new action which compares desired state with actual state 20:31:04 the other is about retrying due to service flakiness 20:31:20 As in, we don't want to quibble over syntax, there, but we do wnat to make sure the end goal is agreed upon. 20:31:29 stevebaker, Hum, okay. The retry thing sounded like a new action. 20:32:17 Is the intention for all openstack services to retry everything? 20:32:20 shardy: retry loops in Heat is, indeed, a workaround for bugs in the clients. We're more talking about "what to do when Heat is stuck" 20:32:31 I don't see why Heat should be special and retry all the things 20:32:31 stevebaker: the more I look at it the more I think this doesn't deserve a session. It hasn't been discussed on the mailing list even 20:32:52 but maybe the topic is small enough to share with another session, maybe http://summit.openstack.org/cfp/details/121 (workflow) or http://summit.openstack.org/cfp/details/95 (convergence) 20:32:57 shardy: special.... man, every openstack user has to retry stuff all the time in practice 20:32:59 or we just drop it 20:33:09 but yeah, that's because there are bugs everywhere 20:33:14 Like if you were just plain out of quota space when you did an update that added a resource.. novaclient isn't going to retry that forever... we need a "try again" driven by the user. 20:33:20 radix: hence my suggestion that it should be abstracted into the clients.. 20:33:33 aren't we getting reservations? 20:33:34 session time should be reserved for proposals that have be discussed and need face-to-face interaction to resolve 20:33:39 shardy: yeah, I'm not disagreeing 20:33:41 shouldn't that solve these issuess 20:34:03 SpamapS: agreed, I think its simply allowing some of our actions to be re-entrant without re-creating things that already worked. 20:34:09 zaneb: this topic has been floating around for the last cycle 20:34:16 randallburt: well said. 20:34:59 If we can't spare the session time, I'm happy to block out time to do a breakout outside our normal session time, but I definitely want to get some consensus on this and do some of the actual work. 20:35:01 randallburt: that's not how the session proposal is worded tho 20:35:28 +1 on allowing users to manually retry things, huge -1 on automagiacally while(1)ing stuff in heat code 20:35:46 I think if it's a user action, it's more in line with the discussion about convergence 20:35:47 shardy: agreed, perhaps we can combine this with convergence. 20:35:48 shardy: undertood on the 1st and totally agree on the second 20:35:56 SpamapS: +1 20:36:04 Ok so lets combine them. 20:36:09 ok, lets tack it onto the convergence session, maybe the last 1/3rd 20:36:10 +1 on combining them 20:36:21 Huh, I think therve just said that didn't he ;) 20:36:37 indeed 20:36:50 #link http://summit.openstack.org/cfp/details/95 20:36:50 my evil plan succeeded 20:36:59 :) 20:37:00 Healing and convergence 20:37:20 therve: my point was that what is being proposed is not the same, but I accept that convergence may solve both use-cases 20:37:20 and retrying 20:37:39 can we call this healing? convergence is a mouthful. the action is a 4 letter verb! 20:37:39 shardy, Agreed 20:37:57 or, convergence may make use of the same mechanism that a user-driven retry mechanism uses. 20:38:06 I see two interesting things in this session, 1) discussion of an explicit "retry" / "converge" operation from the user, and 2) automatically watching for divergence and trying to convergence 20:38:11 er, trying to converge* 20:38:20 radix: heal! 20:38:23 ok ok 20:38:31 radix: if only you could explain these things to us in person... 20:38:41 but it's not like converge is unusual in this space, puppet and chef use that phrase too :P 20:38:50 radix: you know, the timezones would allow you to attend both events using a telepresence bot at one of them.. 20:38:59 SpamapS: sorry =( well, honestly I basically just explained everything I have thought on the subject, hehe 20:39:12 actually, I was wondering if there would be any facility for remote involvement 20:39:27 radix: I've heard maybe not 20:39:29 I wonder if a G+ hangout would work. but I may be sleeping :) 20:39:39 depends on the network 20:39:42 basically we need two things: (1) allow retry from failed states, and (2) allow retry from wedged in-progress states without making a huge mess and breaking everything 20:39:45 not always the best 20:39:45 hangouts have been used, but in the past the summit network is , while good, not that good. 20:40:20 ok 20:40:42 #link http://summit.openstack.org/cfp/details/58 20:40:50 Heat support for multi-region and multi-cloud 20:41:36 So I think we should talk who heat will orchestrate stack with resources in different region 20:42:11 the big thinks I want answered in this session is how to represent the credentials for different clouds, and how we represent what resource goes on what cloud/region 20:42:21 big thin*g*s 20:42:31 Also, how auth will work at all 20:42:52 and az's/regions/flavors 20:43:00 (validating them) 20:43:02 A topic we were discussing is how deferred auth and signals can work for multi-cloud 20:43:20 stevebaker, I really like your context concept to express what resource goes to which cloud/region 20:43:22 My position: figure out just the next step, and do that. This is pretty big, and each step is really tricky... I don't think one cycle will get us to multi-region even.. and multi-cloud is a whole other ball of wax. 20:43:39 ie if you can't use trusts, and you don't have admin on all-the-clouds to create stack users 20:44:03 bgorski: yep, whether we can apply that context to a single resource, or only to an entire nested stack is a good question 20:44:13 SpamapS: +1 20:44:16 (and I don't consider multi-region done until it is HA so that one region failing != your whole stack being unmanagable) 20:44:37 personally I think just getting multi region working is a huge undertaking 20:44:45 SpamapS: you mean like making heat an actual distributed system? 20:44:49 I checked the heat code base and right now each nested stack has his own set of clients 20:44:53 radix: aye 20:45:05 so right now it is really easy to do that for nested stack 20:45:45 right so this sounds like it should be a good healthy session to make sure we understand that next step and the long term vision too 20:45:49 but specifying context for each resource will be much harder 20:45:57 we could put contexts in the environment 20:46:08 and associate them to resources there 20:46:23 #link http://summit.openstack.org/cfp/details/200 20:46:24 then the resource ask the env for it's context 20:46:25 Stack abandon and adopt 20:46:35 bgorski: excellent point, and IMO we should implement multi-region only for Stack resources 20:47:04 apparently multiple sessions can be scheduled into each slot, so I might resurrect http://summit.openstack.org/cfp/details/98 and add it to this slot 20:47:18 this is scary stuff 20:47:40 zaneb: I think so too 20:47:47 Yeah, this loooks like it could get pretty hairy.. 20:47:54 That would be an interesting mechanism to make use of for HA multi-region. 20:48:02 stevebaker: I think that one would be covered by adopt/abandon 20:48:21 region-master goes down, region-slave promotes itself and reads in a static map of what resources it can own and adopts them. 20:48:33 randallburt: it would, that is why I refused it 20:49:12 SpamapS, needs to be heat version independent too 20:49:18 I don't think this would be super hairy either btw. I love the concept. 20:49:32 asalkeld: I think that's a pretty tall order. 20:49:55 SpamapS, if you can guarantee the heat create the resources 20:50:00 For instance, I could see using it to prepare stem-cell type servers to then be mo[4~ved into stacks as needed. 20:50:03 the hairiest bit is how to handle endpoints changing on running servers, but there are many use cases which wouldn't be too hard to achieve 20:50:05 Is the idea to have a new API? like "here's the resource properties and resource_id and any resource_data associated with it, go"? 20:50:21 or maybe extra parameters with create-stack 20:50:29 I still think you can do this in the user/base case with environments 20:50:43 radix, some kind of state file 20:50:46 add the ability to specify a physical resource id for a resource 20:50:57 The interesting part would be to do it without specific resource support 20:50:59 radix: I was thinking new actions, abandon and adopt. adopt is like create, but with some extra state information\ 20:50:59 randallburt: an interesting approach to bring up before the summit. Can you write that up and comment on the proposal or maybe send to the ML? 20:51:06 SpamapS: sure 20:51:16 I think adopt and abandon functionality will be great if created stack will be able to generate template which describes it resources and dependencies between them 20:51:24 therve: I guess that shouldn't be too hard, since resources only have two well-defined ways to store data 20:51:30 in resource_id, and in resource_data 20:51:31 randallburt: OS::Nova::Server already has a name property 20:51:41 randallburt: as do many other native resource types 20:51:51 this _might_ be combinable into convergence as well 20:51:56 because we can use gui as a editor and template debugger but I know it is a long term vision and dream :) 20:51:57 bgorski, Hum, what would you pass to Heat? The tenant? 20:52:13 radix, That's still 2 too many :) 20:52:24 stevebaker: I mean specify a resources native ID in the environment so heat won't create a new one 20:52:35 bgorski: resource relationships is already exposed in the API 20:52:52 lets not get into HOW... if we want to discuss it, lets move on and keep it scheduled. 20:53:08 Friday! 20:53:24 http://summit.openstack.org/cfp/details/229 20:53:25 One issue with this (the adopt) is it's essential that we have 100% parity between the resource properties and the underlying API/physical-resource 20:53:26 Heat exorcism 20:53:31 hangover session 20:53:38 bgorski: so, I don't believe it's possible in general to infer the relationship between resources 20:53:43 so it can never be compatible with multiple versions of things 20:54:33 Do we need a session on a list of bugs? 20:54:42 These all look like "yes fix that please" 20:54:52 Or rather we have to have parity with every project at release time 20:54:56 anyway.. 20:55:09 zaneb has a bunch of technical debt issues to discuss, the fixes for some might be disruptive so discussion may be worthy 20:55:09 SpamapS, maybe a user session? 20:55:13 SpamapS: I guess I proposed it because there are a couple of major changes we need to consider 20:55:30 zaneb: right I see that we do need a "native heat resource access method" 20:55:30 SpamapS: I don't think so, we can prioritize bugs in LP and the weekly meetings 20:55:33 e.g. do we base resources directly on APIs, even when they're broken like Neutron :( 20:55:36 and it is a good chance to lay down some "don't to this, and this is why" guidelines 20:55:51 stevebaker: ++ 20:56:15 stevebaker: heh, maybe we should schedule this before the autoscaling session ;) 20:56:19 * zaneb ducks 20:56:31 a lot of those autoscaling-related ones should go away with the new proposed stuff 20:56:53 so if anyone has other implementation cleanup things, let me know and I'll add it to the session 20:57:06 zaneb: IMO resources shoudl be tied to the logical problem they solve, not just the API. API adherence is just nice for closing the understanding loop. 20:57:24 SpamapS, +1 20:57:31 SpamapS: +2 20:57:46 But yeah perhaps we do need to discuss those two things. 20:58:14 do we need a user session? 20:58:27 #link http://summit.openstack.org/cfp/details/83 20:58:29 Stack troubleshooting support 20:58:32 +1 SpamapS as well :) 20:58:53 where our users can talk about things that are bugging them? 20:59:16 SpamapS, This hasn't been what we've done up to now, at least 20:59:41 this session is proposing a template debugger (some kind of breakpoint mechanism) and other tools to diagnose why a stack has broken 21:00:01 stevebaker, I'll link the user-logging bp too 21:00:22 #link https://blueprints.launchpad.net/heat/+spec/user-visible-logs 21:00:24 tspatzier: I'm a little concerned that this is a wishlist, rather than something that has developers ready to work on 21:00:31 therve: well sometimes we accidentally did it, when the API matched the problem space well ;) 21:00:56 stevebaker, partly agree, but we have a team internally how think about implementation. 21:00:59 stevebaker: +1, it doesn't look all that achieveable in the near-term to me 21:01:28 its totally achievable, but it just needs bodies to do it 21:01:34 a line based live debugger wouldn't actually be that hard. 21:01:38 I asked them to prepare some more details. A session could help them to get the right direction to get this started. 21:01:51 SpamapS: It's the in-instance debugging I think will be ahrd 21:01:55 hard even 21:02:08 aren't we out of time? 21:02:15 tspatzier: as long as they intend to implement something as well - a design wouldn't be that useful 21:02:15 yip we are 21:02:18 ladies & gents ... cough ... have ye any homes to go to? 21:02:20 in-instance debugging will become a more important issue with the software orchestration thing 21:02:21 (as Irish barmen shout out at pub closing time ...;) 21:02:24 wow, some really good session proposals didn't make the cut :( tough schedule. 21:02:27 whoah, that went long :) sorry 21:02:31 eglynn: keep your hair on :) 21:02:34 shardy: we implemented that in Juju by ssh'ing in from the client into a screen started specifically to run the hooks (in this case, we'd start screen in the cloud-init) 21:02:35 #endmeeting