15:08:20 <garyk> #startmeeting scheduling 15:08:21 <openstack> Meeting started Tue Oct 1 15:08:20 2013 UTC and is due to finish in 60 minutes. The chair is garyk. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:08:22 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:08:24 <openstack> The meeting name has been set to 'scheduling' 15:08:43 <debo_os> agenda? 15:08:57 <garyk> Not sure if you guys saw the mail I sent. I suggested we talk about maybe discussing an API to propose for summit 15:09:14 <garyk> And in addition to this the Heat scheduling discussion on the list. 15:09:17 <MikeSpreitzer> I saw the mail. 15:09:25 <MikeSpreitzer> If I understand, it's really the same discussion. 15:09:25 <garyk> So should we start with the API? 15:09:40 <garyk> #topic Discuss scheduling API for summit 15:09:42 <MikeSpreitzer> I think the API should look a lot like the Heat API — deal with this template 15:09:45 <debo_os> garyk, mike: +1 for API 15:09:58 <debo_os> but I am not sure if the scheduler API should look like HEAT .... 15:10:16 <MikeSpreitzer> If we want to make a unified decision, we need unified input 15:10:20 <debo_os> It needs to specify the VRT or eq and policy handles 15:10:27 <debo_os> and it should be very very simple 15:10:29 <PhilD> alaski has been pushing a set of changes for the query scheduler - how does that relate to this ? 15:10:51 <MikeSpreitzer> Oh, something else I need to learn. Can you give a pointer? 15:10:57 <debo_os> Mike: agree about the unified input ... hence maybe VRT ... .. 15:11:31 <garyk> debo_os: can you please elaborate on VRT 15:11:32 <debo_os> but the API could be very simple which is easy to incrementally build and allow for extensions to have complex variants 15:11:50 <debo_os> VRT = virtual resource topology from Mike's jargon 15:12:12 <garyk> Just to recap, last week we spoke about an trying to understand the following (3 things): 15:12:21 <garyk> 1. a user facing API 15:12:41 <garyk> 2. understanding which resources need to be tracked 15:12:47 <debo_os> for example, one needs to pass in groups of resources that need to be scheduled as a single entity for starters - network compute storage ... and pass a list of policy object 15:12:48 <garyk> 3. backend implementation 15:12:55 <debo_os> ok ... so we are one 1 15:13:04 <garyk> debo_os: yes :) 15:13:32 <debo_os> I think 2, 3 are imp but we should have a simple 1. with room for complex extensions since this will evolve 15:13:45 <debo_os> maybe over 1-2 releases ... 15:13:47 <garyk> debo_os: agreed. 15:13:55 <MikeSpreitzer> I think we should consider two basic approaches to 1: (a) introduce a new service with its own API or (b) introduce a side-car to the existing Heat engine 15:14:12 <garyk> do you want to explain what you are thinking and lets see if we can translate those ideas into api's 15:14:50 <MikeSpreitzer> ( a ) would be to put up a service that has an API that is similar to Heat's — you can give it a template to instantiate / update, and ask about what happened to it. 15:14:52 <debo_os> garyk: was that for Mike 15:14:56 <garyk> personally i think that heat is too high in the application stack to be able to make the scheduling decions 15:15:02 <MikeSpreitzer> right 15:15:03 <garyk> debo_os: it was for you 15:15:11 <MikeSpreitzer> or not 15:15:41 <debo_os> ok ... so here is my simplification of the threads - I agree with Mike wrt specify all resources upfront for scheduler 15:16:01 <MikeSpreitzer> I think of holistic infrastructure scheduling as a lower level thing than software orchestration preparation, 15:16:12 <debo_os> so an API should have the following objects: list of VRTs, list of policies and list of metadata 15:16:25 <MikeSpreitzer> but infrastructure orchestration is downstream from holistic scheduling. 15:16:46 <debo_os> so in the simple variation, VRTs could be instances alone and implemented inside nova 15:17:04 <debo_os> in a complex variation, this thing could be built on top of nova, neutron, cinder 15:17:12 <MikeSpreitzer> Debo: You mean a VRT would mention only VMs and be processed only inside nova? 15:17:50 <debo_os> in teh simplest implementation to show the API layer works with the policies 15:18:10 <debo_os> in the complex variant, we can do what you proposed ... specify the topology 15:18:15 <MikeSpreitzer> Debo: I take that as agreement and elaboration 15:18:21 <debo_os> instances are the simplest incarnation of yoru topology - single node 15:18:27 <MikeSpreitzer> oh 15:18:43 <MikeSpreitzer> now I'm not so sure I understand you 15:19:12 <debo_os> ok lets look at your topology - nodes are compute, or storage say .... 15:19:14 <garyk> debo_os: can you please give example so it can maybe help to explain 15:19:20 <MikeSpreitzer> By "single node" you mean something with VRT syntax that just happens to have only one resource in it? 15:19:22 <debo_os> ok consider a simple web app 15:19:40 <debo_os> web layer (rails) = 1 VM connected to mysql (1VM) 15:20:21 <debo_os> in the simplest incarnation, you can say give me 2 VMs ... in the full VRT variation, its VM ---> VM 15:20:32 <debo_os> or rather ext_network--> VM --> VM 15:20:43 <debo_os> so you are specifying network and compute 15:20:54 <garyk> (not to mention storage) 15:21:02 <debo_os> of course :) 15:21:18 <MikeSpreitzer> of course not, or of course including 15:21:19 <MikeSpreitzer> ? 15:21:28 <debo_os> but we can have teh same API and different variations ... 15:21:45 <MikeSpreitzer> I'm a little lost. Is this a new API for nova, a new syntax to stuff into some existing nova API, or what? 15:21:46 <debo_os> VRT implemented in nova boils down to asking for only compute nodes 15:22:06 <debo_os> Mike this is a simple API that can both be done in nova for starters and then done as a service 15:22:09 <debo_os> without changing the API 15:22:19 <MikeSpreitzer> Ah, thanks 15:22:21 <debo_os> and that allows you to plug in your secret sauce 15:22:37 <debo_os> since everyone will have their smart ways of implementing it 15:22:45 <MikeSpreitzer> So it's a new API that takes a VRT. Start by implementing it for nova, later implement as a new service or expansion of Heat engine. Right? 15:22:50 <debo_os> someone will do LP based solving, someone will do nonlinear 15:22:58 <debo_os> yes :) 15:23:01 <garyk> MikeSpreitzer: yes, 15:24:00 <debo_os> wow ... any disagreements? 15:24:02 <garyk> if we could come to an agreement on what the API would look like then it could be useful to propose that 15:24:07 <MikeSpreitzer> Nova already has an API for creating a set of VM instances, right? Can we expand the syntax accepted there? 15:24:10 <garyk> PhilD: what do you think? 15:24:46 <debo_os> a few of us were trying to get instance_groups as an extension ... I think we could improve that API to have VRTs 15:24:49 <Yathi> simple variation will still have - list of instances (simple VRT), list of policies, and list of metadata right 15:24:51 <PhilD> Sorry, production issue came in 15:24:54 <PhilD> :-( 15:25:00 <garyk> ok, np 15:25:41 <garyk> MikeSpreitzer: the instance groups just have policies at the moment. It should be consumed by the API that we would like to propose (i think) 15:25:44 <MikeSpreitzer> OK, I'm such a newbie I mostly read documentation. But the doc for the Nova API includes today an extension for placing a set of VMs. 15:25:46 <debo_os> so the API shoudl have CRUDs for VRTs, policies and metadata? 15:25:59 <Yathi> the VRTs should specify the request_spec right. . 15:26:07 <debo_os> yathi: yes 15:26:10 <garyk> yes 15:26:23 <garyk> I think that our goal here is to define the VRT 15:26:32 <MikeSpreitzer> policies are parts of VRTs, you do not create policies independently 15:26:50 <garyk> if we could define the API, flows and usecases then we could have a good starting point 15:26:53 <debo_os> ok ... so then a list of VRTs with embedded policies and metadata? 15:27:04 <debo_os> would that work 15:27:08 <MikeSpreitzer> debo: yes, that's what I was thinking 15:27:10 <PhilD> Yeah, I think I'd need to see some examples of the VRT to really get a sense of what's being propsoed 15:27:12 <Yathi> embedded structure sounds good - makes it clean 15:27:31 <debo_os> I have one in my mind but Mike might have examples 15:27:34 <MikeSpreitzer> The wiki page I wrote about policy extension gives much of that 15:27:44 <debo_os> lets consider ext_network --> VM --> VM to go back to the web app 15:27:47 <garyk> MikeSpreitzer: please paste the link 15:27:48 <MikeSpreitzer> I did not go all the way to concrete syntax, but am happy to discuss that here 15:27:52 <debo_os> where you need 1 VM for apache and 1 VM for mysql 15:27:55 <PhilD> Worked thruugh Use cases are always a good way exploring this kind of think IMO 15:28:09 <Yathi> we need to evolve the instance group api extension to consider this new thing 15:28:18 <garyk> Yathi: agreed 15:28:43 <garyk> debo_os: please continue with the example (we all seemed to interrupt you) 15:28:52 <debo_os> so VRT = { nodes, connections, policy, metadata} 15:29:05 <debo_os> where nodes = list of VM request specs 15:29:18 <debo_os> connections = list of <node,node> pairs 15:29:29 <MikeSpreitzer> https://wiki.openstack.org/wiki/Heat/PolicyExtension 15:30:19 <debo_os> tahts my simple use case 15:30:23 <MikeSpreitzer> The way I see it, we already have syntax (in heat) for set of resources. Need to add only: (1) grouping, (2) policies, (3) way to put policies on relationships 15:30:47 <garyk> debo_os: would the policies not be coupled with the connections 15:30:59 <garyk> that is, some we may want affinity, others anti-affinity etc 15:31:07 <debo_os> garyk: VRT level policies are here 15:31:18 <debo_os> connection level policies need to be in <node,node, policy> 15:31:29 <debo_os> sorry should have added policy for all the types 15:31:32 <MikeSpreitzer> exactly. You can attach a policy to a relationship between two group/resource 15:31:36 <garyk> ok, understood. that sounds logical 15:31:40 <Yathi> can a VRT be a hierarchy of VRTs ? 15:31:47 <debo_os> why not ... 15:31:50 <MikeSpreitzer> No, one VRT has a hierarchy of groups 15:31:53 <Yathi> then each VRT can have a VRT level policy 15:32:00 <MikeSpreitzer> VRT is the whole you want processed at once 15:32:23 <garyk> yup, kind of what we once described as ensembles 15:32:25 <debo_os> http://docwiki.cisco.com/wiki/Donabe_for_OpenStack .... we have an implementation fo recursive containers on openstack ... hence recursive VRTs 15:32:56 <MikeSpreitzer> So we are agreed on the idea of recursive containment 15:32:57 <debo_os> with full GUI ... http://www.openstack.org/summit/portland-2013/session-videos/presentation/interactive-visual-orchestration-with-curvature-and-donabe 15:33:26 <MikeSpreitzer> We could use a syntax that is oriented around the group AKA container, primarily a tree of those. 15:33:29 <debo_os> garyk: +1 lots of things are similar which is good since it means we all need something like this 15:33:41 <debo_os> mike: why tree and why not graphs 15:33:46 <garyk> agreed. 15:34:13 <MikeSpreitzer> "contain" pretty much implies a tree-like shape to me. 15:34:13 <debo_os> we can already do graphs with neutron and openstack ... 15:34:25 <debo_os> sorry nova 15:34:50 <MikeSpreitzer> It may not seem like it to some of you, but I am actually trying to not go farther than necessary here 15:35:00 <MikeSpreitzer> I think a tree is sufficient 15:35:04 <debo_os> mike: while i agree it looks like treee ... i can also think it looks like a graph 15:35:22 <MikeSpreitzer> Debo: acyclic, right? 15:35:22 <debo_os> esp for describing virtual clusters for intense workloads 15:35:56 <debo_os> if you spec bw constraints you might want to spec a clique with max bw of the edges 15:36:24 <debo_os> see this is why we need an abstract API 15:36:30 <MikeSpreitzer> Debo: Is that an answer to my question about whether the graph can have cycles? 15:36:33 <debo_os> in one implementation you could restrict VRTs to trees 15:36:35 <debo_os> yes it can 15:36:54 <debo_os> I mean Neutron would support it ... why not then 15:37:12 <Yathi> let's start with simple examples :) 15:37:13 <MikeSpreitzer> Can you give us some use cases that require something more general than a tree? 15:37:31 <garyk> i really think that we need to start with something simple. 15:37:34 <debo_os> sure ... in a hadoop env, you might want to define a clque 15:37:35 <debo_os> yes 15:37:48 <debo_os> thats why keeping the API to the VRT level is what I would love to see 15:37:58 <garyk> if we go for complex there is no chance we are going to get it through (it should be extensible to be built on in the fyture) 15:38:00 <debo_os> since there is no end to making this API look better 15:38:14 <MikeSpreitzer> The clique is not a problem for a tree. One vertex for the parent, one for each member. 15:38:18 <debo_os> I am happy if we agree to VRTs with embedded metadata and policy 15:38:19 <Yathi> okay going back to the API.. what is a POLICY ? 15:38:23 <MikeSpreitzer> members all children of the same parent 15:38:31 <Yathi> I have seen flavors of affinity, antiaffinity etc 15:38:40 <MikeSpreitzer> Yes... 15:38:48 <debo_os> policy could be simple named policy handles implemented by whoever is provising u the scheduling 15:38:50 <Yathi> but do we have a generic idea of what could a policy be like 15:38:52 <MikeSpreitzer> collocation, anti-collocation 15:39:01 <debo_os> yeah so these are named objects 15:39:04 <MikeSpreitzer> Yes, we need to define semantics 15:39:05 <garyk> proximity and compute resources 15:39:12 <MikeSpreitzer> I think some take parameters 15:39:20 <debo_os> mike: do we need to define semantics in the api right now 15:39:27 <MikeSpreitzer> for example, anti-collocation to what level of granularity? Rack, machine, … ? 15:39:36 <debo_os> why dont we agree on the basic high level objects that the API needs 15:39:39 <garyk> i think that the onus is on us to try and define the API. then provide examples, use cases and flows 15:39:39 <Yathi> I like the idea of named policy handles.. leaving the implementation details outside 15:40:15 <Yathi> so each kind of implementation of the "SMART resource placement engine" can use policies differently 15:40:22 <MikeSpreitzer> A policy "instance" as it appears in a VRT needs only to name the policy, the thing or two to which it applies, and give the values of the relevant parameters. 15:40:30 <garyk> the more robust the api the better (i know it sounds like lip service, but we really need a good base here) 15:40:30 <debo_os> agreed 15:41:03 <debo_os> garyk: trying to see if we need anything more than a list of VRTs with policy names (maybe params) 15:41:15 <debo_os> else the API looks simple from a 30K ft alt 15:41:37 <MikeSpreitzer> I oulined a proposal in https://wiki.openstack.org/wiki/Heat/PolicyExtension 15:41:48 <garyk> it should compile on paper (or in our case interpret on paper) 15:41:53 <MikeSpreitzer> we need groups, a way to apply a policy to a group, and a way to apply policies to a pair of groups 15:42:20 <MikeSpreitzer> you could allow resources in places of groups, or not, depending on evolution tactics 15:42:33 <debo_os> mike: could we keep teh API simple and just have VRTs with policies 15:42:39 <debo_os> would that break your use cases 15:43:01 <debo_os> then it would be really simple and the impl could be as elaborate as you want! 15:43:08 <MikeSpreitzer> We need a way to apply a policy to a relationship 15:43:20 <debo_os> apply = implementation, right? 15:43:27 <MikeSpreitzer> no,... 15:43:41 <MikeSpreitzer> e.g., "I need 1 Gbps between A and B" 15:43:58 <debo_os> yes thats a policy for the connection between A, B 15:44:08 <MikeSpreitzer> or, for firewall, "A should be able to open a TCP connection to port 8080 on B" 15:44:12 <debo_os> so when you put edges in your VRT, you should have edge policy 15:44:31 <debo_os> hence edges = <node,node,policy> 15:44:33 <MikeSpreitzer> right, call them edges or relationships, we need them 15:44:40 <debo_os> nodes = <node, policy> 15:44:47 <debo_os> VRT = <nodes, edges, policies> 15:45:06 <Yathi> +1 for VRT = <nodes, edges, policies> 15:45:07 <debo_os> I am just using std graph terminology 15:45:16 <debo_os> G=(V,E) :) 15:45:39 <MikeSpreitzer> and we need recursive grouping 15:45:48 <debo_os> in the simplest case edges =[], and we can do this in Nova 15:45:59 <debo_os> nodes = VMs only 15:46:03 <MikeSpreitzer> The three key ideas are: recursive grouping, relationships, and policy applied to group/element or relationship 15:46:18 <doron> Can an edge include more than 2 nodes? 15:46:22 <MikeSpreitzer> no 15:46:38 <debo_os> ok this definition will apply if you consider node = abstract node that represents anotehr VRT 15:46:43 <MikeSpreitzer> And we need edges to be directed, in some cases (e.g., firewall rule) 15:46:49 <debo_os> exept that you need metadata 15:46:52 <Yathi> I was going to ask the same thing can the node be a VRT 15:46:55 <Yathi> for abstract 15:47:12 <debo_os> for a node to be treated as a VT you need ingress border nodes for a given VRT 15:47:13 <doron> so no way to secure 1 GB between A, B and C? 15:47:32 <debo_os> Doron: you need to do A-B, B-C C-A 15:47:44 <MikeSpreitzer> yeah... 15:47:46 <doron> I know, but one of them may fail. 15:47:54 <doron> which invalidates everything 15:48:01 <debo_os> if you want shared A,B,C you need a special VRT with policies that implement that 15:48:07 <garyk> doron: that is why the scheduling should be done at one shot 15:48:10 <debo_os> and then stick this VRT as a node in the general VRT 15:48:14 <MikeSpreitzer> Most policies are essentially about a pairwise relationship, so applying such a policy reduces to a bunch of atomic relationships 15:49:10 <doron> I'm aware of the atomic need, which sometimes 15:49:22 <doron> ends up with a need for more than a pair of nodes. 15:49:43 <doron> if you take afinity, 15:49:46 <MikeSpreitzer> applying a dyadic policy to a group means to apply it to every pair within the group 15:49:46 <debo_os> so I guess we are all saying the same thing with slight changes in jargon 15:49:57 <debo_os> so a dict of jargon mappings would suffice :) 15:50:41 <MikeSpreitzer> Yes, I imagine you could apply a collocation policy to a group of 7 VMs, that means all pairs are collocated 15:50:46 <debo_os> so any disagrees with the simple API = VRTs = <nodes, edges, policies, metadata>? 15:50:57 <debo_os> with implementation in plugins 15:51:08 <MikeSpreitzer> What do you mean by metadata? Is that the parameters of the policies that take parameters? 15:51:14 <doron> MikeSpreitzer: I may need some more info on your suggestion, but I can look into it later. 15:51:15 <debo_os> thats undefined 15:51:17 <debo_os> for now 15:51:22 <debo_os> its defined by the implementation 15:51:41 <Yathi> its a placeholder for any random attributes I guess 15:51:43 <MikeSpreitzer> What does metadata look like? To what is it attached? Why do you want it? 15:51:49 <Yathi> you can think of it that way 15:51:54 <garyk> it would be nice if we could get the api on paper so people could see it and think of issues and problems 15:51:55 <Yathi> in python a simple dictionary ? 15:52:18 <MikeSpreitzer> So we can attach a general dictionary to any vertex in the graph? 15:52:20 <garyk> the metdata will just be key/value pairs 15:53:43 <MikeSpreitzer> Is there convergence on this: a VRT is a graph, with policies applied to vertices and edges (which are directed), metadata applied to vertices. A vertex can be a resource or another VRT. 15:54:41 <doron> +1 15:54:58 <MikeSpreitzer> And the API will get all the relevant VRTs at once. 15:55:07 <MikeSpreitzer> So it can work on all of the at once. 15:55:09 <garyk> we are running out of time. debo_os would it be possible you write up the api and share it with everyone and we can discuss in more detail next week 15:55:36 <debo_os> sure ... 15:55:42 <debo_os> would love to rope in mike and yathi too 15:55:48 <debo_os> but I take teh aI 15:55:53 <garyk> great. anyone else want to help debo_os write this up 15:55:56 <MikeSpreitzer> sure. Let's agree on when/how to talk 15:56:04 <Yathi> would love to work with Debo on this 15:56:17 <garyk> cool. i'll jump in too 15:56:23 <debo_os> awresome 15:56:30 <Yathi> this is one set of API 15:56:44 <MikeSpreitzer> We didn't get to 2 or 3 15:56:45 <debo_os> garyk: you would have been there even if you hadn't volunteered :) we would have dragged u 15:56:50 <Yathi> what about the API to the other parts of the big vision 15:56:56 <garyk> :) 15:57:01 <MikeSpreitzer> Any quick feedback on using host aggregates to convey location structure? 15:57:16 <garyk> Yathi: once we have the foundations we can try and map it to all of the use cases we can think of 15:58:22 <Yathi> ok 15:58:26 <debo_os> ok so we have agreement wrt API? 15:58:32 <garyk> MikeSpreitzer: are you talking about a user configuration of having the aggregate report the 'proximity' 15:58:56 <debo_os> only the high level VRTs etc 15:58:57 <debo_os> ? 15:59:04 <MikeSpreitzer> I'm getting to the questions of the other APIs. The scheduler will need location info, so how is that represented/discovered/conveyed? 15:59:34 <garyk> MikeSpreitzer: that is certainly something that we need to discuss 15:59:47 <Yathi> we will not address the how it is discovered yet.. 15:59:49 <garyk> i think that we are out of time. can we continue offline or next week? 15:59:55 <MikeSpreitzer> One direction would be to define some key:value pairs to use in host aggregates, use host aggregates to represent the structure of the datacenter 15:59:57 <Yathi> but represented and conveyed is something to tackle first 16:00:14 <MikeSpreitzer> I'll be watching the ML 16:00:20 <garyk> ok. great. 16:00:22 <garyk> thank guys 16:00:25 <garyk> #endmeeting