15:08:20 <garyk> #startmeeting scheduling
15:08:21 <openstack> Meeting started Tue Oct  1 15:08:20 2013 UTC and is due to finish in 60 minutes.  The chair is garyk. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:08:22 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:08:24 <openstack> The meeting name has been set to 'scheduling'
15:08:43 <debo_os> agenda?
15:08:57 <garyk> Not sure if you guys saw the mail I sent. I suggested we talk about maybe discussing an API to propose for summit
15:09:14 <garyk> And in addition to this the Heat scheduling discussion on the list.
15:09:17 <MikeSpreitzer> I saw the mail.
15:09:25 <MikeSpreitzer> If I understand, it's really the same discussion.
15:09:25 <garyk> So should we start with the API?
15:09:40 <garyk> #topic Discuss scheduling API for summit
15:09:42 <MikeSpreitzer> I think the API should look a lot like the Heat API — deal with this template
15:09:45 <debo_os> garyk, mike: +1 for API
15:09:58 <debo_os> but I am not sure if the scheduler API should look like HEAT ....
15:10:16 <MikeSpreitzer> If we want to make a unified decision, we need unified input
15:10:20 <debo_os> It needs to specify the VRT or eq and policy handles
15:10:27 <debo_os> and it should be very very simple
15:10:29 <PhilD> alaski has been pushing a set of changes for the query scheduler - how does that relate to this ?
15:10:51 <MikeSpreitzer> Oh, something else I need to learn.  Can you give a pointer?
15:10:57 <debo_os> Mike: agree about the unified input ... hence maybe VRT ...  ..
15:11:31 <garyk> debo_os: can you please elaborate on VRT
15:11:32 <debo_os> but the API could be very simple which is easy to incrementally build and allow for extensions to have complex variants
15:11:50 <debo_os> VRT  = virtual resource topology from Mike's jargon
15:12:12 <garyk> Just to recap, last week we spoke about an trying to understand the following (3 things):
15:12:21 <garyk> 1. a user facing API
15:12:41 <garyk> 2. understanding which resources need to be tracked
15:12:47 <debo_os> for example, one needs to pass in groups of resources that need to be scheduled as a single entity for starters - network compute storage ... and pass a list of policy object
15:12:48 <garyk> 3. backend implementation
15:12:55 <debo_os> ok ... so we are one 1
15:13:04 <garyk> debo_os: yes :)
15:13:32 <debo_os> I think 2, 3 are imp but we should have a simple 1. with room for complex extensions since this will evolve
15:13:45 <debo_os> maybe over 1-2 releases ...
15:13:47 <garyk> debo_os: agreed.
15:13:55 <MikeSpreitzer> I think we should consider two basic approaches to 1: (a) introduce a new service with its own API or (b) introduce a side-car to the existing Heat engine
15:14:12 <garyk> do you want to explain what you are thinking and lets see if we can translate those ideas into api's
15:14:50 <MikeSpreitzer> ( a ) would be to put up a service that has an API that is similar to Heat's — you can give it a template to instantiate / update, and ask about what happened to it.
15:14:52 <debo_os> garyk: was that for Mike
15:14:56 <garyk> personally i think that heat is too high in the application stack to be able to make the scheduling decions
15:15:02 <MikeSpreitzer> right
15:15:03 <garyk> debo_os: it was for you
15:15:11 <MikeSpreitzer> or not
15:15:41 <debo_os> ok ... so here is my simplification of the threads - I agree with Mike wrt specify all resources upfront for scheduler
15:16:01 <MikeSpreitzer> I think of holistic infrastructure scheduling as a  lower level thing than software orchestration preparation,
15:16:12 <debo_os> so an API should have the following objects: list of VRTs, list of policies and list of metadata
15:16:25 <MikeSpreitzer> but infrastructure orchestration is downstream from holistic scheduling.
15:16:46 <debo_os> so in the simple variation, VRTs could be instances alone and implemented inside nova
15:17:04 <debo_os> in a complex variation, this thing could be built on top of nova, neutron, cinder
15:17:12 <MikeSpreitzer> Debo: You mean a VRT would mention only VMs and be processed only inside nova?
15:17:50 <debo_os> in teh simplest implementation to show the API layer works with the policies
15:18:10 <debo_os> in the complex variant, we can do what you proposed ... specify the topology
15:18:15 <MikeSpreitzer> Debo: I take that as agreement and elaboration
15:18:21 <debo_os> instances are the simplest incarnation of yoru topology - single node
15:18:27 <MikeSpreitzer> oh
15:18:43 <MikeSpreitzer> now I'm not so sure I understand you
15:19:12 <debo_os> ok lets look at your topology - nodes are compute, or storage  say ....
15:19:14 <garyk> debo_os: can you please give example so it can maybe help to explain
15:19:20 <MikeSpreitzer> By "single node" you mean something with VRT syntax that just happens to have only one resource in it?
15:19:22 <debo_os> ok consider a simple web app
15:19:40 <debo_os> web layer (rails) = 1 VM connected to mysql (1VM)
15:20:21 <debo_os> in the simplest incarnation,  you can say give me 2 VMs ... in the full VRT variation, its VM ---> VM
15:20:32 <debo_os> or rather ext_network--> VM --> VM
15:20:43 <debo_os> so you are specifying network and compute
15:20:54 <garyk> (not to mention storage)
15:21:02 <debo_os> of course :)
15:21:18 <MikeSpreitzer> of course not, or of course including
15:21:19 <MikeSpreitzer> ?
15:21:28 <debo_os> but we can have teh same API and different variations ...
15:21:45 <MikeSpreitzer> I'm a little lost.  Is this a new API for nova, a new syntax to stuff into some existing nova API, or what?
15:21:46 <debo_os> VRT implemented in nova boils down to asking for only compute nodes
15:22:06 <debo_os> Mike this is a simple API that can both be done in nova for starters and then done as a service
15:22:09 <debo_os> without changing the API
15:22:19 <MikeSpreitzer> Ah, thanks
15:22:21 <debo_os> and that allows you to plug in your secret sauce
15:22:37 <debo_os> since everyone will have their smart ways of implementing it
15:22:45 <MikeSpreitzer> So it's a new API that takes a VRT.  Start by implementing it for nova, later implement as a new service or expansion of Heat engine.  Right?
15:22:50 <debo_os> someone will do LP based solving, someone will do nonlinear
15:22:58 <debo_os> yes :)
15:23:01 <garyk> MikeSpreitzer: yes,
15:24:00 <debo_os> wow ... any disagreements?
15:24:02 <garyk> if we could come to an agreement on what the API would look like then it could be useful to propose that
15:24:07 <MikeSpreitzer> Nova already has an API for creating a set of VM instances, right?  Can we expand the syntax accepted there?
15:24:10 <garyk> PhilD: what do you think?
15:24:46 <debo_os> a few of us were trying to get instance_groups as an extension ... I think we could improve that API to have VRTs
15:24:49 <Yathi> simple variation will still have - list of instances (simple VRT),  list of policies, and list of metadata  right
15:24:51 <PhilD> Sorry, production issue came in
15:24:54 <PhilD> :-(
15:25:00 <garyk> ok, np
15:25:41 <garyk> MikeSpreitzer: the instance groups just have policies at the moment. It should be consumed by the API that we would like to propose (i think)
15:25:44 <MikeSpreitzer> OK, I'm such a newbie I mostly read documentation.  But the doc for the Nova API includes today an extension for placing a set of VMs.
15:25:46 <debo_os> so the API shoudl have CRUDs for VRTs, policies and metadata?
15:25:59 <Yathi> the VRTs should specify the request_spec right. .
15:26:07 <debo_os> yathi: yes
15:26:10 <garyk> yes
15:26:23 <garyk> I think that our goal here is to define the VRT
15:26:32 <MikeSpreitzer> policies are parts of VRTs, you do not create policies independently
15:26:50 <garyk> if we could define the API, flows and usecases then we could have a good starting point
15:26:53 <debo_os> ok ... so then a list of VRTs with embedded policies and metadata?
15:27:04 <debo_os> would that work
15:27:08 <MikeSpreitzer> debo: yes, that's what I was thinking
15:27:10 <PhilD> Yeah, I think I'd need to see some examples of the VRT to really get a sense of what's being propsoed
15:27:12 <Yathi> embedded structure sounds good - makes it clean
15:27:31 <debo_os> I have one in my mind but Mike might have examples
15:27:34 <MikeSpreitzer> The wiki page I wrote about policy extension gives much of that
15:27:44 <debo_os> lets consider ext_network --> VM --> VM to go back to the web app
15:27:47 <garyk> MikeSpreitzer: please paste the link
15:27:48 <MikeSpreitzer> I did not go all the way to concrete syntax, but am happy to discuss that here
15:27:52 <debo_os> where you need 1 VM for apache and 1 VM for mysql
15:27:55 <PhilD> Worked thruugh Use cases are always a good way exploring this kind of think IMO
15:28:09 <Yathi> we need to evolve the instance group api extension to consider this new thing
15:28:18 <garyk> Yathi: agreed
15:28:43 <garyk> debo_os: please continue with the example (we all seemed to interrupt you)
15:28:52 <debo_os> so VRT = { nodes, connections, policy, metadata}
15:29:05 <debo_os> where nodes = list of VM request specs
15:29:18 <debo_os> connections = list of <node,node> pairs
15:29:29 <MikeSpreitzer> https://wiki.openstack.org/wiki/Heat/PolicyExtension
15:30:19 <debo_os> tahts my simple use case
15:30:23 <MikeSpreitzer> The way I see it, we already have syntax (in heat) for set of resources.  Need to add only: (1) grouping, (2) policies, (3) way to put policies on relationships
15:30:47 <garyk> debo_os: would the policies not be coupled with the connections
15:30:59 <garyk> that is, some we may want affinity, others anti-affinity etc
15:31:07 <debo_os> garyk: VRT level policies are here
15:31:18 <debo_os> connection level policies need to be in <node,node, policy>
15:31:29 <debo_os> sorry should have added policy for all the types
15:31:32 <MikeSpreitzer> exactly.  You can attach a policy to a relationship between two group/resource
15:31:36 <garyk> ok, understood. that sounds logical
15:31:40 <Yathi> can a VRT be a hierarchy of VRTs ?
15:31:47 <debo_os> why not ...
15:31:50 <MikeSpreitzer> No, one VRT has a hierarchy of groups
15:31:53 <Yathi> then each VRT can have a VRT level policy
15:32:00 <MikeSpreitzer> VRT is the whole you want processed at once
15:32:23 <garyk> yup, kind of what we once described as ensembles
15:32:25 <debo_os> http://docwiki.cisco.com/wiki/Donabe_for_OpenStack .... we have an implementation fo recursive containers on openstack ... hence recursive VRTs
15:32:56 <MikeSpreitzer> So we are agreed on the idea of recursive containment
15:32:57 <debo_os> with full GUI ... http://www.openstack.org/summit/portland-2013/session-videos/presentation/interactive-visual-orchestration-with-curvature-and-donabe
15:33:26 <MikeSpreitzer> We could use a syntax that is oriented around the group AKA container, primarily a tree of those.
15:33:29 <debo_os> garyk: +1 lots of things are similar which is good since it means we all need something like this
15:33:41 <debo_os> mike: why tree and why not graphs
15:33:46 <garyk> agreed.
15:34:13 <MikeSpreitzer> "contain" pretty much implies a tree-like shape to me.
15:34:13 <debo_os> we can already do graphs with neutron and openstack ...
15:34:25 <debo_os> sorry nova
15:34:50 <MikeSpreitzer> It may not seem like it to some of you, but I am actually trying to not go farther than necessary here
15:35:00 <MikeSpreitzer> I think a tree is sufficient
15:35:04 <debo_os> mike: while i agree it looks like treee ... i can also think it looks like a graph
15:35:22 <MikeSpreitzer> Debo: acyclic, right?
15:35:22 <debo_os> esp for describing virtual clusters for intense workloads
15:35:56 <debo_os> if you spec bw constraints you might want to spec a clique with max bw of the edges
15:36:24 <debo_os> see this is why we need an abstract API
15:36:30 <MikeSpreitzer> Debo: Is that an answer to my question about whether the graph can have cycles?
15:36:33 <debo_os> in one implementation you could restrict VRTs to trees
15:36:35 <debo_os> yes it can
15:36:54 <debo_os> I mean Neutron would support it ... why not then
15:37:12 <Yathi> let's start with simple examples :)
15:37:13 <MikeSpreitzer> Can you give us some use cases that require something more general than a tree?
15:37:31 <garyk> i really think that we need to start with something simple.
15:37:34 <debo_os> sure ... in a hadoop env, you might want to define a clque
15:37:35 <debo_os> yes
15:37:48 <debo_os> thats why keeping the API to the VRT level is what I would love to see
15:37:58 <garyk> if we go for complex there is no chance we are going to get it through (it should be extensible to be built on in the fyture)
15:38:00 <debo_os> since there is no end to making this API look better
15:38:14 <MikeSpreitzer> The clique is not a problem for a tree.  One vertex for the parent, one for each member.
15:38:18 <debo_os> I am happy if we agree to VRTs with embedded metadata and policy
15:38:19 <Yathi> okay going back to the API.. what is a POLICY ?
15:38:23 <MikeSpreitzer> members all children of the same parent
15:38:31 <Yathi> I have seen flavors of affinity, antiaffinity etc
15:38:40 <MikeSpreitzer> Yes...
15:38:48 <debo_os> policy could be simple named policy handles implemented by whoever is provising u the scheduling
15:38:50 <Yathi> but do we have a generic idea of what could a policy be like
15:38:52 <MikeSpreitzer> collocation, anti-collocation
15:39:01 <debo_os> yeah so these are named objects
15:39:04 <MikeSpreitzer> Yes, we need to define semantics
15:39:05 <garyk> proximity and compute resources
15:39:12 <MikeSpreitzer> I think some take parameters
15:39:20 <debo_os> mike: do we need to define semantics in the api right now
15:39:27 <MikeSpreitzer> for example, anti-collocation to what level of granularity?  Rack, machine, … ?
15:39:36 <debo_os> why dont we agree on the basic high level objects that the API needs
15:39:39 <garyk> i think that the onus is on us to try and define the API. then provide examples, use cases and flows
15:39:39 <Yathi> I like the idea of named policy handles.. leaving the implementation details outside
15:40:15 <Yathi> so each kind of implementation of the "SMART resource placement engine" can use policies differently
15:40:22 <MikeSpreitzer> A policy "instance" as it appears in a VRT needs only to name the policy, the thing or two to which it applies, and give the values of the relevant parameters.
15:40:30 <garyk> the more robust the api the better (i know it sounds like lip service, but we really need a good base here)
15:40:30 <debo_os> agreed
15:41:03 <debo_os> garyk: trying to see if we need anything more than a list of VRTs with policy names (maybe params)
15:41:15 <debo_os> else the API looks simple from a 30K ft alt
15:41:37 <MikeSpreitzer> I oulined a proposal in https://wiki.openstack.org/wiki/Heat/PolicyExtension
15:41:48 <garyk> it should compile on paper (or in our case interpret on paper)
15:41:53 <MikeSpreitzer> we need groups, a way to apply a policy to a group, and a way to apply policies to a pair of groups
15:42:20 <MikeSpreitzer> you could allow resources in places of groups, or not, depending on evolution tactics
15:42:33 <debo_os> mike: could we keep teh API simple and just have VRTs with policies
15:42:39 <debo_os> would that break your use cases
15:43:01 <debo_os> then it would be really simple and the impl could be as elaborate as you want!
15:43:08 <MikeSpreitzer> We need a way to apply a policy to a relationship
15:43:20 <debo_os> apply = implementation, right?
15:43:27 <MikeSpreitzer> no,...
15:43:41 <MikeSpreitzer> e.g., "I need 1 Gbps between A and B"
15:43:58 <debo_os> yes thats a policy for the connection between A, B
15:44:08 <MikeSpreitzer> or, for firewall, "A should be able to open a TCP connection to port 8080 on B"
15:44:12 <debo_os> so when you put edges in your VRT, you should have edge policy
15:44:31 <debo_os> hence edges = <node,node,policy>
15:44:33 <MikeSpreitzer> right, call them edges or relationships, we need them
15:44:40 <debo_os> nodes = <node, policy>
15:44:47 <debo_os> VRT = <nodes, edges, policies>
15:45:06 <Yathi> +1 for VRT = <nodes, edges, policies>
15:45:07 <debo_os> I am just using std graph terminology
15:45:16 <debo_os> G=(V,E) :)
15:45:39 <MikeSpreitzer> and we need recursive grouping
15:45:48 <debo_os> in the simplest case edges =[], and we can do this in Nova
15:45:59 <debo_os> nodes = VMs only
15:46:03 <MikeSpreitzer> The three key ideas are: recursive grouping, relationships, and policy applied to group/element or relationship
15:46:18 <doron> Can an edge include more than 2 nodes?
15:46:22 <MikeSpreitzer> no
15:46:38 <debo_os> ok this definition will apply if you consider node = abstract node that represents anotehr VRT
15:46:43 <MikeSpreitzer> And we need edges to be directed, in some cases (e.g., firewall rule)
15:46:49 <debo_os> exept that you need metadata
15:46:52 <Yathi> I was going to ask the same thing can the node be a VRT
15:46:55 <Yathi> for abstract
15:47:12 <debo_os> for a node to be treated as a VT you need ingress border nodes for a given VRT
15:47:13 <doron> so no way to secure 1 GB between A, B and C?
15:47:32 <debo_os> Doron: you need to do A-B, B-C C-A
15:47:44 <MikeSpreitzer> yeah...
15:47:46 <doron> I know, but one of them may fail.
15:47:54 <doron> which invalidates everything
15:48:01 <debo_os> if you want shared A,B,C you need a special VRT with policies that implement that
15:48:07 <garyk> doron: that is why the scheduling should be done at one shot
15:48:10 <debo_os> and then stick this VRT as a node in the general VRT
15:48:14 <MikeSpreitzer> Most policies are essentially about a pairwise relationship, so applying such a policy reduces to a bunch of atomic relationships
15:49:10 <doron> I'm aware of the atomic need, which sometimes
15:49:22 <doron> ends up with a need for more than a pair of nodes.
15:49:43 <doron> if you take afinity,
15:49:46 <MikeSpreitzer> applying a dyadic policy to a group means to apply it to every pair within the group
15:49:46 <debo_os> so I guess we are all saying the same thing with slight changes in jargon
15:49:57 <debo_os> so a dict of jargon mappings would suffice :)
15:50:41 <MikeSpreitzer> Yes, I imagine you could apply a collocation policy to a group of 7 VMs, that means all pairs are collocated
15:50:46 <debo_os> so any disagrees with the simple API = VRTs = <nodes, edges, policies, metadata>?
15:50:57 <debo_os> with implementation in plugins
15:51:08 <MikeSpreitzer> What do you mean by metadata?  Is that the parameters of the policies that take parameters?
15:51:14 <doron> MikeSpreitzer: I may need some more info on your suggestion, but I can look into it later.
15:51:15 <debo_os> thats undefined
15:51:17 <debo_os> for now
15:51:22 <debo_os> its defined by the implementation
15:51:41 <Yathi> its a placeholder for any random attributes I guess
15:51:43 <MikeSpreitzer> What does metadata look like?  To what is it attached?  Why do you want it?
15:51:49 <Yathi> you can think of it that way
15:51:54 <garyk> it would be nice if we could get the api on paper so people could see it and think of issues and problems
15:51:55 <Yathi> in python a simple dictionary ?
15:52:18 <MikeSpreitzer> So we can attach a general dictionary to any vertex in the graph?
15:52:20 <garyk> the metdata will just be key/value pairs
15:53:43 <MikeSpreitzer> Is there convergence on this: a VRT is a graph, with policies applied to vertices and edges (which are directed), metadata applied to vertices.  A vertex can be a resource or another VRT.
15:54:41 <doron> +1
15:54:58 <MikeSpreitzer> And the API will get all the relevant VRTs at once.
15:55:07 <MikeSpreitzer> So it can work on all of the at once.
15:55:09 <garyk> we are running out of time. debo_os would it be possible you write up the api and share it with everyone and we can discuss in more detail next week
15:55:36 <debo_os> sure ...
15:55:42 <debo_os> would love to rope in mike and yathi too
15:55:48 <debo_os> but I take teh aI
15:55:53 <garyk> great. anyone else want to help debo_os write this up
15:55:56 <MikeSpreitzer> sure.  Let's agree on when/how to talk
15:56:04 <Yathi> would love to work with Debo on this
15:56:17 <garyk> cool. i'll jump in too
15:56:23 <debo_os> awresome
15:56:30 <Yathi> this is one set of API
15:56:44 <MikeSpreitzer> We didn't get to 2 or 3
15:56:45 <debo_os> garyk: you would have been there even if you hadn't volunteered :) we would have dragged u
15:56:50 <Yathi> what about the API to the other parts of the big vision
15:56:56 <garyk> :)
15:57:01 <MikeSpreitzer> Any quick feedback on using host aggregates to convey location structure?
15:57:16 <garyk> Yathi: once we have the foundations we can try and map it to all of the use cases we can think of
15:58:22 <Yathi> ok
15:58:26 <debo_os> ok so we have agreement wrt API?
15:58:32 <garyk> MikeSpreitzer: are you talking about a user configuration of having the aggregate report the 'proximity'
15:58:56 <debo_os> only the high level VRTs etc
15:58:57 <debo_os> ?
15:59:04 <MikeSpreitzer> I'm getting to the questions of the other APIs.  The scheduler will need location info, so how is that represented/discovered/conveyed?
15:59:34 <garyk> MikeSpreitzer: that is certainly something that we need to discuss
15:59:47 <Yathi> we will not address the how it is discovered yet..
15:59:49 <garyk> i think that we are out of time. can we continue offline or next week?
15:59:55 <MikeSpreitzer> One direction would be to define some key:value pairs to use in host aggregates, use host aggregates to represent the structure of the datacenter
15:59:57 <Yathi> but represented and conveyed is something to tackle first
16:00:14 <MikeSpreitzer> I'll be watching the ML
16:00:20 <garyk> ok. great.
16:00:22 <garyk> thank guys
16:00:25 <garyk> #endmeeting