15:00:17 <n0ano> #startmeeting scheduler
15:00:18 <openstack> Meeting started Tue May  7 15:00:17 2013 UTC.  The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:19 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:21 <openstack> The meeting name has been set to 'scheduler'
15:00:26 <garyk> hi
15:00:32 <rerngvit> hi
15:00:37 <n0ano> Show of hands, who's here for the scheduler meeting?
15:00:39 <n0ano> o/
15:00:44 <garyk> ack
15:00:44 <jgallard> \o
15:00:46 <rerngvit> +1
15:00:46 <PhilDay> Phil Day
15:01:06 * glikson here
15:01:16 <alaski> o/
15:01:20 <toanster> here
15:01:41 <PhilDay> I'm at a conference so may have to bail out early
15:01:48 <senhuang> Hello!
15:01:57 <rerngvit> hi all :)
15:02:13 <n0ano> PhilDay, NP, I have a dentist appointment right after myself
15:02:22 <PhilDay> Ouch
15:02:27 <n0ano> :-(
15:02:38 <n0ano> OK, let's begin
15:03:03 <n0ano> As I said in the email agenda, I want to go through all the items at least once before we circle back for more detail on the earlier ones
15:03:15 <n0ano> #topic whole host allocation capability
15:03:33 <n0ano> I don't claim to understand this one, is there anyone here to drive this?
15:03:42 <PhilDay> I wrote up and filed the BP on that yesterday (based on the feedback from the summit)
15:03:52 <PhilDay> https://blueprints.launchpad.net/nova/+spec/whole-host-allocation
15:04:38 <n0ano> PhilDay, cool, is the BP sufficient or are there issues you want to bring up now
15:04:43 <PhilDay> This isn't really a scheduler change in the same sense that the other are - its more akin to exposing host aggregates to users
15:04:44 <alaski> I'm also interested in this one, though I don't have a lot to add to the bp right now
15:04:47 <n0ano> we can always study the BP and come back later.
15:05:49 <PhilDay> That would be fine - I'm happ[y to field questions once you've had a chance to read it - probably best done in the mailing list
15:06:05 <PhilDay> I've asked russell to target it for H3
15:06:11 <johnthetubaguy> +1
15:06:14 <n0ano> not hearing much discussion so let's study the BP and respond on the mailing list or another IRC meeting
15:06:24 <PhilDay> works fo rme
15:06:26 <johnthetubaguy> +1
15:06:29 <alaski> sounds good.  The only thing that comes to mind right now is could we also achieve it with specialized flavors
15:06:48 <alaski> but offline is fine
15:06:58 <PhilDay> I think its othogonal to flavors
15:07:20 <n0ano> #topic coexistence of different schedulers
15:07:45 <n0ano> so, curious about this one, how is this different from the plugin filters/weighters that we have now?
15:08:36 <PhilDay> I think the idea is that you could have, for example, different stacks of filters for different users
15:09:07 <rerngvit> or for different flavors
15:09:29 <PhilDay> So if you build this on top of the whole host allocation (i.e have a filter config spefic to an aggregate) then you get another step towards private clouds within a cloud
15:09:37 <n0ano> implication being that schedulers are a runtime selection rather than a startup configuration issue
15:09:45 <rerngvit> I'm posting the ether pad here for reference (https://etherpad.openstack.org/HavanaMultipleSchedulers)
15:10:11 <ogelbukh> multiple schedulers is confusing name for this feature, in my opinion
15:10:31 <glikson> yep, the idea is to have multiple configurations of FilterScheduler, or even different drivers
15:10:34 <senhuang> n0ano: you are absolutely right
15:10:38 <glikson> configured for different host aggregates
15:10:58 <senhuang> glikson: maybe we should call it dynamically loaded scheduler/filtering
15:11:08 <n0ano> I'm concerned that this would be a `major` code change, has anyone looked at what it would take to implement this?
15:11:13 <ogelbukh> is it possible to give admin user ability to specify filters list as a scheduler hint?
15:11:39 <PhilDay> So do you see this as multipel schedulers running on different queues (i.e a fairly static set) or something much more dynamic ?
15:11:43 <n0ano> also, is there really a call for this feature or are we doing something just becuase we `can` do something
15:11:50 <glikson> well, there are multiple options to implement it..
15:12:23 <glikson> I guess I will can defer it to next week -- will be more ready to elaborate on options and dilemmas
15:13:17 <glikson> n0ano: sure, there are several use-cases involving pools with different hardware and/or workloads
15:13:18 <PhilDay> I can see a number of use cases - would like to think about it being perhaps an aggregate specific scheduler - since that seems liek a good abstraction
15:13:45 <glikson> Phil: yep
15:14:10 <PhilDay> Should we also consider this as allowing combined bare metal & Hypervisor systems - of is bare metal moving out of Nova now ?
15:15:04 <n0ano> PhilDay, I would think that bare metal would the orthogonal to this, not sure there's an impact
15:15:23 <rerngvit> PhilDay: could you elaborate a bit more what you mean by "combined bare metal & Hypervisor"?
15:15:34 <glikson> n0ano: I suggest to postpone the discussion to next week -- hope to have next level of design details by then
15:15:44 <PhilDay> +1
15:16:26 <n0ano> OK, I think we've touched a never an everyone's interested, let's postpone discusion till we've all had a chance to study the etherpad
15:16:45 <n0ano> clearly a good are to explore, let's just do it later
15:16:55 <n0ano> #topic rack aware scheduling
15:17:46 <garyk> i am not sure who proposed this. in quantum we would like to propose a network proximity api
15:17:48 <n0ano> this is another one of those - can this be handled by the current flavor / extra_specs mechanism
15:17:51 <PhilDay> That was one of mine as well - I haven't done any more on this as I wasn't sure how it fitted in with some of the bigger scheems for defining sets of instances for scheduling
15:18:32 <senhuang> PhilDay: it could be one use case of group-scheduling
15:18:33 <PhilDay> Not sure you can so it by flavours - you need to add information to each host about its physical / network localicty
15:18:54 <PhilDay> Right - group-scheduling would cover it.
15:19:07 <senhuang> PhilDay: so is it more about how to get the "rack" information from hosts?
15:19:31 <n0ano> PhilDay, note topic 1 (extending data in host state), add the localicity to that and then use current filtering techniques on that new data
15:19:39 <garyk> senhuang: PhilDay: does a host have metadata? if so the rack "id" could be stored and used...
15:19:40 <PhilDay> What I was propsoign going into the summit was something pretty simple - add a "rack" property to each host, and then write a filter to exploit that
15:20:19 <senhuang> garyk: okay. then we can add a policy for group-api that says "same-rack"
15:20:25 <jgallard> is it possible to use availabilty zone for that? one availability zone = 1 rack (1 datacenter = multiple availability zones)
15:20:27 <PhilDay> hosts don't really have meta today - they have capabilities but that's more of a binary (like do I have a GPU)
15:20:40 <garyk> senhuang: yes, sounds good
15:20:43 <PhilDay> I wouldn't want to overlay AZ
15:20:52 <PhilDay> that has a very specific meaning
15:21:03 <garyk> PhilDay: ok, understtod
15:21:11 <jgallard> PhilDay, ok, thanks
15:21:25 <PhilDay> https://etherpad.openstack.org/HavanaNovaRackAwareScheduling
15:22:07 <n0ano> still sounds like something that can be handled by some of the proposals to make the scheduler more extensible
15:22:10 <PhilDay> So we could still do something very simple that just covers my basic use case, but if group-scheculing is going to land in H then that would be a superset of my iodea
15:22:36 <glikson> we are working on another proposal to extend the notion of zones to also cover things like racks
15:23:02 <garyk> PhilDay: glikson i think that the group scheduluing can cover this.
15:23:06 <jgallard> glikson, do you have some links on that work?
15:23:07 <PhilDay> n0ano:  agreed - I suggest we shelve for now and see how those other ideas pan out.
15:23:16 <senhuang> garyk: PhilDay: +1
15:23:17 <glikson> i.e., allow hierarchy of zones, for availability or other purposes, and surface it to the user
15:23:44 <glikson> https://etherpad.openstack.org/HavanaTopologyAwarePlacement
15:23:52 <n0ano> PhilDay, then we'll let you monitor the other proposals and raise a flag if they don't provide the support you're looking for
15:23:59 <jgallard> glikson, thanks!
15:24:02 <PhilDay> We can always revisist later if some of the bigger ideas don;t come through.  I could do with havign basic rack aware scheduling by the end of H
15:24:43 <n0ano> OK, moving on
15:24:52 <n0ano> #topic list scheduler hints via API
15:25:26 <n0ano> anyone care to expand?
15:25:34 <PhilDay> Wroet up the BP for this yesterday as well: https://blueprints.launchpad.net/nova/+spec/scheduler-hints-api
15:25:48 <n0ano> PhilDay, busy bee yesterday :-)
15:26:22 <PhilDay> Basically its about exposing the scheduler config to users so that they can tell which hints will be supported and which ignored (based on the config).  At the moment its just a black box
15:26:43 <PhilDay> and hints for filters not configured would be silenlty ignored.
15:26:59 <n0ano> what exactly do you mean by `hint`
15:27:04 <rerngvit> is this one somewhat overlap with the topology placement glikson just mention? or kind of a superset of?
15:27:05 <alaski> I'm in favor of it.  Slightly related to this I just thought it may be a good idea to be able to set policies on those hints
15:27:18 <PhilDay> the shceduler_hints options passed into server create
15:27:44 <PhilDay> Policies would be good - but that's anew topci I think
15:27:49 <alaski> agreed
15:27:51 <PhilDay> a new topic ;-(
15:28:02 <PhilDay> (My typing is crap today)
15:28:08 <n0ano> if that's being passed into the scheduler by the users create call wouldn't the user know this info already?
15:28:27 <rnirmal> +1 for exposing supported hints.. n0ano it's not available right now
15:28:31 <PhilDay> The user can't tell if the system they are takign to has that
15:28:41 <PhilDay> filter configured or not.
15:29:13 <n0ano> aah, the user asks for it but the system may not provide it, in that case I'm in favor
15:29:24 <PhilDay> Exactly
15:29:53 <rerngvit> like you request for a vegetarian food in a party
15:29:58 <n0ano> although it is a hint and there's never a guaranteee that a hint will be honored
15:30:04 <rerngvit> organizer may or may not provide it
15:30:28 <n0ano> rerngvit, but, again, you don't know that it's not available util you ask for it and don't get it.
15:30:56 <rerngvit> yep, there also should be a special response for this.
15:31:00 <n0ano> by providing this API are we implicitly guranteeing that `these hints will be honored'?
15:31:04 <senhuang> plus: you don't know whether you get it or not
15:31:18 <PhilDay> If its an affinity hint it can be hard to tell if its eign ignored or not - you might still get what you ask for but by chance
15:31:47 <glikson> +1 to support API to query scheduler hints
15:31:52 <PhilDay> Most of the 'hints' are implemented as filters today, so you either get them or not
15:31:52 <alaski> n0ano: I don't think so.  But that raises the question of whether there should be another concept to encapsulate "fail if not honored"
15:32:05 <glikson> PhilDay: have you thought how to implement it?
15:32:11 <n0ano> alaski, indeed
15:32:27 <PhilDay> if you want it to really be a hint then ity needs to be in the weighting function,
15:33:08 <n0ano> PhilDay, good point, I see rampant confusion between what a filter is as opposed to a weight
15:33:10 <senhuang> PhilDay: then you need to expose what attributes are available for weighing
15:33:21 <PhilDay> I was thinking of having teh API rpc to the scheduler, and for each filter to have a method that returns details of its supported hints - so a simple iteration through the configured filters and weighers
15:33:25 <glikson> PhilDay: e.g., would a driver just declare which hints it supports?
15:33:55 <PhilDay> Since filters and weighers are sublcasses it should be fairly easy to add it
15:34:31 <rerngvit> agree. Seems a possible implementation to me.
15:35:00 <glikson> yep, in FilterScheduler you could potentially delegate it to individual filters/weights..
15:35:05 <PhilDay> Just becomes an API extension then
15:35:37 <PhilDay> I think only the filter scheduler supports hints - I hadn't really thought about any of the others
15:36:06 <glikson> PhilDay: whould it "live" under "servers", as an action?
15:36:17 <n0ano> haven't looked at the issue but sounds like hints need to be extend to the weights also
15:36:57 <PhilDay> Yep - I think both filters and weights would need to support the "decribe_hints()" method, and teh scheduler could iterate over both lists
15:37:16 <rerngvit> I think so. It should be in both filters and weights.
15:37:24 <senhuang> +1
15:37:29 <n0ano> assuming this is some sort of new, inherited class that should be fairly simple
15:38:25 <n0ano> OK, my take is everyone seems to agree this is a good idea, mainly it's implementation details
15:38:34 <PhilDay> glikson:  not sure about a server action - isn;t that for actions on a specific instance. ?   This is more of a system capability thing - a bit like listing suported flavours
15:38:57 <PhilDay> I asked Russell to mark the BP for H2
15:38:59 <rerngvit> #agree
15:39:00 <glikson> I wonder whether this has anything to do with "help" command in CLI.. essentially this is related to invocation syntax?
15:39:23 <PhilDay> Should be exposed by teh CLI fo rsure
15:40:19 <PhilDay> moving on ?
15:40:33 <n0ano> PhilDay, took the words out of my keyboard :-)
15:40:47 <n0ano> #topic host directory service
15:41:21 <n0ano> I have no clue what this one is
15:41:48 <russellb> PhilDay: i'll look over these blueprints and get them updated today
15:42:03 <garyk> i am sorry but i need to leave. hopefully the groups/ensembles will be discussed. senhuang and glikson have the details. thanks
15:42:34 <n0ano> garyk, tnx, unlikely we'll get to those today, should cover them next week tho
15:43:09 <n0ano> no one want to talk about host directory service?
15:43:51 <n0ano> OK, moving on
15:44:03 <n0ano> #topic future of the scheduler
15:44:14 <alaski> This is mine
15:44:17 <n0ano> alaski, I think this is your issue, yes?
15:44:22 <alaski> The start of it is https://blueprints.launchpad.net/nova/+spec/query-scheduler
15:44:56 <alaski> Essentially modifying the scheduler to return hosts that can be scheduled to, rather than pass requests on itself
15:45:09 <senhuang> alaski: i like this idea
15:45:35 <PhilDay> Is this related to the work that Josh from Y! was talking about - sort of orchestartion within conductor ?
15:45:35 <senhuang> it clarifies the workflow and separate the complexity of work flow management from resource placement
15:45:51 <alaski> PhilDay: it is a part of that
15:46:30 <n0ano> given that I see the scheduler's job a finding appropriate hosts I don't care who actually sends the requests to those hosts, I have no problems with this idea
15:46:50 <glikson> alaski: so, essentially you will have only "select_hosts" in scheduler rpc API, and all the schedule_* will go to conductor, or something like that?
15:47:01 <alaski> The reason I called this "future of scheduler" is that long term I would like to discuss how we can handle cross service scheduling, but I think there's pleny of work before that
15:47:17 <alaski> glikson: yes, most likely conductor
15:47:22 <senhuang> alaski: we have a proposal covered part of that
15:47:22 <n0ano> glikson, hopefully, whoever `called` the scheduler is the one who `does` the scheduling
15:47:41 <senhuang> alaski: part of the cross service scheduling
15:47:42 <johnthetubaguy> it will help the orchestration work
15:47:45 <johnthetubaguy> too
15:47:47 <glikson> +1
15:48:15 <PhilDay> So the scheduler basically becomes an information service on suitable hosts,
15:49:04 <alaski> PhilDay: that's how I see it, and I think that eventually leads into cross service stuff
15:49:05 <senhuang> some orchestrator/work flow manager asks/calls the scheduler for the suitable hosts for creation/resizing
15:49:43 <senhuang> alaski: +1. this should be helpful for the unified resource placement blue print
15:50:14 <alaski> Cool.  It should really simplify retries, orchestration, and unify some code paths I think
15:50:24 <rerngvit> I have one question though. If the actual scheduling moves somewhere else other than the scheduler. How can the scheduler return an answer then?
15:50:29 <PhilDay> I don't have a problem with the concept - I guess it needs that new central state management / orcestration bit to land first or at the same time
15:51:03 <rerngvit> it would requires a scheduler to query other services for states, is this correct? or I misunderstand something?
15:51:11 <alaski> PhilDay: the initial idea is to put this into conductor, and start building that up for state management
15:51:13 <johnthetubaguy> we already have a similar code path to this in the live-migrate retry logic inside the scheduler, I was thinking of moving to conductor, its kinda similar
15:51:23 <johnthetubaguy> alaski: +1
15:52:03 <alaski> rerngvit: it doesn't really change how the scheduler works now, except that instead of then making the rpc call to the compute it returns that to conductor to make the call
15:52:23 <n0ano> rerngvit, not sure why the scheduler would have to query something, already the hosts send info to the scheduler, that wouldn't change.
15:53:00 <PhilDay> Is there a general BP to cover creating the workflow piece in conductor (or heat or where ever its going)
15:53:11 <senhuang> yep. it is something queries the scheduler for the selected host.
15:53:13 <rerngvit> ok
15:53:30 <senhuang> PhilDay: Joshua from Y! has a blueprint on structured state management
15:53:34 <glikson> PhilDay: yes, Josh posted on the mailing list
15:53:55 <johnthetubaguy> he restarted the orchestration meeting
15:54:07 <PhilDay> seems like this needs to be part of that work then - I can't see it working on its own
15:54:32 <alaski> PhilDay: I'm working with Josh a bit, he'll definitely be a part of it
15:54:32 <johnthetubaguy> its related, for sure, I think the pull to conductor can be separate
15:54:49 <glikson> I think the two are complimentary really.. you can either move things to conductor as is and then reoganize, or the other way around..
15:55:03 <johnthetubaguy> its the api -> conductor bit, then call out to scheduler and compute as required
15:55:16 <PhilDay> I have to dash - but on #9 I wrote that BP up as well https://blueprints.launchpad.net/nova/+spec/network-bandwidth-entitlement
15:55:48 <glikson> what might be a bit more tricky (for both) is the option when there is no real nova-conductor service..
15:56:07 <n0ano> well, I have to dash also, let's close here and pick up next week (moving to a new house, hopefully I'll have internet access then)
15:56:07 <PhilDay> We have some code for this, but it needs a bit of work to re-base it etc.   It's pretty much the peer of the cup_entitilement BP
15:56:21 <PhilDay> Ok, Bye
15:56:27 <senhuang> next time, i suggest we pick up from #9
15:56:30 <johnthetubaguy> glikson: indeed, it means the api gets a bit heavy from the local calls, but maybe thats acceptable
15:56:31 <PhilDay> good meeting guys
15:56:33 <rerngvit> ok see you then.
15:56:35 <n0ano> tnx everyone
15:56:39 <n0ano> #endmeeting