15:00:17 #startmeeting scheduler 15:00:18 Meeting started Tue May 7 15:00:17 2013 UTC. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:19 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:21 The meeting name has been set to 'scheduler' 15:00:26 hi 15:00:32 hi 15:00:37 Show of hands, who's here for the scheduler meeting? 15:00:39 o/ 15:00:44 ack 15:00:44 \o 15:00:46 +1 15:00:46 Phil Day 15:01:06 * glikson here 15:01:16 o/ 15:01:20 here 15:01:41 I'm at a conference so may have to bail out early 15:01:48 Hello! 15:01:57 hi all :) 15:02:13 PhilDay, NP, I have a dentist appointment right after myself 15:02:22 Ouch 15:02:27 :-( 15:02:38 OK, let's begin 15:03:03 As I said in the email agenda, I want to go through all the items at least once before we circle back for more detail on the earlier ones 15:03:15 #topic whole host allocation capability 15:03:33 I don't claim to understand this one, is there anyone here to drive this? 15:03:42 I wrote up and filed the BP on that yesterday (based on the feedback from the summit) 15:03:52 https://blueprints.launchpad.net/nova/+spec/whole-host-allocation 15:04:38 PhilDay, cool, is the BP sufficient or are there issues you want to bring up now 15:04:43 This isn't really a scheduler change in the same sense that the other are - its more akin to exposing host aggregates to users 15:04:44 I'm also interested in this one, though I don't have a lot to add to the bp right now 15:04:47 we can always study the BP and come back later. 15:05:49 That would be fine - I'm happ[y to field questions once you've had a chance to read it - probably best done in the mailing list 15:06:05 I've asked russell to target it for H3 15:06:11 +1 15:06:14 not hearing much discussion so let's study the BP and respond on the mailing list or another IRC meeting 15:06:24 works fo rme 15:06:26 +1 15:06:29 sounds good. The only thing that comes to mind right now is could we also achieve it with specialized flavors 15:06:48 but offline is fine 15:06:58 I think its othogonal to flavors 15:07:20 #topic coexistence of different schedulers 15:07:45 so, curious about this one, how is this different from the plugin filters/weighters that we have now? 15:08:36 I think the idea is that you could have, for example, different stacks of filters for different users 15:09:07 or for different flavors 15:09:29 So if you build this on top of the whole host allocation (i.e have a filter config spefic to an aggregate) then you get another step towards private clouds within a cloud 15:09:37 implication being that schedulers are a runtime selection rather than a startup configuration issue 15:09:45 I'm posting the ether pad here for reference (https://etherpad.openstack.org/HavanaMultipleSchedulers) 15:10:11 multiple schedulers is confusing name for this feature, in my opinion 15:10:31 yep, the idea is to have multiple configurations of FilterScheduler, or even different drivers 15:10:34 n0ano: you are absolutely right 15:10:38 configured for different host aggregates 15:10:58 glikson: maybe we should call it dynamically loaded scheduler/filtering 15:11:08 I'm concerned that this would be a `major` code change, has anyone looked at what it would take to implement this? 15:11:13 is it possible to give admin user ability to specify filters list as a scheduler hint? 15:11:39 So do you see this as multipel schedulers running on different queues (i.e a fairly static set) or something much more dynamic ? 15:11:43 also, is there really a call for this feature or are we doing something just becuase we `can` do something 15:11:50 well, there are multiple options to implement it.. 15:12:23 I guess I will can defer it to next week -- will be more ready to elaborate on options and dilemmas 15:13:17 n0ano: sure, there are several use-cases involving pools with different hardware and/or workloads 15:13:18 I can see a number of use cases - would like to think about it being perhaps an aggregate specific scheduler - since that seems liek a good abstraction 15:13:45 Phil: yep 15:14:10 Should we also consider this as allowing combined bare metal & Hypervisor systems - of is bare metal moving out of Nova now ? 15:15:04 PhilDay, I would think that bare metal would the orthogonal to this, not sure there's an impact 15:15:23 PhilDay: could you elaborate a bit more what you mean by "combined bare metal & Hypervisor"? 15:15:34 n0ano: I suggest to postpone the discussion to next week -- hope to have next level of design details by then 15:15:44 +1 15:16:26 OK, I think we've touched a never an everyone's interested, let's postpone discusion till we've all had a chance to study the etherpad 15:16:45 clearly a good are to explore, let's just do it later 15:16:55 #topic rack aware scheduling 15:17:46 i am not sure who proposed this. in quantum we would like to propose a network proximity api 15:17:48 this is another one of those - can this be handled by the current flavor / extra_specs mechanism 15:17:51 That was one of mine as well - I haven't done any more on this as I wasn't sure how it fitted in with some of the bigger scheems for defining sets of instances for scheduling 15:18:32 PhilDay: it could be one use case of group-scheduling 15:18:33 Not sure you can so it by flavours - you need to add information to each host about its physical / network localicty 15:18:54 Right - group-scheduling would cover it. 15:19:07 PhilDay: so is it more about how to get the "rack" information from hosts? 15:19:31 PhilDay, note topic 1 (extending data in host state), add the localicity to that and then use current filtering techniques on that new data 15:19:39 senhuang: PhilDay: does a host have metadata? if so the rack "id" could be stored and used... 15:19:40 What I was propsoign going into the summit was something pretty simple - add a "rack" property to each host, and then write a filter to exploit that 15:20:19 garyk: okay. then we can add a policy for group-api that says "same-rack" 15:20:25 is it possible to use availabilty zone for that? one availability zone = 1 rack (1 datacenter = multiple availability zones) 15:20:27 hosts don't really have meta today - they have capabilities but that's more of a binary (like do I have a GPU) 15:20:40 senhuang: yes, sounds good 15:20:43 I wouldn't want to overlay AZ 15:20:52 that has a very specific meaning 15:21:03 PhilDay: ok, understtod 15:21:11 PhilDay, ok, thanks 15:21:25 https://etherpad.openstack.org/HavanaNovaRackAwareScheduling 15:22:07 still sounds like something that can be handled by some of the proposals to make the scheduler more extensible 15:22:10 So we could still do something very simple that just covers my basic use case, but if group-scheculing is going to land in H then that would be a superset of my iodea 15:22:36 we are working on another proposal to extend the notion of zones to also cover things like racks 15:23:02 PhilDay: glikson i think that the group scheduluing can cover this. 15:23:06 glikson, do you have some links on that work? 15:23:07 n0ano: agreed - I suggest we shelve for now and see how those other ideas pan out. 15:23:16 garyk: PhilDay: +1 15:23:17 i.e., allow hierarchy of zones, for availability or other purposes, and surface it to the user 15:23:44 https://etherpad.openstack.org/HavanaTopologyAwarePlacement 15:23:52 PhilDay, then we'll let you monitor the other proposals and raise a flag if they don't provide the support you're looking for 15:23:59 glikson, thanks! 15:24:02 We can always revisist later if some of the bigger ideas don;t come through. I could do with havign basic rack aware scheduling by the end of H 15:24:43 OK, moving on 15:24:52 #topic list scheduler hints via API 15:25:26 anyone care to expand? 15:25:34 Wroet up the BP for this yesterday as well: https://blueprints.launchpad.net/nova/+spec/scheduler-hints-api 15:25:48 PhilDay, busy bee yesterday :-) 15:26:22 Basically its about exposing the scheduler config to users so that they can tell which hints will be supported and which ignored (based on the config). At the moment its just a black box 15:26:43 and hints for filters not configured would be silenlty ignored. 15:26:59 what exactly do you mean by `hint` 15:27:04 is this one somewhat overlap with the topology placement glikson just mention? or kind of a superset of? 15:27:05 I'm in favor of it. Slightly related to this I just thought it may be a good idea to be able to set policies on those hints 15:27:18 the shceduler_hints options passed into server create 15:27:44 Policies would be good - but that's anew topci I think 15:27:49 agreed 15:27:51 a new topic ;-( 15:28:02 (My typing is crap today) 15:28:08 if that's being passed into the scheduler by the users create call wouldn't the user know this info already? 15:28:27 +1 for exposing supported hints.. n0ano it's not available right now 15:28:31 The user can't tell if the system they are takign to has that 15:28:41 filter configured or not. 15:29:13 aah, the user asks for it but the system may not provide it, in that case I'm in favor 15:29:24 Exactly 15:29:53 like you request for a vegetarian food in a party 15:29:58 although it is a hint and there's never a guaranteee that a hint will be honored 15:30:04 organizer may or may not provide it 15:30:28 rerngvit, but, again, you don't know that it's not available util you ask for it and don't get it. 15:30:56 yep, there also should be a special response for this. 15:31:00 by providing this API are we implicitly guranteeing that `these hints will be honored'? 15:31:04 plus: you don't know whether you get it or not 15:31:18 If its an affinity hint it can be hard to tell if its eign ignored or not - you might still get what you ask for but by chance 15:31:47 +1 to support API to query scheduler hints 15:31:52 Most of the 'hints' are implemented as filters today, so you either get them or not 15:31:52 n0ano: I don't think so. But that raises the question of whether there should be another concept to encapsulate "fail if not honored" 15:32:05 PhilDay: have you thought how to implement it? 15:32:11 alaski, indeed 15:32:27 if you want it to really be a hint then ity needs to be in the weighting function, 15:33:08 PhilDay, good point, I see rampant confusion between what a filter is as opposed to a weight 15:33:10 PhilDay: then you need to expose what attributes are available for weighing 15:33:21 I was thinking of having teh API rpc to the scheduler, and for each filter to have a method that returns details of its supported hints - so a simple iteration through the configured filters and weighers 15:33:25 PhilDay: e.g., would a driver just declare which hints it supports? 15:33:55 Since filters and weighers are sublcasses it should be fairly easy to add it 15:34:31 agree. Seems a possible implementation to me. 15:35:00 yep, in FilterScheduler you could potentially delegate it to individual filters/weights.. 15:35:05 Just becomes an API extension then 15:35:37 I think only the filter scheduler supports hints - I hadn't really thought about any of the others 15:36:06 PhilDay: whould it "live" under "servers", as an action? 15:36:17 haven't looked at the issue but sounds like hints need to be extend to the weights also 15:36:57 Yep - I think both filters and weights would need to support the "decribe_hints()" method, and teh scheduler could iterate over both lists 15:37:16 I think so. It should be in both filters and weights. 15:37:24 +1 15:37:29 assuming this is some sort of new, inherited class that should be fairly simple 15:38:25 OK, my take is everyone seems to agree this is a good idea, mainly it's implementation details 15:38:34 glikson: not sure about a server action - isn;t that for actions on a specific instance. ? This is more of a system capability thing - a bit like listing suported flavours 15:38:57 I asked Russell to mark the BP for H2 15:38:59 #agree 15:39:00 I wonder whether this has anything to do with "help" command in CLI.. essentially this is related to invocation syntax? 15:39:23 Should be exposed by teh CLI fo rsure 15:40:19 moving on ? 15:40:33 PhilDay, took the words out of my keyboard :-) 15:40:47 #topic host directory service 15:41:21 I have no clue what this one is 15:41:48 PhilDay: i'll look over these blueprints and get them updated today 15:42:03 i am sorry but i need to leave. hopefully the groups/ensembles will be discussed. senhuang and glikson have the details. thanks 15:42:34 garyk, tnx, unlikely we'll get to those today, should cover them next week tho 15:43:09 no one want to talk about host directory service? 15:43:51 OK, moving on 15:44:03 #topic future of the scheduler 15:44:14 This is mine 15:44:17 alaski, I think this is your issue, yes? 15:44:22 The start of it is https://blueprints.launchpad.net/nova/+spec/query-scheduler 15:44:56 Essentially modifying the scheduler to return hosts that can be scheduled to, rather than pass requests on itself 15:45:09 alaski: i like this idea 15:45:35 Is this related to the work that Josh from Y! was talking about - sort of orchestartion within conductor ? 15:45:35 it clarifies the workflow and separate the complexity of work flow management from resource placement 15:45:51 PhilDay: it is a part of that 15:46:30 given that I see the scheduler's job a finding appropriate hosts I don't care who actually sends the requests to those hosts, I have no problems with this idea 15:46:50 alaski: so, essentially you will have only "select_hosts" in scheduler rpc API, and all the schedule_* will go to conductor, or something like that? 15:47:01 The reason I called this "future of scheduler" is that long term I would like to discuss how we can handle cross service scheduling, but I think there's pleny of work before that 15:47:17 glikson: yes, most likely conductor 15:47:22 alaski: we have a proposal covered part of that 15:47:22 glikson, hopefully, whoever `called` the scheduler is the one who `does` the scheduling 15:47:41 alaski: part of the cross service scheduling 15:47:42 it will help the orchestration work 15:47:45 too 15:47:47 +1 15:48:15 So the scheduler basically becomes an information service on suitable hosts, 15:49:04 PhilDay: that's how I see it, and I think that eventually leads into cross service stuff 15:49:05 some orchestrator/work flow manager asks/calls the scheduler for the suitable hosts for creation/resizing 15:49:43 alaski: +1. this should be helpful for the unified resource placement blue print 15:50:14 Cool. It should really simplify retries, orchestration, and unify some code paths I think 15:50:24 I have one question though. If the actual scheduling moves somewhere else other than the scheduler. How can the scheduler return an answer then? 15:50:29 I don't have a problem with the concept - I guess it needs that new central state management / orcestration bit to land first or at the same time 15:51:03 it would requires a scheduler to query other services for states, is this correct? or I misunderstand something? 15:51:11 PhilDay: the initial idea is to put this into conductor, and start building that up for state management 15:51:13 we already have a similar code path to this in the live-migrate retry logic inside the scheduler, I was thinking of moving to conductor, its kinda similar 15:51:23 alaski: +1 15:52:03 rerngvit: it doesn't really change how the scheduler works now, except that instead of then making the rpc call to the compute it returns that to conductor to make the call 15:52:23 rerngvit, not sure why the scheduler would have to query something, already the hosts send info to the scheduler, that wouldn't change. 15:53:00 Is there a general BP to cover creating the workflow piece in conductor (or heat or where ever its going) 15:53:11 yep. it is something queries the scheduler for the selected host. 15:53:13 ok 15:53:30 PhilDay: Joshua from Y! has a blueprint on structured state management 15:53:34 PhilDay: yes, Josh posted on the mailing list 15:53:55 he restarted the orchestration meeting 15:54:07 seems like this needs to be part of that work then - I can't see it working on its own 15:54:32 PhilDay: I'm working with Josh a bit, he'll definitely be a part of it 15:54:32 its related, for sure, I think the pull to conductor can be separate 15:54:49 I think the two are complimentary really.. you can either move things to conductor as is and then reoganize, or the other way around.. 15:55:03 its the api -> conductor bit, then call out to scheduler and compute as required 15:55:16 I have to dash - but on #9 I wrote that BP up as well https://blueprints.launchpad.net/nova/+spec/network-bandwidth-entitlement 15:55:48 what might be a bit more tricky (for both) is the option when there is no real nova-conductor service.. 15:56:07 well, I have to dash also, let's close here and pick up next week (moving to a new house, hopefully I'll have internet access then) 15:56:07 We have some code for this, but it needs a bit of work to re-base it etc. It's pretty much the peer of the cup_entitilement BP 15:56:21 Ok, Bye 15:56:27 next time, i suggest we pick up from #9 15:56:30 glikson: indeed, it means the api gets a bit heavy from the local calls, but maybe thats acceptable 15:56:31 good meeting guys 15:56:33 ok see you then. 15:56:35 tnx everyone 15:56:39 #endmeeting