15:04:57 #startmeeting Scheduler 15:04:58 Meeting started Tue Oct 8 15:04:57 2013 UTC and is due to finish in 60 minutes. The chair is garyk. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:04:59 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:05:01 The meeting name has been set to 'scheduler' 15:05:08 BTW, same mistake was made several weeks in the past; can we add pointer at https://wiki.openstack.org/wiki/Meetings/Scheduler ? 15:05:13 hopefully that will be better (sorry for missing it) 15:05:29 Otherwise people will not find typescripts from those meetings. 15:05:59 i am not really sure about how to do that. on the mail list i have sent links to the summary of the meetings. maybe that will help 15:06:29 so over the last week Ytahi, Mike and Debu have been working on https://docs.google.com/document/d/17OIiBoIavih-1y4zzK0oXyI66529f-7JTCVj-BcXURA/edit?usp=sharing 15:06:29 hi - sorry I'm late 15:06:31 It's better than nothing, but people should not have to search the mail archive to find pointers that belong in the wiki 15:06:44 MikeSpreitzer: agreed. my bad 15:07:02 OK, 'nuff said here 15:07:24 On with the interesting discussion! 15:07:24 do people want to discuss the above and the ideas proposed 15:07:50 As you see in the ML, I do. Is Sylvia here? 15:07:51 in addition to this i think that we need to make sure that russellb is aware of the etherpad with the suggested summit talks 15:07:53 I shared a document yesterday on the instance group model 15:07:58 #link https://docs.google.com/document/d/17OIiBoIavih-1y4zzK0oXyI66529f-7JTCVj-BcXURA/edit?usp=sharing 15:08:10 I am not share if you had a chance to look at it yet 15:08:15 I'm sorry, I meant Sylvain 15:08:45 Yathi: its the same one - no idea why the link is different 15:08:46 I did e-mail Russell to let him know about the etherpad, and he liked the Idea of us trying to pre-consolidate the sessions 15:08:59 PhilD: thanks! 15:09:21 it would just be great f we can follow up and see if it is on his radar 15:09:25 We currently hav 7 sessions on the pad which I suspect may be too many - so many a good topic for next wee would be to see if we can consolidate some 15:09:43 Where is that pad? 15:09:46 garyk: this is the new document I wrote yesterday with some updates to the model so I called Version 2.0 of the proposal 15:09:59 Yathi: thanks 15:10:06 I'll ping him again to see if we can get a quota of scheduler sessions ;-) 15:10:12 ok, should we disucss the doc and then process to discussing consolidating sessions 15:10:15 Updates are based on what we discussed last week about the API 15:10:21 PhilD: thanks 15:10:38 #topic Instance groups model and extension 15:10:49 So I started some discussion in the ML; last is a response from Sylvain. 15:10:59 Yathi, would you or MikeSpreitzer like to explain for thoise who did noy het a chance to read 15:11:13 (deferring to author) 15:11:23 ok thanks 15:11:51 here is the link again 15:11:54 #link https://docs.google.com/document/d/17OIiBoIavih-1y4zzK0oXyI66529f-7JTCVj-BcXURA/edit?usp=sharing 15:12:15 I will provide a brief summary now 15:12:49 InstanceGroup = 15:13:01 sorry folks, joined late! 15:13:13 IGMember can be an InstanceGroup itself, or an instance 15:13:14 debo_os: np. Yathi is explaing in brief the doc 15:13:24 Here in the Nova context, an instance refers to a VM 15:13:37 but in the generic sense it refers to any virtual resource 15:13:55 an IGPolicy applies to either an InstanceGroup as a whole or just a IGMemberConnection 15:13:59 so that could be a disk/network/.. 15:14:20 garyk: yeah that is the idea 15:14:37 and an IGMemberConnection is an edge between two nodes i.e. two IGMembers 15:15:17 also, IGPolicy refers to a named policy, which is implemented separately in such a way that is understandable by the engine that does the scheduling, etc 15:15:25 outside the scope of this API doc 15:15:48 IGMetadata - is a key,value pair dictionary to capture any additional metadata 15:16:04 I guess this captures the idea we discussed last week 15:16:27 Any thoughts ? 15:16:47 I have some questions about the class diagram, and the interaction pattern with the client 15:17:39 Why one black diamond, not all black? 15:18:43 MikeSpreitzer: Okay good point.. something yet to think about.. this was based on the idea of the lifecycle of a InstanceGroup 15:19:05 of course.... 15:19:16 if you delete an instanceGroup, an IGPolicy can still stay - as it could apply to another InstanceGroup 15:19:22 I guess that was my thinking 15:19:47 An InstanceGroupPolicy is just a reference to a policy, an it is a reference in service of one particular InstanceGroup 15:20:24 So if the InstanceGroup goes away, there is no need to keep that reference 15:20:24 At the moment when the istance group is deleted we mark the policy for the group as deleted. it is currently not reusable 15:20:38 Well I considered policy as a first class citizen 15:20:51 so one policy could apply to multiple InstanceGroup 15:20:56 I think your proposal separates policy definitions and policy references... 15:20:58 it is a reference to a named policy 15:21:11 Policy definitions are out of scope, policy references are in scope and each serves just one group 15:21:41 Or maybe I misunderstand the diagram.. 15:21:54 do you propose multiple groups will contain the exact same reference object? 15:22:39 No, I think not 15:22:48 Each reference is part of just one group 15:22:56 Mike: for now if we provide references arent they good enough for an abstraction 15:23:23 debo_os: not sure I understand your point. 15:23:41 I'm just saying each reference is part of just one group, so should be deleted with the group 15:24:10 since a policy could be referenced just by a name/reference .... 15:24:13 (wow, I thought this would be a quick and easy point, put off the bigger stone to later) 15:24:25 I agree this is a reference in the InstanceGroup context 15:24:30 deleting the policy could be part of the implemnetation for now 15:24:38 but why can't this reference be re-used.. 15:24:48 I agree with debo_os this is implementation specific 15:24:51 No, I am not proposing to delete the policy with a referencing group, just the reference 15:24:56 at the moment the refence is the id of the instance group 15:25:02 we will need to redesign that 15:25:29 garyk: you nailed it 15:25:33 my thoughts are clear now 15:25:42 yes they can all be black diamonds :) 15:26:02 OK, great, let's move on. 15:26:19 Why are there integer IDs eveywhere? Do not the UUIDs suffice? 15:26:43 MikeSpreitzer: the keys in the db were id's - that is the pattern in nova 15:26:49 integer ID is from the DB id 15:27:04 So is this ID unique within the whole DB? 15:27:25 yes, that is, for for the instance groups 15:27:36 Why are there also UUIDs then? 15:27:48 all of the access is actually done via uuids 15:28:06 give me a sec and i'll paste where this is done 15:28:20 I'm confused. Integers are the keys but access is via UUID? 15:29:33 MikeSpreitzer: https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L5686 (this is an implementation detail). i suggest that we take it offline 15:29:43 OK 15:29:45 unless you think that it is something blocking the discussion at the moment 15:29:52 no, let's go on 15:30:18 PhilD: alaski: did you guys manage to read the doc? what do you think? 15:30:20 ok.. so do you agree with the updated model ? any votes ? 15:30:50 Yathi: i personally like it. 15:30:56 So I think Yathi's proposal has a client go through three phases: (1) create resources without scheduling, (2) present groups and policies to service for joint scheduling, then (3) activate resources. Right? 15:31:45 garyk: yathi: I think its a gr8 start ...and abstract yet enough to fold into nova for now 15:31:55 yeah that is the model here with the instance group proposal.. 15:32:01 at the moment (well, the code that did not get through) was as follows: 15:32:13 1. the user would create an instance group 15:32:21 2. the user assigns a policy 15:32:23 garyk: I've read the doc but haven't had time to digest it yet, too many other things going on. But it seems that y'all are hashing it out pretty well so I'll defer to you for now 15:32:34 mike: your 3 point summmary looks accurate to me 15:32:50 3. when deploying the instance a hint is passed with the group id 15:33:17 4 the scheduler updates the instance group for future use 15:33:54 it is very rudimentary and only works with anti-affinity. we have yet to address other complex conifgs 15:33:57 So I think we can make a simpler API, with just two phases. 15:34:11 Is there a phase 0 to determine if resources exist apriori? 15:34:38 In the first phase the client presents groups and policies and resources, the service creates resources and does joint scheduling; second phase, the client activates the resources. 15:34:52 phase 1 of registering a instance group could possibly include existing resources isn't it 15:35:44 We want UPDATE as well as CREATE, and we need the ability to relate to pre-existing external(to the group) things. 15:36:05 But no phase 0 needed, those functionalities are part of main line. 15:36:45 mike: update is definitely needed, but the semantics of the update need to be worked out for the general case - topology 15:36:50 Mike: while creating a group, you provide the member uuids - this member could potentially be already present.. I guess we understand this ? 15:37:39 For CREATE, you do not put existing things in the new group; for UPDATE, you can refer to group members that were created previously. 15:37:56 One thing I'm curious about is how to start an instance group. So far I've seen policies like anti-affinity or affinity but they all rely on something already in place. Do instance groups not cover placement decisions except in relation to other resources? 15:38:22 That's all that I have proposed. 15:38:37 In the private cloud you could go further, and allow references to physical things 15:38:37 alaski: that is something that we would like to address. 15:39:08 MikeSpreitzer: private cloud, or admin facing API 15:39:11 garyk: cool 15:39:14 BTW, affinity does not require sequencing. We can make a joint decision about how to place two things with affinity. 15:39:29 MikeSpreitzer: fair point 15:39:32 at the moment we are trying to get the basic primitives in. once we have them then we can start to build the models that Mike mentions (or that is my take at th emoment)\ 15:39:49 InstanceGroup is the starting point.. the decisions are made as a whole in the next step 15:39:49 +1 15:40:23 Would love to deal with sequencing later ... also not sure if we should leave that to the implementation (plugin/extension) 15:40:28 policies give guidelines for making these decisions 15:40:45 OK, I thought we wanted to start with making joint decisions 15:41:07 mike: we do once we have absorbed the basic primitives (within nova say) ... 15:41:13 I just want to understand if the intention is to cover something like references to physical things, or if that will always be out of scope. 15:41:21 OK, I'm adjusting my thinking... 15:41:35 alaski: virtual or physical should be both covered with theabstraction 15:41:42 say you have baremetal ... 15:41:49 alaski: I think that in the public cloud, the user does not know physical things and the admin does not know user's input 15:41:49 its still a property of a node 15:42:34 alaski: so neither can state affiinity between virtual resource and physical. This is not about baremetal... 15:42:45 mike: one can provide the hint that the tenant needs a physical but then its for later :) 15:42:48 MikeSpreitzer: right, our use case is more for admin/qe testing of new cells and how they can place builds there. Just curious if instance groups will help with that 15:42:50 For baremetal, just think of it as a very thin hypervisor 15:42:55 drilling more into alaski's question .. presumably, one needs to give ids to preexisting physical resources 15:43:00 How are these discovered? 15:44:22 alaski: you can group them up and assign a policy that it needs a physical 15:44:25 alaski: can you elaborate on "admin/qe testing of new cells" — I am not sure I understand. 15:45:05 MikeSpreitzer: add a new cell to the infrastructure and have admins or qe send builds there before public users can 15:45:22 Ah, thanks 15:45:44 alaski: cannot you do that with availability zones? 15:45:45 New to me, but thinking on my feet, I think that could be handled by affinity to physical stuff. 15:46:29 we are doing such a think with availability zones, aggregates and flavor metadata 15:46:35 aloga: potentially, though cells don't work with AZs right now 15:46:49 i think that exposing anything about the underlying resources to the end users is a bad idea 15:46:52 that needs to get sorted out anyway 15:47:15 MikeSpreitzer: agreed. I'm still not convinced it's the best solution to the issue though. 15:47:19 right, normal end users would not see physical stuff, but I think admins are already different in many APIs 15:47:37 But it sounds like it's not part of the instance groups scope, at least not for the moment. 15:47:41 MikeSpreitzer: yeah, I meant plain users 15:47:45 not admin-api 15:47:48 alaski: maybe we should have an admin API which should decide whether to allow for hints like physical 15:47:49 ? 15:48:06 debo_os: that's what I'm wondering. I think it would be helpful 15:48:08 More like authorization issue, I think, no need for different API 15:48:25 that works too 15:48:35 I just didn't know if it fit the model that's being discussed 15:48:35 I mean, syntax is the same, just different references allowd 15:48:53 alaski: I think we should eventually add an admin API too to address some of these .... 15:48:55 debo_os: in our scope - hint == policy right ? 15:49:00 yeah 15:49:07 references :) 15:49:39 i think that we also need time to discuss consolidation of the scheduling talks 15:49:48 is it ok that we do a context switch? 15:50:03 garyk: can you elaborate on the issue? Switch OK with me. 15:50:21 ok with me. Hope I didn't derail too much, jumped in a bit late and uninformed 15:50:21 OK.. I guess this InstanceGroup discussion will continue next week 15:50:28 and in ML? 15:50:43 and please feel free to comment offline and the core collaborators feel free to edit the doc 15:50:45 I see an id field in the InstanceGroupMember ... how does one generate such IDs for non-virtual resources? we can discuss next week and maybe i'll understand better then 15:50:49 MikeSpreitzer: not really sure i understood your question. i guess we could continue on the ML 15:51:08 garyk: right 15:51:20 iqster: the id is just to link it to the instance_group 15:51:39 iqster: we need to persist the InstanceGroupMember reference too. hence the id 15:51:42 iqster: their integer ID is like UUIDs in our code, it is record key in the DB 15:51:51 that reference could be to a non-virtual resource 15:51:59 thx 15:52:15 #topic summit talks - https://etherpad.openstack.org/IceHouse-Nova-Scheduler-Sessions 15:53:01 PhilD: any idea on which talks we can merge? 15:53:26 maybe we should have an API session? 15:53:29 Schedulng accross Services and Smart Resource Placement is an easy merge 15:53:41 an api session would be great. 15:53:46 i had put in a placeholder for smart resosource placement since it would be important 15:53:47 MikeSpreitzer: +1 15:54:03 mike:+1 15:54:10 Can you guys please update the etherpad with the new suggestions 15:54:16 sure 15:54:22 Sorry, forgot to do that. 15:54:29 I have a summit proposal on the main site for those 15:54:35 I am sorry but I need to leave now (i have a lift that is waiting for me). sorry for the curveball. 15:54:37 sure.. they are all applicable under the smart resource placement proposal 15:54:40 But I think I'm just asking for that meged session 15:54:45 Anyone know if the endmeeting can be done by someone else? 15:55:00 BTW, Scheduler Performance is also relevant 15:55:12 garyk: if you #chair someone I think they can end it 15:55:13 yes, the perf is very relevant 15:55:15 I think we need shared cache of state 15:55:23 +1 15:55:36 so ideally we need the 3 components: API, samrt placement and cache 15:55:37 @Garry - Sorry, got caught up into something else. Re consolidation sessions, can we pick that up next week ? 15:55:39 #char alaski 15:55:44 and performance if we have another slot 15:55:48 #chair alaski 15:55:49 Current chairs: alaski garyk 15:55:51 yes, pick up next week 15:56:02 #link https://docs.google.com/document/d/1IiPI0sfaWb1bdYiMWzAAx0HYR6UqzOan_Utgml5W1HI/edit?usp=sharing 15:56:04 alaski: is that ok with ou 15:56:13 garyk: yep, thanks 15:56:16 this doc points to related blueprints for smart resource placement 15:56:17 thanks 15:56:31 ok do we have 4 slots? 15:56:41 hten we could fit all API, smartplace, cache, perf 15:56:43 the in-memory state work by Boris and co also are required for the global state repo 15:57:07 if we have 3 slots, we probably should merge the perf in the cache and smartplace 15:57:15 +1 15:57:21 I'd like to see that merge anyway 15:57:23 alaski: how many slots could we get 15:57:34 we need 3 ideally 15:58:01 mike: then lets shoot for 3 and have perf as a component in each of cache and smartplace 15:58:02 debo_os: I'm not sure. It's going to depend a bit on how many other sessions are proposed, but I'll talk to Russell about getting 3 15:58:12 I said earlier that I'd ask Russell to see if we can get a quota for scheduling. I think 3 is a real minimum, we really need more liek 5 or 6 to cover all the topics 15:58:22 3 seems reasonable 15:58:29 5 or 6 will depend on what else is proposed 15:58:30 russelb: awesome 15:58:41 really don't know yet 15:59:03 russelb: 3 should get us started 15:59:10 roughly the same number of time slots as the last summit 15:59:15 have one placeholder already 15:59:15 1 extra i think 15:59:34 3 is probably minimum 15:59:44 If you look at what's already on the either pad I think 3 would be very tight. I really want to avoid last times issue of trying to do 10 subjects in one session 15:59:48 We'll probably think of at least one more once we start thinking carefully 16:00:16 russelb: then 3 for starters and 1-2 if possible ... performance, maybe a deep dive on policies 16:00:44 Why don't we try to narrow down as much as possible with a rough ordering, and see what we can get 16:00:46 PhilD: yes, me too 16:00:51 I think a deep dive on policies might be very important to let people know what could be done as we move along 16:00:55 And some of the scheduler work in H really seemed to stall because we didn't have a clear way ahead coming out of the summit 16:00:59 I'd rather defer stuff to mailing list discussions than pack too much into 1 time slot 16:01:00 let's continue making the list next week, just let Russell know that we'd like more than 3 if possible 16:01:16 ok 16:01:23 Cool. We'll pick this up next week. 16:01:30 Thanks everyone 16:01:31 gr8 meeting! 16:01:34 thanks 16:01:36 #endmeeting