14:03:11 #startmeeting nova-scheduler 14:03:12 Log: http://eavesdrop.openstack.org/meetings/nova_meeting/2015/nova_meeting.2015-10-05-14.01.log.html 14:03:14 Meeting started Mon Oct 5 14:03:11 2015 UTC and is due to finish in 60 minutes. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:03:15 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:03:18 The meeting name has been set to 'nova_scheduler' 14:03:26 third time's a charm 14:03:39 OK, what else can I screw up this morning 14:03:40 * bauzas waves again 14:03:49 n0ano, waves back 14:04:29 #topic Mitaka planning 14:04:52 So, I read the two specs pointed out last week and they're a good start 14:05:30 the one, https://review.openstack.org/#/c/192260/8 , scheduler plans, mainly talks about stuff we know and are doing 14:05:58 that's technically a backlog spec and a devref change, but anyway :) 14:06:09 the other, https://review.openstack.org/#/c/191914/6 , parallel scheduler for V2, I' 14:06:23 s/I'/I'm concerned might be overkill 14:07:04 to me, we've all said along that DB access is the problem but I don't know that we've measured exactly how bad it is, especially with the caching scheduler that we have 14:07:25 I'd like to see some real perfmance number before we try and do major changes 14:07:41 n0ano: the parallel scheduler is actually needed for the cells effort 14:08:09 n0ano: just from a design tenets PoV, it just makes sense without figures 14:08:25 bauzas, so it's a functional need with maybe a performance benefit 14:08:56 n0ano: it's more a scalability feature than a performance feature if you prefer 14:09:35 I'm a little bit concerned by the word 'paralled', I would have preferred 'distributed' but that's fine 14:10:04 the thing is, we need to document what is our christmas list 14:10:18 a closely related thing in this environment, if the scheduler was fast enough you wouldn't need to distribute it but that's kind of nit picky 14:10:24 not zactly discussing about how we 'll build the super train that daddy bought us 14:10:53 christmas shopping in october? bleh 14:10:59 n0ano: that's debatable 14:11:11 I agree, lets get the current stuff completed before we get distracted by the new shiny 14:11:22 edleafe: speak of that, our malls are now full of advent calendars 14:11:40 bauzas, I agree and I'm willing to not debate it right now 14:12:11 bauzas, they mail you those, we have buy them at the grocery store :-) 14:12:15 n0ano: sure, it's just about snapshotting a necessary move and the ideas behind 14:12:40 not yet discussing about which one to pick 14:13:23 n0ano: heh, our calendars are lego-ones, so I guess it's why - the chicken stopped producing chocolate ones like years ago 14:13:42 anyway, diverting 14:14:08 bauzas, heresy, chocolate is a requirement :-) but ignoring that 14:15:03 I'd like to kind of keep us focused, to me the progression is: 14:15:14 1) finish the API clean up... 14:15:21 2) split out the scheduler... 14:15:38 3) consider performance/scalability 14:15:53 if we try and do too many of those at the same time nothing will happen 14:16:13 the #2 is still debatable :) 14:16:20 the problem is claims I think, did we work out how that impacts the scheduler API yet? 14:16:29 while the #3 is a benefit anyway :) 14:16:48 johnthetubaguy: that's a good point 14:17:07 johnthetubaguy: a distributed scheduler needs to address how to claim properly 14:17:09 do we have a concrete plan there yet? claims wise I mean 14:17:20 I think 3) will be much more doable after 2) (ignoring the cross project benefits of 2) so it's still a priority to me 14:17:22 johnthetubaguy: nothing we agreed 14:17:33 there was some talk of moving that into the scheduler, mostly for the parallel bit 14:17:48 johnthetubaguy: right, that's why I'd like to address #3 before #2 14:17:54 n0ano: my worry is getting (1) completed, its hard to evolve that API once its split out 14:18:24 johnthetubaguy: +1 14:18:30 the parallel bit is more about availability than speed, if we get it working well enough, FWIW 14:18:36 johnthetubaguy, APIs can change, that's no impossible and it's probably better that it requires thought to make a change 14:18:45 I would like an API that isn't nova-specific 14:18:53 otherwise, what's the point of a split? 14:19:18 * bauzas feels we discussed of that a couple of times before :) 14:19:32 so there is a chicken and egg thing here, honestly, both *could* be made to work, its a cases of working out the trade-offs 14:19:37 I'm with bauzas I thought it was pretty generic 14:19:53 don't get me wrong, here is my take 14:20:16 #1 we know that we should address the distributed thing, just because we're blocking cells v2 at least 14:20:43 #2 we know we should consider heterogenous resources provided to the scheduler 14:20:57 #3 we never yet agreed on a split and how 14:21:11 that's what I considered the consensus 14:21:46 in re #3 - my understanding is we did agree, clean up the APIs through the current effort and then do the mechanics of a split 14:22:49 in re #2 - are heterogenious resources a problem with the current design? 14:22:50 so the deal was to fix the API and discuss whether we split and how 14:23:19 n0ano: http://lists.openstack.org/pipermail/openstack-dev/2015-September/075403.html 14:23:24 to answer your question 14:23:57 so I thought we agreed, get the APIs sorted, then look again at the split, but I don't think its worth fixating on the difference, or lack there of, between those two positions 14:24:25 johnthetubaguy, +1 14:25:38 volume capacity, IP capacity and how it relates to compute capacity is an age old issue here really, it would be good to get that fixed up 14:26:07 availability zones and relating different pools of resources is certainly a common requirement 14:27:01 johnthetubaguy: so I guess you better explain my position, because I +1 14:27:04 johnthetubaguy, to me those capacities are just metrics (e.g. numbers), from a scheduler perspective that's pretty simple - how you measure them is not so simple 14:27:31 what I'm trying to explain is that we agreed on refactoring the APIs and reconsider once that done whether it was necessary to split 14:28:02 but in the meantime, there are many other topics coming in, and I'm really not convinced by the idea of splitting could just solve all our problmes 14:28:34 bauzas: splitting by itself doesn't get us any improvement 14:28:39 it won't solve our problems but I do believe it will make working on a lot of them easier 14:28:40 edleafe: ezactly 14:28:47 cleaning up so that a split *could* happen does 14:28:51 n0ano: so this is more about error handling 14:29:18 johnthetubaguy, not following you 14:29:18 say you pick where the volume goes, or where the compute goes, and that means you can only get some of your resources, you need to pick something else 14:29:54 its logically separate pools of resources you need to claim, that have a dependency relationship described in their metrics 14:30:18 so the request spec would be for both compute, volume and networking resources, in an extreme case 14:30:40 the way we currently work you wouldn't pick a host unless it satisfied all of the resource requirements, the scheduler just has to know about all of those resources 14:30:51 that's why the plan was to clean up our APIs first, then identify what could be needed for cross-project scheduling, then identify how to provide those and only by then, decide whether we split or just add another endpoint 14:31:21 its about picking a compute host, and a volume az, and a neutron network segment that all are able to give you a claim, and retry if not, right? 14:31:37 its multiple related items, its not just picking a compute node at this point 14:31:44 johnthetubaguy: and doing it in a non-racy way 14:31:56 edleafe: that I'm cautious 14:32:04 edleafe: I mean, we need retries 14:32:19 bauzas: yes, we will always need them with the current approach 14:32:28 edleafe: well, optionally, yes, claims would help make the retries inside the scheduler, and the expensive of a quick choice 14:32:36 but we should improve things so that they are kept to a minimum 14:33:14 we need to be more prepared to offer choice here, being less racy will be crazy important for some users, and a big slow down for other users, depends on your needs and resource usage patterns really 14:33:15 are we looping back ? 14:33:17 I think we're in violent agreement, retries are necessary but if we do too many of them we have a problem. 14:33:40 I am just trying to define the problem for the multi resource pool scheduling here 14:34:10 its been a long standing requirement, that seems to be getting more important, rather than less important 14:34:39 johnthetubaguy: that's why your backlog spec is worthwhile 14:34:52 and that's why I'd like to consider it before splitting 14:34:54 johnthetubaguy, do you know if anyone has written up anything about this (multi resource pools) 14:34:55 I probably should create a different one for this issue 14:35:11 n0ano: there have been a few ML posts and things, not seen anything written up 14:35:28 there is a spec 14:35:31 from jay 14:35:32 sec 14:35:34 n0ano: basically volume must be local to compute AZ, IP capacity must be local to compute AZ 14:35:46 bauzas: ah, cool, do you have a link for that one? 14:35:49 n0ano: I'm almost finished with my radical rewrite proposal, if that helps :) 14:35:57 https://review.openstack.org/#/c/225546/ 14:36:02 johnthetubaguy: n0ano^ 14:36:28 edleafe: please present that as an alternative scheduler that could be in tree, no throwing away what we have, for now 14:36:51 bauzas: ah, sweet 14:36:54 johnthetubaguy: understood, but no, there's no way it could be in tree 14:37:10 johnthetubaguy: I realize that it won't ever happen 14:37:22 I just want people to think about the root causes of our issues 14:37:41 jaypipes: around ? 14:38:28 bauzas, probably not, he would have jumped in by now 14:38:53 let's call him 5 times in front of a mirror 14:39:03 jaypipes jaypipes jaypipes jaypipes jaypipes 14:39:08 edleafe: so I have done a lot of experiments with a decent rack of servers, with belliott that lead to the caching scheduler and tuning the greenlet workers down, thats mostly what I am basing that parallel work on, but that idea still needs testing out, anyways 14:39:11 looks like we have a lot to discuss in Tokyo 14:39:20 so the key bit here, is we have a spec up, so lets get it reviewed 14:39:26 #link https://review.openstack.org/#/c/225546/ 14:39:38 johnthetubaguy: +1 14:39:43 that's an iterative process 14:39:59 we have johnthetubaguy's spec and devref, we also have jaypipes spec 14:42:02 so the parallel one is not very actionable right now, I think resource pools are more important right now 14:42:22 the claims piece still need some specific solutions being discussed I feel 14:42:33 johnthetubaguy: that's a good question 14:42:45 honestly, once we have those two piece, I will feel better about our API 14:42:58 johnthetubaguy: we haven't yet finished to implement the resouce-object BP 14:43:01 by API I mean the scheduler interface 14:43:13 bauzas: very true, that needs review now, its up for review right? 14:43:22 johnthetubaguy: and that spec is actually an extension to the resource-objects BP 14:43:38 bauzas: totally agreed 14:43:52 johnthetubaguy: so yes, we can somehow identify that what we agreed (ie. refactor our APIs) is still valid for Mitaka 14:44:34 johnthetubaguy: re: the resource-objects, I saw some patches from jaypipes but I guess he had not all the work ready 14:44:52 johnthetubaguy: at least, I can find the objects creation, not their usage 14:45:07 well, I think I need to distill all the different issues from this thread and propose a scheduler session in Tokyo to discuss them 14:45:25 well 14:45:32 so the deadline for session proposals is tomorrow I think 14:45:38 if I remember correctly 14:45:44 * bauzas looking 14:45:44 NP, I'll get it done today 14:45:56 we need something concrete to discuss ideally 14:46:49 we have some specific specs to discuss plus some more speculative stuff 14:46:54 I think we have some agreement on the list of issues (solid API, inc resource pools, inc claims in scheduler) 14:47:07 For me the list is: (solid API, inc resource pools, inc claims in scheduler) 14:47:24 quoting johnthetubaguy "The deadline for proposals will likely be Tuesday 6th October, 23.59 UTC," 14:47:36 today EOB :) 14:48:07 johnthetubaguy: scheduler claims are part of the parallel (ie. distributed) scheduler discussion I feel 14:48:17 bauzas, today's the 5th, that should be EOB tomorrow 14:48:28 right, we're on Monday 14:48:32 * bauzas facepalm 14:48:37 I thought we were Tuesday 14:48:40 anyway 14:48:44 so I was saying 14:49:01 bauzas: thats true, I am thinking call out claims as they impact parallel and resource pools really 14:49:10 solid APIs is surely one thing to address (at least the missing bits considering that reqspec-obj is on its way) 14:49:41 distributed scheduling (incl. sched claims) are IMHO a second part to address 14:49:53 johnthetubaguy: but I see your point 14:50:41 what I'm a bit worried is to defer the necessary talk for a scaling-out scheduler would mean that we'd defer cells v2 14:51:17 because we can't hardly assume that one single scheduler could boil the ocean, er. the whole cloud 14:51:46 8 mins to the end of that meeting also 14:52:03 so it only affects multi-cell v2 14:52:14 and that feels like its release + 1 still 14:52:28 but ideally we would have a prototype ready during mitaka 14:52:33 bauzas, yeah, those concerns should be addressed at a session and no, we're running out of time 14:52:39 johnthetubaguy: erm, the idea is that the n-api would have one scheduler to address all cells 14:53:26 johnthetubaguy: but that's certainly debatable 14:53:41 bauzas: well it just works for the single cell case, its just the same API as today 14:53:58 getting late guys, let's move on 14:54:04 #topic opens 14:54:34 so, only two weeks until Tokyo, do we want to meet next week & after or should we just re-convene at the summit? 14:54:57 johnthetubaguy: http://specs.openstack.org/openstack/nova-specs/specs/liberty/approved/cells-scheduling-interaction.html was that I was thinking about 14:55:03 but sure, we can move on 14:55:11 yeah, lets move on 14:55:33 n0ano: I can attend the next one, not the one before the Summit 14:55:38 bauzas: sharding doesn't affect the API really 14:55:56 (enjoying Tokyo with family, eh) 14:56:26 johnthetubaguy: sure 14:56:28 I'm willing to talk IRC next week and then defer to the summit, that's doable 14:57:03 I'd be interested in gathering feedback from cinder and neutron folks about what they'd like to send to us 14:57:26 given we're discussing about https://review.openstack.org/#/c/225546/1/specs/mitaka/approved/resource-providers.rst,cm 14:57:38 but that's a bit premature 14:57:38 bauzas, me too, we asked about 2 summits ago and haven't gotten much back 14:58:31 well, I have to run (next meeting), tnx everyone, talk next week 14:58:35 #endmeeting