15:04:00 #startmeeting scheduling 15:04:01 Meeting started Tue Sep 10 15:04:00 2013 UTC and is due to finish in 60 minutes. The chair is garyk. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:04:02 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:04:04 The meeting name has been set to 'scheduling' 15:04:35 I don't have much for this week - other that to say that I set up the etherpad for the summit sessions: 15:04:36 #topic summit sessions 15:04:48 yeah, i'll post that link in a sec 15:04:51 https://etherpad.openstack.org/IceHouse-Nova-Scheduler-Sessions 15:05:12 Hi all, sorry late 15:05:13 PhilD: thanks for putting this together 15:05:30 NPO 15:05:31 NP 15:06:15 how about we go over the sections and then see if people want to take on open issues 15:06:45 Sure 15:07:07 #topic scheduler metrics and ceilometer 15:07:42 i know there was a lot of discussion about this and sadly we did not make much progress with the patches in H 15:07:53 Could do with someone to put themselves forward as the lead for this session 15:08:02 I would be happy to do that 15:08:19 Sold :-) 15:08:51 Cool. Maybe you can also be in touch with the guys who worked on the patches. 15:09:10 Yes, I was also involved 15:09:15 in the discussions. 15:09:16 #action PaulMurray to lead the metrics scheduling 15:09:22 I put down what i thought were the possible models - but I think it would be good if we could go into the summit with a strawman proposal. 15:09:24 great 15:09:37 (to avoid it just becoming another round of discussion) 15:10:01 PhilD: that is a great idea. The more we can crystalize ideas prior to the summit the better. 15:10:38 anything else on the metrics? 15:10:56 The references are things I know of - if there are others that should be there please let me know 15:11:07 I am pmurray@hp.com 15:11:57 Ok, so moving to the next topic 15:12:11 #topic image properties and host capabilities 15:12:27 I think this one needs fleshing out by you Gary 15:12:46 Correct. I'll fill in the gaps for next weeks meeting. 15:12:59 Don also expressed interest in this one. 15:14:08 #topic scheduler performance 15:14:57 boris-42: you around? i saw an initial WIP patch - https://review.openstack.org/#/c/45867/ 15:15:20 garyk yes that is our work 15:15:32 garyk around Scheduler as a Service 15:15:47 garyk without fanout, scalable and flexible solution 15:16:03 boris-42, can you check the doc link I posted - it points to a version updated 13.08.13, and I think I saw you say there was a recent update ? 15:16:09 understood. so you are fine with leadning this session proposal at summit 15:16:15 https://etherpad.openstack.org/IceHouse-Nova-Scheduler-Sessions 15:16:35 garyk PhilD if nobody is against I will lead this session 15:16:50 boris-42: i am in favor 15:16:57 +1 15:17:08 garyk PhilD due sammit we will get all patches + real numbers from Rally 15:17:14 (someone has to catch the rocks right ;-) 15:17:22 =) 15:17:24 yeah=) 15:17:39 boris-42: i think that it is important to address issues raised on the db performance patches - for example scenarious used etc. 15:17:54 (my spelling is bad sorry) 15:18:10 garyk actually we are going to get results from real openstack deployments 15:18:27 garyk I got 1k servers so I will test it in different configurations and different scenarios 15:18:52 garyk we should get results from real deployments not only DB load and so on 15:18:55 undertood, but i think that in order to convince the community we need to be able to explain the test bed. 15:19:23 garyk you will be able to repeat this experiments 15:19:26 garyk with Rally https://wiki.openstack.org/wiki/Rally 15:19:36 it would be nice if we could have some concensus regarding the performance tests that we would like done for the profiling 15:19:52 boris-42: ok, i'll take a look 15:20:17 boris-42: do we have a list of bottlenecks? 15:20:17 garyk performance tests will be like run 1000 instances 10 simuntaneolsly by 100 requests to Nova 15:20:25 garyk not yet 15:20:30 garyk rally is not finished 15:20:45 garyk I mean I know about some bottlenecks 15:21:09 garyk but it will be better to get it with Rally (when it will be finished) 15:21:10 boris-42: :). i guess that it is a process. 15:21:41 does anybody have anything else regarding the performance session? 15:22:25 #topic scheduling across services 15:22:39 boris-42: you are also listed on this one 15:23:06 PhilD: is this a generic scheduler? 15:23:37 This was a tricky topic last year - as it has potential overlap into Heat and other services 15:23:39 garyk PhilD our approach contains points around getting one scheduler with all data to make scheduling for all services 15:24:14 garyk PhilD so as it is the part of our new approach I will be glad to present how to solve it without pain=) 15:24:25 I tried to capture the use cases I remembered from last year - there may be others. 15:24:42 PhilD: ok, thanks for the clarifications 15:25:09 boris-42: this is going to be challenging as it involves other projects 15:25:18 To me it doesn't seem like we want to talk about a scheduler outside of these services yet, more about what should each service expose so that scheduling decisions could be made 15:25:18 A proposal to fix it would be good. If we think it affects other projects we should call that out to Russell so he can schedule (Sic) accordingly 15:25:36 garyk in case of cinder and nova 15:25:43 garyk it is really pretty simple 15:26:11 alaski: good point 15:26:18 alaski - that may be a good approach - it at least bounds the problem to one that those in the session might be able to agree on 15:26:34 yeah, i completely agree 15:26:58 Aside from just info, we need to discuss what kinds of reservations need to be exposed 15:27:04 alaski: would you like to work with boris-42 on this one? 15:27:20 garyk: sure 15:27:21 garyk working on what? 15:27:26 PhilD: right 15:27:30 garyk sorry I miss something=) 15:27:50 boris-42: scheduling across services 15:28:07 alaski we already have approach how to make it in cinder and nova 15:28:08 or at least a first step in that direction 15:28:19 alaski I mean one scheduler 15:28:25 alaski it is really easy 15:28:48 alaski we will prepare docs 15:28:55 boris-42: i am not sure that it is that easy. 15:28:55 alaski and patches 15:29:09 garyk it is about 500 simple lines of code 15:29:18 boris-42: I'm interested to see what you have 15:29:20 garyk and the hardest are already on review=) 15:29:32 alaski we will publish soon other pathces=) 15:29:38 alaski I will ping you=) 15:29:39 boris-42: what alaski proposes is for us to first look at the data that we want to use and then decide onhow to move forwards. 15:29:44 @boris-42: So does you're solution live just within Nova ? 15:30:04 PhilD yes 15:30:15 PhilD I mean actually we are changing only few places 15:30:21 boris, can you please share more details on what you have on cinder + nova single scheduler.. I am interested on this single scheduler 15:30:21 boris-42: you solution may be great but we need a community concensus. once we get that it will be a lot easier to get it though the review process 15:30:54 garyk PhilD Yathiraj I think that IRC meeting is not good place for this 15:31:07 garyk PhilD Yathiraj we should update and improve our docs 15:31:11 so is it ok to say that alaski and boris-42 will take the leads on this? 15:31:16 garyk PhilD Yathiraj publish our pathces 15:31:16 boris, Is there a blueprint, or some already committed code fore review.. OK.. a link should be fine - yudupi@cisco.com is my email 15:31:21 and then discuss=) 15:31:21 I think the key here is showing that any proposed solution can be extended to cover any use case - so if it works great for Nova and Cinder but canl twork for Neutron 15:31:27 garyk: works for me 15:31:36 then it will be a struggle to get consensus. 15:31:39 great! 15:31:49 PhilD agree 15:32:05 PhilD let we write on papers and UML diagrams and patches our thoughts 15:32:15 PhilD it will be easier to discuss=) 15:32:22 PhilD especially around Neutron=) 15:32:30 Agreed we can't resolve the design here - what we want to do is make sure teh DS session is set up to give us the best chance of a decsion on the way forward 15:32:48 Cool - add the links to the etherpad 15:32:56 boris-42: can we check that it compiles (or in our case interprets) on paper first, then go to the patches 15:33:24 garyk papers are not ready yet.. 15:33:27 can we move to the next topic? 15:33:30 garyk but they will be on this week 15:33:49 boris-42: cool. no rush we are still discussing things 15:34:00 #topic private clouds 15:34:17 PhilD: alaski: you guys are taking the helm here? 15:34:25 Yep 15:34:27 yes 15:34:50 great. anything else you want to mention about this or should we move to the next topic 15:35:07 I'm happy to move on 15:35:14 +1 15:35:19 sorry guys I have to go=) 15:35:31 boris-42: ok, thankd 15:35:42 #topic multiple scheduler policies 15:35:53 glikson: you around? 15:36:05 more or less.. 15:36:33 glikson: you ok for leading this one? 15:36:47 sure 15:37:14 is there anything else you would like to mention? 15:37:32 few more folks from IBM are likely to join, and anyone else is also welcome 15:37:49 not at the moment, I think 15:37:58 ok 15:38:25 #topic additional sessions 15:38:50 Are there additional sessions that people would like to propose or did we miss something? 15:39:42 I know that debo wanted to address scheduling of resources as a follow up to the instance groups. we need to add this to the etherpad 15:40:18 I'm going to be looking at using the Taskflow library for instance creation, but I'm in POC stage right now 15:40:32 and it's only incidentally related to scheduling 15:40:47 garyk - do you know enough about what debo wanted to outline teh session on the etherpad ? 15:40:50 alaski: can you elaborate a little to save us a few google searches 15:41:08 PhilD: not off hand. i'll ask him to add it and mail you. 15:42:04 Sounds like group scheduling *might* be something that could be rolled into "scheduling accross services" ? The titles aren't cast in stone if we find there are topics that are related 15:42:07 garyk: Taskflow is a library for handling "orchestration" of workflows. Essentially it should allow for steps of instance creation to be stopped, persisted, resumed, retried on failures, etc... 15:42:31 Isn't that where Heat fits in ? 15:42:58 garyk: but the first step is querying scheduler, not having it proxy. My work on that didn't quite make it into H so I'm picking it up in I 15:43:16 alaski: ok, understtod 15:43:24 PhilD: this is at the compute host level 15:43:50 PhilD: so lower level than Heat sees things 15:44:05 Ah, OK, Are you working with Josh from Y! on that ? (I think he had some ideas / opinions on that) 15:44:37 PhilD: not directly yet, but should be later 15:44:57 (at least with all this stuff carrying over from H we have a flying start on changes for I ;-) 15:45:28 I certainly hope so. Some of us are licking our wounds with the scheduling features that did not makes it :) 15:46:00 do we have ny other items to bring up regarding the summit session proposals? 15:46:18 I have another not fully scheduler related topic, resource tracker 15:46:50 I don't know how far outside of the scheduler itself we want to get here 15:46:50 alaski: I think that it is a scheduler related topic 15:47:17 right now the resource tracker is in memory on compute nodes 15:47:26 Is that distinct from the "Scheduler metrics & Ceilometer" topic - I kind of see resource tracking as part of that 15:47:36 I would like to query it from conductor 15:47:56 PhilD: hmm, not sure 15:48:00 alaski: would that not just be an implementation detail? 15:48:49 garyk: mostly, but in doing so there's an idea to maybe change the interface a bit. Or at least consolidate it with claims somehow 15:49:14 alaski: this came up around metrics - or rather vs metrics 15:49:21 and it will involve persisting it outside of the compute, and synchronizing access 15:49:30 alaski: what is it you have in mind 15:49:37 alaski: understood. 15:49:42 Kind of feels like it should be part of the "how do we keep track of stuff we need to count / measure" session (maybe that would be a better title for teh first session) 15:50:06 PhilD: that's what I was thinging 15:50:12 thinking 15:50:17 PaulMurray: my main concern is actually about creating a claim without round tripping to the compute 15:50:23 good - then we thing alike ;-) 15:50:28 alaski: i think that that could also be related to debo's topic (but he is not around to discuss it). could we discuss this in more detail next week? 15:50:36 garyk: sure 15:50:47 garyk: agreed 15:50:51 Maybe capture some points on the EP in the meantime ? 15:51:09 #action discuss resource tracker next week 15:51:17 I'd kind of like to have that as a working scratchpad 15:51:19 PhilD: good idea? 15:52:14 do we want to address BP's that did not make H? 15:52:47 or will we go with PhilD's idea of starting I with a bang 15:53:22 I would like to mention that some work I've been doing is a little at odds with other work I've seen 15:53:37 Not incompatible, but we should all be aware of what's going on 15:53:54 alaski: good point. 15:54:02 Might be useful to at least capture the BPs that we are carrying over that aren't covered by the planned sessions 15:54:28 Sadly the instance groups is being carried over - we had issues with the API\ 15:54:48 That is, the user facing API 15:55:09 I'm working to remove schedule_run_instance in favor of select_destinations in the scheduler. So some of the instance_group work is going to get moved elsewhere for that 15:56:07 ok, np. i don't think that where it is run is an issue. the idea and how the policies are implemented is what is important 15:56:50 garyk: that's what I figured. I dont want to break anything or make it any harder. I just want things handled in the right place, which may end up being the conductor 15:57:17 alaski: sounds reasonable. 15:57:25 One topic I would like an opinion on - teh change to add a get scheduler hints APi failed at the last hurdle to make H because there was an objection to introducting a scheduler/api.py (it was seen as to trivial a pass through to be worth adding) 15:58:01 PhilD: i would have liked to have seen that one get through. It was ready very earky in the H cycle 15:58:10 My thinking was that we should be moving away from having things other that the scheduler direclty calling scheduler/rpsapi.p 15:58:15 rpcapi.py 15:58:43 @Gary - yep, it was all there and working, and this file had been there since June ;-( 15:59:02 I didn't think it was worth a FFE for though 15:59:24 I think it could have been worth a shot (nothing to lose) 15:59:32 i think that we have run out of time 15:59:47 But the general question was, what do folks think about having a scheduler/api.py ? 16:00:08 PhilD: I'm fine with it. I'm not sure how necessary it is without seeing it, but it fits the model we use elsewhere 16:00:26 PhilD: i agree 16:00:45 sorry guys i am going to have to end now. lets continue next week. 16:00:53 Its not *necessary* for the get_hints call - but at some point (query scheduler maybe) I'm sure we'll need it. 16:01:00 NP - bye all 16:01:12 irc://chat.freenode.com:6667/#endmeeting 16:01:58 #endmeeting