#openstack-meeting log

15:00:46 <n0ano> #startmeeting gantt
15:00:47 <openstack> Meeting started Tue Jul  1 15:00:46 2014 UTC and is due to finish in 60 minutes.  The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:48 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:51 <openstack> The meeting name has been set to 'gantt'
15:00:56 <bauzas> o/
15:01:00 <n0ano> anyone here to talk abou tthe scheduler?
15:01:17 <yjiang5> o/
15:02:21 <n0ano> hmm, small group today (maybe we can get lots done :-)
15:02:35 * n0ano watches the solar panels being install on my roof
15:02:39 <ericfriz> Hi all
15:02:45 <n0ano> #topic code forklift
15:02:46 <LisaZangrando> Hello
15:02:49 <MarcoVerlato> Hi
15:02:55 * bauzas also writes slides at the same time :)
15:02:58 <schwicke> I'm new here.
15:03:12 <n0ano> ericfriz, LisaZangrando nice you can make it, we'll get to you soon
15:03:21 <n0ano> schwicke, NP, I promise we don't bite :-)
15:03:38 <bauzas> n0ano: don't we ? :)
15:03:49 <schwicke> n0ano: hoping for the best :)
15:03:52 <LisaZangrando> ok thanks
15:04:03 <n0ano> bauzas, I thought we were good on the client and then john had some issues, do they look doable?
15:04:19 <bauzas> n0ano: well, we discussed today with johnthetubaguy
15:04:31 <bauzas> n0ano: about how we should do the steps to Gantt
15:04:48 <bauzas> n0ano: because he thought about possible issues and how we should do that
15:04:59 <bauzas> n0ano: long story short, there is an etherpad
15:04:59 <yjiang5> bauzas: I need check IRC history to see your discussion,right?
15:05:14 <johnthetubaguy> n0ano: hey, in my team planning meeting, but do shout at me if you want some answers
15:05:28 <bauzas> #link https://etherpad.openstack.org/p/gantt-nova-compute_nodes
15:05:46 <bauzas> so the main problem is:
15:05:56 <bauzas> what should we do with ComputeNode table ?
15:06:11 <bauzas> should it be a Scheduler table or a Nova table ?
15:06:31 <bauzas> as per the last findings, johnthetubaguy is thinking to leave ComputeNode in Noa
15:06:32 <bauzas> Nova
15:06:44 <bauzas> and only do updates in the client
15:06:50 <yjiang5> bauzas: how often will we access the compute table?
15:06:58 <n0ano> for compatibility reasons I think it should probably stay with nova for now, maybe in the future it can be moved into gantt
15:07:18 <bauzas> n0ano: so that means we go to keep the computenodes table
15:07:40 <johnthetubaguy> n0ano: gantt will need its own table, with a different structure, and that seems fine
15:07:57 <bauzas> ok, please all review https://etherpad.openstack.org/p/gantt-nova-compute_nodes and make comments if any
15:08:16 <johnthetubaguy> PCI stats need the ComputeNode table at the moment, for the PCI devices stuff, and I recon that means it has to say in Nova for the medium term
15:08:18 <yjiang5> bauzas: n0ano: Don't this should be compute_table object scope? If everything is kepts in compute_node object, then no matter how we do the implementation, we will simply change the compute_node object?
15:08:28 <bauzas> if we all agree to keep compute_nodes, I'll backport johnthetubaguy's change into the 82778 patch
15:08:56 <yjiang5> johnthetubaguy: you mean PCI stats or PCI dev tracker?
15:09:04 <bauzas> well, to be precise, I already made that, I need to restore a previous patchset
15:09:10 <n0ano> unfortunately, I think johnthetubaguy is right and we should stay that way for now
15:09:51 <yjiang5> johnthetubaguy: why don''t we hide all these thing behind the compute node object?
15:09:56 <n0ano> bauzas, then why did you change originally, won't the same objections apply?
15:10:04 <bauzas> ok, my main concern is keeping the roadmap, so I'll go with these changes for https://review.openstack.org/#/c/82778
15:10:32 <bauzas> n0ano: the problem is that there were no clear consensus
15:10:45 <bauzas> n0ano: so I made lots of proposals over here
15:10:54 <bauzas> n0ano: now, I'll stick with the proposal
15:11:02 <bauzas> n0ano: there is another side effect to it
15:11:25 <bauzas> johnthetubaguy and I agreed that we should possibly do the fork once that patch get merged
15:11:32 <n0ano> the PCI issue is a strong argument (to me anyway) so I'd just say nova owns the table is the new concensus and we try and make it work
15:11:43 <bauzas> and then work on Gantt directly
15:11:53 <bauzas> but that requires some code freeze in Nova
15:12:00 <bauzas> johnthetubaguy: you agree ?
15:12:13 <yjiang5> n0ano: if we will remove the compute table out of nova, we can change PCI for it also.
15:12:35 <bauzas> johnthetubaguy: I mean, we take your scenario, go ahead, do the split, work on Gantt for feature parity
15:12:45 <n0ano> yjiang5, I's say that's something we do later, after we do the split into gantt
15:12:55 <n0ano> bauzas, +1
15:13:00 <bauzas> johnthetubaguy: that means that Nova will possibly have some code freeze
15:13:10 <bauzas> johnthetubaguy: and *that* is a big turn in mind
15:13:46 <toan-tran> bauzas: +1, who knows how long it takes to finish it
15:14:00 <n0ano> bauzas, not necessarily code freeze, we just have to back port changes from nova to gantt after the s;lit
15:14:01 <bauzas> because the idea was to do some pre-work on sched-db https://review.openstack.org/89893
15:14:21 <bauzas> but johnthetubaguy got -1 to it
15:14:43 <bauzas> n0ano: my only worries go about the level of backports needed
15:15:02 <bauzas> n0ano: and the idea was to do the steps *before* to prevent that backports
15:15:09 <bauzas> s/that/these
15:15:36 <n0ano> bauzas, a concern but I think it's doable, the steps we are doing reduce the number of backports needed rather than eliminate them
15:15:39 <bauzas> the main problem is about filtering on aggregates and instances
15:16:04 <bauzas> ok so https://review.openstack.org/82778 is the top prio and then we split
15:16:14 <n0ano> +1
15:16:30 <bauzas> n0ano: we need to think about all the steps for stepping up a CI, etc.
15:16:35 <toan-tran> n0ano: some of the mechnism will need to change, like aggregates, these will prevent some nova patches into gantt
15:16:36 <bauzas> an API and a client :)
15:17:03 <bauzas> toan-tran: there are some blueprints for porting the aggs stats to sched using extensible RT
15:17:20 <bauzas> toan-tran: that would avoid the sched to call the Nova API for it
15:17:30 <bauzas> toan-tran: which is good
15:17:43 <toan-tran> bauzas: just an example, but thanks for the info :)
15:17:44 <bauzas> toan-tran: but until that, Gantt won't support aggregates filtering
15:18:12 <bauzas> n0ano: still happy with that ?
15:18:30 <toan-tran> my point is since we don't even have all the changes sorted out, there are risks that some new patches cannot be backported to gantt
15:18:31 <n0ano> bauzas, works for me, just means we'll have some feature parity work still for gantt
15:18:40 <bauzas> we can possibly vote on it ?
15:18:43 <toan-tran> s/sorted/figured
15:18:54 <bauzas> give me chair on the meeting, will arrange a vote
15:19:03 <bauzas> #chair bauzas
15:19:12 <n0ano> #chair bauzas
15:19:13 <openstack> Current chairs: bauzas n0ano
15:19:18 <bauzas> #help vote
15:20:18 <n0ano> do you need anything from me, you have the chair
15:20:20 <bauzas> #vote Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity with nova-scheduler ?
15:20:35 <bauzas> strange
15:20:41 <bauzas> the bot is unhappy
15:21:35 <n0ano> well, +1 from me, no matter what the bot is doing
15:21:59 <toan-tran> +1 for me too
15:22:04 <bauzas> #startvote Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity with nova-scheduler ?
15:22:05 <openstack> Begin voting on: Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity with nova-scheduler ? Valid vote options are Yes, No.
15:22:06 <openstack> Vote using '#vote OPTION'. Only your last vote counts.
15:22:15 <mspreitz> bauzas: you mean Gantt not feature parity at first, but will be later?
15:22:16 <bauzas> dammit, forgot the good tag :)
15:22:20 <bauzas> #undo
15:22:21 <openstack> Removing item from minutes: <ircmeeting.items.Help object at 0x2a27910>
15:22:31 <bauzas> #vote Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity at first with nova-scheduler ?
15:22:34 <n0ano> mspreitz, yes, that is the plan
15:22:48 <bauzas> #startvote Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity at first with nova-scheduler ?
15:22:49 <openstack> Already voting on 'Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity with nova-scheduler '
15:23:00 <bauzas> #endvote
15:23:01 <openstack> Voted on "Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity with nova-scheduler ?" Results are
15:23:12 <bauzas> #startvote Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity at first with nova-scheduler ?
15:23:14 <openstack> Begin voting on: Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity at first with nova-scheduler ? Valid vote options are Yes, No.
15:23:15 <openstack> Vote using '#vote OPTION'. Only your last vote counts.
15:23:19 <bauzas> #vote yes
15:23:22 <yjiang5> #vote No
15:23:24 <n0ano> #vote yes
15:23:24 <toan-tran> #vote yes
15:23:30 <toan-tran> #vote Yes
15:23:57 * bauzas eventually found out how to setup a vote...
15:24:03 <bauzas> mspreitz: ?
15:24:14 <mspreitz> not sure, I  came in late, I think I will abstain
15:24:17 <bauzas> ok
15:24:20 <bauzas> #endvote
15:24:21 <openstack> Voted on "Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity at first with nova-scheduler ?" Results are
15:24:39 <bauzas> awesome...
15:24:54 <bauzas> anyway, we have a majority over here
15:24:58 <yjiang5> n0ano: bauzas: need leave now. talk to you guys later.
15:25:07 <bauzas> sure, thanks yjiang5
15:25:15 <n0ano> bot is weird but my count was 3-1 so yes wins
15:25:25 <yjiang5> bauzas: bye.
15:25:27 <n0ano> yjiang5, later
15:25:42 <n0ano> bauzas, so, you have clear direction for now?
15:25:50 <bauzas> #action bauzas to deliver a new patchset for sched-lib based on keeping ComputeNode in Nova
15:26:12 <bauzas> n0ano: yup
15:26:23 <n0ano> cool, let's move on then
15:26:31 <n0ano> #topic Fair Share scheduler
15:26:31 <bauzas> n0ano: we need to sync up next week to see what to do with the split itself
15:26:41 <n0ano> ericfriz, you still here?
15:26:55 <ericfriz> yes, i'm here!
15:27:11 <LisaZangrando> me too
15:27:15 <n0ano> so, I hope everyone read ericfriz email about his fair share scheduler idea
15:27:26 <schwicke> yep
15:27:38 <toan-tran> me too
15:27:46 <n0ano> the idea looks interesting, I'm curious is this just a new filter or are you changing the scheduler itself?
15:29:04 <bauzas> ericfriz: could you just summarize your idea ?
15:29:28 <ericfriz> It's not a filter, but it's a change to the scheduler algorithm
15:30:03 <LisaZangrando> please take a look to the slide #12
15:30:21 <toan-tran> LisaZangrando: can you provide the link here?
15:30:29 <schwicke> ericfriz: if I got it right, you are using the scheduler from slurm, correct ?
15:30:37 <LisaZangrando> the schema show the new architecture
15:30:56 <bauzas> #link https://github.com/CloudPadovana/openstack-fairshare-scheduler
15:31:10 <ericfriz> yes, SLURM's Priority MultiFactor
15:31:26 <bauzas> it appears to me that your proposal is really close to what Blazar already does :)
15:31:34 <LisaZangrando> no, the scheduler implements the same scheduling algorithm og slurm
15:31:41 <LisaZangrando> no, the scheduler implements the same scheduling algorithm of slurm
15:31:53 <bauzas> ie. you have a reservation and the system will handle it
15:33:10 <bauzas> #link https://wiki.openstack.org/wiki/Blazar
15:33:21 <LisaZangrando> which kind of reservation?
15:33:41 <mspreitz> ericfriz: does a user request in your design have a start time and/or end time or duration?
15:33:47 <bauzas> virtual instances or physical compute_node
15:34:25 <mspreitz> LisaZangrando: you mean slide #12 of https://agenda.infn.it/getFile.py/access?contribId=17&sessionId=3&resId=0&materialId=slides&confId=7915 ?
15:35:12 <toan-tran> LisaZangrando: I quick scanned your docment
15:35:32 <toan-tran> and get the feeling that you want to sorted users' requests based on priority
15:35:40 <ericfriz> mspreitz: user request has no duration when it's queued
15:35:54 <toan-tran> but users' requests are asynchronized
15:35:55 <jaypipes> #link https://review.openstack.org/#/c/103598/ <-- totally different way of approaching the scheduler. Just for kicks and giggles.
15:36:15 <toan-tran> you're assuming that there are not enough resources for current requests? so that they have to wait in a queue?
15:36:47 <ericfriz> toan-tran: yes, it's.
15:37:18 <toan-tran> ericfriz: as you said, current nova scheduler does handle requests FIFO
15:37:51 <toan-tran> so I think your patch targets nova-scheduler Manager than nova-scheduler Scheduler  :)
15:39:00 <bauzas> toan-tran: I just thought about Blazar because it seems the whole idea is to say "as a user, I want to start an instance but I want to guaranttee that I'll have enough resource for it"
15:39:18 <bauzas> so I'll wait until all the conditions are met
15:39:34 <bauzas> that's what I call a lease
15:39:42 <toan-tran> bauzas: yes, but Blazar focuses on time condition
15:39:51 <bauzas> ie. a strong contract in between the user and the system
15:40:01 <bauzas> toan-tran: not exactly
15:40:03 <toan-tran> here they're talking a bout priority, who gets the resources first
15:40:31 * johnthetubaguy is free to talk if its useful later in the meeting
15:40:35 <bauzas> toan-tran: the Blazar lease is about granting resources for a certain amount of time
15:40:49 <bauzas> johnthetubaguy: nah we agreed on your approach
15:41:03 <bauzas> johnthetubaguy: I'll do a new patchset tomorrow so you'll review it
15:41:06 <johnthetubaguy> bauzas: OK
15:41:09 <johnthetubaguy> thanks
15:41:27 <LisaZangrando> briefly, To all user requests will be assigned a priority value calculated by considering the share allocated to the user by the administrator and the evaluation of the effective resource usage consumed in the recent past. All requests will be inserted in a priority queue, and processed in parallel by a configurable pool of workers without interfering with the priority order.
15:41:39 <schwicke> I think the proposal is useful in situations where resources are limited and where the provider has an interest in getting it's resources used all the time.
15:42:11 <schwicke> not being an expert on blazar but to me it seems to address a different use case
15:43:16 <bauzas> LisaZangrando: by saying "shares", you mean quotas ?
15:43:19 <toan-tran> schwicke: well if user does not care much on time constraint so yes tou're right
15:43:28 <toan-tran> s/tou/you
15:43:48 <LisaZangrando> bauzas: yes
15:43:50 <schwicke> toan-tran: certainly correct
15:44:11 * n0ano wishes bauzas would quit stealing my questions :-)
15:44:22 <LisaZangrando> bauzas: yes, share in batch system terminology
15:44:36 <n0ano> LisaZangrando, then what is a quota, CPU usage, mem usage, disk usage?
15:44:58 <toan-tran> n0ano: quicker next time! otherwise you'll loose
15:45:23 <n0ano> toan-tran, age is slowing down my fingers
15:45:24 <schwicke> n0ano: quotas on those are ceilings.
15:45:43 <schwicke> They define the maximum of what a user can have I'd say, right, Lisa ?
15:46:03 <bauzas> schwicke: that's what we call quotas in OpenStack :)
15:46:09 <schwicke> :)
15:46:15 <mspreitz> I'm a little confused here, it looks like the FairShare design is about a priority queue to hand out things sooner or later, not limit usage
15:46:24 <bauzas> mspreitz: +1
15:46:27 <toan-tran> mspreitz: +1
15:46:37 <LisaZangrando> it is a % of resource assigend to a project/user
15:46:40 <bauzas> and the "sooner or later" sounds familiar to me...
15:46:42 <mspreitz> so it's about "share" not "quota"
15:46:59 <schwicke> yes
15:47:03 <toan-tran> please correct me if I'm wrong
15:47:23 <toan-tran> FaireShare targets a situation in which there is not enough resources for everbody
15:47:27 <n0ano> LisaZangrando, implication is that your cahnges will affect more than just the scheduler, you have to setup mechanism for specifying and allocating these shares
15:47:38 <toan-tran> so the scheduler hes to decide who to give resoures to, and how much
15:47:59 <toan-tran> nothing to do with quota  , right ?
15:48:08 <toan-tran> s/hes/has
15:48:15 <johnthetubaguy> is there a good description of the use cases for the fairshare scheduler anywhere?
15:48:34 <bauzas> johnthetubaguy: https://agenda.infn.it/getFile.py/access?contribId=17&sessionId=3&resId=0&materialId=slides&confId=7915
15:48:41 <LisaZangrando> toan-tran: correct
15:49:11 <mspreitz> bauzas: those slides do not have use case in them
15:49:38 <mspreitz> bauzas: my mistake
15:49:48 <mspreitz> there is text about use case, it is pretty generic
15:50:00 <mspreitz> page 16 and 17
15:50:02 <johnthetubaguy> mspreitz: +1
15:50:10 <johnthetubaguy> I don't see what problem it is trying to solve
15:50:25 <johnthetubaguy> I am sure there is one, I just don't see it right now
15:50:51 <mspreitz> page 16 and 17 are really statements of technology goals, not illustrations of usage
15:51:17 <ericfriz> The FairShareScheduler is used in our Openstack installation, named "Cloud Padovana"
15:51:31 <johnthetubaguy> it talks about queuing the requests of users, there is generally never a backlog, so it doesn't matter about the order, you just place things when you get the request, so there much be something bigger that is required here
15:51:41 <bauzas> so, could we consider to ask you to provide some information for next week ?
15:52:02 <johnthetubaguy> ericfriz: so are you trying to share resources between users, and evict people who are using too many resources?
15:52:16 <n0ano> looks like a clear definition of the use cases/problems you are solving would be nice to have
15:53:04 <ericfriz> johnthetubaguy: yes, that is an usecase
15:53:14 <mspreitz> ericfriz: tell us about the people using Cloud Padovana and what problems they would have if your solution were not in place
15:53:40 <johnthetubaguy> ericfriz: OK, is quite a different "contract" with users to what nova offers, so we need a nice description of that ideally
15:53:45 <mspreitz> ericfriz: I did not notice anything about eviction
15:54:04 <n0ano> mspreitz, +1
15:54:58 <ericfriz> mspreitz: scientific teams. when there are not more resources, the user requests fail.
15:55:02 <johnthetubaguy> I am assuming you want someone to have 10% of all resources, should they request them, and others can use that space if they are not using them, and so it boils down to "spot instance" style things, but I don't really see that described anywhere
15:55:50 <bauzas> I'm also thinking about something mentioned in the thread, deferred booting
15:56:30 <mspreitz> ericfriz: what is the nature of these scientific jobs?  Can they tolerate allocation of some VMs now, a few more later, and a few more later?
15:56:31 <johnthetubaguy> ericfriz: it probably seems obvious with a grid computing hat, but doesn't fit too well into nova right now
15:56:45 <bauzas> you would probably require the Keystone trusts mechanism, hence my idea about blazar
15:57:05 <bauzas> and what johnthetubaguy said still makes me thinking about Blazar...
15:57:25 <LisaZangrando> The "illusion" to have unlimited resources ready to be used and always available is one of the key concepts underlying the Cloud paradigm. Openstack refuses further requests if the resources are not available.
15:57:25 <bauzas> Blazar == Climate, for the records
15:58:18 <bauzas> LisaZangrando: so you want to guaranttee them on a best-effort basis ?
15:58:49 <LisaZangrando> we want to guarantee all requests are processed
15:58:49 <mspreitz> LisaZangrando: Does this illusion include maybe being spontaneously evicted?
15:58:55 <johnthetubaguy> LisaZangrando: well, sure, but we could make things a bit more "griddy" for highly utilised clouds, its just going to involve introducing new type of flavors, like "spot instances", extra instance above your quota, but they could get killed at any point
15:59:13 <n0ano> bauzas, more like don't every fail a request, just pend it until it can be satisfied
15:59:17 <bauzas> we're running out of time, we need to conclude
15:59:27 <ericfriz> Blazar uses time condition, FairshareScheduler has no time condition for extracting the user requests from the queue.
15:59:42 * n0ano refers back to his last commen about stealing questions
15:59:57 <bauzas> but Blazar implements some best-effort mode where you define your contract :)
15:59:59 <n0ano> indeed, it's the top of the hour so we'll have to conclude
16:00:03 <toan-tran> LisaZangrando: as a public cloud provider ( <== Cloudwatt ), I can tell you that we're trying our best to not be in the situation o fneeding FS
16:00:15 <toan-tran> but I can see its value
16:00:31 <schwicke> maybe as an action item a compilation of use cases would be useful I think, circulate that and then review later ?
16:00:33 <toan-tran> s/fneeding/needing
16:00:39 <n0ano> I want to thank everyone, good discussion, I'd like to continue the fair share discussion next week, hopefully we've given you guys stuff to think about
16:00:41 <bauzas> schwicke: +1
16:00:58 <schwicke> can that be action-itemed
16:01:04 <n0ano> #action schwicke to come up with clear use cases for next weeks meeting
16:01:11 <toan-tran> so you should come up with a good usecase, and be careful not to get too much on grid's phylosophy
16:01:11 <bauzas> awesome
16:01:16 <n0ano> tnx everyone
16:01:20 <n0ano> #endmeeting