#openstack-meeting log

15:00:05 <bauzas> #startmeeting gantt
15:00:06 <openstack> Meeting started Tue Aug 19 15:00:05 2014 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:07 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:09 <openstack> The meeting name has been set to 'gantt'
15:00:27 <bauzas> hi, who's here for discussing nova scheduler efforts ?
15:00:36 <mspreitz> me
15:01:05 <bauzas> n0ano is having a meeting conflict so I'll be chairing today
15:02:42 <bauzas> ok, still waiting one more min
15:04:01 <jaypipes> o/
15:04:05 <jaypipes> hi folks
15:04:07 <jaypipes> sorry late
15:04:20 <bauzas> no problem, we haven't yet started
15:04:34 <bauzas> only mspreitz and me seem to be present today :)
15:04:44 <mspreitz> and Jay
15:04:57 <bauzas> was talking to jaypipes :)
15:05:03 <bauzas> ok, I guess we can start then
15:05:07 * jaypipes asked ndipanov to hop in here.
15:05:11 <bauzas> little agenda, but still :)
15:05:20 <bauzas> jaypipes: cool thanks
15:05:23 * ndipanov lands
15:05:24 <jaypipes> is PaulMurray still on holiday?
15:05:32 <jaypipes> and where's Mr. Moustache?
15:05:33 <bauzas> jaypipes: seems so
15:05:46 <bauzas> jaypipes: my sources tell me that Paul is somewhere in France
15:05:53 <jaypipes> darn Europeans with all your vacation :P
15:05:58 <ndipanov> haha
15:06:14 <bauzas> jaypipes: so I suspect that he liked so much the country that he won't be there for years
15:06:25 <jaypipes> hehe
15:06:33 <bauzas> n0ano is having administrative tasks IIIUC
15:06:34 <jaypipes> well, he'll be there i November :)
15:06:49 <bauzas> anyway, let's start
15:06:49 <Yathi> hi
15:06:54 <bauzas> Yathi: \o
15:07:12 <bauzas> #topic Forklift Status
15:07:23 <bauzas> so much fun here
15:07:36 <bauzas> so, basically, a quick status, as usual
15:08:13 <bauzas> https://review.openstack.org/82778 and https://review.openstack.org/104556 are identified as priorities for J-3 reviews
15:08:26 <bauzas> still waiting approvals tho
15:08:38 <bauzas> both of them are related to bp/scheduler-lib stuff
15:09:03 <bauzas> another bp is on-going
15:09:14 <bauzas> with the spec to be validated
15:09:22 <bauzas> https://review.openstack.org/89893
15:09:59 <bauzas> changes have been proposed, https://review.openstack.org/#/q/status:open+topic:bp/isolate-scheduler-db,n,z
15:10:16 <jaypipes> bauzas: I have reservations about some of the code in those two patches. will review with comments today.
15:10:24 <bauzas> jaypipes: sure, please do
15:10:49 <bauzas> about isolate-scheduler-db, the main concern is about its usage of ERT (Extensible Resource Tracker)
15:10:51 <ndipanov> I do as well but mine are well known :)
15:11:00 <bauzas> ndipanov: :)
15:11:11 <ndipanov> although not sure it;s the 2 patches I am referring to
15:11:13 <jaypipes> bauzas: my concerns also revolve around ERT.
15:11:34 <jaypipes> bauzas: actually, let me restate...
15:11:42 <bauzas> ndipanov: it's not the same bp
15:12:00 <ndipanov> then nothing... carry on
15:12:00 <bauzas> ndipanov: the 2 formers are creating a new client
15:12:09 <ndipanov> haven't looked at those
15:12:13 <bauzas> jaypipes: sure, please do
15:12:41 <jaypipes> bauzas: my concerns around the isolate scheduer DB patches is that the fundamentals of the API -- the API structure and the parameters passed between conductor/api and the scheduler -- need to be cleaned up before creating a client lib. And the ERT stuff made the interfaces worse than they already were.
15:12:50 <bauzas> I opened a thread in -dev ML for discussions also http://lists.openstack.org/pipermail/openstack-dev/2014-August/043466.html
15:13:55 <bauzas> jaypipes: by creating client lib, you also refer to bp/scheduler-lib ?
15:14:01 <bauzas> s/creating/saying
15:14:13 <ndipanov> I could not possibly agree more with jaypipes
15:14:23 <jaypipes> yes, I will try to reply to the ML. I am stepping over some toes with my comments, though, and am treading a tightrope between comments and antagonism.
15:14:47 <jaypipes> bauzas: creating client lib == https://review.openstack.org/#/c/82778/
15:14:54 <bauzas> sooooo
15:15:10 <bauzas> sounds like we opened the Pandore box
15:15:23 <jaypipes> bauzas: well, technically ERT opened it. :P
15:15:39 <ndipanov> jaypipes, I really don't want to be antagonistic as well - but if we get to stall for one cycle and get things right(er) based on real feedback - it's a net win for gantt imho
15:16:06 <jaypipes> ndipanov: I agree (that's actually been what I said repeatedly in Oregon as well).
15:16:18 <bauzas> jaypipes: hence the plan we discussed
15:16:30 <bauzas> jaypipes: I mean, we identified some work to do
15:16:35 <jaypipes> bauzas: yes, agreed.
15:17:00 <jaypipes> bauzas: for instance, I am 100% supportive of the removal of the direct DB and objects calls from the nova/scheduler/ code
15:17:14 <bauzas> btw. nice blogpost from mikal here http://www.stillhq.com/openstack/juno/000012.html for summarizing what was discussing in the nova meetup about scheduler
15:17:34 <bauzas> jaypipes: glad to hear I have sponsors :)
15:17:40 <jaypipes> bauzas: the issue I have is that the calling structures for the API calls (currently internal, but will become external once the split goes forward) are awkward and not future-proof.
15:18:04 <bauzas> that's why we identified the need to iterate on that
15:18:21 <bauzas> but if we take the strategy, the work for Kilo is about creating a python lib
15:18:23 <jaypipes> bauzas: I believe the ERT work is an "anti-iteration" of that, though.
15:18:38 <bauzas> so that means that external API will be very loosy
15:19:00 <mspreitz> you mean lossy, bad?
15:19:12 <bauzas> mspreitz: uh, my bad
15:19:26 <bauzas> mspreitz: I mean, that will be very lightweight
15:20:05 <bauzas> so, my concerns are about the alternatives
15:20:10 <bauzas> I'm not fully pro-ERT :)
15:20:43 <bauzas> jaypipes: ndipanov: what are your thoughts on what should be done first ? (I'm not saying "rewrite RT and deliver it in Scheduler" :D )
15:20:53 <ndipanov> well
15:20:57 <ndipanov> this is how I see it
15:21:07 <ndipanov> if we are going to stic with "optimistic scheduling"
15:21:08 <bauzas> provided we identified that scheduler needs to have a clear way to get info from other nova bits
15:21:10 <ndipanov> and even if we are not
15:21:22 <bauzas> ndipanov: could you please link again your paper ?
15:21:39 <bauzas> ndipanov: was pretty much of interest, btw. :)
15:21:51 <ndipanov> we need a way to agree on what data goes to the scheduler, and after that to the compute nodes
15:21:59 <bauzas> ndipanov: agreed
15:22:01 <ndipanov> and what data goes from compute nodes to the scheduler
15:22:22 <bauzas> ndipanov: I think the statement of what's required in Nova filters has been done in the sped
15:22:25 <bauzas> spec
15:22:47 <ndipanov> and we need to make sure that this data can be retrieved in an efficient manner
15:22:56 <bauzas> ndipanov: agreed too
15:23:06 <bauzas> ndipanov: lemme give you all the spec rst file
15:23:11 <ndipanov> k
15:23:16 <bauzas> ndipanov: so you'll see all deps
15:23:28 <ndipanov> once you go down the road of agreeing on data
15:23:36 <bauzas> (at least the ones I identified, I'm not bugproof :) )
15:23:54 <ndipanov> I think you will see that RT will need to live in the scheduler likely even though it will be called in computes
15:23:54 <Yathi> have you considered keeping the data outside of scheduler completely as a external db service
15:23:56 <bauzas> https://review.openstack.org/#/c/89893/11/specs/juno/isolate-scheduler-db.rst
15:24:00 <ndipanov> in case we do optimistic scheduling
15:24:19 <ndipanov> which is what we do now - no locking in the sched, but may retry on computes
15:24:25 <bauzas> ndipanov: I was seeing RT as a client for updating the Scheduler
15:24:52 <bauzas> ie. RT and Scheduler need to have same view
15:25:03 <bauzas> and RT pushes updates to Scheduler
15:25:19 <bauzas> so, even if Scheduler goes stale, it goes back to RT for claiming with the correct values
15:25:57 <bauzas> is johnthetubaguy around ?
15:26:08 <ndipanov> what jaypipes was proposing with one of his POCs is to not do claims and retries on the host
15:26:20 <bauzas> ndipanov: yeah, I know
15:26:49 <bauzas> ndipanov: I was just mentioning another approach which was to keep claims (and RT) in Compute
15:27:00 <johnthetubaguy> bauzas: I am semi-available but behind on my email
15:27:24 <jaypipes> ndipanov: well, it was proposing to do a final retry/check on the host, but do claims (and return those claims over the Schedule API) in the scheduler itself.
15:27:29 <bauzas> johnthetubaguy: cool, we're just debating on how the RT/Scheduler thing is articulating
15:27:46 <ndipanov> jaypipes, even better
15:27:59 <ndipanov> that this would totally be an implementation detail of the resource tracker
15:28:00 <jaypipes> ndipanov: so we do a tight loop on the scheduler side, with optimistic locking on the compute node resources, and then just do a retry/exception logic on the compute node itself.
15:28:46 <johnthetubaguy> jaypipes: this sounds like what we badly called two-phase commit before, or did I miss understand the proposal?
15:29:10 <johnthetubaguy> jaypipes: ah, your loop is inside the scheduler, not in the conductor
15:29:13 <bauzas> #link http://eurosys2013.tudos.org/wp-content/uploads/2013/paper/Schwarzkopf.pdf
15:29:18 <jaypipes> johnthetubaguy: no, no 2-phase commit at all.
15:29:35 <ndipanov> jaypipes, do link that patch here :)
15:29:37 <jaypipes> johnthetubaguy: right. the retry claim loop is entirely in the scheduler.
15:29:41 <johnthetubaguy> jaypipes: honestly what we called was't two phase commit either
15:29:52 <jaypipes> https://review.openstack.org/#/c/103598/
15:30:00 <jaypipes> johnthetubaguy: yes, understood :)
15:30:38 <jaypipes> anyway, the above PoC code was just that.. for demo purposes. It includes a bunch of code that shows how to model resources properly without ERT too.
15:31:13 <ndipanov> ah I even starred it
15:31:50 <jaypipes> it really should be broken down into two parts:
15:32:21 <jaypipes> a) changing  the scheduler APIs to use resource models and a real class (no nested dicts) for modeling requested resources, launch policies, and conditions
15:32:32 <jaypipes> b) having the scheduler do the claim process, not the compute node
15:33:19 <ndipanov> the way I see it - you guys need to do a) and b) can come later
15:33:30 <ndipanov> but a) is something that needs to be done or we will regret it
15:33:36 <johnthetubaguy> jaypipes: one alternative is to move all claims to the conductor, then see how they fix into the scheduler, where possible? but maybe thats more work than we need
15:33:46 <jaypipes> right, that's what I've been saying. and doing a) after a split is gonna just be painful
15:33:56 <johnthetubaguy> jaypipes: just thinking of resize claims vs boot cliams, but maybe that split is silly
15:34:24 <ndipanov> johnthetubaguy, you need all the data that sched had to do either
15:34:30 <jaypipes> yeah
15:34:32 <johnthetubaguy> yeah, ignore me I think, straight to scheduler makes more sense, its a given slot you are trying to reserver either way around
15:35:43 <bauzas> jaypipes: correct me if I'm wrong, but a) is just stopping sending blobs ?
15:35:49 <jaypipes> well, look, what I've been saying is that we need to get these resource models and resource/launch request models done first, then work on claim stuff. The problem with ERT is that it throws away good resource modeling in favor of yet more nested dicts of stuff.
15:35:53 <johnthetubaguy> ndipanov: so in the back of my head, its keeping the single scheduler running fast, so giving it less work to do, but really we need to make multiple schedulers work, and frankly I had bet on this claim process on the compute being the locking mechanism to fix that, so this fits those two things together nicely
15:37:07 <ndipanov> jaypipes, and also does not provide any way to go from: this was requested by the user-> this is the data we all see
15:37:14 <jaypipes> yup
15:37:16 <johnthetubaguy> jaypipes: so I see the current ERT as a small step towards refactoring the existing code, not a finished thing, we need to split up the big blob of code into smaller chunks where is clear what you do to add new resources, and agreed we need something better than random dicts
15:37:28 <bauzas> johnthetubaguy: +1
15:37:34 <jaypipes> johnthetubaguy: sorry, I disagree. I see at as a step backwards.
15:37:39 <ndipanov> johnthetubaguy, as for the claims being on compute - I don't think it's a bad design
15:38:24 <ndipanov> but what is bad design - is that there is no clear way to do the same thing in sched and in the claim
15:38:27 <bauzas> jaypipes: I mean, I can understand that nested dicts are evil, but why not consider ERT updating resource models ?
15:38:30 <mspreitz> Those of us who want to make joint decisions will need a way to make joint claims
15:38:36 <ndipanov> without doing select * from join...join...
15:39:00 <bauzas> ndipanov: hence the idea that RT and Scheduler should have same modezl
15:39:05 <jaypipes> ndipanov: I do. the placement engine needs to have a holistic view of the system's resources, and having claims handled on the compute node means the placement engine has out-of-date info and cannot make quick decisions (must rely on retry exceptions being raised from the compute)
15:39:33 <johnthetubaguy> jaypipes: I think agree about your issues with the interface, I just saw that as something we need to improve and evolve to a strict versioned system, there are certainly safer ways down that path for the same kind of code split
15:39:37 <jaypipes> bauzas: because the thing that is extensible about ERT does not need to be extensible? :)
15:40:21 <ndipanov> jaypipes, that is fair bu that is a design decision, and one we can walk away from for a better desing (less trade-offs) - and orthogonal to the idea of data modeling (and querying)
15:40:23 <jaypipes> bauzas: resources don't need to be extensible. they need to be properly modeled.
15:40:23 <bauzas> jaypipes: extensible is just another word for on-demand
15:40:29 <jaypipes> bauzas: no...
15:40:42 <johnthetubaguy> jaypipes: I want to be able to reduce the reported resources to a bare minimum for my filters, but lets not try go there
15:40:51 <jaypipes> bauzas: extensible, in the case of ERT, means resources are classes that are loaded as plugins in stevedore, and that is totally useless IMO
15:41:28 <jaypipes> bauzas: instead, we need to properly model resources that we know are used in Nova: cpus, memory, NUMA placement, disk, ect
15:41:41 <bauzas> and aggregates, flavors, instances...
15:42:05 <jaypipes> bauzas: and making those resources "plugins" does not make those things suddenly proper models. in fact, it makes it even more loosely defined and non-standardized/inconsistently-applied
15:42:35 <bauzas> jaypipes: ok, let's ban the word "plugins" and replace it with "classes"
15:42:38 <johnthetubaguy> jaypipes: yeah, I wish that big wasn't implemented already, its the ability to reduce the traffic to a minimum that I would like, and prehaps only reporting the deltas does that anyway, but maybe lets not go there right now, I love the claims discussion
15:42:56 <mspreitz> NUMA placement is not a resource... it is the observation that you can not factor a node's resources into orthogonal sets
15:43:11 <mspreitz> orthogonal dimensions
15:43:19 <dansmith> johnthetubaguy: I'm not really following this closely, but I too worry about it being *more* work to go from ERT to versioned/stable data than it would be from what we had before
15:43:27 <jaypipes> mspreitz: no, that is not correct. if an instance consumes a certain socket/core/thread, it is consumed as a whole.
15:44:12 <johnthetubaguy> dansmith: yeah, thats a good point, it seemed easier in my head, but happy to bow to the consensus on that
15:44:36 <ndipanov> mspreitz, not sure I follow...
15:44:38 <bauzas> so should we consider to make use of what we already version ?
15:44:49 <bauzas> ie. update Scheduler with objects ?
15:44:53 <dansmith> johnthetubaguy: I went from "on board" when it used objects, to "meh" when it didn't, and made the slow slide to -1 over the course of watching it
15:45:05 <mspreitz> never mind, I was thinking of a more general kind of NUMA, I guess
15:45:49 <jaypipes> bauzas: yes, definitely, but there's a number of things that are not objects -- for example, a "LaunchRequest" and a "Resource" (and subclasses) are not objects yet.
15:45:49 <ndipanov> in the sence where it defines access times/bandwith that you could then somehow schedule on?
15:45:56 <ndipanov> mspreitz, ^
15:46:18 <bauzas> jaypipes: what do you mean by LaunchRequest ?
15:46:28 <bauzas> and a ComputeNode is a resource
15:46:34 <ndipanov> bauzas, I assume all the data we need but don't have in a single place right now
15:46:39 <jaypipes> bauzas: the thing currently called "request_spec" in the scheduler APIs.
15:46:45 <ndipanov> like filter_specs and requset_spec
15:46:48 <ndipanov> yes that
15:46:50 <jaypipes> bauzas: but made into a real class, not a random set of nested dicts.
15:46:59 <ndipanov> jaypipes, +1000
15:47:05 <johnthetubaguy> jaypipes: +1
15:47:12 <bauzas> jaypipes: I was expecting/fearing this answer...
15:47:16 <ndipanov> ok +1
15:47:18 <jaypipes> https://review.openstack.org/#/c/103598/4/nova/placement/__init__.py <-- see PlacementRequest class.
15:47:39 <mspreitz> ndipanov: I was thinking of NUMA as referring to non-uniform access to main memory; I think the discussion here is focusing only on cache, which is bound to core
15:47:40 <jaypipes> bauzas: don't fear the reaper.
15:47:49 <dansmith> lol
15:47:53 <bauzas> ok, time is running fast (for the reaper too)
15:48:06 <jaypipes> bauzas: don't fear the reaper.
15:48:09 <jaypipes> gah...
15:48:16 <jaypipes> up key
15:48:21 <bauzas> we need to somewhat conclude on that topic, even if the the last topic is open discussion
15:48:31 <bauzas> so
15:48:39 <bauzas> wrt what has been discussed
15:48:40 <mspreitz> needs more cowbells
15:49:01 <ndipanov> bauzas, that's what I've been trying to say all along  - for me without this (modeling data first) - we are just postponing the pain
15:49:08 <jaypipes> mspreitz: ++ :)
15:49:19 <bauzas> jaypipes: do you agreed on finding some time to discuss with me about a real *change* ?
15:49:23 <johnthetubaguy> bauzas: jaypipes: I like the retry loop for gaining claims living inside the scheduler, for the record
15:49:28 <bauzas> jaypipes: of course, it will deserve a spec...
15:49:39 <jaypipes> bauzas: absolutely. that's why I keep showing up here :)
15:49:41 <mspreitz> I do have one thing for opens, or ML
15:49:51 <bauzas> ok so
15:50:08 <jaypipes> bauzas: I just think ERT makes it harder to get to where we need to be.
15:50:12 <mspreitz> I'd like to tighten up the arguments around smart or solver scheduler
15:50:30 <bauzas> #action bauzas and jaypipes to propose a resource model for scheduler
15:50:35 <bauzas> bing.
15:50:40 <johnthetubaguy> jaypipes: dansmith: I guess thats the point of contention, is ERT a step backwards or forwards
15:50:52 * ndipanov supports that and will even help if it's after thursday
15:51:04 <jaypipes> ++
15:51:18 <bauzas> ok, happy us, we have an action
15:51:21 <jaypipes> johnthetubaguy: ndipanov and I are pretty strong on the backwards side.
15:51:42 <bauzas> I'm proposing to discuss on ERT when PaulMurray is back
15:51:54 <ndipanov> yeah - I think it opens us up to more pain later for skimping on proper design now
15:52:00 <dansmith> yep
15:52:01 <bauzas> at least not reverting his change until he can somewhat discuss
15:52:03 <mspreitz> My opinion is that the central logic can be very generic: for each resource you have capacity and demands.  Could be handled with dicts.
15:52:17 <jaypipes> bauzas: yes, I agree completely with waiting for Paul to be back.
15:52:19 <johnthetubaguy> jaypipes: dansmith: right, I kinda thought it was a baby step forward, all be it with some unfortunate baggage, but happy to go with the majority on this, I agree there are other much easier routes forward, just this one had effort on it already
15:52:32 <dansmith> johnthetubaguy: cool
15:52:45 <bauzas> ... well, at least until Frogs free an English from jail
15:53:00 <bauzas> ok, next topic so
15:53:03 <bauzas> 7 mins left
15:53:07 <bauzas> #topic open discussion
15:53:11 <bauzas> mspreitz: ?
15:53:15 <mspreitz> ok...
15:53:44 <mspreitz> I wonder if we can separate the issues of more sophisticated placement criteria from the issue of simultaneous vs. sequential
15:54:05 <mspreitz> I also wonder if Yathi has a runtime complexity argument in favor of simultaneous
15:54:13 <ndipanov> bauzas, as for the paper that I linked to you the otehr day just google "google omega paper"
15:54:29 <bauzas> ndipanov: I gave the link in that discussion
15:54:30 <bauzas> :)
15:54:36 <bauzas> ndipanov: see ^
15:54:40 <mspreitz> That is, I think we can do sophisticated placement criteria with scheduler hints, if we are willing to accept sequential solving.
15:55:01 <mspreitz> The argument about simultaneous vs. sequential solving is a possibly separable thing
15:55:23 <mspreitz> Yathi: are you still here?
15:55:24 <ndipanov> mspreitz, sequential as - we have a queue and some kind of a lock on all resources
15:55:26 <bauzas> mspreitz: I think that's a good question which deserves to rediscuss about Solver Scheduler bp
15:55:42 <Yathi> hi yes
15:55:55 <mspreitz> by sequential I mean how we do it now, wtih  no attempt to gather a bunch of things together for a joint placement decision
15:56:03 <johnthetubaguy> ndipanov: hmm, interesting, thanks
15:56:34 <Yathi> simultaneous gives you a way to cover a unified view which could be lost when done sequentially
15:56:38 <ndipanov> johnthetubaguy, of course not all of that applies here - but it does make a nice taxonomy of differen sched designs and their tradeoffs
15:56:45 <mspreitz> I want to be precise about the loss
15:56:58 <mspreitz> one loss is this: you risk picking a poorer solution
15:57:05 <bauzas> Yathi: simultaneous would possibly require to see a locking mechanism
15:57:06 <johnthetubaguy> ndipanov: right, thats always handy, shared terminology,
15:57:07 <mspreitz> another possible loss is this: you spend more time solving
15:57:15 <mspreitz> I want to understand if the second is so
15:57:22 <ndipanov> I especially liked the 2 level approach where you have the resource master and the schedulers that see a subset of resources
15:57:44 <ndipanov> that the master let's them see
15:57:45 <johnthetubaguy> mspreitz: one you do additions and substractions, you end up having to do both I guess, so maybe we do sequentially first?
15:58:08 <johnthetubaguy> ndipanov: thats what cells does today (all be it badly)
15:58:19 <jaypipes> johnthetubaguy: correct.
15:58:31 <jaypipes> johnthetubaguy: sharding vs. unified view.
15:58:31 <Yathi> anything that will not result in a loss can be handled sequentialy i agree
15:58:36 <mspreitz> johnthetubaguy: I could easily see a roadmap that starts with more sophisticated placement criteria and switches from sequential to simultaneous later
15:58:54 <Yathi> but these will not increase solving time when done simultaneoulsy either
15:58:57 <bauzas> mspreitz: IMHO, simultaneous needs to be covered out of Nova
15:59:23 <bauzas> mspreitz: because I'm seeing it as something asking for a "lease", and I think you know what I'm seeing
15:59:28 <mspreitz> I am kinda lost in the cross conversation
15:59:33 <mspreitz> can we pls follow up in the ML?
15:59:41 <bauzas> mspreitz: sure, open a thread
15:59:51 <mspreitz> ok
15:59:51 <Yathi> sure
15:59:53 <jaypipes> ++
16:00:07 <bauzas> folks, thanks for your help, much appreciated
16:00:12 <jaypipes> ty bauzas :)
16:00:12 <ndipanov> bauzas, np
16:00:25 <bauzas> see you next week, and jaypipes don't plan to take vacations soon :)
16:00:26 <ndipanov> thank you for caring about this
16:00:32 <johnthetubaguy> jaypipes: the cells sharding is more useful for "soft" people reasons like isolating infrastructure into like-typed failure zones as you add capacity, so you can spot failure patterns more easily, etc
16:00:37 <bauzas> bye all
16:00:39 <bauzas> #endmeeting