20:02:09 <ttx> #startmeeting tc
20:02:10 <openstack> Meeting started Tue Sep 10 20:02:09 2013 UTC and is due to finish in 60 minutes.  The chair is ttx. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:02:11 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
20:02:12 <markwash> o/
20:02:13 <mordred> o/
20:02:14 <openstack> The meeting name has been set to 'tc'
20:02:17 <annegentle> o/
20:02:17 <ttx> Our agenda:
20:02:25 <ttx> #link https://wiki.openstack.org/wiki/Governance/TechnicalCommittee
20:02:38 <ttx> #topic Savanna incubation request: initial discussion
20:02:44 <ttx> #link http://lists.openstack.org/pipermail/openstack-dev/2013-September/014623.html
20:02:50 <ttx> #link https://wiki.openstack.org/wiki/Savanna/Incubation
20:02:59 <ttx> SergeyLukjanov: hi!
20:03:13 <SergeyLukjanov> ttx, howdy!
20:03:19 <mikal> Hi
20:03:27 <akuznetsov> hi
20:03:30 <ttx> So we usually consider incubation over two meetings. The first week is an initial discussion so that the main issues can emerge
20:03:46 <ttx> and at the second one we usually conclude that discussion and vote
20:04:06 <ttx> So this week is mostly about Q&A
20:04:22 <ttx> Personally I had a question about the scope. Savanna is a single project but proposes two very different services: cluster and data operations
20:04:39 <ttx> That sounds like two very separate use cases to me
20:04:47 <hub_cap> yes the former seems very similar to trove mission
20:04:54 <ttx> So far we had provisioning stuff like Trove
20:05:04 <ttx> and data-oriented stuff like Marconi
20:05:06 <mikal> I would like to hear more about plans for heat as well
20:05:17 <zaneb> mikal: ++
20:05:22 <ttx> but no project that would handle both at the same tiem
20:05:22 <vishy> o/
20:05:33 <SergeyLukjanov> let's start from the question about cluster and data ops
20:05:37 <ttx> could you explain the value of bundling those two activities in the same project ?
20:05:52 <SergeyLukjanov> all data ops are build around the Hadoop eco
20:06:34 <SergeyLukjanov> and the Hadoop cluster provisioning process is very complex, additionally, we need not only Hadoop cluster but a lot of other tools that works on Hadoop
20:06:36 <jd__> i.e. relying on Heat?
20:06:49 <ErikB> Specifically, Savanna is focused around enabling setup, provisioning, configuration and deployment of Hadoop on OpenStack and related data operations that would be executed on Hadoop
20:07:22 <hub_cap> yes but trove is focused on enabling setup, prov, config and deployment of X on OpensSTack
20:07:28 <hub_cap> where X is a datastore
20:07:37 <akuznetsov> Savanna main goal is the elastic data processing, but without cluster operation it is unreachable. So the first step was creation of staff for cluster operation
20:07:43 <ttx> sounds like a product definition more than a project definition... how much code would be shared between those two types of ops ?
20:07:54 <SergeyLukjanov> hub_cap, we're not targeting Savanna as data store provider
20:08:19 <ttx> (not saying you don't need both, just wondering how much sense it makes as a single project)
20:08:34 <SergeyLukjanov> ttx, in fact we want to provide data ops using Hadoop eco, but we need to provision complex cluster to do it
20:08:40 <hub_cap> im not sure what you mean by data store provider? youre targetting a "spin up a hadoop ecosystem", yes?
20:08:41 <russellb> another way to look at it, is that the cluster part could be considered part of the OpenStack Deployment program
20:08:51 <mordred> hub_cap: hadoop isn't storage, it's processing
20:09:02 <SergeyLukjanov> mordred, yep, absolutely
20:09:07 <ttx> russellb: or some kind of trove-like project that would leverage heat
20:09:08 <mattf> SergeyLukjanov, ErikB, a little background of Hadoop might help to level the understanding
20:09:09 <rnirmal> actually it's both..
20:09:11 <russellb> ttx: yes
20:09:23 <hub_cap> im pretty sure its more than just processing
20:09:24 <vishy> ok so the problem here is that cluster management and configuration is a shared problem
20:09:33 <hub_cap> correct vishy well put
20:09:38 <vishy> which is partially solved by trove and heat
20:09:43 <ttx> russellb: I can see a use case for someone to spin up hadoop clusters rather than submit direct jobs... it just sounds like a very different type of user
20:09:53 <ruhe> ElasticDataProcessing allows to execute MapReduce jobs on demand. It means that Hadoop cluster will be provisioned specifically for the job, and destroyed once job is complete.
20:10:03 <hub_cap> fwiw, i dont want to touch a data api w/ a 10 ft pole..., so i see savanna filling that gap
20:11:16 <mikal> So the use case here is that a user sends an api requests saying "map reduce this please" and savnna brings up the jobs and kicks them off?
20:11:29 <ruhe> mikal right
20:11:31 <hub_cap> is there no use case for long running hadoop clusters?
20:11:32 <mikal> I guess I don't understand why Savanna shouldn't be using heat to orchestrate that process
20:11:51 <zaneb> I've heard it said that Hadoop is too complicated for Heat to deploy... I would love to at least find out what we're missing
20:11:54 <SergeyLukjanov> hub_cap, we have such use case
20:11:58 <hub_cap> or even hadoop provisioning for customers
20:11:59 <ruhe> hub_cap, yes there is such use case
20:12:01 <ttx> mikal: but there is also a use case where a user sends api requests to bring up a whole hadoop cluster
20:12:02 <hub_cap> to use on their own
20:12:32 <mikal> ttx: a hadoop cluster that lives longer than a single map reduce you mean?
20:12:38 <rnirmal> mikal: yes
20:12:40 <shardy> ttx: that can still use heat behind the scenes tho, no?
20:12:47 <ttx> shardy: definitely
20:12:50 * gabrielhurley is late to the meeting
20:12:52 <mikal> shardy: agreed
20:12:53 <mattf> shardy, +1
20:13:11 <ttx> mikal: yes, and for which you'd use classic hadoop tools to submit jobs
20:13:26 <mikal> ttx: fair enough
20:13:27 <hub_cap> or some data api savanna provides ttx?
20:13:46 <ErikB> Hadoop can be fairly difficult to configure and deploy. Savanna provides the mechanism to deploy the Hadoop infrastructure (composed of multiple services, configuration, topology) on OpenStack leveraging distribution specific constructs. Each distribution (Apache, MapR, Cloudera, HDP) tends to provide their own mechanism for deployment and management which is what Savanna provides a framework for. Duplicating this in Heat
20:13:53 <SergeyLukjanov> Hadoop cluster provisioning is a very complex process with tons of configs and Savanna provides an ability for users to create templates
20:13:58 <asavu> IMHO Savanna is like EMR + Netflix Genie tightly integrated. I'm not sure Heat can solve the orchestration problem completely but I agree can be part of the solution
20:14:15 <hub_cap> do you feel hadoop is more complicated than setting up a cassandra or mongo cluster?
20:14:30 <hub_cap> because Trove is going to tackle those, as blueprints are sitting in our queue
20:14:32 <ruhe> hub_cap definetely
20:14:34 <rnirmal> I might be wrong but currently heat doesn't have the option to do a post processing operation. which is needed for cluster configuration
20:14:43 <hub_cap> oh oh oh Trove can :)
20:14:48 <demorris> SergeyLukjanov: Trove would need that same capability to configure clusters and templates that describe the complexities of the different cluster deployments
20:14:49 <mikal> I worry that saying "we can model this in heat" indicates a heat bug instead of a need for a new orchestration system
20:14:51 <jd__> enhancing Heat might be a better solution though asavu
20:14:54 <hub_cap> i know heat has deferred things too
20:14:56 <russellb> rnirmal: but part of being an OpenStack projet is to work with other projects to fill gaps :)
20:14:59 <akuznetsov> hub_cap yes it hadoop cluster contains several services with circular dependencies
20:15:00 <shardy> asavu: as mentioned by zaneb, we'd like to understand the parts you think you can't solve with heat atm
20:15:10 <SergeyLukjanov> hub_cap, we're provisioning not only Hadoop, but Hadoop eco using some Hadoop management consoles
20:15:12 <rnirmal> russellb: agree... just wanted to point it out
20:15:32 <aignatov_> yes, hadoop(hdfs, mr service and services for data processing) has more complexity than deployment of cassandra
20:15:35 <hub_cap> sure SergeyLukjanov / akuznetsov and im sure cassandra would be as such too... are there no ecosystem tools wrt it?
20:15:38 <russellb> and i think one of the expectations if you were to be incubated would be to work with projects to fill gaps so that you can build on them as much as possible
20:15:47 <hub_cap> +1 russellb
20:15:52 <shardy> mikal: Agree, probably we just need to better understand what's missing/needed in Heat
20:16:12 <ttx> My point is that while I see a value for the data API, I question the value of a hadoop-specific cluster thing. That would overlap with a lot of Heat/Trove space
20:16:13 <mikal> shardy: yes. I think an analysis of that is something I'd like to see for part two of this discussion.
20:16:20 <akuznetsov> the main issue that Heat is not support a circular decencies
20:16:21 <russellb> so part of the Q&A is trying to establish some vision for where this is headed, and how it might integrate
20:16:26 <shardy> mikal: +1
20:16:30 <mordred> yah
20:16:34 <hub_cap> ttx +1. and id love a cassandra data api too built on top of heat/trove ;)
20:16:39 <ruhe> we actually have a wiki page for such questions about "Why don't you use Heat?" - https://wiki.openstack.org/wiki/Savanna/WhyNotHeat
20:16:44 <mordred> I mean, when trove came to us, it was not using heat, and we said, dude, you should use heat
20:16:45 <SergeyLukjanov> hub_cap, there are a lot of tools that works on Hadoop - Hive, Pig, Oozie
20:16:45 <hub_cap> maybe savanna fits for a "data api"
20:16:50 <mikal> ruhe: /me looks
20:16:58 <hub_cap> SergeyLukjanov: sure, and i assume we could install them
20:17:09 <hub_cap> we have a postprocessing guest that is in charge of this
20:17:13 * ttx looks
20:17:20 <mordred> "Savanna currently maintains Grizzly+ compatibility. " - if you got integrated, would that still be a goal?
20:17:21 <russellb> all of this isn't necessarily arguing where you should be *right now*, just where you should go :)
20:17:22 <russellb> to be clear ...
20:17:25 <hub_cap> and it keeps services online and reports failures
20:17:27 <mordred> russellb: ++
20:17:28 <asavu> shardy afaik aws cloudformation doesn't have all the semantics we need e.g arbitrary script execution, vendor API interaction etc.
20:17:32 <hub_cap> russellb: +1 billino
20:17:40 <hub_cap> *billion
20:18:10 <mordred> "Circular dependencies - we should generate ‘/etc/hosts’ for all instances in provisioned cluster." - I believe os-*-config will be your friends there
20:18:19 <ruhe> agree, that at some point we'll need to use Heat for provisioning. It's just a matter of time
20:18:28 <mordred> ah - good
20:18:30 <mordred> "Once Heat fulfills all these requirements we will be able and should use Heat for VM provisioning. "
20:18:39 <SergeyLukjanov> mordred, we're currently planing to guarantee only H support in 0.3 release (mid October)
20:18:40 <ttx> So... basically if Heat needs to be improved to be used as a basis for Savanna... then maybe it makes sense to wait for that to happen before filing Savanna for incubation. Incubation is about INTEGRATING with existing intergrated projects to form a coherent whole.
20:19:02 <shardy> asavu: we're working on lots of native, non cloudformation-compatible functionality atm, your requirements (and contributions) would be very valuable to that process
20:19:02 <ttx> projects can completely exist outside of incubation
20:19:06 <markmcclain> +1
20:19:14 <shardy> rather than just rolling your own everything
20:19:18 <mordred> ttx: or, perhaps part of integrating is a list of things that need to be done on both sides to come out of integrating
20:19:30 <hub_cap> i woould still see a lot of overlap between the clustering api trove has proposed and the savanna clustering api
20:19:33 <mordred> it's hard to say "hey, heat, we need this feature for X" if X isn't on heat's radar per-se
20:19:38 <hub_cap> rather than saying "just heat"
20:19:43 <SergeyLukjanov> ttx, integration with Heat is one of our goals for I cycle
20:19:52 <zaneb> to be clear, the Heat team is working on a new template format with the explicit goal of being able to describe something like a Hadoop deployment
20:20:07 <russellb> i think these sorts of things are fine to work on during incubation
20:20:09 <zaneb> i.e. this is the canonical example of what we want to be able to support
20:20:11 <ttx> mordred: so you would incubate and let them there until proper integration is achieved ?
20:20:20 <mordred> ttx: that's the point of incubation, no?
20:20:21 <russellb> that's what we asked of trove
20:20:26 <hub_cap> yar
20:20:26 <russellb> ttx: +1
20:20:33 <shardy> mordred: we welcome feedback from users and potential contributors about what features they need
20:20:39 <jd__> ttx: mordred: +1
20:20:42 <hub_cap> so does no one see the overlap between trove/savanna wrt the clustering?
20:20:50 <dolphm> hub_cap: ++
20:20:51 <russellb> i mean, we should have reasonable confidence that they can/will achieve what we're asking, and are off to a good start
20:20:52 <ttx> mordred: sure, as long as the stated goal of savanna is to achieve that integration
20:20:56 <mordred> hub_cap: sure- I'd love to see you guys working together
20:21:18 <hub_cap> +1 mordred it would make trove a better product
20:21:19 <mordred> hub_cap: explicitly working on solving that - either by one of you consuming the other, or by spinning off a third thing that both of you consume
20:21:21 <mordred> or something
20:21:37 <ttx> mordred: i.e. it makes sense to incubate if its' to work on integration. Not so much if it's to explain that they can't use Heat because of A and B
20:21:40 <hub_cap> absolutely mordred... something can be consenus-ified
20:21:48 <mattf> SergeyLukjanov, wouldn't you say that a core goal of savanna is to integrate with other openstack projects?
20:21:53 <rnirmal> also cluster provisioning doesn't just apply to hadoop.. it's hadoop today.. spark tomorrow and a whole lot of possible other tools/projects
20:21:58 <mordred> ttx: I think their 'why not heat' is already actually a 'why not heat right now'
20:21:58 <vishy> hub_cap: what "clustering" do you need that couldn't be provided by heat?
20:22:02 <ttx> but Sergey said it's one of their goals, so I think we are fine
20:22:03 <hub_cap> +1 rnirmal
20:22:05 <mattf> imho it's a core reason savanna is being done in the openstack community instead of outside
20:22:05 <gabrielhurley> sounds like that shared clustering feature might be a good candidate for a shared library
20:22:21 <hub_cap> gabrielhurley: that would work
20:22:26 <SergeyLukjanov> ttx, yep integration with other OS projects is our goal
20:22:41 <hub_cap> vishy: we use heat for clustering, for installiation of a cluster
20:22:49 <hub_cap> doing things like master/slave promotion for mysql
20:22:50 <russellb> i think part of incubation should be viewed as more attentive guidance from us on how/where a project should integrate
20:22:51 <hub_cap> or failover
20:23:07 <shardy> mordred: I'd like to see a roadmap of "$stuff we need to migrate to heat in the medium term"
20:23:08 <hub_cap> complex things require knowledge from "within" so to speak, being a guest
20:23:13 <russellb> i think that works well ... gets it more on everyone's radar
20:23:16 <vishy> ok so promotion and failover
20:23:18 <demorris> there is also benefit to a common API for clustering so we don't end up with two completely distinct clustering API provisioning methods
20:23:35 <vishy> that isn't what I think of when i think clustering
20:23:35 <hub_cap> to start, vishy, thats 2 things i can think of
20:23:37 <demorris> unless the use case is so different that it calls for it, but I don't see it as such just yet for the API
20:23:50 <jgriffith> demorris: +1
20:23:54 <hub_cap> vishy: clustering a data store
20:23:56 <vishy> because if it is just launch a group of vms i think heat handles that just fine
20:23:57 <jmaron> I think you need a clearer definition of "cluster" and or "clustering".  In the hadoop world it's more than the provisioning of VMs - it's the provisioning and configuraiton of a slew of data services on top of those hosts (Note that hadoop isn't necessarily cloud/VM aware)
20:24:00 <SergeyLukjanov> hub_cap, I hope that we'll use Heat for cluster provisioning and potentially for [auto]scaling support
20:24:12 <akuznetsov> possible clustering should be done on Heat side, for example cluster for j2ee application
20:24:15 <hub_cap> vishy: totally agree, thats what we are doing in trove :)
20:24:20 <ruhe> cluster deployment is one of the simplest things in Savanna. there are lot's of details related to Hadoop - integration with Swift, HDFS block placement
20:24:21 <vishy> so specifically it is atomically configuring things
20:24:23 <hub_cap> install = heat
20:24:31 <vishy> which is going to require something like zookeeper, no?
20:24:56 <hub_cap> possibly :)
20:25:02 <mordred> can't heat already handle that? or will do soon?
20:25:06 <vishy> that doesn't sound like it specifically belongs to one project so shared library might be the way to go
20:25:22 <hub_cap> im all for shared lib
20:25:25 <hub_cap> shared = better right? :)
20:25:29 <vishy> have to wonder if it actually fits into the the taskflow library
20:25:31 <ruhe> hub_cap, so the idea is to develop clustering support in Heat which then could be used by Trove and Savanna?
20:25:48 <hub_cap> possibly? we are already going ot use heat for clustering support
20:25:57 <vishy> ruhe: +1
20:25:59 <dmakogon_> hub_cap: cluster_provisioning lib is VERY(!!!) good point
20:26:01 <hub_cap> i see it as savanna uses trove for clustering / post processing etc...
20:26:14 <dmakogon_> hub_cap: and it could be like the part of heat
20:26:16 <hub_cap> and if you want a "hadoop prov api only" you can use trove
20:26:23 <vishy> the location of the clustering library is a minor point
20:26:23 <hub_cap> which obviously will use heat to prov
20:26:26 * ttx sees a lot of discussions needed between heat, trove and savanna guys in HK
20:26:33 <hub_cap> yes plz!
20:26:38 <ttx> but I like what I'm seeing
20:26:38 <hub_cap> ttx ^ ^
20:26:49 <hub_cap> i think there is much overlap
20:26:53 <hub_cap> and i dont like duplicating work
20:27:03 <dmakogon_> hub_cap +1
20:27:03 <dolphm> ttx: ++
20:27:08 <SergeyLukjanov> hub_cap, Trove is 'Database as a Service', but Hadoop isn't a DB
20:27:22 <hub_cap> is it not, at the heart of it?
20:27:27 <mordred> ttx: ++
20:27:31 <hub_cap> is there not a (or many) ways to process and retrieve data
20:27:35 <hub_cap> and a storage engine
20:27:42 <zaneb> it is and it isn't ;)
20:27:43 <ruhe> hub_cap, it's main goal is bigdata processing
20:27:46 <hub_cap> and a plethora of tools avail
20:27:46 <dmakogon_> SergeyLukjanov: but Hive/HBase - yes !
20:28:07 <hub_cap> absolutely, and i dont want to touch processing in Trove
20:28:08 <ttx> I think at the very least, even with clustering completely stripped off, savanna would make sense standalone as a data API
20:28:19 <hub_cap> ttx:++
20:28:30 <hub_cap> id love to see savanna as the data api in openstack
20:28:32 <dmakogon_> ttx: good point
20:28:38 <mikal> ttx: I agree. I just think savanna should be as thin as possible to reduce duplication of effort
20:28:48 <hub_cap> mikal: ++
20:28:51 <russellb> the deployment still has to be solved somewhere
20:28:54 <ttx> mikal: sounds like one of my lines
20:28:58 <ruhe> hub_cap, i guess we need a definition of clustering
20:29:02 <hub_cap> id love to see savanna tackle cassandra and mongo in the future
20:29:06 <dmakogon_> ruhe: yes
20:29:08 <hub_cap> in terms of data api
20:29:13 <gabrielhurley> thin++
20:29:19 <russellb> those might be different data APIs though ...
20:29:26 <ttx> russellb: oh sure. But it's aproblem space that several others are exploring..; and all those people need to talk around a beer to solve it
20:29:35 <russellb> fair enough
20:29:39 <hub_cap> ttx beer+whiteboard
20:29:41 <dolphm> mikal: ++
20:29:41 <SergeyLukjanov> hub_cap, we're planning to support cassandra as external data source
20:29:43 <dmakogon_> hub_cap: cassandra/mongo via savanna ??
20:29:53 <hub_cap> dmakogon_: the data api
20:29:58 <ttx> SergeyLukjanov: I also have a slight concern about you being the author of more than half of the commits
20:30:00 <hub_cap> not the prov/clustering
20:30:03 <ttx> It's not as extreme as Designate (59% instead of 84%) but it still looks a bit brittle to me
20:30:06 <markwash> where can I learn more about the savanna data api?
20:30:09 <akuznetsov> hub_cap we will have cassandra and mongo as one of the data source for edp
20:30:17 <ttx> SergeyLukjanov: are you superman ?
20:30:28 <ttx> SergeyLukjanov: are there buses near where your live ?
20:30:34 <hub_cap> akuznetsov: and you will need clusters for those correct?
20:30:56 <hub_cap> and trove is going to solve prov'ing those clusters in its near future (dmakogon_ is foaming at the mouth)
20:30:56 <akuznetsov> markwash https://wiki.openstack.org/wiki/Savanna/EDP
20:31:03 <SergeyLukjanov> ttx, it's mostly related to the initial state of the project, you can take a look for the last 3 months percentage
20:31:16 <dmakogon_> hub_Cap: correct me if i'm wrong, trove could use savanna(in future) for provision clusters of cassandra ?
20:31:19 <zaneb> https://github.com/stackforge/savanna/contributors
20:31:30 <russellb> and I keep typoing Savanna as Savannah because of Savannah, GA
20:31:32 <russellb> :(
20:31:41 <hub_cap> i see it as the opposite dmakogon_, savanna being a data api uses trove to prov the clusters
20:31:45 <hub_cap> and then does magic data stuff w them
20:31:48 <ruhe> hub_cap, i'm not sure if database_as_a_service is a proper tool to provision data_processing_tool
20:31:57 <mattf> ttx, savanna has active development from mirantis, red hat and hortonworks. state is mirantis seeded the project and has a lot of historical commits.
20:31:58 <SergeyLukjanov> ruhe, agreed
20:32:14 <vishy> hub_cap: it sounds like you are starting to see trove as cluster_provisioning_as_a_service
20:32:23 <markwash> akuznetsov: so is it basically "post up a hive/pig script after establishing your data sources" ?
20:32:28 <rnirmal> maybe that needs to be split out into it's own service then
20:32:29 <ttx> SergeyLukjanov: agree that recent data shows a good trends
20:32:29 <dmakogon_> hub_cap: we can do it in any way
20:32:31 <hub_cap> well we are clustering nosql datastors as a service vishy :)
20:32:46 <hub_cap> man i cant type
20:32:46 <vishy> i don't think it clearly "belongs" to any project today
20:32:47 <dmakogon_> hub_cap: savanna via trove, trove via savanna
20:32:57 <mikal> mattf: there are a lot more mirantis people though right? I don't think that's bad (and all credit to mirantis), but do you think the project would survive if  for some reason it stopped being a priority for mirantis?
20:32:58 <dmakogon_> hub_cap: multiple dual support
20:33:02 <vishy> i think we all agree that clurster provisioning is important
20:33:04 <akuznetsov> markwash not only
20:33:16 <vishy> * cluster
20:33:16 <hub_cap> +1 for clusters
20:33:19 <vishy> and it goes somewhere
20:33:23 <jmaron> the provisioning of VMs in savanna is already fairly well partitioned as an API/service.  trove/heat are not precluded as playing a potential role.  However, it is only a portion of what is required to configure a hadoop deployment (hate using cluster - seems to be a term with specific conotation in this crowd ;) )
20:33:23 <ruhe> anyway, Savanna is all around about integration with various Hadoop vendors. I'm worried that splitting development into two separate project will make thing really complex. both will need integration with various Hadoop distros
20:33:30 <hub_cap> vishy: ++
20:33:32 <ttx> vishy: I wouldn't mind trove to expand scope to support generic clustering
20:33:33 <ErikB> mikal - savanna is a priority for Hortonworks
20:33:36 <mattf> mikal, i think the community is growing, will continue to grow and will be sustaining outside of mirantis, yes
20:33:40 <vishy> and trove/heat/savanna/workflow can fight to the death about who gets to own it
20:33:50 <hub_cap> vishy: cage match?
20:33:56 <dmakogon_> mikal: savanna has contributors from RedHat, so it could survive in any way)
20:33:58 <akuznetsov> savanna already has a lot staff for clustering anti affinity group, networks and ect.
20:33:58 <ttx> vishy: otherwise everyone will keep on reinventing it
20:33:59 <vishy> oh and tripleo
20:34:05 <vishy> since they have to provision clusters as well
20:34:11 <vishy> :o
20:34:19 <hub_cap> lol
20:34:25 <ruhe> vishy, it seems like Heat is the right tool to provision cluster, others are for tight integration with software running inside VMs
20:34:25 <russellb> zomg clusters
20:34:26 <mordred> yeah - but that's just heat really :)
20:34:31 * hub_cap hands vishy the wrench for the meeting
20:34:33 * mordred clusters himself
20:34:41 * markwash noms on clusters of nuts
20:34:41 <hub_cap> oh god now were all doomed
20:34:51 <dmakogon_> ruhe: +1 for clusters provisioning
20:35:00 <vishy> ruhe: the issue is provisioning the vms is easy
20:35:04 <ttx> wow 34min in and it's already toasted
20:35:09 <SergeyLukjanov> ruhe, yep and we planning to digg into Heat and try to contributed missed features to it to be able to use it for provisioning in Savanna
20:35:21 <hub_cap> ttx fwiw, i wouldnt mind expandoing scope cuz its going to be cassandra/mysql/mongo/redis/etc/etc/etc in trove
20:35:27 <vishy> it is configuring the services to know about each other, do elections, etc.
20:35:27 <vishy> that is hard
20:35:28 <hub_cap> and savanna will have hadoop
20:35:34 <dmakogon_> SergeyLukjanov: good idea
20:35:38 <vishy> and although the software is different there is definitely a lot of overlap in these things
20:35:42 <demorris> hub_cap: + Vertica CE
20:35:44 <annegentle> I think big data/ map reduce use cases are really valuable, and would like to see heat orchestrating to help other projects laser focus on use cases
20:35:44 <ruhe> vishy, not only provision, but apply configs, for instance host names in /etc/hosts for the whole cluster
20:36:07 <hub_cap> yes that should be the same w/ cassandra right ruhe?
20:36:18 <mikal> ruhe: would wouldn't just bring up an instane running bind and point everyone to that (or something other than losts of copies of /etc/hosts?)
20:36:19 <ruhe> hub_cap, right
20:36:23 <hub_cap> we plan on supporting in a generic way, cuz im sure mongo/cassandra will need
20:36:32 <vishy> ruhe: imo that is part of configuring the software
20:36:32 <ttx> annegentle: yes, and I wouldn't mind Savanna contributors to contribute the missing stuff they need to existing projects :) Yay cross-openstack collaboration
20:36:52 <annegentle> ttx: yep
20:36:55 <mordred> ++
20:37:10 <jgriffith> vishy: ++
20:37:14 <vishy> a lot of the difficulty would be avoided if we had integrated dns and autodiscovery
20:37:21 <jgriffith> I think there needs to be a clearer distinction here
20:37:22 * vishy has a whole bag of wrenches
20:37:33 <hub_cap> geez you sure do
20:37:34 <ttx> so basically I think there is value in incubating savanna, if only to get all those devs to show up at the design summit and see how they can best fit
20:37:43 <hub_cap> +1 ttx
20:37:46 <hub_cap> so um, state of clustering at the summit?
20:37:50 <dmakogon_> hup_cap: SergeyLukjanov: i see the next situation: trove support HBase/Hive - that is means that trove get Hadoop cluster provisioned via Savanna and than install Hive/Hbase on that cluster
20:37:52 <russellb> vishy: huge +1 to auto discovery ...
20:37:56 <dmakogon_> hub_cap:+1
20:38:15 * ttx looks up scedhule to make sure heat/trove and savanna don't run at the same time
20:38:18 <russellb> vishy: we keep inventing new methods to do that by hand, kinda getting silly
20:38:35 <mordred> vishy: dns++
20:38:37 <ruhe> dmakogon_, don't hurry :) HBase provisioning is the most complicated thing i ever seen in my life
20:38:42 <russellb> and yes, dns++ too
20:38:48 <russellb> is anyone help with dns yet?
20:38:51 <ruhe> I mean, getting it done right
20:38:52 <russellb> (sorry, another topoic)
20:38:58 <hub_cap> hah
20:39:00 <vishy> ruhe: no way it is more complicated than configuring openstack!
20:39:05 <ttx> err. trove and heat run at the same time, sigh
20:39:10 <dmakogon_> ruhe: i know, i've deployed it by hands a lot of times
20:39:14 <ruhe> vishy, good catch
20:39:25 <hub_cap> ttx FAIL
20:39:27 <mordred> russellb: no worries - we can talk about infinite number of things in parallel in this meeting
20:39:37 <hub_cap> mordred: hows the weather?
20:39:51 <mordred> hub_cap: great! I got some torchy's tacos yesterday and a bowl of queso
20:39:59 <SergeyLukjanov> agreed with need to discuss where clustering part should be done
20:40:02 <mordred> ttx: rework the whole scedule now!
20:40:08 <ttx> on it
20:40:08 <mattf> mordred, super linear scaling eh?
20:40:12 <dolphm> mordred: /jealous
20:40:34 <hub_cap> yes id love to see 1 project support clustering, there is no need to reinvent
20:40:40 <rnirmal> heat can provision a cluster with some work maybe... but some X needs to configure it and X needs to manage the lifecycle of a cluster... be it hadoop, cassandra or spark.. and I see a split between savanna and trove..... can we work towards solving that first
20:40:43 <hub_cap> and id love to see 1 project support a data api
20:40:49 <dmakogon_> hub_cap: +1 to shared lib
20:40:55 <hub_cap> +1 to single project
20:41:04 <ruhe> rnirmal +1
20:41:10 <hub_cap> rather than shared lib w/ the same api's between 2 diff projects heh
20:41:15 <dmakogon_> i think this is shouldn't be a standalone project
20:41:28 <dmakogon_> just some algorithms
20:41:32 <hub_cap> oh no a data api is quite valid :)
20:42:34 <zaneb> hub_cap: shared lib (vs. shared service) seems quite reasonable to me?
20:42:43 <dmakogon_> or heat should provision clusters for next usage in terms of current project
20:42:45 <hub_cap> yes zaneb
20:43:36 <ttx> mordred: I'll propose to swap trove and ironic on https://docs.google.com/spreadsheet/ccc?key=0AmUn0hzC1InKdDdPRXFrNjV4SW91SWF5N2gwYnRHYWc#gid=1 -- sounds like the most limited change that would solve it
20:43:44 <ErikB> +1 rnirmal - this is the value that Savanna adds.
20:43:47 <dmakogon_> hub_cap: shared lib ease to reuse without any integration
20:43:53 <ruhe> to understand requirements of such shared lib (or service) we'll need to understand requirement from both Trove and Savanna
20:43:56 <russellb> ironic and heat overlap is probably rough too
20:44:16 <russellb> since the folks interested in baremetal, are also interested in tripleo, which are interested in heat
20:44:21 <hub_cap> +1 ruhe
20:44:32 <ttx> russellb: there are more heat slots than ironic slots though, so they can still attend some of it
20:44:41 <russellb> ttx: ah, cool, probably fine then
20:44:50 <SergeyLukjanov> looks like no ideas for clustering discussion now and the best solution is to setup clustering discussion at design summit and apply the decision in I cycle
20:44:51 <hub_cap> hmmm seems like we are starting to see program vs project
20:45:05 <hub_cap> +1 SergeyLukjanov we have submitted one for trove :)
20:45:20 <mordred> hub_cap: ++
20:45:20 <hub_cap> http://summit.openstack.org/cfp/details/54
20:45:29 <dmakogon_> SergeyLukjanov: +1
20:45:34 <russellb> ttx: we should just serialize the whole thing and have the design summit never end
20:45:52 <hub_cap> when it is over is when it begins
20:45:55 <ttx> russellb: sounds like paradise
20:46:51 <ttx> OK, so it would be great to start this discussion a bit this week so that we can see the premises of this collaboration by the time we finally vote on this (next week or the week after)
20:47:11 <russellb> good call
20:47:28 <dmakogon_> hub_cap: SergeyLukjanov: we could discuss clustering together, it term of trove/heat/savana/ironic
20:47:31 <annegentle> ttx: how does the election timing and vote for incubation line up?
20:47:38 <annegentle> ttx: I can't remember when elections are
20:47:40 <ttx> and unless someone has another concern to raise, we can go to open discussion now
20:47:43 <ttx> annegentle: next topic
20:47:45 <hub_cap> yes dmakogon_
20:47:51 <hub_cap> go go go ttx
20:47:58 <mordred> we have another topic? jeez
20:48:05 <ttx> not really
20:48:07 <ttx> #topic Open discussion
20:48:12 <ttx> I set up the pages for the PTL and TC elections in the next weeks:
20:48:13 <hub_cap> we are gonna discuss mordred's hatred for open discussion
20:48:18 <ttx> #link https://wiki.openstack.org/wiki/PTL_Elections_Fall_2013
20:48:22 <dmakogon_> hub_cap: SergeyLukjanov: more that +100500 for shared lib for clustering
20:48:23 <ttx> #link https://wiki.openstack.org/wiki/TC_Elections_Fall_2013
20:48:33 <ttx> I'm looking for volunteers for filling the election official roles, especially for the PTL election
20:48:43 <ttx> Can be difficult since you should not be running for any of the PTL positions to be an PTL election official...
20:49:02 <SergeyLukjanov> are there any other questions about savanna? (we've discussed only one not main feature of savanna…)
20:49:11 <russellb> dang that's a lot of PTL positions :-)
20:49:17 <russellb> elections like woah
20:49:36 <ttx> annegentle: to answer your question... we won't start renewing the TC until October 4 so we still have a few meetings dates possible
20:49:46 <russellb> SergeyLukjanov: i think we need to continue on the ML with what came up so far, and we'll continue the discussion next week
20:49:48 <hub_cap> russellb: openstack is growing up :) or out!
20:49:57 <annegentle> ttx: ok thx
20:50:48 <ttx> the TC elections run after the PTL elections, which increases the odds recently-elected PTLs would get an electoral boost in the TC election (feature, not bug)
20:51:13 <mordred> this may not really be a TC thing - but since you're all here - we're going to make a stronger push to finish moving to testr next cycle - because there are some things we want to do with subunit streaming processing in the gate (which will result in quicker response time to failures and shorter gate resets)
20:51:24 <ttx> In other news, TC members should review the initial governance repo commit at:
20:51:29 <ttx> #link https://review.openstack.org/#/c/44489/
20:51:44 <ttx> once that is set we will use that for voting
20:52:09 <mikal> Yay!
20:52:40 <ttx> Anything else, anyone ?
20:52:55 <markwash> testr should have coverage, and I *want* to help :-)
20:53:14 <ttx> depending on how LinuxCon/CloudOpen gets finally scheduled and how many people are stuck, we may cancel next week meeting
20:53:48 <markmcclain1> Won't a good portion of us be there?
20:54:00 <russellb> how do you people attend so many conferences and still get work done?
20:54:03 <ttx> I count at least 3
20:54:09 <ttx> russellb: work ?
20:54:12 <markwash> russellb: you answered your own question methinks
20:54:22 <russellb> heh.
20:54:24 <mordred> russellb: I usually put in a full day's work while at ocnferences - it's just at different/odd times
20:54:28 <mordred> markwash: ++
20:54:59 <clarkb> testr coverage?
20:55:21 <clarkb> I feel like the word coverage is far too overloaded here. What are we walking about?
20:55:24 <markwash> clarkb: coverage measurements, that is
20:55:28 <gabrielhurley> russellb: also relevant are your definitions of "work" and "done"
20:55:34 <markwash> like, using testr tests, I can measure the code coverage of my unit tests
20:55:37 <ttx> so the final discussion/vote on savanna might just wait for the September 24 meeting. Skip or notskip will be discussed on the TC mailing-list
20:55:42 <clarkb> markwash: we have it doing that today
20:55:50 <markwash> clarkb: say what? sorry I'm out of date
20:56:00 <clarkb> markwash: that was one of the requirements to use testr with nova
20:56:15 <mordred> markwash: yup. we're all fancy like that
20:56:20 <clarkb> markwash: basically we swap in coverage.py for python and run the test runners that way then combine coverage afterwards
20:56:20 <markwash> <3
20:56:22 <clarkb> works great
20:56:27 <mordred> (sorry, I was ++ing "want to help")
20:56:27 <ttx> I even count 4, notmyname will be there
20:56:34 <notmyname> yes
20:56:48 <ttx> like I said on another thread, more PTL/TC members talking there than at an openstack summit :/
20:56:59 <mordred> yah. that makes me sad
20:57:18 <mordred> I spend most of my year talking at conferences, and I've only ever given half of one talk at an openstack one
20:57:28 <mordred> kinda funny that
20:57:43 <gabrielhurley> people only *think* they want to hear from me. I'll show them... I'll show them all!
20:57:49 <notmyname> are there project updates during the conference this time?
20:58:00 <mordred> notmyname: post confernece webinar things
20:58:00 <notmyname> if not, that removes all the PTL talks
20:58:04 <ttx> notmyname: I think they want to do the webinar thing again
20:58:07 <notmyname> mordred: should be both
20:58:21 <russellb> it's painful to do it at the conference
20:58:21 <mordred> notmyname: the request came through to not do them in person because of time
20:58:26 <russellb> no time to let the stuff soak
20:58:27 <ttx> notmyname: I placed a "TC panel" proposal so that we appear somewhere
20:58:31 <russellb> haven't even finished talking yet and you have to present it?
20:59:04 <mordred> russellb: I like giving the infra update at the start of the summit - more time for beer that way :)
20:59:07 <russellb> i'm glad we're not doing it there (with the current layout anyway)
20:59:22 <russellb> if we split or offset them, sure :)
20:59:26 <notmyname> it's an opportunity to talk about what's been happending, perhaps a couple of ideas that were talked about, and to brag on contributors.
20:59:28 <ttx> we are looking into a one-day offset for the next one
20:59:39 <ttx> hopefully more for the one after that
20:59:41 <russellb> ttx: that's a start
20:59:43 <mordred> ++
20:59:53 <russellb> whatever we can get is ++ from me
20:59:57 <ttx> like conf mon-Thy and design cummit tue-fri
21:00:06 <ttx> err conf mon-thu
21:00:17 <ttx> and summit*
21:00:21 <ttx> and #endmeeting
21:00:24 <ttx> #endmeeting