18:04:13 <aignatov2> #startmeeting Savanna
18:04:14 <openstack> Meeting started Thu May 23 18:04:13 2013 UTC.  The chair is aignatov2. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:04:15 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
18:04:18 <openstack> The meeting name has been set to 'savanna'
18:05:38 <SergeyLukjanov> We are working on extension and templates blueprints last week
18:05:40 <aignatov2> This week we updated bluprints related for Pluggable Provisioning Mechanism
18:06:05 <aignatov2> Now they have more details
18:06:28 <aignatov2> Nadya can you post links to this documents?
18:07:40 <Nadya> yes, sure https://wiki.openstack.org/wiki/Savanna/PluggableProvisioning
18:08:01 <Nadya> #link https://wiki.openstack.org/wiki/Savanna/PluggableProvisioning
18:10:06 <Nadya> and the second updated wiki page is about Templates #link https://wiki.openstack.org/wiki/Savanna/Templates
18:10:33 <aignatov2> #link https://wiki.openstack.org/wiki/Savanna/Templates
18:10:48 <jmaron> I just posted some thoughts about templates to the savanna-all list
18:12:55 <jmaron> I'd welcome comments/thoughts
18:12:58 <aignatov2> we are looking through it
18:13:02 <jmaron> thanks!
18:17:36 <aignatov2> besides your mail Jon, I'd like to notice another items.
18:18:13 <akuznetsov> such parameters can be place at cluster config, not at a node group config
18:18:14 <aignatov2> We are mostly finished with instructions for diskimage-builder to construct images with Apache Hadoop inside
18:18:33 <aignatov2> It will support both Ubuntu and CentOS distros
18:18:46 <akuznetsov> possibly we should add for cluster config a nested layout
18:18:54 <akuznetsov> for example HDFS
18:19:21 <akuznetsov> it should contain parameters applicable for all nodes in clusters
18:19:33 <akuznetsov> e.g. replication factor
18:20:48 <ErikB> In Jon's example on checkpoint.period, are you saying that this should be in the cluster config?
18:21:11 <jmaron> I'm not sure what you mean.  service configuration properties can be defined at both the cluster level as well as the node group level
18:21:54 <jmaron> so any 'GENERAL" or service specific property can be specified at both cluster and node level
18:22:17 <akuznetsov> for examples mapred child heap size is only applicable to the node group
18:22:31 <mattf> aignatov2, when you have something even mostly working re diskimage-builder, fire it my way. i'll take it for a test drive through RDO. fyi, i still think you should check the recipe into the savanna.git repo.
18:22:40 <jmaron> it can be specified globally at the cluster, but overridden at the node level if desired
18:22:58 <jmaron> if specified at the cluster, it's essentailly a global default
18:24:10 <aignatov2> mattf, we just have some troubles with DIB and not well tested it
18:24:29 <aignatov2> sometimes clusters are not working with it :)
18:25:03 <aignatov2> mostly it is the same DIB elements which I sent you last week)
18:25:25 <ruhe> mattf, we're going to create a new repo in stackforge for all the tools we build for Savanna
18:25:40 <mattf> if they aren't working in that the instance doesn't get a vm priv ip, i found a fix for that. i needed it to get ubuntu instances running on rdo.
18:26:18 <mattf> ruhe, what's the hesitation for including the image creation in the savanna repo?
18:27:24 <jmaron> in our view the user would have a cluster configuration panel and a node group configuration panel.  In both instances the same properties can be specified.  they'll end up in a cluster or node group template.  In both cases, they'd be associated to a "GENERAL" groping or a service grouping
18:27:41 <ruhe> mattf, we would like keep current repo only for savanna core. savanna-python-client, savanna-horizon-plugin will be separate projects. that's how all the OS projects manage additional components
18:27:44 <aignatov2> mattf, I think these scripts are the separate project like savanna-pythonclient, savanna-horizon etc
18:28:25 <SergeyLukjanov> ruhe +1
18:29:26 <mattf> currently there's a tight coupling between what the image and savanna-api. until there's a cleaner separation, it seems wise to keep the two together. the hope is they'll more easily be kept in sync.
18:29:56 <akuznetsov> but node group can override parameters like replication factor
18:30:03 <akuznetsov> in you case
18:30:20 <mattf> take for instance the default root password. if it's changed in the image builder it must also be changed in savanna-api. best to do it as a unit in one repo.
18:30:31 <aignatov2> in the future we may have more scripts and instructions, elements and so forth, and savanna repo is an independent project with it's integration tests, for instance
18:30:53 <dmitryme> Jon, in your model user can redefine a property like HDFS replication factor in Node Group
18:30:55 <ruhe> mattf, we're definitely going to decouple these things in ongoing implementation of plugins
18:31:05 <aignatov2> mattf, root password is a temporary solution, we plan to reset it after deployment for security reasons
18:31:07 <jmaron> akuznetsov:  yes.  do you object to that capability?
18:31:18 <aignatov2> in the duture releases
18:31:21 <dmitryme> this looks like a way to shoot in ones foot
18:31:23 <aignatov2> *future
18:31:37 <jmaron> Ot
18:31:38 <akuznetsov> yes because the replication factor is cluster parameter
18:32:16 <jmaron> ok.  I suppose we could add the scope parameter for such properties
18:32:35 <mattf> root pw is just a specific example. w/o defined expectations between savanna-api <-> instance, we risk the two getting out of sync if they live apart.
18:33:32 <ruhe> mattf, there is already set of integration tests we run on each commit. we'll know that something is wrong instantly :)
18:33:45 <mattf> so i propose keeping them together, until the time there's a well defined interface, or they're entirely stable.
18:33:57 <ErikB> akuznetsov - do you see the issue that was brought up by Jon on the property collision that is possible in the current proposal?
18:34:17 <mattf> ruhe, do those integration tests build a new image for testing w/ or use a stock image?
18:34:30 <dmitryme> Erik, sure we see it
18:34:41 <jmaron> that was just one example.  there are others as well.  we believe the correct grouping is at a service level
18:34:44 <akuznetsov> yes we see it
18:34:57 <dmitryme> actually we were thinking to avoid it in the following way:
18:35:40 <dmitryme> when plugin specifies config applicable_target="service:hdfs", we demonstrate it in the UI once in Node Group
18:35:47 <jmaron> to be clear: properties are defined as associated to services or general, scoped to node or cluster
18:36:50 <jmaron> it would also have to be presented only once at the cluster level, if applicable
18:36:56 <dmitryme> i.e. the do not going to provide user a way to specify it twice in a single Node Group
18:37:12 <dmitryme> Jon, yep, for cluster as well
18:38:23 <ruhe> mattf, currently we don't build images for each commit. but i think that in a couple of weeks we'll have image-builder decoupled from any hard-code
18:38:26 <jmaron> that would fix that issue, but I'm wondering about the templates:  would properties still be grouped by component?
18:40:22 <Nadya> component = process?
18:40:26 <dmitryme> Jon, component is "task tracker", "datanode", right?
18:40:26 <jmaron> yes
18:40:34 <dmitryme> aha, I see
18:40:59 <mattf> ruhe, how would you do an integration test that involves image creation if the integration test is for savanna-api (lives w/ savanna-api repo) and the image creation lives in a separate repo?
18:43:57 <dmitryme> Jon, we want some properties to be grouped by process (like processes heap sizes) and some - by service (like fs.checkpoint.dir)
18:45:44 <jmaron> the problem is that we actually don't even have that information in most instances.  Hadoop configuration is essentially service based, so there is actually no documentation about the component/process association.  we've spoken to internal developers, and the indication is that that would be a fairly significant documentation effort that isn't planned currently
18:46:31 <jmaron> so there are two fundamental problems:  1.  grouping by component isn't generally done by hadoop and 2.  the information isn't even available
18:47:58 <jmaron> but as UI developers, given the information associated with the config object, you can make some design decisions (pagination, filtering, etc) to make a usable interface.  Trying to make the hadoop model align with the UI requirements is not the way to do that, IMHO
18:50:01 <ruhe> mattf, that's a good question. i think we should have a pool of images. i understand your point, but we really want to keep savanna similar to other OS projects. for instance, every project has python client which is kept separate from the main code. and at the same time python client can be used internally to handle requests from Horizon UI
18:51:35 <mattf> ruhe, python client makes sense. it forces you to have a stable API for each project/repo. i think it's premature to say the image & savanna-api is stable enough to live apart. it's also not entirely clear to me that there's major motivation to stabilize that interface right now.
18:53:23 <mattf> consider keeping them together for now. in any event, i'm keen to try it out asap. i've the email instructions to work from atm, would love a repo to work with too -- i'm guessing there will be rdo specific changes i'll have to make.
18:54:16 <ruhe> mattf, sure. we'll try to get that repo asap
18:56:24 <Nadya> Jon, an example.  we have 2 mashines: big and small. We run a service mapred. So it appears 2 process tt on 1st and on 2nd. It have different amount of tasks to which may be processed
18:57:57 <Nadya> so we need to configure that on 1st machine we may run 10 tasks and on second - 5. But it is the one service
18:58:40 <Nadya> *map tasks
18:58:55 <jmaron> I'm not sure you can even do that based on the selection of one flavor per node group.  How do you propose to have different machine sizes as part of the same node group?
18:59:34 <jmaron> so in this case, you'd define a big machine group of one, a small machine group of one, and configure the service parameters for each accordingly
18:59:56 <Nadya> service may be ran on 2 machines which belong to different node groups
19:01:00 <kgriffs> hey folks
19:01:03 <aignatov2> guys, we are out of time
19:01:08 <jmaron> in that case you can specify cluster level parameters for the service as default values, and then node group overrides per node
19:01:19 <jmaron> node group
19:01:21 <aignatov2> #endmeeting savanna