18:00:52 <SergeyLukjanov> #startmeeting sahara
18:00:52 <openstack> Meeting started Thu Oct 30 18:00:52 2014 UTC and is due to finish in 60 minutes.  The chair is SergeyLukjanov. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:00:53 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
18:00:56 <openstack> The meeting name has been set to 'sahara'
18:00:56 <SergeyLukjanov> ping sahara folks
18:01:14 <elmiko> o/
18:01:19 <tmckay> SergeyLukjanov, crobertsrh has lost power, and mattf is out
18:01:25 <aignatov> o/
18:01:26 <jodah> o/
18:01:29 <SergeyLukjanov> tmckay, ack
18:01:32 <NikitaKonovalov> o/
18:01:52 <sreshetnyak> o/
18:02:36 <SergeyLukjanov> #link https://wiki.openstack.org/wiki/Meetings/SaharaAgenda
18:02:44 <SergeyLukjanov> #topic sahara@horizon status (croberts, NikitaKonovalov)
18:03:08 <SergeyLukjanov> NikitaKonovalov you're the only presenter for this topic
18:05:04 <SergeyLukjanov> okay, let's move on
18:05:08 <SergeyLukjanov> #undo
18:05:09 <openstack> Removing item from minutes: <ircmeeting.items.Topic object at 0x2f5bdd0>
18:05:16 <SergeyLukjanov> #topic News / updates
18:05:18 <SergeyLukjanov> folks, please
18:05:29 <SergeyLukjanov> I've been mostly working on preparing to summit
18:05:56 <elmiko> i'm making good progress on bug#1306505, i've also been talking with the OSSG folks and the API working group. i've got some ideas about security and api to talk about at summit.
18:06:16 <SergeyLukjanov> #1306505
18:06:22 <elmiko> oh, also trying to work up a spec for api v2
18:06:36 <SergeyLukjanov> elmiko, that's great
18:06:47 <jodah> elmiko i'd love to have a look at the spec when it's ready
18:07:20 <elmiko> SergeyLukjanov: i think there is some interesting activity happening surrounding api standardization
18:07:23 <tmckay> interesting question on openstack-dev about dockerizing Sahara.  We should discuss at summit whether to make dockerized Sahara a feature for kilo
18:07:26 <elmiko> jodah: definitely
18:07:30 <tmckay> not sure what the issues may be
18:07:45 <sreshetnyak> I'm working on various bug fixes
18:07:50 <NikitaKonovalov> SergeyLukjanov, my Internet connection is dropping frequently, status for Sahara UI is good, there are few minor comments
18:07:51 <SergeyLukjanov> tmckay, dockerized sahara itself or dockerized hadoop cluster?
18:08:09 <jodah> SergeyLukjanov tmckay both ideally :)
18:08:17 <tmckay> SergeyLukjanov, unclear from the email, I asked for clarification
18:08:25 <NikitaKonovalov> I'm also trying to keep backports up-to-date for stable/juno
18:08:30 <tmckay> jodah, ++, might be two different features
18:08:34 <elmiko> SergeyLukjanov, tmckay, i think the email was talking about hadoop nodes being containerized
18:09:01 <SergeyLukjanov> elmiko, tmckay, okay, so, anyway both of the dockerizations are interesting
18:09:02 <tmckay> I think so too, the specific question was about changing hostsnames
18:09:05 <tmckay> so I think it was nodes
18:09:19 <jodah> dockerizing sahara is perhaps more about dockerizing openstack/devstack entirely
18:09:20 <SergeyLukjanov> tmckay, yeah, sounds like that (just saw an email)
18:09:35 <elmiko> i think we have 2 paths for containers; 1. sahara controller container, should definitely watch the kolla project, 2. node containers, should definitely watch nova-docker
18:10:11 <tmckay> so the issue in question is, how do we get around docker hostname change restrictions in launched nodes
18:10:13 <elmiko> i listened in on the kolla meeting earlier this week, and they are preparing to add a sahara container
18:10:28 <tmckay> could just be a single issue, but I am guessing there might be other docker issues once we get into it
18:10:55 <tmckay> I'd like to learn more about docker anyway :)  EDP is so yesterday :)
18:10:59 <elmiko> tmckay: yea, we'll have to visit how we use the node hostname in the controller
18:11:25 <tmckay> elmiko, I'm wondering if we can switch most references to ip address, and defer hostname
18:11:35 <tmckay> maybe that would help
18:11:39 <tmckay> anyway ... summit
18:11:45 <tmckay> we won't solve here
18:11:51 <elmiko> tmckay: yea, that might work, or do something with container names.
18:11:58 <SergeyLukjanov> tmckay, elmiko, not always, some services requires dns, as I remember zookeeper requires dns
18:12:41 <elmiko> SergeyLukjanov: dns + docker, i know it's a sticky issue. we've had some folks run into issues with reverse-dns lookup and containers
18:12:55 <tmckay> hey, my service provider just hooked me up for international.  So I can text you all in Paris :)  Account was too new, didn't show up on the web page ...
18:13:09 <elmiko> tmckay: lol, nice
18:13:36 * SergeyLukjanov planning to buy some local pre-paid sim card
18:13:41 <tmckay> oh, other issue for me ....
18:14:15 <tmckay> playing with Scala class that can be imported by Spark jobs for convenience in setting hadoop configs for swift access (lots of names in that sentence)
18:14:16 <SergeyLukjanov> do we need to discuss something about design summit?
18:14:32 <tmckay> haven't gotten too far yet, beyond learning enough Scala to do it :)
18:15:05 <tmckay> SergeyLukjanov, oh, that's a good idea on the sim card
18:15:27 <SergeyLukjanov> tmckay, lebara was good enough last time I've been in paris
18:15:41 <elmiko> SergeyLukjanov: i'm happy with the session text for security stuff, i talked with OSSG and i think we'll have good communication/cooperation from them when needed.
18:15:45 <tmckay> SergeyLukjanov, I missed sign up for the Mirantis party.  Can you get me in? :)
18:16:05 <SergeyLukjanov> elmiko, great!
18:16:14 <SergeyLukjanov> tmckay, I think we'll be able to do it
18:16:20 <tmckay> I may try to clean up the EDP pad a little, but the basic topics are there
18:16:41 <SergeyLukjanov> tmckay, okay
18:16:50 <SergeyLukjanov> #topic Design Summit @ Paris
18:16:54 <tmckay> I think maybe there is not too much to do for EDP in Kilo
18:16:58 * SergeyLukjanov need to update integration part
18:17:06 <tmckay> but, I could be wrong
18:17:12 <aignatov> tmckay: thx you for composing awesome summary for me ;)
18:17:16 <SergeyLukjanov> tmckay, it's a good question
18:17:33 <tmckay> aignatov, you're welcome
18:17:35 <SergeyLukjanov> tmckay, probably it's time to expose supported job types in API and remove hardcode from Horizon
18:17:50 <elmiko> SergeyLukjanov: +1
18:18:00 <tmckay> SergeyLukjanov, yes.  I think there is cleanup we can do, but I'm not sure there are big, new features
18:18:11 <SergeyLukjanov> tmckay, yeah
18:18:58 <tmckay> alazarev spearheaded very nice refactoring, that makes it easy to add storm, fake plugin, spark, etc etc
18:19:11 <aignatov> but it could all be discussed in your section
18:19:21 <aignatov> I think we can talk about not obly kilo part
18:19:27 <aignatov> but about edp future
18:19:38 <SergeyLukjanov> aignatov, ++
18:19:48 <elmiko> will we have time outside the meetup session to talk about api v2?
18:19:48 <tmckay> agreed
18:19:59 <tmckay> sure, in the pod, at the parties
18:20:01 <aignatov> elmiko: yes
18:20:08 <tmckay> during lunch, breakfast
18:20:11 <elmiko> lol
18:20:24 <elmiko> i remember the parties from icehouse, no work is getting done there ;)
18:20:29 <SergeyLukjanov> elmiko, we could talk about it a lot on the meetup
18:20:52 <SergeyLukjanov> and some folks are unable to chair sessions after such parties ;)
18:21:02 <elmiko> lol!
18:21:16 <tmckay> hmm, I think not too many are leaving on Friday so we can go all night if we have too
18:21:23 <elmiko> i think there are some interesting ideas we should consider implementing, like async object endpoints
18:21:57 <tmckay> we really should add scaling to the CLI
18:22:01 <tmckay> it's still missing, I believe
18:22:14 <tmckay> no technical reason, just not done
18:22:18 <alazarev_> tmckay, +1
18:22:41 <SergeyLukjanov> tmckay, +1
18:22:50 <aignatov> +1
18:22:52 <SergeyLukjanov> and probably cleanup the cli implementation
18:22:53 <alazarev_> tmckay, there is a plenty of work that can be done in python client
18:23:00 <SergeyLukjanov> yeah
18:23:02 <tmckay> which pad should we add that to?  Or just make sure there is still a blueprint?
18:23:15 <tmckay> maybe we should start a client pad
18:23:18 <tosky> migrate to the unified client, if the project is still alive?
18:23:35 <tmckay> tosky, unfamiliar with unified client
18:23:46 <SergeyLukjanov> tmckay, it souds like client is one more very important area to invest in Kilo
18:24:05 <tosky> tmckay: https://wiki.openstack.org/wiki/OpenStackClient
18:24:06 <elmiko> something that was brought up, that would really help with cli, is having more documented json templates
18:24:21 <jodah> documenting each parameter for each resource
18:24:23 <tmckay> +1 I think we have a "extend and improve" dev cycle, just make everything better, fill in gaps
18:24:36 <tmckay> maybe not too many headline features
18:24:51 <SergeyLukjanov> tmckay, yup, it'll be great to extend and improve all the things ;)
18:24:56 <tmckay> but, we have to have some headlines, or people get bored :)
18:25:30 <SergeyLukjanov> #topic Open discussion
18:26:17 <SergeyLukjanov> we have a question raised by our new contributors from Shanhai - to make meeting time more friendly for Asia tz
18:27:10 <SergeyLukjanov> the easiest solution is to make each other week meeting in other time
18:27:22 <tmckay> good idea
18:27:25 <SergeyLukjanov> I think i'll follow up with it in mailing list
18:27:35 <tmckay> iirc, Asia is +13 or +14 from EST
18:28:06 <tosky> Asia what timezone?
18:28:29 <tmckay> well, I meant Japan :)
18:28:36 <tmckay> but there is more :)
18:29:01 <alazarev_> the main question I have, will they regularly attend the meeting?
18:29:29 <SergeyLukjanov> it's a good question too
18:29:41 <alazarev_> I have never seen them here
18:29:51 <dmitryme> speaking about shanghai, it has timezone UTC/GMT +8 hours
18:29:52 <jodah> maybe they're asleep? :)
18:30:10 <SergeyLukjanov> yeah, I think so
18:30:13 <alazarev_> if this is not important enough to attend meeting one time at 3AM...
18:30:45 <alazarev_> will they attend _regularly_ in convenient time?
18:31:07 <SergeyLukjanov> who knows
18:32:11 <crobertsrh> What is the proposed convenient time?
18:32:31 <jodah> Perhaps ask them for a proposed convenient time range
18:32:41 <aignatov> crobertsrh: you found your power? ;)
18:32:49 <elmiko> lol
18:32:50 <crobertsrh> Yes :)
18:34:36 <tmckay> crobertsh, blizzard there?
18:34:43 <tmckay> It is October, after all
18:34:51 <crobertsrh> Nothing but sunshine today.  I have no idea what happened
18:35:01 <elmiko> weird...
18:35:17 <elmiko> tmckay: actually, the forecast for halloween is snow
18:35:36 <tmckay> same here I think, but only in the mountains
18:35:50 <jodah> I'm new to the project, trying to get familiar with things :) I was reading the storm blueprint and it got me wondering about what is or isn't in scope for Sahara? What sorts of things are fair game to provision? Because I know Heat sees some of this stuff as potentially its area.
18:36:22 <SergeyLukjanov> jodah, sahara is the data processing as the service
18:36:57 <SergeyLukjanov> jodah, so, we're making not only provisioning but operations like correct scaling/ decomminisioning, configuration + EDP
18:37:19 <jodah> sure. EDP applies to hadoop though, not really storm
18:37:33 <jodah> ...or stream processing, at least looking at the EDP API
18:37:43 <SergeyLukjanov> we'd like to apply it to Storm too
18:37:54 <jodah> for deploying topologies and such?
18:37:57 <SergeyLukjanov> jodah, currently it's applied to both Hadoop and Spark
18:38:12 <tmckay> jodah, why not storm?  You can run jobs in storm, right?
18:38:18 <SergeyLukjanov> jodah, storm topology == hadoop job execution
18:38:44 <jodah> sure
18:39:03 <tmckay> yeah, as long as you have data to process, and you can start it and cancel it and check its progress, it qualifies
18:39:16 <tmckay> EDP has only "run, cancel, status" at a high level
18:39:30 <jodah> the concepts somewhat line up
18:39:32 <tmckay> and "input" and "output", or an argument list
18:39:48 <jodah> aside from storm though, i'm curious just generally - about the scope of sahara
18:39:53 <tmckay> run => "go" or "start"
18:40:04 <jodah> is it to encompass any input->process->output data pipeline?
18:40:35 <SergeyLukjanov> jodah, it's impossible to cover inf list of data processing frameworks
18:40:52 <crobertsrh> I think to encompass "many" would be a better way to say it
18:40:53 <SergeyLukjanov> so, for now we're mostly hadoop, hadoop-like and very popular frameworks :)
18:41:02 <SergeyLukjanov> crobertsrh, ++
18:41:03 <jodah> the EDP resources are obviously very hadoop specific - ex: job-binaries, whereas storm has its own specific concepts.
18:41:20 <jodah> i'm sure they can be generalized, i'm just looking at it right now
18:41:29 <tmckay> but, in general, 1) provision a cluster for analytics and 2) simple interface to run analytics
18:42:43 <jodah> fair enough :)
18:42:49 <tmckay> jodah, I'll read up on storm, not too familiar.  It would be an interesting look at the Sahara concepts, maybe expose unintended biases
18:43:23 <jodah> i just came from doing a bunch of stream processing / messaging work, including with storm, but hadoop is a newer area to me :)
18:43:32 <jodah> related question, If we're provisioning Hadoop, and Storm, should we be provisioning ZooKeeper clusters separately (since they are often shared across services in production)? Does this get a bit to close to what heat can do?
18:44:04 <tmckay> jodah, I am also thinking about mesos, and a cluster that maybe has multiple applications running on it
18:44:18 <jodah> exactly
18:44:45 <jodah> similar with ZK - we share our ZK clusters for kafka and storm. i've seen deployments that share them for hadoop as well, since the throughput is so low
18:45:38 <tmckay> jodah, will you be at summit?
18:45:44 <jodah> unfortunately i will not
18:45:51 <tmckay> pity
18:45:53 <jodah> i'm sure i will at the following summit though
18:45:55 <tmckay> :)
18:46:05 <tmckay> okay, great
18:46:17 <jodah> i just started working on some of this stuff with HP cloud. i'll be around a lot :)
18:46:17 <elmiko> fwiw i think it's an interesting idea we should investigate
18:47:04 <jodah> with ZK in particular though, does that start to overlap too much with heat?
18:47:30 <jodah> provisioning things that don't really get re-configured much or scaled up/down
18:48:16 <tmckay> well, sahara uses heat under the hood (can use heat)
18:48:16 <SergeyLukjanov> jodah, I think that it'll looks like we could to specify external ZK cluster to the clusters provisioned by sahara
18:48:31 <jodah> that would make sense
18:48:33 <tmckay> so, if we have things that just map through to heat, that will be fine imho
18:48:43 <SergeyLukjanov> yeah
18:48:53 <SergeyLukjanov> and due to the fact that we use heat
18:49:00 <jodah> or somehow delegate to a heat stack to standup the ZK cluster that might be needed for hadoop/storm
18:49:01 <SergeyLukjanov> and sahara could be used from heat
18:50:28 <jodah> One more thought on storm - aside from lifecycle management, scaling up/down, etc., has there been any talk about doing more stream processing related abstractions on top of storm/spark-streaming, similar to what other services like kinesis do?
18:51:09 <SergeyLukjanov> jodah, it's a topic to talk about
18:51:11 <jodah> it's hard to generalize storm since the concepts are so specific, but perhaps a few things could be done
18:51:30 <tmckay> jodah, have you talked to tellesnobrega?  He is doing the storm plugin
18:51:56 <jodah> no, but i'd like to
18:52:38 <tmckay> jodah, he's been doing most of the work.  Maybe you can collaborate
18:52:55 <tmckay> most may even be "all"
18:52:56 <jodah> sounds good
18:53:19 <jodah> since i'm new to the project i'm not sure which areas i'll be asked to focus on, but that is certainly possible :)
18:54:52 <tmckay> 5 minutes folks, anything else?
18:54:59 * tmckay pretends to be Sergey :)
18:55:13 <SergeyLukjanov> if not we could end it 5 mins earlier and have extra coffee
18:55:22 <tmckay> ++
18:55:27 <elmiko> +1
18:55:51 <SergeyLukjanov> #endmeeting