18:00:52 #startmeeting sahara 18:00:52 Meeting started Thu Oct 30 18:00:52 2014 UTC and is due to finish in 60 minutes. The chair is SergeyLukjanov. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:53 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:00:56 The meeting name has been set to 'sahara' 18:00:56 ping sahara folks 18:01:14 o/ 18:01:19 SergeyLukjanov, crobertsrh has lost power, and mattf is out 18:01:25 o/ 18:01:26 o/ 18:01:29 tmckay, ack 18:01:32 o/ 18:01:52 o/ 18:02:36 #link https://wiki.openstack.org/wiki/Meetings/SaharaAgenda 18:02:44 #topic sahara@horizon status (croberts, NikitaKonovalov) 18:03:08 NikitaKonovalov you're the only presenter for this topic 18:05:04 okay, let's move on 18:05:08 #undo 18:05:09 Removing item from minutes: 18:05:16 #topic News / updates 18:05:18 folks, please 18:05:29 I've been mostly working on preparing to summit 18:05:56 i'm making good progress on bug#1306505, i've also been talking with the OSSG folks and the API working group. i've got some ideas about security and api to talk about at summit. 18:06:16 #1306505 18:06:22 oh, also trying to work up a spec for api v2 18:06:36 elmiko, that's great 18:06:47 elmiko i'd love to have a look at the spec when it's ready 18:07:20 SergeyLukjanov: i think there is some interesting activity happening surrounding api standardization 18:07:23 interesting question on openstack-dev about dockerizing Sahara. We should discuss at summit whether to make dockerized Sahara a feature for kilo 18:07:26 jodah: definitely 18:07:30 not sure what the issues may be 18:07:45 I'm working on various bug fixes 18:07:50 SergeyLukjanov, my Internet connection is dropping frequently, status for Sahara UI is good, there are few minor comments 18:07:51 tmckay, dockerized sahara itself or dockerized hadoop cluster? 18:08:09 SergeyLukjanov tmckay both ideally :) 18:08:17 SergeyLukjanov, unclear from the email, I asked for clarification 18:08:25 I'm also trying to keep backports up-to-date for stable/juno 18:08:30 jodah, ++, might be two different features 18:08:34 SergeyLukjanov, tmckay, i think the email was talking about hadoop nodes being containerized 18:09:01 elmiko, tmckay, okay, so, anyway both of the dockerizations are interesting 18:09:02 I think so too, the specific question was about changing hostsnames 18:09:05 so I think it was nodes 18:09:19 dockerizing sahara is perhaps more about dockerizing openstack/devstack entirely 18:09:20 tmckay, yeah, sounds like that (just saw an email) 18:09:35 i think we have 2 paths for containers; 1. sahara controller container, should definitely watch the kolla project, 2. node containers, should definitely watch nova-docker 18:10:11 so the issue in question is, how do we get around docker hostname change restrictions in launched nodes 18:10:13 i listened in on the kolla meeting earlier this week, and they are preparing to add a sahara container 18:10:28 could just be a single issue, but I am guessing there might be other docker issues once we get into it 18:10:55 I'd like to learn more about docker anyway :) EDP is so yesterday :) 18:10:59 tmckay: yea, we'll have to visit how we use the node hostname in the controller 18:11:25 elmiko, I'm wondering if we can switch most references to ip address, and defer hostname 18:11:35 maybe that would help 18:11:39 anyway ... summit 18:11:45 we won't solve here 18:11:51 tmckay: yea, that might work, or do something with container names. 18:11:58 tmckay, elmiko, not always, some services requires dns, as I remember zookeeper requires dns 18:12:41 SergeyLukjanov: dns + docker, i know it's a sticky issue. we've had some folks run into issues with reverse-dns lookup and containers 18:12:55 hey, my service provider just hooked me up for international. So I can text you all in Paris :) Account was too new, didn't show up on the web page ... 18:13:09 tmckay: lol, nice 18:13:36 * SergeyLukjanov planning to buy some local pre-paid sim card 18:13:41 oh, other issue for me .... 18:14:15 playing with Scala class that can be imported by Spark jobs for convenience in setting hadoop configs for swift access (lots of names in that sentence) 18:14:16 do we need to discuss something about design summit? 18:14:32 haven't gotten too far yet, beyond learning enough Scala to do it :) 18:15:05 SergeyLukjanov, oh, that's a good idea on the sim card 18:15:27 tmckay, lebara was good enough last time I've been in paris 18:15:41 SergeyLukjanov: i'm happy with the session text for security stuff, i talked with OSSG and i think we'll have good communication/cooperation from them when needed. 18:15:45 SergeyLukjanov, I missed sign up for the Mirantis party. Can you get me in? :) 18:16:05 elmiko, great! 18:16:14 tmckay, I think we'll be able to do it 18:16:20 I may try to clean up the EDP pad a little, but the basic topics are there 18:16:41 tmckay, okay 18:16:50 #topic Design Summit @ Paris 18:16:54 I think maybe there is not too much to do for EDP in Kilo 18:16:58 * SergeyLukjanov need to update integration part 18:17:06 but, I could be wrong 18:17:12 tmckay: thx you for composing awesome summary for me ;) 18:17:16 tmckay, it's a good question 18:17:33 aignatov, you're welcome 18:17:35 tmckay, probably it's time to expose supported job types in API and remove hardcode from Horizon 18:17:50 SergeyLukjanov: +1 18:18:00 SergeyLukjanov, yes. I think there is cleanup we can do, but I'm not sure there are big, new features 18:18:11 tmckay, yeah 18:18:58 alazarev spearheaded very nice refactoring, that makes it easy to add storm, fake plugin, spark, etc etc 18:19:11 but it could all be discussed in your section 18:19:21 I think we can talk about not obly kilo part 18:19:27 but about edp future 18:19:38 aignatov, ++ 18:19:48 will we have time outside the meetup session to talk about api v2? 18:19:48 agreed 18:19:59 sure, in the pod, at the parties 18:20:01 elmiko: yes 18:20:08 during lunch, breakfast 18:20:11 lol 18:20:24 i remember the parties from icehouse, no work is getting done there ;) 18:20:29 elmiko, we could talk about it a lot on the meetup 18:20:52 and some folks are unable to chair sessions after such parties ;) 18:21:02 lol! 18:21:16 hmm, I think not too many are leaving on Friday so we can go all night if we have too 18:21:23 i think there are some interesting ideas we should consider implementing, like async object endpoints 18:21:57 we really should add scaling to the CLI 18:22:01 it's still missing, I believe 18:22:14 no technical reason, just not done 18:22:18 tmckay, +1 18:22:41 tmckay, +1 18:22:50 +1 18:22:52 and probably cleanup the cli implementation 18:22:53 tmckay, there is a plenty of work that can be done in python client 18:23:00 yeah 18:23:02 which pad should we add that to? Or just make sure there is still a blueprint? 18:23:15 maybe we should start a client pad 18:23:18 migrate to the unified client, if the project is still alive? 18:23:35 tosky, unfamiliar with unified client 18:23:46 tmckay, it souds like client is one more very important area to invest in Kilo 18:24:05 tmckay: https://wiki.openstack.org/wiki/OpenStackClient 18:24:06 something that was brought up, that would really help with cli, is having more documented json templates 18:24:21 documenting each parameter for each resource 18:24:23 +1 I think we have a "extend and improve" dev cycle, just make everything better, fill in gaps 18:24:36 maybe not too many headline features 18:24:51 tmckay, yup, it'll be great to extend and improve all the things ;) 18:24:56 but, we have to have some headlines, or people get bored :) 18:25:30 #topic Open discussion 18:26:17 we have a question raised by our new contributors from Shanhai - to make meeting time more friendly for Asia tz 18:27:10 the easiest solution is to make each other week meeting in other time 18:27:22 good idea 18:27:25 I think i'll follow up with it in mailing list 18:27:35 iirc, Asia is +13 or +14 from EST 18:28:06 Asia what timezone? 18:28:29 well, I meant Japan :) 18:28:36 but there is more :) 18:29:01 the main question I have, will they regularly attend the meeting? 18:29:29 it's a good question too 18:29:41 I have never seen them here 18:29:51 speaking about shanghai, it has timezone UTC/GMT +8 hours 18:29:52 maybe they're asleep? :) 18:30:10 yeah, I think so 18:30:13 if this is not important enough to attend meeting one time at 3AM... 18:30:45 will they attend _regularly_ in convenient time? 18:31:07 who knows 18:32:11 What is the proposed convenient time? 18:32:31 Perhaps ask them for a proposed convenient time range 18:32:41 crobertsrh: you found your power? ;) 18:32:49 lol 18:32:50 Yes :) 18:34:36 crobertsh, blizzard there? 18:34:43 It is October, after all 18:34:51 Nothing but sunshine today. I have no idea what happened 18:35:01 weird... 18:35:17 tmckay: actually, the forecast for halloween is snow 18:35:36 same here I think, but only in the mountains 18:35:50 I'm new to the project, trying to get familiar with things :) I was reading the storm blueprint and it got me wondering about what is or isn't in scope for Sahara? What sorts of things are fair game to provision? Because I know Heat sees some of this stuff as potentially its area. 18:36:22 jodah, sahara is the data processing as the service 18:36:57 jodah, so, we're making not only provisioning but operations like correct scaling/ decomminisioning, configuration + EDP 18:37:19 sure. EDP applies to hadoop though, not really storm 18:37:33 ...or stream processing, at least looking at the EDP API 18:37:43 we'd like to apply it to Storm too 18:37:54 for deploying topologies and such? 18:37:57 jodah, currently it's applied to both Hadoop and Spark 18:38:12 jodah, why not storm? You can run jobs in storm, right? 18:38:18 jodah, storm topology == hadoop job execution 18:38:44 sure 18:39:03 yeah, as long as you have data to process, and you can start it and cancel it and check its progress, it qualifies 18:39:16 EDP has only "run, cancel, status" at a high level 18:39:30 the concepts somewhat line up 18:39:32 and "input" and "output", or an argument list 18:39:48 aside from storm though, i'm curious just generally - about the scope of sahara 18:39:53 run => "go" or "start" 18:40:04 is it to encompass any input->process->output data pipeline? 18:40:35 jodah, it's impossible to cover inf list of data processing frameworks 18:40:52 I think to encompass "many" would be a better way to say it 18:40:53 so, for now we're mostly hadoop, hadoop-like and very popular frameworks :) 18:41:02 crobertsrh, ++ 18:41:03 the EDP resources are obviously very hadoop specific - ex: job-binaries, whereas storm has its own specific concepts. 18:41:20 i'm sure they can be generalized, i'm just looking at it right now 18:41:29 but, in general, 1) provision a cluster for analytics and 2) simple interface to run analytics 18:42:43 fair enough :) 18:42:49 jodah, I'll read up on storm, not too familiar. It would be an interesting look at the Sahara concepts, maybe expose unintended biases 18:43:23 i just came from doing a bunch of stream processing / messaging work, including with storm, but hadoop is a newer area to me :) 18:43:32 related question, If we're provisioning Hadoop, and Storm, should we be provisioning ZooKeeper clusters separately (since they are often shared across services in production)? Does this get a bit to close to what heat can do? 18:44:04 jodah, I am also thinking about mesos, and a cluster that maybe has multiple applications running on it 18:44:18 exactly 18:44:45 similar with ZK - we share our ZK clusters for kafka and storm. i've seen deployments that share them for hadoop as well, since the throughput is so low 18:45:38 jodah, will you be at summit? 18:45:44 unfortunately i will not 18:45:51 pity 18:45:53 i'm sure i will at the following summit though 18:45:55 :) 18:46:05 okay, great 18:46:17 i just started working on some of this stuff with HP cloud. i'll be around a lot :) 18:46:17 fwiw i think it's an interesting idea we should investigate 18:47:04 with ZK in particular though, does that start to overlap too much with heat? 18:47:30 provisioning things that don't really get re-configured much or scaled up/down 18:48:16 well, sahara uses heat under the hood (can use heat) 18:48:16 jodah, I think that it'll looks like we could to specify external ZK cluster to the clusters provisioned by sahara 18:48:31 that would make sense 18:48:33 so, if we have things that just map through to heat, that will be fine imho 18:48:43 yeah 18:48:53 and due to the fact that we use heat 18:49:00 or somehow delegate to a heat stack to standup the ZK cluster that might be needed for hadoop/storm 18:49:01 and sahara could be used from heat 18:50:28 One more thought on storm - aside from lifecycle management, scaling up/down, etc., has there been any talk about doing more stream processing related abstractions on top of storm/spark-streaming, similar to what other services like kinesis do? 18:51:09 jodah, it's a topic to talk about 18:51:11 it's hard to generalize storm since the concepts are so specific, but perhaps a few things could be done 18:51:30 jodah, have you talked to tellesnobrega? He is doing the storm plugin 18:51:56 no, but i'd like to 18:52:38 jodah, he's been doing most of the work. Maybe you can collaborate 18:52:55 most may even be "all" 18:52:56 sounds good 18:53:19 since i'm new to the project i'm not sure which areas i'll be asked to focus on, but that is certainly possible :) 18:54:52 5 minutes folks, anything else? 18:54:59 * tmckay pretends to be Sergey :) 18:55:13 if not we could end it 5 mins earlier and have extra coffee 18:55:22 ++ 18:55:27 +1 18:55:51 #endmeeting