14:01:24 #startmeeting sahara 14:01:25 Meeting started Thu Jul 2 14:01:24 2015 UTC and is due to finish in 60 minutes. The chair is SergeyLukjanov. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:01:26 #link https://wiki.openstack.org/wiki/Meetings/SaharaAgenda 14:01:26 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:01:29 The meeting name has been set to 'sahara' 14:01:57 #topic sahara@horizon status (crobertsrh, NikitaKonovalov) 14:02:07 #link https://etherpad.openstack.org/p/sahara-reviews-in-horizon 14:02:24 There is finally a change created to move Sahara panels to contib 14:02:39 #link https://review.openstack.org/#/c/197363/ 14:02:43 \o/ 14:03:14 so once that one is merge, we'll need to rebases everythnig 14:03:45 actually it's better to start rebasing now 14:04:12 NikitaKonovalov, ++ for starting rebasing now 14:04:19 because the current changes wount be merged anyway 14:04:25 NikitaKonovalov, what's the progress of moving to contrib? 14:04:25 I have several patches added, one related for node process tab 14:04:38 for horizon I meant 14:04:58 SergeyLukjanov: the patch is on review, and guess what? no reviews so far 14:05:12 :( 14:05:45 and now it in merge conflict 14:05:46 I hope we'll get it merged fast 14:06:17 david-lyle, could we help somehow with a patch that is moving sahara to contrib? 14:06:20 vgridnev: it's now a good momnet to rebase your changes on top of the contrib change 14:06:41 NikitaKonovalov, ok 14:07:03 anyway I'll chack that that all our changes are rebased 14:07:12 check* 14:08:34 NikitaKonovalov, ack 14:08:49 NikitaKonovalov, and please ensure that our etherpad contains all links to the CRs 14:08:50 no more new from me 14:08:55 ok, thx 14:08:57 SergeyLukjanov: sure 14:09:06 #topic spark version status (elmiko) 14:09:14 hey 14:09:19 elmiko, hey ;) 14:09:30 given the recent changes proposed from venza, i wan't to talk about deprecetating 0.9.1 14:09:33 should we support older versions of spark? 14:09:33 #link https://review.openstack.org/#/c/195054 introduces 1.3 version of spark, but sahara plugin only lists 0.9.1 and 1.0.0 (https://review.openstack.org/#/c/196686 adds 1.3.1 to the sahara plugin -- venza), and we have no way to regenerate older versions 14:09:37 and what our plans are for the future 14:09:58 i don't see much point in supporting more than 1 older version of spark 14:10:10 elmiko, agreed 14:10:14 elmiko: agree 14:10:18 actually versions 0.9.1 and 1.0.0 are no longer available from spark's website 14:10:25 right, that too 14:10:38 imo, we really need to get 1.4 support =) 14:10:46 we could keep them as deprecated (only for ops on the already provisioned clusters) 14:11:01 and keep only latest version active 14:11:11 SergeyLukjanov: +1 14:11:49 we did not test 1.4 yet, but 1.3.1 works well, that's why my patch is for 1.3.1 14:12:01 SergeyLukjanov: +1 14:12:03 I hope to have fully working Spark support in HDP and CDH someday (with EDP and etc.) and then we'll be probably able to use them instead of separated plugin, what do you guys thinking about it? 14:12:04 i'm talking more optimistically venza =) 14:12:23 venza: do you think to push 1.4 for Liberty, and if yes, will it be 1.4 only or 1.4+1.3? 14:12:30 i think we should still have a "vanilla" spark plugin at some point 14:12:49 another rand idea - to merge vanilla hadoop and spark plugins 14:12:50 yes, it is like the vanilla hadoop plugin 14:13:05 oh, spark based on CDH HDFS 14:13:05 i've been experimenting with a bare fedora spark 1.4 image, and i could see value in having a separate, simple, spark plugin 14:13:34 lately we are running compute only clusters without local hdfs 14:13:44 imo, we should someday have "vanilla hadoop" and "vanilla spark", but maybe not as the same plugin 14:14:37 tosky: I plan to work on 1.4 right after 1.3 gets merged 14:14:59 tosky: probably is is again a simple patch 14:15:05 *it 14:15:08 yea 14:16:30 sounds like we are in agreement about deprecationg then 14:16:44 yup 14:16:50 yes 14:17:11 we'll probably have to talk in tokyo about the future of the spark plugin though? 14:17:23 #agreed keep only latest spark version active and deprecate older versions 14:17:32 that's the summary I think ^^ 14:17:37 +1 14:17:40 +1 14:18:01 great 14:18:09 #topic News / updates 14:19:16 keystone session spec is finally up, #link https://review.openstack.org/#/c/197743/ 14:19:23 i have some poc code working locally with this 14:19:30 I've been working on NameNode HA support for HDP. And it is not working due to a strange Ambari behavior when starting NameNodes 14:19:37 Make some important back ports to stable/kilo branch in sahara, several bugs researching and also horizon, horizon and etc 14:19:54 i've been working on shared across tenants and protected from updates/deletions resources 14:20:00 NikitaKonovalov: uh, for 2.2? We found an issue in 2.0.6, Oozie is not properly configured 14:20:07 An Ambari bug is here https://issues.apache.org/jira/browse/AMBARI-12235 If anyone is interested 14:20:12 tosky: 2.2 14:20:15 We successfully moved to the in-tree Devstack plugin, so, it's now very easy to improve it's integration 14:20:17 We are working on Sahara/Manila integration and I just cook one slide deck on google docs for your reference. 14:20:22 https://docs.google.com/presentation/d/1qv2L1AVJ2BuBfBX33bE2hBu4KROV4l5h5xVHG9S9Sp8/edit?usp=sharing 14:20:31 NikitaKonovalov: for reference, https://bugs.launchpad.net/sahara/+bug/1470841 14:20:31 Launchpad bug 1470841 in Sahara "NameNode HA for HDP2 does not set up Oozie correctly" [Undecided,New] - Assigned to Ethan Gafford (egafford) 14:20:39 working on cluster verification checks #links https://review.openstack.org/#/c/196713/ 14:20:54 tosky: I'll keep that in mind, thanks 14:20:54 Please take a look and let me know your comment. 14:20:59 I am working on NameNode HA support for CDH. hope can finish the codes soon. 14:21:04 weiting, thx! 14:21:10 kchen, cool! 14:21:22 kchen, are you going to impl HA for other services? 14:21:25 NikitaKonovalov: egafford is working on it (but I don't see him online), so you can coordinate with him 14:21:37 @Sergey, yes for Yarn. 14:21:47 Others are not in current plan 14:21:51 kchen: do you have a change on review for NameNode HA? 14:22:48 @Nikita no yet. 14:22:56 hope I can submit one patch soon. 14:23:19 kchen: ok, I was just curious to see how CDH does HA 14:23:33 egafford: I have updated recurrence edp job spec 14:23:33 I have already submitted a spec and a patch to add a CM API for HDFS HA on CDH 14:23:45 AndreyPavlov, how is it going with a grenade code? 14:24:00 AndreyPavlov, have you tested it? 14:24:11 huichun: Cool; I'll take another look. More eyes on the recurrence spec would be handy, btw; it's important. 14:24:14 kchen: https://review.openstack.org/#/c/197551/ here is the change I have for HDP right now 14:24:29 i've tested it as part of grenade 14:24:39 NikitaKonovalov: cool, I will check that. 14:25:18 #topic Open discussion 14:25:26 it worked fine, but haven't checked it after plugins implementation 14:25:50 just a question about version deprecation 14:25:52 NikitaKonovalov: https://review.openstack.org/#/c/196929/ this spec shows how to do HDFS HA for CDH 14:26:14 it means the version is still listed, but no tests are run? 14:26:30 how is a user aware the version X is deprecated? 14:26:43 All, one thing I am not clear, whether we have a plan to align all job/job_execution into job_template/job everywhere. codes, docs. 14:26:54 For recently I was reading the documents, and found the names very confusing. 14:27:08 kchen, it could be fully done only with a v2 API 14:27:16 venza, I suppose we can add some validation rules about deprecation of some plugins version 14:27:16 kchen: thanks I'll have a look 14:27:23 kchen, because we need to change an API endpoits and object names for it 14:27:27 kchen: yea, i have ides about that for v2 14:27:44 SergeyLukjanov: ok. but Horizon already changed the names on UI, right? 14:27:48 venza: when vanilla 2.4 was deprecated, the version was listed but you couldn't create any new clusters 14:27:59 tosky: ok 14:28:30 changing topic, I will propose a small spec to allow the scenario test runner to consume mako templates and convert the yaml files we have in etc/scenario to this format, so that sahara-ci won't need to do the replacement 14:28:52 I will need some feedback from degorenko, I guess 14:29:03 kchen, SergeyLukjanov, i should have a first draft of the v2 api spec up within the next 2 weeks. it has a roadmap for some of these changes (including renaming job templates/jobs) 14:29:38 i have a proposed strategy for a number of changes we should implement over the remaining L and then M cycles 14:29:42 tosky: so if I want to do scenario tests in my local machine, we do not need to do the replacement by hand, right? 14:29:51 kchen: exactly! 14:29:56 kchen: that's the good side effect 14:30:33 tosky: sounds great :) 14:31:32 tosky, could you probably write a spec for it? 14:31:46 hehe, that's exactly what we talked about last night! 14:31:52 SergeyLukjanov: that's what I'm doing right now :) 14:31:59 great 14:32:09 yeah, I remember we were talking about it last night ;) 14:32:13 or day 14:33:40 i 14:33:42 i 14:33:45 lol 14:33:47 ? 14:34:04 so silent :) 14:34:13 i'll plead once again, i could use reviews on https://review.openstack.org/#/c/197743/ and https://review.openstack.org/#/c/179393/ =) 14:34:20 sorry, mistyped a few 14:35:04 While we're on begging for reviews, it'd be great to finish off review on https://review.openstack.org/#/c/187809/ : there's more work to do on this change during the cycle in client/UI/integration tests, but it's currently pending this review. 14:36:28 that should be our topic after open discussion, "review begging" ;) 14:36:47 Heh. 14:37:02 * elmiko shakes his cup 14:40:30 egafford, I'll take a look on it today again and it looks like I'll approve it :) 14:41:17 Thanks SergeyLukjanov! I just want to make sure there's enough time to get all the bits in place in L. 14:41:36 egafford, that's important 14:42:04 folks, propose corresponding changes to client asap to make us able to release client with at least few features per time ;) 14:42:17 SergeyLukjanov: about storing Spark images. There is 1.0.0 on mirantis web space linked in the documentation 14:43:10 good question venza! 14:43:39 I have a 1.3.1 on google drive, but that is not good for documentation 14:45:15 venza, we could put it to the storage 14:45:53 SergeyLukjanov: you mean update the image on mirantis server? 14:45:56 I'm still thinking about how to safely publish images to tarballs.o.o 14:46:07 venza, yeah 14:46:41 I will remove the link from the documentation for 1.0.0 since we are deprecating it, then we will see 14:46:48 the main issue is that if we'll publish images after each change merged - nothing will be tested and it's not safe 14:47:54 +1 14:48:43 what's our option though, i think it's nice if we provide some images but we will always have the problem of validation. 14:49:00 it doesn't seem friendly to ask all users to create images, but it almost seems like the best path. 14:49:53 elmiko: That doesn't really fix the validation issue, though, so much as defer the problem away from the dev team toward support teams. 14:50:00 i think that we want users to create their own images, we should make it easier, more straightforward 14:50:01 right 14:50:30 egafford: but otoh, who's doing all this validation work on the images, and how to keep up with newer versions in light of that? 14:50:53 elmiko, yeah 14:51:28 tellesnobrega: Yeah, I have this dream of having an image generation service that's shared by Trove, Sahara, Octavia, etc. 14:51:28 elmiko: I don't have a fabulous solution to that problem either, really, other than gating at major releases. 14:51:41 egafford, + 1 14:51:43 right now, before publishing, we're asking esikachev to generate all images, test them and upload :) 14:52:02 and we appreciate that esikachev is doing that =) 14:52:13 elmiko, SergeyLukjanov: Backward (and even forward) compatibility will always be rough here without a vast matrix. 14:53:03 egafford: agreed, my concern is that in the time it takes us to produce validated images, for spark let's say, we will want to be pushing a newer version. and will anyone even want the older versions? 14:54:09 assuming our images are for users who are trying out sahara, why isn't it fair to just announce "these are experimental images, for production usage you should use something else" 14:54:39 elmiko, good idea 14:54:40 i mean, if someone is going into production with sahara they should have a partner, or know how to create/validate images, or something... 14:54:52 we could probably publish all latest images 14:55:16 just make sure to announce "experimental, use at your own risk" ;) 14:55:20 and document that it's just a experimental images 14:55:20 elmiko: Well, this is a classic reliability vs. shiny tradeoff. Some folks will want the new shiny even at risk of pain, some will want things to just work immediately and will flee from Sahara if they don't. 14:55:28 +1 for publishing with the disclaimer 14:55:43 more then that, we could some day add some CI for this images 14:55:50 probably periodic jobs 14:55:54 elmiko, i agree with that 14:56:21 egafford: agreed, but how much assurance can we provide when we are talking about the depth of, essentially, free validation work? 14:57:20 good question :) 14:57:27 btw 3 mins left 14:58:04 elmiko: I think it's not at all unreasonable to assert that at major and point releases, we have a version of sahara-image-elements (which depends on a version of DIB) which we know to be able to create a functional image. 14:59:00 it makes sense to me if we have the juno release with supported validated images, and the kilo release with ..., but for master work it should be experimental images. imo 14:59:01 here at the university, we are starting to announce sahara so other researchers can use it, and we will probably need different images, having a way to generate them quickly would be nice, i mean, we dont want to have to have a setup just to build images, and we dont want to have to clone three projects to generates images everytime 14:59:14 just on the direction of DIY images 14:59:39 tellesnobrega: but wouldnt it make more sense for your organization to provide a set of images that are blessed for all to ue? 14:59:42 *use 15:00:03 yes, of course 15:00:14 etoews: Error: Can't start another meeting, one is in progress. Use #endmeeting first. 15:00:16 #endmeeting