#openstack-meeting-3 log

14:00:48 <SergeyLukjanov> #startmeeting sahara
14:00:50 <openstack> Meeting started Thu Oct  8 14:00:48 2015 UTC and is due to finish in 60 minutes.  The chair is SergeyLukjanov. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:51 <SergeyLukjanov> #link https://wiki.openstack.org/wiki/Meetings/SaharaAgenda
14:00:51 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:54 <openstack> The meeting name has been set to 'sahara'
14:01:05 <crobertsrh> Hel\o/
14:01:13 <esikachev> hi
14:01:13 <SergeyLukjanov> let's wait for a few more minutees'
14:01:27 <elmiko> o/
14:01:36 <vgridnev> hi
14:02:21 <SergeyLukjanov> #topic sahara@horizon status (crobertsrh, vgridnev)
14:02:30 <SergeyLukjanov> #link https://etherpad.openstack.org/p/sahara-reviews-in-horizon
14:02:41 <crobertsrh> I added one bug fix to that list this week.
14:02:49 <crobertsrh> It has to do with filtering by plugin name.
14:03:07 <crobertsrh> It's kind of ugly, but actually not a regression.
14:03:18 <vgridnev> nothing from me on this topic
14:03:50 <crobertsrh> Soon, I will have a change posted for adding shares to node group templates.  Reviews will be appreciated.
14:04:17 <SergeyLukjanov> cool, thx
14:04:19 <SergeyLukjanov> #topic News / updates
14:04:54 <elmiko> mainly been working on talks for summit, also a little help with coordinating some outreachy applicants for sahara
14:05:27 <vgridnev> working on ambari plugin improvements
14:05:27 <egafford> Also primarily on summit talks and discussions.
14:05:47 <AndreyPavlov> hi guys, have been working on CLI and grenade job
14:05:55 <esikachev> working on SPI for cluster verification checks, testing liberty
14:06:07 <crobertsrh> I've been working on adding manila share functionality to horizon (node/cluster templates and running clusters)
14:06:34 <tosky> some time to test Liberty
14:07:12 <tosky> and finding bug, like the one mentioned by crobertsrh (which would be nice to backport to Liberty, if it's not a problem, after fixed): https://bugs.launchpad.net/horizon/+bug/1503235
14:07:12 <openstack> Launchpad bug 1503235 in OpenStack Dashboard (Horizon) "[Sahara] Filter (on strings?) not working at least for Node Group and Cluster Template pages" [Undecided,In progress] - Assigned to Chad Roberts (croberts)
14:07:34 <vgridnev> crobertsrh, I viewed Trevor change for filtering in sahara for cluster templates, it looks like it breaks backward comp.
14:07:38 <vgridnev> tosky, ^^
14:07:47 <elmiko> #link https://review.openstack.org/#/c/232067/
14:07:52 <elmiko> that's the review in question
14:07:58 <tosky> vgridnev: what exactly?
14:08:01 <sreshetn1ak> I'm working on support api-paste.ini
14:08:05 <crobertsrh> vgridnev:  He was striving to not break backward compat, but I haven't looked at it yet.
14:08:26 <elmiko> yea, it looked like from his tests that it would not break compat
14:09:28 <vgridnev> so, if we have 2 cluster templates, with name "abs" and "abss" , and then ask for filter for name with arg "abs" we will get 2 cluster templates as result
14:09:52 <tosky> yes, but this is what happens when you use the same search box in Compute, for example
14:10:14 <tosky> consistency is more important, I personally think that the current behavior is incorrect; I would say users expect substring matching
14:10:54 <egafford> tosky: I think vgridnev's point that existing users will see a change in response. Still, that's the goal in this case; to fix the bug behavior does have to change.
14:10:55 <vgridnev> tosky, but initially was  perfect matching, so we are changing output result
14:11:22 <tosky> vgridnev: yes, to be consistent with similar search filters in horizon
14:11:26 <egafford> So either we live with it until we rev the API for backward compat, or we fix the bug.
14:12:07 <tosky> egafford: it's not a matter of API, it's behavior; if we say that it can't be changed (i.e. perfect match only), it will never be changed regardless of the internal code implementation
14:12:20 <egafford> I think in this case it's reasonable to call the initial version wrong, and worth fixing, rather than reasonable and worth preserving (even if we'd rather do something else.)
14:12:40 <crobertsrh> +1 for consistency across openstack
14:14:11 <egafford> tosky: Well, vgridnev's argument is that we shouldn't change the response (behavior) of an existing API. Response is part of API too; we could decide to only change at a version switch. Still, +1 to consistency across openstack, as crobertsrh says.
14:14:26 <tosky> uhm uhm
14:14:29 <egafford> I think fixing it is a win.
14:15:10 <tosky> and a new API? I guess no one wants that
14:15:24 <elmiko> i want that ;)
14:15:25 <tosky> the goal (from my point of view) is consistency in search behavior
14:15:26 <egafford> tosky: Well, elmiko wants that.
14:15:28 <egafford> :)
14:15:31 <tosky> in Horizon, at least
14:15:45 <tosky> not talking about 2.0 :)
14:15:52 <tosky> I was talking about a new method here
14:15:57 <egafford> (So do I.) tosky: Yeah, we agree, I'm just mapping out the alternative argument.
14:16:20 <elmiko> tosky: so, you are saying add a new method for the glob style filtering?
14:16:30 <vgridnev> we can't be sure that someone uses current behavior of Sahara
14:16:56 <tosky> yes, but on the other side we don't have microversioning - but we did add new methods to the 1.1 API in the past, right?
14:17:25 <elmiko> yea, we could add an entirely new method without breaking the contract
14:17:39 <elmiko> vgridnev: +1
14:17:47 <egafford> elmiko: Where do you even put that in a sensible REST API though?
14:17:47 <tosky> so, if it is possible, maybe we could leave get_cluster_templates as it is and introduce new get_cluster_templates_filter (the same for other methods)
14:18:01 <egafford> I mean, there are crazy places to put it, but sensible ones? Harder.
14:18:15 <elmiko> egafford: yea, i know. this is a difficult question
14:18:32 <tosky> and then, in horizon, look for the new one and fallback in case it is not there
14:18:33 <tosky> that would make impossible to backport to Liberty, though
14:18:47 <tosky> vgridnev: would that ^ be acceptable?
14:19:05 <tosky> (we all know that we will have 2.0 in M, so this will be moot, right? :D)
14:19:18 <tosky> ok, joking aside
14:19:27 <elmiko> i dunno, we might have to wait on this. because we will also need to change the filter stuff for node groups, i'd hate to see us propogate a bunch of bad api endpoints just to fix this for now.
14:20:40 <elmiko> i think we could add a new filtering parameter to the current endpoint
14:21:08 <elmiko> which would allow us to distinguish between exact match and glob match
14:21:11 <vgridnev> I think this issue not so critical for liberty, it doesn't break anything, just something probably strange
14:21:17 <elmiko> +1
14:21:30 <elmiko> also, our api docs are making me sad =(
14:22:00 <SergeyLukjanov> elmiko outdated?
14:22:19 <egafford> Yeah, if there's uncertainty among the team, waiting does seem to make sense. And vgridnev, your point is good and well-taken; GetOne to GetMany is a real, substantive change.
14:22:22 * SergeyLukjanov just returned home tonight and have a jet lag
14:22:29 <elmiko> SergeyLukjanov: i think we are missing some details
14:22:35 <elmiko> (on the api-ref site)
14:23:01 <SergeyLukjanov> heh, that's sad
14:23:55 <SergeyLukjanov> btw I'm proposing to agree on preliminary list of design summit sessions on the next IRC meeting (Oct 15)
14:23:57 <elmiko> SergeyLukjanov: imo, we just need to start creating our api docs in the specs repo, like keystone and others have done.
14:24:30 <elmiko> in rst format, to make it easier for updating
14:24:35 <SergeyLukjanov> elmiko, and copying to api-ref or droping api-ref?
14:25:15 <tosky> elmiko: I thought api doc was autogenerated from the APIs and comments, isn't it the case?
14:25:30 <elmiko> SergeyLukjanov: there is much discussion about what should happen to api-ref with the docs team, but one of the suggestions concerning documenting apis is for more projects to keep their api references up to date in their own repos.
14:25:35 <elmiko> tosky: not yet
14:26:04 <elmiko> there is a massive discussion about moving to an autogenerated format for the api-ref site, but for the projects themselves to keep descriptive long-form api references in their own repos
14:26:23 <elmiko> an example, https://github.com/openstack/keystone-specs/tree/master/api
14:26:35 <elmiko> in the v2 api spec, i've proposed we do the same
14:27:00 <tosky> wouldn't it make sense to add also the long-form API references as comments? I mean, it's the doxygen style
14:27:03 <elmiko> this will make it much easier for developers to keep the descriptive docs up-to-date when adding new things
14:27:21 <SergeyLukjanov> #topic Open discussion
14:27:38 <rickflare> hello
14:27:52 <rickflare> I just wanted to introduce myself
14:28:00 <elmiko> tosky: that does make sense, and if you look at some of the oslo libs that is what they do.
14:28:21 <tosky> elmiko: can't we go in that direction then?
14:28:26 <elmiko> tosky: imo, we should still move towards making our docs in the specs repo. it allows for a much more in-depth description of the api
14:29:14 <SergeyLukjanov> rickflare, o/
14:29:23 * elmiko waves at rickflare
14:29:39 <SergeyLukjanov> elmiko, we should evaluate the option of creating v2 api in specs repo
14:29:44 * rickflare waves back o/
14:29:59 <SergeyLukjanov> elmiko, I like the idea, but I'm afraid that we'll need to duplicate it in api-ref
14:30:33 <tosky> elmiko: why can't we have in-depth as API comments directly?
14:30:35 <elmiko> SergeyLukjanov: yea, but api-ref is pain in the ass. we need to keep that up to date for now, but it will change in the future.
14:30:47 <elmiko> tosky, SergeyLukjanov, look at this review for more ideas https://review.openstack.org/#/c/214817/
14:30:52 <rickflare> So guys let me give you a little insight to my background etc. I am system engineer/ clould engineer. I have been involved and using Linux for about 12 years now.
14:30:53 <elmiko> tosky: we can
14:31:01 <elmiko> tosky: it's a deeper question though
14:31:21 <tosky> hi rickflare
14:31:33 <elmiko> welcome rickflare
14:31:50 <rickflare> I am extremely passionate about Hadoop and other cloud/big data technologies. I am looking to contribute to Sahara as this is best group of FOSS people ive worked with online.
14:31:54 <SergeyLukjanov> elmiko, oh, swagger, it's not bad
14:32:22 <rickflare> I am currently in the process of trying to implment saltstack and puppet into our current cluster deployments.
14:32:49 <egafford> o/ rickflare!
14:33:00 <crobertsrh> Glad to have ya here rickflare
14:33:41 <elmiko> tosky: look at line 39-64 in that review
14:34:06 <rickflare> In my process I have been amazed at how awesome this project is and I can only see great things ahead. I hope to bring years of managing large hadoop clustering.
14:34:12 <elmiko> tosky: that stuff is way to in-depth to put in the comments. it will make the code more comments than code
14:34:32 <elmiko> rickflare: \o/
14:34:47 <tosky> elmiko: which is not a bad thing (more comment than code) - well, not for compiled languages at least
14:35:34 <elmiko> tosky: the point is, the proposed solution is to use swagger to auto-generate the api references with a supplemental document written in rst that will have more in-depth explanations of how to use the api
14:35:39 <rickflare> I feel the key to Sahara success will henge on easy of use for admins and devopers etc. I am extremely passionate about this and I am more than willing to get my hands dirty and really help out.
14:36:22 <elmiko> rickflare: we could definitely use more admin/ops perspectives on how to improve sahara =)
14:37:11 <SergeyLukjanov> yeah and we're really looking for this ind of feedback :)
14:38:44 <rickflare> so one of the things that I think should be addressed is the process in which one makes images.
14:39:10 <egafford> rickflare: :D
14:39:13 <SergeyLukjanov> :)
14:39:25 <SergeyLukjanov> yeah, it's the area where we need to do smth
14:39:40 <crobertsrh> rickflare:  Any chance you'll be in Tokyo for the summit?
14:39:49 <rickflare> Its a huge hurdle and for those who are not extremely knowledgeable about tox it will be a huge turn off for newbies
14:39:49 <crobertsrh> we will be talking about image generation there
14:39:58 <egafford> I hope to talk about that at summit coming up. If you have any ideas for specific implementations, I'd really love to know; I want to sound out all the options before us to get a sense of which to explore in the next 6months.
14:40:23 <rickflare> crobertsrh I really wish I could be there. I am planning on attending the Austin summit for sure.
14:41:44 <rickflare> so the all in one tox command is fine but it needs to be much more modular in its approach. I am certains admins are constantly going to want to slipstream per se software into these images and its going to be a bummer if they can not do that with ease.
14:42:34 <pino|work> rickflare: http://libguestfs.org/virt-customize.1.html there you go
14:42:48 <elmiko> we have been talking about allowing extra parameters to pass through sahara-image-elements into diskimage-builder
14:43:18 <rickflare> thanks pino!
14:44:07 <rickflare> ive also noticed esp with devstack that if one builds a large cluster ie 20 nodes or so Mariadb seems to get overwhelmed quickly.
14:45:04 <rickflare> I have instances in which clusters would not delete etc because of too many connections to Mariadb. Ive also alters the allowed number of connections in my.cnf only to see the same effects.
14:45:16 <rickflare> Im in the process of opening up a bug report about this
14:45:51 <elmiko> i think there has got to be some leeway for devstack too though. devstack is not meant for production work.
14:46:06 <elmiko> imo, 20+ node cluster is not ideal for devstack
14:46:19 <rickflare> The one thing I love is how fluid and natural the process in Horizon is in deploying a cluster. It just works and that wonderful.
14:46:45 <rickflare> elmiko understood, thats why i brought it up here because I was not sure it was of real concern or not.
14:47:17 <elmiko> building large clusters on devstack seems low prio to me
14:47:43 <elmiko> i just feel like you will spend more time fighting with devstack than getting work done ;)
14:48:23 <rickflare> ive also seen this with 5 or 7 node clusters as well though
14:48:32 <crobertsrh> Yes, 1 node devstack is hard enough  sometimes :)
14:48:56 <rickflare> well im sorry I might have confused what I was saying
14:49:10 <rickflare> I was refering to clusters with 5 ot 7 nodes in them
14:49:15 <elmiko> ok, 5-7 node, that we should look into
14:49:18 <crobertsrh> rickflare:  Is the problem special for sahara clusters, or is it something you'd see with 7 or 8 unconnected vms too?
14:52:14 <rickflare> its only with sahara clusters and per my impections it seems to be steaming from heat and only during the delete command.
14:52:32 <egafford> rickflare: Huh, interesting.
14:52:57 <rickflare> So once horizon reports that a cluster has been deleted it seems that heat is still actively communicating heavily with Mariadb
14:53:17 <SergeyLukjanov> rickflare, heat generates high load on openstack services apis
14:53:21 <rickflare> and it overwhelms it and then it locks up.
14:53:46 <egafford> We do create a lot of resources, and those resources have DB footprints in Heat and other services; deletion is often quicker than spinup, and Heat may not have much in the way of rate-limiting on those calls.
14:53:47 <SergeyLukjanov> so, we've seeing how heat killing some parts of the cloud itself during cluster creation and removal ;)
14:54:03 <SergeyLukjanov> MQ is very affected too
14:54:14 <rickflare> SergeyLukjanov :) ok so its not just me.
14:55:08 <rickflare> I think perhaps we can rate limit or load balance mariadb to deal with this? I am currently in the process of starting to test more of this using RDO versus devstack.
14:55:29 <SergeyLukjanov> we're running 50-200 nodes Sahara clusters regularly and seeing different issues
14:55:39 <egafford> rickflare: Sadly, a lot of the rate limiting would have to be in heat.
14:55:49 <egafford> From Sahara's side, it's really just a very few calls.
14:55:50 <SergeyLukjanov> yeah
14:56:14 <elmiko> also, rate limiting or load balancing to the db is kinda outside sahara purview. we can provide advice, and i think this exists in the operator manual, but that seems like the extent of our actions.
14:56:29 <SergeyLukjanov> actually, we're working with Heat folks to make a kind of limiting (batching) for the requested to guarantee that there will be no DDOS for the APIs
14:56:38 <elmiko> +1
14:56:41 <rickflare> that is awesome!
14:57:10 <egafford> SergeyLukjanov: Yeah, Sahara as attack vector in a public cloud would make us only so attractive to operators. :)
14:57:11 <SergeyLukjanov> rickflare and you still have an option to try the direct engine
14:57:43 <elmiko> egafford: but it's not a sahara only problem, you could do the same by creating some crazy heat requests
14:57:44 <rickflare> SergeyLukjanov I will reach out to offline to learn more about that.
14:57:50 <egafford> elmiko: Oh, absolutely.
14:57:57 <egafford> It needs to be on Heat's sid.e
14:58:09 <elmiko> yea, i just fail to see how we can make a sahara change to affect this
14:58:20 <egafford> Totally agreed.
14:58:39 <egafford> A savvy attacker could make a much more cumbersome Heat template than we do.
14:59:05 <SergeyLukjanov> it could be any load to openstack api
14:59:13 <rickflare> I think we must continue to focus on the junior system admin that is going to have the responsibility of installing and mantaining this software. I think keeping that in mind will go a long way in ensuring we are flexible and easy to understand. Esp because Hadoop, Storm, etc are all monsters of thier own.
14:59:15 <elmiko> yea, these are not sahara specific problems
14:59:26 <SergeyLukjanov> our MQ solution sucks IMO
14:59:36 <egafford> SergeyLukjanov: Heh.
14:59:37 <elmiko> rickflare: yea, but what you are describing is a more general openstack issue. not a sahara issue
14:59:56 * regXboi looks at clock
15:00:00 <SergeyLukjanov> #endmeeting