#cloudkitty log

09:00:43 <peschk_l> #startmeeting cloudkitty
09:00:44 <openstack> Meeting started Fri Dec  7 09:00:43 2018 UTC and is due to finish in 60 minutes.  The chair is peschk_l. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:00:45 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:00:47 <openstack> The meeting name has been set to 'cloudkitty'
09:01:06 <peschk_l> Hello everybody!
09:01:16 <peschk_l> Linkid, jferrieu, huats are you there ?
09:02:04 <peschk_l> Todyas topics can be found on https://etherpad.openstack.org/p/cloudkitty-meeting-topics, feel free to add any during the meeting
09:02:56 <jferrieu> Hi everybody
09:03:29 <peschk_l> Linkid, you wanted to talk about the state of the WSME interface, we'll talk about it once you're here :)
09:03:52 <peschk_l> Let's start with the storage then
09:04:04 <peschk_l> #topic v2 storage interface
09:04:41 <peschk_l> As you know, an InfluxDB backend for the v2 storage interface will soon be integrated to cloudkitty
09:05:14 <peschk_l> The gnocchi backend will be removed soon after, as it is not performant enough
09:06:13 <peschk_l> The influxDB backend works pretty well and is performant but there is a drawback: clustering / HA is not in the community version of influxDB
09:06:59 <peschk_l> In consequence, we need to support at leat one other backend, with support for clustering
09:08:04 <peschk_l> Until now, we considered the following backends: PostgreSQL, Cassandra, KaïrosDB, Elasticsearch, MongoDB
09:09:48 <peschk_l> postgres was eliminated because clustering can be tedious, and it doesn't seem to be used much in the openstack community
09:11:26 <peschk_l> Cassandra seems to offer enough flexibility for our needs. However, it does not support grouping
09:11:48 <peschk_l> so cassandra isn't an option
09:13:35 <peschk_l> Elasticsearch is rather complex: it seems like it would suit our needs in terms of flexibility (grouping, sorting, filtering...) but is pretty complex to install
09:14:39 <peschk_l> However, the elastic stack is used by many companies, and elasticsearch has also been suggested during the project onboarding in Berlin
09:15:53 <peschk_l> So elasticsearch seems to be a good option
09:17:38 <peschk_l> Next interesting candidate: MongoDB. It has historically been used by the telemetry project, and has beeen replaced by gnocchi afterwards because of performance issues
09:19:17 <huats> Honnestly using MongoDB will send a very bad message to the OpenStack world, due to the "history" in ceilometer...
09:19:19 <peschk_l> Here is a quote from one of Julien Danjou's blog posts (the openstack ceilometer gnocchi experiment) "We soon started to encounter scalability issues in many of the read requests made via the REST API. A lot of the requests requires the data storage to do full scans of all the stored samples. Indeed, the fact that the API allows you to filter on any fields and also on the free-form metadata (meaning non indexed key/values tuples) has
09:19:19 <peschk_l> a terrible cost in terms of performance (as pointed before, the metadata are attached to each sample generated by Ceilometer and is stored as is)."
09:19:49 <peschk_l> However their needs aren't the same as ours, at all
09:20:03 <peschk_l> huats: I'm not sure they will care
09:20:33 <peschk_l> huats: But it's true, it would be weird if we encounter the same issues as them
09:21:45 <huats> I am not speaking of the openstack developers, but users who have been facing issues with ceilo before the gnocchi's switch
09:22:17 <huats> and those user who are our potential users won't see the usage of Mongo as a good thing.
09:23:03 <peschk_l> My point on MongoDB is: Ceilometer had issues because the datamodel needed to be as flexible as possible, and no indexing was made. Cloudkitty also needs to be flexible because collected metrics can have many forms. We won't be able to handle the indexing because the fields on which we group  can vary, especially at the beginning when operators modify those often
09:23:42 <peschk_l> However, our data model will always have more or less the same form (indexing of the "groupby" fields, no indexing of the metadata)
09:24:42 <peschk_l> If the datamodel is well explained in the documentation, operators will be able to build their own indexes easily
09:25:30 <huats> I agree that we might not face the same issues... But if you switch to Mongo you won't be able to give a wrong signal to users who have faced the issues with ceilo... and thus they will avoid CK
09:26:09 <huats> they will stop at the first feeling, and not try to understand that they can really improve the way it is used
09:26:39 <peschk_l> huats: so you mean we should use elasticsearch ?
09:27:38 <peschk_l> indexing will still be up to the operators, but many people have experience with this solution
09:29:37 <peschk_l> anyway, we can still re-discuss this later, let's move on to the next topic
09:29:58 <huats> With the various elements you brought here, I think it may be a better choice. Indeed it is more complex to setup, but as you said it is widely used and appreciated by companies
09:30:02 <huats> sure move on
09:30:13 <peschk_l> #topic State of the WSME api
09:31:22 <peschk_l> As Linkid mentionned a few day ago, WSME should not be used anymore: http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000004.html
09:32:12 <peschk_l> The part about WSME ends with "Does anyone still use this?  Use Flask instead.". Good news, that's what we intend to do in the v2 API
09:33:59 <peschk_l> I believe that Flask-RESTful + Voluptuous will be a good solution for our needs, see spec https://review.openstack.org/#/c/614275/
09:34:26 <peschk_l> We'll keep the v1 API as it is now, and deprecate it as soon as v2 is stable
09:35:00 <peschk_l> Once WSME is officially unmaintained, we'll remove the v1 API
09:35:14 <peschk_l> Does anybody have some remarks about this ?
09:36:42 <peschk_l> oops forgot one part: paste (library to build WSGI pipelines) should also be replaced by Flask. That's also planned in v2
09:37:22 <peschk_l> I'll wait a minute for remarks and move on to the next topic, code cleanup
09:38:29 <peschk_l> OK, let's move on then
09:38:35 <peschk_l> #topic code cleanup
09:39:33 <peschk_l> (link to the story: https://storyboard.openstack.org/#!/story/2004400). Some parts of cloudkitty have not been maintained for several releases
09:40:42 <peschk_l> I propose to delete these part from cloudkitty in order to keep the codebase as small as possible
09:41:09 <peschk_l> A lot of new features are planned, and it will become hard to maintain if we don't pay attention
09:42:31 <peschk_l> Unmaintained parts are: the writer (report generation can now be done with the client), the fake collector (which reads CSV files), and the meta collector, which has never been used anyway
09:43:40 <peschk_l> The fake scope fetcher is also unmaintained, and the gnocchi transformer is not used anymore
09:44:49 <peschk_l> Does anybody have some remarks on this ?
09:46:48 <Linkid> hi
09:46:54 <Linkid> (sorry, I'm late)
09:47:09 <peschk_l> hi Linkid! (no worries)
09:47:34 <peschk_l> Linkid: some remarks about the previous topics ?
09:47:56 <Linkid> if some features are not used, we can remove them (or deprecate them), I think
09:49:03 <Linkid> about WSME and Flask, I think it could be interesting to see what other projects do, and maybe share our future Flask API in another component
09:49:34 <Linkid> (but maybe it is too early)
09:50:14 <Linkid> about CSV export, I read here that some people still use it
09:50:48 <peschk_l> It's definitely interseting to look at what other projects do, but sharing the Flask API will require some extra work, which we can't do for now
09:50:52 <Linkid> ah, you're talking about CSV import
09:51:15 <huats> Yes import... the export is just needed :)
09:51:30 <Linkid> yes, of course, it is too early to make another component ;)
09:51:50 <Linkid> let's see how we can deal with it for cloudkitty
09:52:45 <peschk_l> Linkid, huats: I was talking about the export. The writer has some bugs, and exports can be done through the client. We can provide a v2 api endpoint with some additional features compared to what's possible right now
09:53:13 <jferrieu> I think that using Flask can make cloudkitty even more attractive to developers communities, it can be a good thing as it's getting quite popular
09:53:26 <peschk_l> everything in the writer is hard-coded, it is not possible to use it outside of an openstack context
09:54:19 <huats> peschk_l: I agree, the functionnalities of the writer hae been integrated in the client, so we can remove it. But I think it would be important to add to the client a simple way to do the same file then produced by the writer
09:54:50 <peschk_l> huats: https://docs.openstack.org/python-cloudkittyclient/latest/usage.html#csv-report-generation this allows to have the exact same result through the client
09:55:46 <huats> I know, but can't we do some kind of alias of it in the client directly ?
09:56:10 <huats> it would ease the migration and the usage
09:56:33 <peschk_l> huats: we could, but I'd rather do this with the v2 api (pagination, filters...)
09:57:04 <peschk_l> And I'm sure jferrieu would be glad to implement such a v2 ednpoint ;)
09:57:55 <huats> I am not against doing that in V2 as long as we don't have a cycle without it
09:58:10 <jferrieu> peschk_l: why not
09:58:24 <huats> I would be to remove the writer during the same release we add that endpoint
09:58:45 <peschk_l> there will be a deprecation period anyway, so the writer will probably be completely deleted in T
09:59:16 <huats> ok
09:59:17 <peschk_l> in S, a warning telling users to do report generation through the client will be enough
09:59:23 <huats> yes
10:01:08 <peschk_l> Well, I think we talked about all the topics, and we are exactly on time. Quick update about cloudkitty.io: I'm still working on it but it should be ready soon
10:02:01 <peschk_l> If anybody has some remarks he'd like to add to conclude, I'll wait a bit before I end the meeting
10:04:40 <peschk_l> OK then, that's it for today. Thanks everybody!
10:04:48 <peschk_l> #endmeeting