15:00:08 #startmeeting monasca 15:00:08 Meeting started Wed Nov 11 15:00:08 2015 UTC and is due to finish in 60 minutes. The chair is rhochmuth. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:09 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:12 The meeting name has been set to 'monasca' 15:00:17 o/ 15:00:21 o/ 15:00:26 o 15:00:32 o/ 15:00:33 o/ 15:00:40 cheers! 15:00:48 o/ 15:01:13 So, sorry about the mix-up last week on the time change 15:01:14 o/ 15:01:26 cheers! 15:01:29 you were not the only one :) 15:01:36 o/ 15:01:39 NIce to see everyone here today 15:01:45 Looks like we'l have a good meeting then 15:01:57 looks like we have big news, right? 15:02:01 Please update the agenda 15:02:14 #topic tent 15:02:35 Congratulations to everyone, Monasca is in the Big Tent. 15:02:46 https://review.openstack.org/#/c/213183/ 15:02:47 congrats! way to stick with the process rhochmuth! 15:03:07 congratulations to everyone! 15:03:32 So, I'm not sure I have too much to say. 15:03:39 :) 15:03:45 just take a bow 15:03:50 :) 15:03:55 good job! 15:03:56 :) 15:03:57 I think we are doing good on process, with improvements in areas that we can still make 15:04:18 Hopefully, we'll get some more developers on-board with the project as a result 15:04:26 that would be great! 15:04:34 big tent should be a booster 15:04:36 for sure 15:04:52 We can also be a part of the official OpenStack process like having our own sessions at the next summit 15:05:00 +1 15:05:03 +1 15:05:07 +1 15:05:14 It will also help us with other projects, like Congress 15:05:31 +2 ;-) 15:06:24 So, unless any questions, maybe we should move on, but again Thanks to everyone for all the support! 15:06:41 nice work 15:06:54 #topic twc 15:07:02 that's me 15:07:08 you are up 15:07:11 first update on perf 15:07:33 one of the enhancements i'd been working on, and actually got working for vertica was pre-join projections 15:07:48 but found a bug where the db crashed when you'd update statistics 15:07:49 the agenda says it is deprecated 15:07:58 yup -- so abandoning that http://my.vertica.com/docs/7.2.x/HTML/index.htm#Authoring/NewFeatures/_VersionIndependent/DeprecatedFunctionality.htm%3FTocPath%3DHP%2520Vertica%25207.2.x%2520New%2520Features%7CDeprecated%2520and%2520Retired%2520Functionality%7C_____1 15:08:13 i guess it wasn't a widely used feature or stable 15:08:17 wow, that didn't last long 15:08:33 didn't they only add it in 7.2 15:08:37 yeah, i don't think it would buy us much -- not as much as some app caching 15:08:49 at least 7.1 15:08:55 so, i just started looking at that this morning 15:09:04 the caching that is 15:09:10 sweet -- initial thoughts? 15:09:11 i don't think it will be difficult 15:09:18 you're my hero 15:09:25 i didn't do it yet 15:09:31 :) 15:09:52 i'll start playing with it, unless deklan wants a reprieve 15:09:57 from devstack 15:09:57 would love to hear your thoughts -- i can help identify the queries that are repeated... 15:10:09 devstack is never ending 15:10:43 let me try writing some code and then i'll put somethign up for review 15:10:50 perfect, thx! 15:11:22 well that covers 2.5 in my list 15:11:39 next topic -- persister weirdness 15:11:42 opened https://bugs.launchpad.net/monasca/+bug/1511793 15:11:43 Launchpad bug 1511793 in Monasca "java version of monasca persister appears to have memory leak" [Undecided,New] 15:11:59 not sure it's a memory leak, but about once a week, we simply see things 'stall' in the persister 15:12:01 the persister has a cache in it, could that be the problem? 15:12:30 not sure, just know nothing gets to the db, kafka lag seems to keep up, nothing in the logs 15:12:39 could use some ideas for how to triage 15:12:43 can you do a 'kill -3' against the process to get a thread dump the next time it happens? 15:12:44 restart fixes it, for a while 15:12:52 will do ddieterly 15:13:03 so, this is occurring once a week 15:13:04 i have some jstack output already 15:13:15 yes, about that, happened yesterday 15:13:23 very interesting 15:13:25 worst part is, data is lost 15:13:34 you can also try setting the cache to a low size 15:13:35 the persister should immediately fail if it can't write to the database 15:13:43 there should be any loss of data 15:13:43 restart, huge gaps in data, depending on how long it takes to notice 15:13:58 if a db write fails, a sql exception should be caught and the persister exists 15:14:17 doesn't seem to be happening, more like things are hung 15:14:18 and the offset in Kafaka shousl not have been updated 15:14:30 so, on restart it shoudl start where it was the last time 15:14:55 the persister could be in a hung state, but the kafka offsets should not be advanced then 15:15:07 ok, that process dump will help 15:15:12 it's possible a bug i found in our consumer lag monitoring could have masked that fact, and kafka lag is big for certain partitions of the metrics topic 15:15:16 i'll do that 15:15:37 i'll have better consumer lag checking in our next week's deploy to prod 15:15:50 thx for the help on that one, it's gnarly 15:15:59 ok -- next topic? 15:16:06 rbak - that's u 15:16:12 yep 15:16:17 so the only window that we know of where data could be lost is if the persister fails write after a db write 15:16:23 that is a succesful db write 15:16:35 then you would end-up with duplicate data 15:17:10 ok, i'm done, rbak you are up 15:17:17 So currently if you submit a datapoint with some dimensions, e.g. hostname, and later add more dimensions, e.g. hostname and region, the first metric becomes impossible to query 15:17:38 yeah 15:17:47 i don't consider that a bug 15:17:55 why not? 15:17:57 that is the way influxdb works 15:18:26 we removed the ability to return multiple metrics in a query 15:18:31 because of influxdb 15:18:39 The problem is that it causes grafana to error and display nothing when you hit one of these cases without the merge flag. 15:18:59 Because it tries to query every dimension set, but one isn't actually valid 15:19:22 but if you query the one with the hostname and region do you get the data? 15:19:33 influxdb has peformance issues when we try to sort out the different series based on dimensions 15:19:51 we might want to see if they fixed that in later releases 15:20:02 fabiog: For the metric with both dimensions yes, but not for the metric with only hostname set 15:20:33 rbak: I think it is right, if you want the old too you need to apply the merge flag 15:21:26 So, currently the API says you can't get multiple series/metrics back in a single query 15:21:32 fabiog: But you might not want to merge if they are separate data 15:21:33 Each one needs to be uniquely identified 15:21:53 I tested "GET /v2.0/metrics/names" with dimensions(only one), but I can't get result. 15:22:05 rbak: but they are not, since they have the same hostname 15:22:05 seems like at a minimum, we need some error handling around this -- grafana will just barf, yes rbak? 15:22:22 We should return an error 15:22:23 fabiog: In this example yes, but that isn't always the case 15:22:29 bklei: That's correct 15:22:56 rhochmuth: I think a 409 conflict error would be appropriate. You are querying for things that are not unique ... 15:23:21 this certainly makes me nervous to add dimensions 15:23:31 paints us into a corner 15:24:03 so, if you add dimensions, then you've got problems, i agree 15:24:41 it is a trivial change for Vertica, but I didn't think that InfluxDB coudl support this 15:24:47 which is why we removed from Vertica 15:24:50 if you add dimensions, is it to the same logical series? if so, then you can use the merge-metrics flag 15:25:24 It might be or might not, especially if you are adding dimensions in order to split out the data into something more specific 15:25:26 i think it's more an issue when our users build graphs adhoc 15:25:43 we could give the java api a 'influxdb compatibility mode' 15:26:15 then we could allow the java api to do the things that twc wants when it is not in influxdb compatibility mode 15:26:30 or a metrics list flag that says, ignore un-queryable dimension sets 15:27:09 don't want to kill the whole hour on this topic 15:27:18 maybe we could do a launchpad design sketch for this feature 15:27:25 agreed ddieterly 15:27:28 That works. 15:27:48 next twc topic -- hoping for some support to add start/end time to metric list call 15:27:52 here's the scenario 15:28:15 a dashboard does a metrics list for a time period, then does a stats call for all dimension sets 15:28:19 bklei: but if your queries you always have the hostname and merge you will always get the right result .. 15:28:41 yes fabio, this is a case where we aren't merging 15:29:20 so the dashboard ends up querying needlessly for data that isn't there if there aren't measurements 15:29:34 this happens a lot for a MaaS dashboard where VMs are short lived 15:29:35 bklei: what I am saying is that forcing the merge does not change the queries that have the least amount of common dimensions ... 15:30:26 so, you are trying to get a list of the metrics that are active in some time period 15:30:29 that could be fabiog, well include you in the etherpad discussion on that? 15:30:35 yes rhochmuth 15:30:42 bklei: sure 15:31:04 this is one of the perf enhancements we'd like to make at twc before going live with maas 15:31:20 i'm ok with the change 15:31:28 this review isn't done, bug at least works on the vertica side and accomplishes what i'm describing 15:31:34 https://review.openstack.org/#/c/241626/ 15:31:36 probably should have a blueprint 15:31:46 i'm just testing the influxdb/python 15:31:47 the other issue is Tempest tests and Python API 15:31:51 yup 15:31:55 awesome 15:32:03 do you have the tempest tests working 15:32:17 can add tempest too, haven't got that far -- at least not in devstack 15:32:21 it should be really easy to add 15:32:27 devstack isn't required to run them 15:32:43 will do that for sure, and if you want a blueprint i'll start one and link to my review 15:32:57 there are directions on how to install/run 15:33:15 all that should change is the endpoint and user credentials to match your environment 15:33:16 bueno -- i need to ramp up there, so thx for the directions 15:33:28 awesome, i'm sure it'll work 1st time :) 15:33:55 i think that's it for that topic? 15:34:09 ok 15:34:11 rbak -- grafana 2.0 update? 15:34:30 I have a monasca plugin for grafana 2.5. The branch here: https://github.com/rbak1/grafana-plugins/tree/master/datasources 15:34:41 Hello everybody 15:34:51 hi 15:34:54 Feel free to pull it down give me feedback. 15:35:03 I'm still making changes, but once it's stable I'll submit a pull request. 15:35:06 https://github.com/rbak1/grafana-plugins/tree/master/datasources 15:35:29 This plugin is for the standalone grafana, and does not currently integrate with horizon. It needs a keystone token in the datasource to talk to monasca. 15:35:47 I'm currently looking at integrating the built in grafana auth with keystone. That way it can get it's own keystone token. 15:36:03 It might also be possible to leverage the grafana concept of "organizations" to provide per tenant dashboards, so I'm looking at this as well. 15:36:30 https://blueprints.launchpad.net/monasca/+spec/grafana2x 15:36:36 jobrs created a blueprint for this. I'll try to keep that updated with how things are going. 15:37:02 Thanks Ryan. 15:37:03 rhochmuth: rbak: we should see if the grafana community may be interested in supporting a Monasca datasource 15:37:18 I think that is the plan 15:37:35 fabiog: It should be easy to get this pulled into the plugins repo. 15:37:53 rbak: ok 15:37:59 Later we can also try to get into the main repo if we want. That's what gnocchi is trying to do now 15:38:44 Thanks for update ryan, should be move on 15:38:55 Sure, that's all I have 15:39:00 #topic logging 15:39:17 Log-management integration into monasca-vagrant / devstack plugin ? 15:39:26 we would like to integrate our ansible roles into monasca-vagrant 15:39:45 That would be great. 15:39:56 and then start working on devstack integration 15:40:38 do Monasca plan to support both installers? 15:40:52 should it be the reverse? i assumed devstack will cause us to deprecate monasca-vagrant? 15:41:16 well, but we have ansible roles already 15:41:22 well, there is always the possiblity of monasca-vagrant getting deprecated 15:41:30 as a result of devstack 15:41:36 today we still need it for other reasons 15:41:39 ok 15:42:09 what's the current status of devstack plugin? I think rally needs this for adding monasca tests 15:42:23 it's just we had some questions after the summit, on how to install thte logging 15:42:29 the devstack plugin works with both java and python implementations 15:42:39 we need to fix up some of the smoke tests at this point 15:43:00 Vertica is not supported in DevStack 15:43:14 ddieterly: can you enable different parts, like enable monasca_metrics monasca_events or something like that 15:43:16 you should be able to run the smoke tests against monasca running from the devstack plugin 15:43:31 fabiog: no, not at this time 15:43:35 As far as the overall DevStack plugin it works, but we are still relatively new to tis DevStack thing, so there could be issues that you run into 15:44:22 we need to test the vagrant devstack next 15:44:41 So, where should the log api devstack plugin live 15:44:56 i was thinking in the monasca-log-api repo 15:45:03 ideally that would be its own plugin 15:45:08 ok 15:45:46 i don't know how it works, how it intagrates with monasca devstack plugin? 15:45:58 And for the monasca-vagrat, it would be in ansible-monasca-log-api 15:46:12 and also monasca-log-agent 15:46:14 you will need to create a separate plugin that does what the monasca-api devstack/plugin does 15:46:17 monasca-log-schema 15:46:26 monasca-elkstack 15:46:50 ok 15:46:50 ideally, each separate repo has a devstack/plugin 15:47:08 we may need to deviate from the ideal if it makes sense 15:47:31 ok, additionally should we add monasc-log-api into governance project? 15:47:45 yes 15:47:59 nice 15:48:02 i think monasca-log-api should be added 15:48:03 who's doing it? witek? 15:48:22 i will push the change 15:48:30 great! thanx! 15:48:31 thanks witek 15:48:49 the goal is to have the plugin in a repo be able to run in a gate job 15:49:09 so that any changes to the repo go thru a suite of integration tests using the plugin to setup the env 15:49:26 i get it 15:50:21 ok, i think that's all for logging 15:50:26 thx 15:50:39 #topic tokyo 15:50:49 news from tokyo 15:51:08 i think we should reach out to patrick petit from miranis 15:51:38 why? 15:51:44 He's doing the LMA plugin for Fuel, right? 15:52:29 https://github.com/openstack/fuel-plugin-lma-collector 15:52:53 he was doing work with Heka, and seemed interested in Monasca, 15:53:04 So, that is one follow-up item I had 15:53:36 Fabio has done great work in getting connnected with Congress 15:53:49 fabiog: comments 15:53:52 ? 15:54:21 yes 15:54:46 what we discussed is the ability for Congress to create alarms in Monasca based on Policies 15:55:14 then Congress will receive Webhook notifications when the alarms go off and act on the related Policy 15:55:28 they liked the idea a lot, but they want to move in steps 15:55:43 so what we agreed at the summit is to start integrating Congress with Monasca using a driver 15:56:00 drivers are their standard polling mechanism to get data out of the Openstack services 15:56:21 I have a WIP around this: https://review.openstack.org/#/c/241826/ 15:57:11 I will start interacting with them to understand what API they would like to see polling, in my mind Metrics and Statistics (and maybe Alarms) are viable, but Measurements will kill them 15:57:17 they store all this data in SQL ... 15:57:21 any comments? 15:57:46 I guess move on then 15:58:09 Thanks to all the presenters. Great job. 15:58:23 watched most on youtube -- nice work everyone 15:58:40 lots of monasca interest there? 15:58:48 oh yes 15:59:09 awesome 15:59:17 So, unfortunatley we are going to have to end the meeting 15:59:29 Do we need to get togehter outside of weekly meeting? 15:59:30 ciao! 15:59:34 ciao 15:59:48 perhaps mailing list discussions? 15:59:54 remind me the mail alias? 16:00:03 Sounds good. We need to address your design question 16:00:20 #endmeeting