15:02:27 #startmeeting monasca 15:02:28 Meeting started Wed Mar 9 15:02:27 2016 UTC and is due to finish in 60 minutes. The chair is rhochmuth. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:02:29 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:02:31 The meeting name has been set to 'monasca' 15:02:35 o/ 15:02:43 o/ 15:02:43 morning 15:02:43 hi 15:02:44 o/ 15:02:48 running a little late this morning 15:02:50 o/ 15:02:50 sorry 15:03:10 it's 7 am here, late is good :-) 15:03:55 #topic summit 15:04:18 ment to mention that the agenda is posted at, https://etherpad.openstack.org/p/monasca-team-meeting-agenda 15:04:26 Agenda for Wednesday March 9, 2016 (15:00 UTC) 15:04:26 1. Austin Summit Monasca Sessions 15:04:26 2. Monasca Agent discussion from previous week with SAP regarding overriding dimensions 15:04:26 3. Stale alarms for metrics that don't exist anymore (deleted vms) -- how should we address? 15:04:26 4. Multiple metrics per http request on statistics and measurements resource. 15:04:27 1. See review at https://review.openstack.org/#/c/289675/ 15:04:27 5. Brief status update for Anomaly Detection 15:04:46 So, it looks like we had a descent number of Monasca related sessions 15:04:48 accepted 15:04:56 Thanks everyone 15:05:22 hi 15:05:28 hi jobrs 15:05:43 here for topic 2. 15:05:50 i've put it for four design summit sessions at the summit 15:06:08 assuming that occurs, we should be able to have some discussions there 15:06:37 nice! 15:06:56 so, i'll keep you posted on what i get, and then we can adjust the agenda as we get closer 15:07:45 the the summary is overall planning/status, discussion on new features and performance, logging api/implementation, and monasca/networking/broadview 15:08:05 if we run over, then we can find open spots to discuss more 15:08:16 does that sound reasonable? 15:08:32 sounds good to me 15:08:58 sounds good here 15:09:05 if possible, i would like to add anomaly detection 15:09:18 in new features? 15:09:30 ho_away: if you'll be attending then i'll request a another spot for that topic 15:09:41 unfortunatley, the folks from bristol won't be there 15:09:54 so, i was worried about general attendance 15:10:14 but i'll request another spot, and we can discuss that topic in detail with whoever attends 15:10:20 i know, i didn't get permission for it yet but i will go there. 15:10:37 i'll definitely be up for a discssuon and planning on that topic 15:10:58 after i get permission i will let you know thanks! 15:11:03 ok, thanks 15:11:14 #topic agent 15:11:30 jobrs: this is a carry over from last week 15:11:36 i guess we broke you 15:11:44 yep 15:11:53 so, do you have a proposal 15:12:09 I shared some links last time, they are in the logs 15:12:19 we could add a parameter to adjust the behaviour 15:12:29 keep the old or the new 15:12:41 based on a parameter 15:12:48 woudl that be acceptable? 15:12:59 I would be happy already if we would have a shared view on what the 'service' dimension is good for 15:13:14 a) openstack service 15:13:19 b) technical service 15:13:50 a) means that plugins for generic components/services do not know the answer 15:14:17 b) means that plugins for generic components set it themselves 15:14:33 but b) also means that there is not standard dimension for openstack services as registered in the ks catalog 15:14:53 This is what we've been trying to do 15:14:57 b) also means that we have quite redundancy between component and service - what is the difference at all? 15:15:24 to me the previous behavior was more consistent 15:15:44 have 'user' set 'service' if it is a generic component 15:16:04 and not do something like component='mysql', service='mysql',process='mysqld' 15:16:06 service = compute, networking, …, when the entity being monitored corresponds to a a specific openstack component, such as nova-api 15:16:16 exactly 15:16:23 not done yet 15:16:40 it is for monasca at least 15:17:00 then service = mysql, rabbitmq, …, if it is a "shared" service, unless the shared component isn't really being shared 15:17:23 it is often the case the mysql and rabbitmq are sharess across many openstack services 15:17:38 but it can be the case that it is deployed 1-to-1 with a service 15:17:39 agreed, and in that case I believe it is fair that the one configuring the agent is taking care to set the --service parameter on monasca-setup 15:18:22 So, component should correspond to the specific "process" that is being montiroed 15:18:28 in any case the plugin cannot possibly know 15:18:29 so, for Nova 15:18:39 service=compute, component = nova-api, ... 15:18:49 For "mysql" 15:18:52 service = mysql 15:18:56 component = mysqld 15:19:14 In some cases, the values are the same 15:19:21 there is a process dimension 15:19:56 the plugin cannot know for what service mysql is used 15:20:11 same with apache 15:20:36 etc., so plugins should IMHO set dimensions sparingly 15:20:45 But, it can be overriden 15:20:46 or at least not override 'service' 15:20:57 no, it cannot - no longer 15:21:17 the order was reversed 15:21:32 got it 15:21:44 i think we just coded to what worked for our deployment 15:22:03 same with us :-) 15:22:22 we ran into a problem because there were plugins that were not setting the service dimension 15:22:39 this created problems in the ui 15:23:01 so, in that case what we wanted was to always supply a service dimension = "uncategorized" 15:23:01 sure, but this can be fixed when configuring the agent, not? 15:23:30 i'm concerned about automatically changing the default dimensions -- if we do that i'd prefer the change be config file driven so old behavior continues to work 15:23:49 maybe this belongs to the UI layer? not sure 15:23:56 we've run into issues with dimension changes and bloat 15:24:30 us too, that is a big issue in my opinion 15:24:32 so, my proposal is to restore the old behaviour, and then create an option to enable the new behaviour 15:24:52 +1 15:24:53 i'm trying to get the rbrandt 15:25:15 he's not arround 15:25:16 adds complexity 15:25:24 there is an "old" bug: A metrics graph becomes not to appear when adding/deleting a dimension https://bugs.launchpad.net/monasca/+bug/1485859 15:25:24 Launchpad bug 1485859 in Monasca "A metrics graph becomes not to appear when adding/deleting a dimension." [Undecided,Triaged] 15:25:51 that means changing dimensions is not a good idea... 15:26:18 that is what I meant with big issue. you cannot force people not to add dimensions, that is what they are good for 15:26:46 what we wanted to have happen was to always have a dimension of service=uncategorized 15:26:57 if one wasn't supplied 15:27:27 to me this looks like a presentation-layer problem 15:27:29 i'm not exactly sure at this point, why that ended up modifying the default behavriou 15:28:01 so, i'll check with rbrandt and come up with a proposal to fix 15:28:10 if that sounds ok 15:29:17 it isn't a presentation layer problem, btw 15:29:33 the problem occurs when searching for metrics and alarms 15:29:51 we don't support being able to get metrics and alarms that don't have a supplied dimension 15:29:51 I was just talking of the specific case of the service dimension 15:30:05 agreed -- if this is the issue rbak found -- you end up with metrics you can't query for without merging 15:30:13 other than that I believe it is a bigger issue which will not be solved by the default service domain value 15:30:37 +1 15:30:38 so, you can't search for the absencse of a dimension 15:30:48 today 15:30:55 and there is no way to do that in some databases 15:30:58 like influxdb 15:31:38 so, we wanted to supply a default dimension = uncategoried everywhere 15:31:53 but that broke the original baheviour 15:31:55 but it is not just "service" 15:32:09 yes, it applies to any dimension 15:32:36 so this is what I do not like about the option, it does not really fix the problem (for us) 15:32:51 but from a ui perspective we usually only group by hostname, service, 15:33:00 why? 15:33:29 why doesn't it work? 15:33:41 we have other dimensions 15:34:09 e.g. in kubernetes: namespace, ressource_controller, ... 15:34:37 the great thing about dimensions is that you can have your own ones 15:35:25 Roland: Is the number of dimensions fixed (and should not be changed)? All not used dimensions will have default value "uncategorized"? 15:37:38 Besides going back to the old behaviour, which doesn't fix the problem I'm trying to address, is there a specific proposal that we can implement 15:37:55 We have a problem 15:38:10 I'm just looking for a specific way to resolve at this point that is implementatble 15:38:22 sure 15:38:47 unfortunately I am not an influxdb expert 15:38:48 but in your case, can't we just add service=uncategorized in the monasca agent in case the service is not set from the plugin and not via agent config? 15:39:33 ok, i'll look into that 15:39:53 i don't have an answer right now 15:39:54 that would imply the db defaults to that value when items added or removed, correct? Is that the case now? 15:39:56 the global configuration is overridden by the plugins 15:40:05 but, if it is possible, we'll try to do that 15:40:15 my proposal would be to make merge-metrics a default 15:40:20 er, added or updated 15:40:40 ok, i like that idea 15:41:00 äh, ...my default 15:41:05 not sure why we didn't do that, but i'll investigate and try to get back to that 15:41:48 so you tell the API which dimensions should be expanded 15:41:58 and the remaining ones are merged 15:42:05 that gives some stability 15:42:51 rbaks grafana plugin is exposing this behavior to the user, so it is possible it seems 15:43:31 if the api would support it, too, then the db-driver could do optimizations to reduce the number of queries (mid-term) 15:46:05 ok, i'll look at the code and work with rbrandt and see what we come up with 15:46:09 sound good? 15:46:51 sounds great 15:46:55 thank you 15:47:06 ok, thanks 15:47:11 switching topics 15:47:19 #topic stale alarms 15:47:28 that's me 15:47:34 sorry times up 15:47:38 just kidding 15:47:41 anyone else encounter this? :) 15:47:44 lol 15:47:47 yes, 15:47:48 so here's the scenario 15:48:08 the overview page in horizon, when tracking vms with alarm defs 15:48:08 we, this is one of the topi issues we've been looking at 15:48:17 goes gray after a vm goes away 15:48:25 and requires a manual alarm delete step 15:48:48 so we could have a prune process specific to this, but i wonder if there's a better idea/solution 15:48:54 to handle stale stuff in the UI 15:49:01 there is a better way 15:49:09 good, i'm all ears 15:49:27 but first let me say that we are solving this in our helion distributino using a script and cron job 15:49:36 that is a short-term solution 15:49:48 because the better solution is harder 15:49:50 right, and we could do the same here -- we'd welcome you sharing ^^ 15:50:07 i'll try and get that script open-sourced somewhere 15:50:26 gracias, and there's discussion about a more elegant solution? 15:50:27 the way we were intending was to use the Events API 15:50:34 maybe discussion in austin? 15:50:46 Sure, we can discuss 15:50:48 aah, so lifecycle trigger 15:50:50 i like that 15:50:53 Correct 15:51:02 the Events APi woudl receive all VM lifecycle events 15:51:36 that's clean -- so we can do the cron/script as a workaround till then 15:51:40 and the events engine would have a handler associated with it to delete the alarm, when the VM end event occurs 15:52:06 The only alternative right now is a cron/script 15:52:26 the script atually invokes the nova api to determine the VMs that have been deleted 15:52:36 so it is not purely time based 15:52:38 that would work 15:52:40 nice 15:53:06 the problem with the events api/engine is that it is not getting any development time right now 15:53:15 so, it could be a long wait 15:53:25 for that to be deliever 15:53:28 so, I presume these events events are defined by nova and they are publishing to OSLO? 15:53:34 correct 15:53:40 s/events events/events/ 15:53:48 openstack notification, VM lifecycle events 15:53:56 there is a wiki that describes them all 15:54:02 but i dont' have the link 15:54:06 right now 15:55:10 so, is that enought of a discussion 15:55:12 on that topic 15:55:14 thx 15:55:23 we are running out of time 15:55:34 #topic multiple metrics 15:55:39 So, i posted some code 15:55:42 it isn't complete 15:55:58 if bklei and rbak could look at it that woudl be good 15:56:26 what i would like to do is add a query parameter, "multiple_metrics" or something simialr 15:56:37 yeah, looks good to me if we add the parm to differentiate behavior 15:56:45 to enable returning multiple metrics in a single measurements or statistics resource 15:56:53 we'll definitely be using that 15:56:57 I'll take a look as soon as the meeting is done. 15:56:59 this will probably improve your over all query perofmrance 10X at least 15:57:01 if not more 15:57:08 i'm actually hoping for 100X 15:57:30 i don't think this is possibel for influxb using an in-database query 15:57:43 gonna be freaking awesome -- in vertica land :) 15:57:43 so, please take a look 15:58:27 I think we are out of time for anomaly detection 15:58:43 ok, next week :-) 15:58:48 sorry ho_away 15:58:52 you are up first next week 15:59:03 i'll also touch-base with luis 15:59:04 thanks! 15:59:29 yeah, i will send you email to have a meeting 15:59:31 jobrs: will get back to you 15:59:36 thx for hosting rhochmuth! 15:59:39 thx ho_away 15:59:45 thanks everyone 15:59:49 bye 15:59:51 thank you, looking forward too multiple metrics, too 15:59:54 later 15:59:56 thanks Roland 16:00:04 bye and thx 16:00:16 bye 16:00:16 #endmeeting