13:00:49 #startmeeting monasca 13:00:49 Meeting started Tue Dec 10 13:00:49 2019 UTC and is due to finish in 60 minutes. The chair is witek. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:50 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:52 The meeting name has been set to 'monasca' 13:01:00 hello everyone 13:05:11 anyone around? 13:08:58 Hi, sorry for coming late. 13:09:12 hi Martin 13:09:19 Dobek took a day-off 13:09:46 let's start then 13:09:49 sure 13:09:52 agenda: 13:09:56 https://etherpad.openstack.org/p/monasca-team-meeting-agenda 13:10:20 #topic Promethues plugin update 13:10:28 https://github.com/stackhpc/stackhpc-monasca-agent-plugins 13:11:21 dougsz has published his extension for Prometheus plugin 13:11:37 I went through the readme today 13:11:45 hey, sorry i'm late 13:11:51 hi Doug 13:11:57 nice work 13:12:13 ah, thanks, just a test really at the moment, but it seems feasible 13:12:59 the main motivation was to use the Ceph Prometheus endpoint to replace the Monasca Agent Ceph collector 13:13:14 but should hopefully work for everything 13:13:41 I was wondering about the best approach to merge it into Monasca Agent 13:13:42 right, Ceph has added Prometheus instrumentation 13:14:02 The current Prometheus Monasca Agent plugin is heavily geared to k8s 13:14:11 which is probably not what a lot of people want 13:14:44 but it also allows static configuration, right? 13:14:55 yeah, in a very basic sense 13:15:09 I almost wonder if it should be the prometheus-k8s plugin 13:15:33 and then have a vanilla prometheus plugin like the prototype 13:15:42 Alternative is to have all functionality in one plugin 13:16:00 is there that many conflicting points? 13:16:52 There is some whitelisting capability but only for the k8s thing 13:16:59 which would need unifying 13:17:45 I see 13:17:46 `monasca.io/whitelist` 13:17:51 slightly weird naming 13:19:06 and the metrics types thing, is for k8s 13:19:16 but could be useful for general endpoints 13:19:19 I thought, `auto_detect_endpoints` could control if we use static endpoints configuration or K8s auto detection 13:19:52 yeah - I mean it works 13:20:38 but it seems like a chunk of the k8s specific config needs to be pulled down and made available to the non-k8s bit 13:20:59 and you have to worry about backwards compatible option naming etc. 13:21:24 hence my thoughts about making a new plugin 13:21:47 so you'd prefer to rename the existing one to prometheus-k8s, and publish the prototype based as prometheus? 13:22:13 something like that 13:22:50 probably best not to rename the existing one though for backwards compatibility 13:23:05 fine for me, if that's much easier to implement 13:23:37 thanks - it's much easier for me too, given limited time 13:25:34 the plugin name is not that important after all, we could document both options in the same section and point to two different configuration files 13:26:07 that's true - I can push something to that effect 13:26:18 chaconpiza: any thoughts? 13:27:12 to be in the context... the goal of this new plugin is to handle the new version of Ceph? 13:28:29 new Ceph versions offer Prometheus instrumentation, so yes, we can monitor Ceph with this new Prometheus plugin 13:28:53 but also any other Prometheus endpoints 13:29:15 chaconpiza: Ceph was the motivation (because the existing plugin parses the Ceph CLI which changes from release to release) 13:29:36 yes, I remember. 13:29:46 but yeah, as witek says, to make it easier to use Prometheus endpoints in general without running another Monasca service to compute rates etc. 13:30:04 then it sound good like a long term solution 13:31:21 I'm a little worried about performance with larger setups, where we'd like to scrape several endpoints and define multiple aggregations 13:31:59 in the long term aggregation on Monasca server would be better 13:32:36 but that's more work, so I'm happy with this plugin 13:33:08 One advantage of this is that the whitelist can greatly reduce the amount of data going into Monasca 13:33:31 right, that's very useful 13:33:37 Many Prometheus endpoints produce vast amounts of data at scale (eg. Ceph, 1M scrapes for 10 node cluster) 13:34:33 But I agree, people may want to the agent to be lightweight and shift the compute burden to the central Monasca deployment 13:34:46 *want the 13:34:52 do you remember this component? 13:34:56 https://github.com/monasca/monasca-aggregator 13:35:07 yeah 13:35:31 we could implement it scalable with Faust 13:36:03 That's a good idea 13:36:45 so many nice things we can do :) 13:37:12 I was reading through the doc of your Prom plugin 13:37:29 the `counter` section is somewhat missleading 13:38:06 any feedback much appreciated :) 13:38:36 will you be proposing it upstream in the next time? 13:39:00 yeah, I will do that, probably the best place to dicuss 13:39:17 very nice, thanks a lot! 13:39:44 can we move on? 13:39:56 please do, thanks 13:40:02 #topic review 13:40:18 we've made some progress on reviews this week 13:40:41 the merging of DevStack plugin landed 13:41:10 also updating ELK change has been updated 13:41:48 here our board: 13:41:51 https://storyboard.openstack.org/#!/board/190 13:42:14 I started looking at periodic notifications 13:42:25 https://storyboard.openstack.org/#!/story/2006837 13:42:38 the changes have been up for much too long already 13:43:11 other one needing attention is IPv6 support: 13:43:17 https://review.opendev.org/673274 13:43:46 Adrian has submitted new change deleting the old plugin from monasca-log-api: 13:43:52 https://review.opendev.org/690527 13:44:23 do you have some more reviews you'd like to mention? 13:45:07 None from me 13:45:10 #topic new bugs 13:45:20 we have one new bug report this week 13:45:25 https://storyboard.openstack.org/#!/story/2006984 13:45:37 it's about upgrading the DB schema 13:46:04 yes, that's mine, I will push a fix soon 13:46:23 nice, thank dougsz 13:46:42 I should have spotted that really - I think the alembic step just needs to query existing plugins which are configured and skip deleting the associated types 13:47:23 Worth knowing about if anyone is upgrading anytime soon 13:48:01 that's from Queens to any other version? 13:48:19 yeah, we did Queens -> Rocky -> Stein and hit it then 13:48:56 do you know at which step? 13:49:13 I mean -> R or -> S 13:49:37 I *think* it is in Rocky where we got rid of built in notification types 13:50:34 No stein actually 13:51:25 ok, thanks, we're upgrading to Rocky but I think we didn't hit it 13:52:06 #topic AOB 13:52:16 We had in a pre-production-env suddenly a big increase of memory consuption from Influxdb 1.3.4 13:52:31 We noticed that the monasca-metric-agent was wrongly configured in the url of Nova for the metric "http_status". 13:53:05 So it was producing the metrics with a long string in the "value_meta" besides of having "1" as the metric "value". 13:53:21 we are wondering whether Influxdb has troubles to process a big amount of points with this "long value_meta" 13:53:34 chaconpiza: Issue with the detection plugin not configuring the URL correctly? 13:54:09 In our devs machines the detection plugin works well and it end up with correct URL for keystone, nova, cinder, etc 13:54:30 value_meta limit is ~2kb right? the main issue I have seen is with two many unique dimensions 13:54:31 I am not sure how the pre-production team got it 13:54:41 *s/two/too 13:54:58 because of the cardinality? 13:55:01 yeah 13:55:31 We are simulating the metric-agent with: https://github.com/monasca/monasca-perf/blob/master/scale_perf 13:55:38 agent_simulator.py but setting a long string in the "value_meta". 13:55:52 in order to reproduce the issue 13:58:12 we will keep you informed in case we can break influxdb because of big "value_meta" 13:58:44 thanks, it's probably worth investigating InfluxDB 1.7 as 1.3 has a security issue where dimension values are leaked across tenants 13:59:11 I think it would be better to improve the auto-detection script to configure the agent as needed 14:00:04 so far all metric-agent's configuration were fixed manually and restarted 14:00:56 in pre-prod a single tenant is being used 14:01:13 OK, please keep us updated 14:01:22 the time is over 14:01:30 thanks for the discussions 14:01:43 thanks all, bye 14:01:48 thanks 14:01:52 thanks, bye 14:01:55 #endmeeting