13:00:46 <witek> #startmeeting monasca
13:00:47 <openstack> Meeting started Tue Feb 11 13:00:46 2020 UTC and is due to finish in 60 minutes.  The chair is witek. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:00:48 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:00:50 <openstack> The meeting name has been set to 'monasca'
13:00:56 <witek> hello everyone
13:01:02 <adriancz> Hi
13:01:05 <chaconpiza> Hello
13:01:08 <Dobroslaw> hi
13:01:45 <witek> the agenda seems to be light
13:01:49 <witek> https://etherpad.openstack.org/p/monasca-team-meeting-agenda
13:01:55 <witek> let's start
13:02:04 <witek> #topic ujson replacement
13:02:47 <witek> I've sent a question to requirements team at openstack-discuss
13:02:56 <witek> http://lists.openstack.org/pipermail/openstack-discuss/2020-February/012376.html
13:03:59 <witek> orjson seems not to be the good option for including into global requirements
13:04:10 <witek> because of complicated build process
13:04:52 <witek> do we have any performance evaluation results already?
13:05:24 <chaconpiza> I compared ujson, json and simplejson in devstack
13:05:48 <chaconpiza> all these are available in devstack already
13:06:21 <chaconpiza> This is the way I did the checks:
13:06:37 <chaconpiza> 1. After stacking devstack, I stopped the log-agent, metric-agent and the persister.
13:06:50 <chaconpiza> 2. I modified the scale_perf/agent_simulator.py from Monasca-perf in order to work with devstack.
13:07:06 <chaconpiza> 3. I sent 690 calls with 1000 points each one using 4 Python Processes. Every point had this form:
13:07:18 <chaconpiza> {'dimensions': {'cloud_name': 'monasca',
13:07:18 <chaconpiza> 'cluster': 'compute',
13:07:18 <chaconpiza> 'component': 'vm',
13:07:18 <chaconpiza> 'container': 'container_0',
13:07:18 <chaconpiza> 'control_plane': 'ccp',
13:07:19 <chaconpiza> 'hostname': 'agent_0',
13:07:20 <chaconpiza> 'resource_id': '34c0ce14-9ce4-4d3d-84a4-172e1ddb26c4',
13:07:22 <chaconpiza> 'service': 'service_0',
13:07:24 <chaconpiza> 'tenant_id': '71fea2331bae4d98bb08df071169806d',
13:07:28 <chaconpiza> 'zone': 'nova'},
13:07:30 <chaconpiza> 'name': 'aaa.perf_428',
13:07:32 <chaconpiza> 'timestamp': 1581424933000,
13:07:34 <chaconpiza> 'value': 781}
13:07:40 <chaconpiza> where: 'name' can be [aaa.perf_000 to aaa.perf_999]
13:07:54 <chaconpiza> 'value' a randon [000 to 999]
13:08:03 <chaconpiza> and 'hostname': ['agent_0' to 'agent_3']
13:08:20 <chaconpiza> This point is quite similar to those we have in production.
13:08:37 <chaconpiza> Cardinality: 1000 metrics X 4 agents X 1 containers X 1 services = 4000
13:08:50 <chaconpiza> 4. Changed from 30 to 1000 the batch_size of the metrics in the persister
13:09:07 <chaconpiza> 5. Set a log.warn in the json.loads https://github.com/openstack/monasca-persister/blob/master/monasca_persister/repositories/utils.py#L21
13:09:57 <chaconpiza> 6. Started the persister manually redirecting the output to a file waiting to consume all the 690000 points waiting in kafka.
13:10:26 <chaconpiza> 7. Checked the time difference in the log from the first deserialization (log.warn) to the last.
13:11:21 <chaconpiza> I repeated the process for simplejson and json, by changing the import at
13:11:27 <chaconpiza> https://github.com/openstack/monasca-persister/blob/master/monasca_persister/repositories/utils.py#L16
13:12:21 <chaconpiza> The results: json took 2 min 20 sec in the first test and 2 min 22 sec in the second test.
13:13:02 <chaconpiza> ujson took 1 min 53 sec in the first test and 1 min 51 in the second
13:13:23 <chaconpiza> simplejson took 1 min 59 in both tries.
13:14:52 <witek> from this test simplejson seems to be a good compromise
13:15:15 <chaconpiza> and at least in devstack is already available without any change
13:16:09 <witek> we could test rapidjson as well, but I don't expect it to be much faster the ujson
13:16:32 <chaconpiza> Yes, I have all the mini-infrastructure to test it
13:17:22 <chaconpiza> *Note that I only tested the deserialization (json.loads), I left the serialization (json.dumps) using ujson.
13:17:36 <chaconpiza> https://github.com/openstack/monasca-persister/blob/a4addd0f5e8c60f631a77fd280f2810b8f222203/monasca_persister/repositories/influxdb/metrics_repository.py#L48
13:18:47 <witek> that's fine, we were interested in deserialization because of persister beeing the bottleneck
13:18:55 <chaconpiza> cool
13:20:26 <witek> any other comments on that?
13:21:01 <Dobroslaw> well, it's good if there is no need to add it to global requirements and is fast enough
13:21:49 <witek> agree
13:22:15 <chaconpiza> Probably the profiling test in the internet show differences because of the size of the string to be deserializated.
13:22:27 <chaconpiza> *tests
13:22:48 <witek> sure, it depends strongly on the structure of the object
13:23:38 <chaconpiza> yes like the nesting level of the object
13:24:10 <chaconpiza> Besides of the dimensions: we use a 'flat' object
13:25:23 <witek> so do we agree on using simplejson, or do you still want to test rapidjson as well?
13:26:10 <chaconpiza> can rapidjson be installed easy-way with pip?
13:26:19 <adriancz> I think simplejson should be good enough
13:26:21 <chaconpiza> I can do the check just after the meeting
13:29:10 <witek> I'm fine with simplejson as well, we can still change it in future if proves to cause any problems
13:29:44 <witek> #topic aob
13:29:58 <witek> do we have any other topics for today?
13:30:38 <chaconpiza> no from my part
13:31:42 <witek> OK, let's wrap it up then
13:31:52 <witek> thanks for joining
13:31:57 <witek> and for the tests
13:32:04 <chaconpiza> thanks
13:32:10 <witek> see you next time
13:32:10 <Dobroslaw> thank you
13:32:11 <bandorf> Thanks, bye everybody
13:32:19 <witek> #endmeeting