16:00:38 #startmeeting neutron_performance 16:00:39 Meeting started Mon Aug 12 16:00:38 2019 UTC and is due to finish in 60 minutes. The chair is mlavalle. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:40 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:42 The meeting name has been set to 'neutron_performance' 16:02:13 o/ 16:03:43 hey 16:03:55 very hot summer in Paris, isn't it? 16:04:29 it is better now luckily :) 16:04:46 we did have a day with 42.5C (aka hot hot hot) 16:05:02 but back in mid July you were hitting 40C, right? around Bastille Day 16:05:32 yes I had a good timing on my holidays :) 16:05:55 are homes, offices, cars and trains prepared for that kind of heat? 16:06:55 well the offices were quite popular for a few days (and cars) 16:07:15 here is Austin, we hit 40+ most of July, all of August and 2 or 3 weeks in September. but our homes, offices and cars are prepared for it 16:07:32 but houses and public transportation were not a good place to be for a few days 16:07:53 yeah south of France has more houses with AC, but around Paris it is still very rare 16:08:07 that was not a productive day :'( 16:08:35 ok, let's assume the others are not attending 16:08:40 :) 16:08:52 yeah it sounds like a middle-of-August attendance level 16:08:53 so let's have a quick one this week, in the spirit of the summer slowdown 16:09:07 #topic Updates 16:10:01 * bcafarel still catching up on previous meetings logs 16:10:14 last time I mentioned that the osprofiler output was dominated by traces for SQLAlchemy, which don't correlate very well with lines in the code 16:10:33 so I sent this message to the ML: http://lists.openstack.org/pipermail/openstack-discuss/2019-July/008104.html 16:12:05 I didn't get any responses, so this morning I forwarded the message directly to akurilin's email address 16:12:14 let's see if we get a follow up from him 16:12:40 in the mean time, given the lack of response, I decided to take an alternative path 16:13:48 * haleyb wanders in late 16:14:37 I decided to instrument the the create_port and update_port code in the ML2 plugin with calls to cProfile 16:16:00 so to get detailed profiling just in the calls? 16:16:07 essentially I add profile = cProfile.Profile() and profile.enable() at the start of those methods 16:17:09 and the profile.disable() and profile.dump_stats('/tmp/xxx') at the end of the methods, where xxx changes depending whether it is create port or update port 16:17:37 with this I get a couple of files that then I can analyze with pstats 16:18:18 I sort the profile by cpu cumtime and then print the top 100 16:18:27 this is whet I get: 16:18:50 for create_port: http://paste.openstack.org/show/756173/ 16:19:11 for update_port: http://paste.openstack.org/show/756174/ 16:19:44 by the way, I call these two methods as part of a script that creates and binds a port 16:20:21 since this cycle has been the focus for our performance exercise 16:21:17 in the case of the port create, you get a lot of sqllchemy stuff at the top 16:21:35 but notice that there are 3 methods from the ipam 16:22:21 if you add their cumtime together, they are close to the top consumers of cumulative cpu time 16:22:34 so it seems we need to dig deeper there 16:22:58 and by the way, they might be the culprits of the sqlalchemy code high utiliztion 16:24:05 in the case of update_port 16:24:18 again, we get a lot of sqlalchemy code at the top 16:24:45 I remember from some time ago a post by jd to transform cprofile data in call trees (could help here to show if they do trigger a lot of sqlalchemy calls) 16:25:17 bcafarel: yeah, I'll get to tthat in a minute 16:25:25 :) 16:26:12 going back to the update_port trace, we also see that we spend a lot of time notifying from the dvr scheduler and dhcp 16:26:23 which is kind of surprising 16:26:43 so again, it seems we need to dig deeper then 16:27:36 then I came accross jd's blog post https://julien.danjou.info/guide-to-python-profiling-cprofile-concrete-case-carbonara/ 16:28:05 * mlavalle ended up buying his book 16:28:33 https://scaling-python.com/ 16:29:11 a worthy investment yes (and this was indeed the link I was thinking about) 16:29:26 so at this point I am generating call trees 16:30:06 but it seems that it needs some massaging before being able to generate an actionable calls tree 16:30:40 so my next step is to refine a bit the way I am gathering data with cProfile in the code 16:30:50 I am probably: 16:31:54 1) going to create a decorator to decorate the methods that we want to profile. This decorator is still going to dump the output in /tmp, but I am going to add a time stamp, so we can save output for several calls 16:32:58 2) with the above decorator, I am then going to run the the create and bind port scenario against my deployment, increasing the load gradually 16:33:28 3) with this, I am going to correlate the rally reports with the cProfile / calls trees 16:33:39 and see if we can get with more precise data 16:34:20 the very neat thing about this, is that we can point very precisely in the code where the bottlenecks are 16:35:06 so that's my update for today 16:35:16 any comments? 16:36:22 none, that looks promising 16:36:46 cool 16:36:57 anything else we should discuss today? 16:38:21 nothing for me, I did not find much time to work on perf topics recently :/ 16:38:31 ok, cool 16:38:54 I really got excited last night when I started seeing all this coming together 16:39:04 I think we are on the right path 16:41:05 ok 16:41:14 let's it a meeting 16:41:20 call it^^^ 16:41:22 :) 16:41:25 have a nice week 16:41:30 thanks for attending 16:41:36 same, see you around! 16:41:36 #endmeeting