#openstack-performance log

15:30:00 <DinaBelova> #startmeeting Performance Team
15:30:00 <openstack> Meeting started Tue Apr 18 15:30:00 2017 UTC and is due to finish in 60 minutes.  The chair is DinaBelova. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:30:01 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:30:04 <openstack> The meeting name has been set to 'performance_team'
15:30:09 <rcherrueau> o/
15:30:19 <DinaBelova> rcherrueau good evening sir :)
15:30:31 <DinaBelova> tovin07 akrzos o/
15:30:31 <tovin07> hello o/
15:30:51 <akrzos> DinaBelova: o/
15:31:03 <DinaBelova> let's get started I guess :)
15:31:08 <DinaBelova> #topic Action Items
15:31:15 <DinaBelova> last time we had only one action item on rcherrueau
15:31:29 <rcherrueau> hum ...
15:31:29 <DinaBelova> regarding adding new testing methodology to the perf docs
15:31:37 <rcherrueau> ongoing :)
15:31:46 <DinaBelova> yeah, I though so as well :)
15:31:51 <DinaBelova> so let's keep it :)
15:32:00 <DinaBelova> #action rcherrueau add OpenStack testing under networking delays (e.g. multisite deployment) methodology to performance docs (openstack under WAN)
15:32:07 <DinaBelova> #topic Current progress on the planned tests
15:32:21 <DinaBelova> in the meanwhile rcherrueau please share your current progress :)
15:32:26 <rcherrueau> We have our first results :)
15:32:34 <DinaBelova> yay :)
15:32:42 <rcherrueau> The deployment model we use is always the same: control, network and volume services are on the same nodes. computes are on dedicated nodes.
15:32:58 <rcherrueau> Then, we add latency between computes and the control node, and run Rally scenarios.
15:33:09 <rcherrueau> The latency variations are: 10, 25, 50, 100 and 150ms.
15:33:20 <rcherrueau> First experiments show that a bigger latency implies a longer VM boot time and delete time.
15:33:33 <rcherrueau> This was expected. Here are some results: At 10, the average boot time is 22 sec, whereas at 150 it is 30 sec.
15:33:43 <rcherrueau> Respectively 5 sec and 8 sec for delete time.
15:33:50 <DinaBelova> yeah, this looks expected
15:33:57 <rcherrueau> I expect the time difference during boot comes from Glance. No idea for the delete time.
15:34:11 <rcherrueau> except oslo_messaging communications
15:34:21 <rcherrueau> I can say more soon, because I used OSProfiler at the same time to produce traces.
15:34:37 <rcherrueau> I have to dig into that and make a diff between two OSProfiler traces to see which functions are responsible for this difference.
15:34:41 <DinaBelova> cool, this should give us specific place
15:34:48 <rcherrueau> yep
15:34:57 <rcherrueau> For next week I plan to test neutron and I also wanna see how latency affects OpenStack when you have many clients and thus many messages in the bus (by varying concurrency in Rally).
15:35:10 <rcherrueau> If I have the time, I also wanna run Shaker.
15:35:23 <rcherrueau> That's all for Inria
15:35:29 <DinaBelova> ack, thank you sir
15:35:30 <DinaBelova> thanks again
15:35:48 <DinaBelova> akrzos, sir, do you have any news regarding the gnocchi testing?
15:36:18 <akrzos> Yeah continuing to try and scale up
15:36:34 <akrzos> Hit an issue where gnocchi lost coordinator on one controller
15:36:49 <akrzos> thus if thats 3 controllers
15:36:56 <akrzos> you loss 1/3 of your capacity
15:37:01 <akrzos> not sure why that occured yet
15:37:23 <akrzos> been able to get ~5k instances
15:37:27 <akrzos> still trying to get to 10k
15:37:29 <DinaBelova> do data from the monitoring? can't it be some overload somewhere?
15:37:34 <akrzos> also remvoed the collector
15:37:49 <akrzos> but now looks like rpc settings not necessarily optimal for agent-notification
15:37:58 <akrzos> and the gnocchi api to receive so many requests
15:38:41 <akrzos> not really sure from the monitoring other than it's easily fixable with restarting the metricd processes
15:38:49 <akrzos> but you have to catch it
15:39:06 <DinaBelova> okay, what's your current feeling? Do you think it's still possible to reach 10k?
15:39:36 <akrzos> there is plenty of resources still on this setup its getting the services to use it
15:39:55 <akrzos> i have like 4 days now to figure it out
15:40:00 <akrzos> so probably not likely
15:40:06 <akrzos> unfortunately
15:40:09 <DinaBelova> :(
15:40:30 <DinaBelova> that's sad, but let's hope you'll overcome this issue
15:40:36 * akrzos fingers crossed
15:40:42 <DinaBelova> true
15:41:30 <DinaBelova> ok, so from mirantis side I was able to confirm that we're switching to Mirantis Cloud Platform (MCP) usage in our scale and performance tests
15:41:44 <DinaBelova> this is new experience for us, so we need to learn all tips and tricks first :)
15:41:53 <DinaBelova> before going to the tests themselves
15:42:41 <DinaBelova> so I suspect in nearest future we'll be trying to deploy it against various scale and create automatizations around this process with usual set of baseline tests run against it
15:43:29 <DinaBelova> so I suspect my updates won't be really interesting in next month or so :D
15:43:44 <DinaBelova> I think that's all from my side
15:43:59 <DinaBelova> #topic Open Discussion
15:44:06 <DinaBelova> tovin07 anything to talk about?
15:44:11 <tovin07> yes
15:44:33 * tovin07 getting link
15:45:12 <tovin07> #link Raaly + OSprofiler https://review.openstack.org/#/c/456278/
15:45:36 * DinaBelova adds this change to the review list
15:45:39 <tovin07> Last week, Rally PTL create a spec for this feature
15:45:57 <tovin07> rcherrueau had some comment in this patch already
15:46:05 <rcherrueau> I wrote a small comment on this one on the form. But, on the basis this is really good.
15:46:29 <DinaBelova> sadly I did not take a look on it so far
15:46:36 <DinaBelova> but will do
15:47:19 <tovin07> besides, I tried some test with rally (with OSprofiler enabled) to measure overhead of OSprofiler in devstack environment
15:47:34 <tovin07> *tests
15:47:36 <DinaBelova> tovin07 oh, interesting
15:47:52 <DinaBelova> I suspect the influenct depends much on the chosen storage background
15:48:18 <DinaBelova> tovin07 what is your current feeling on this?
15:48:19 <tovin07> currently, I try with redis
15:49:29 <tovin07> with some small tests, I saw that it take about 0 -> 7% overhead in testing time
15:49:56 <tovin07> with zero-optimization, I think it’s a good result :D
15:50:21 <DinaBelova> how does it depend on the trace points number?
15:50:42 <DinaBelova> is the dependency linear?
15:51:05 <tovin07> I did not have the detail answer for this :/
15:51:30 <DinaBelova> ok, please keep us updated :)
15:51:52 <tovin07> yup
15:52:01 <tovin07> that’s all from me this week
15:52:06 <DinaBelova> tovin07 thank you
15:52:17 <tovin07> How your Easter?
15:52:22 <DinaBelova> ok, anything else to cover? rcherrueau akrzos?
15:52:26 <DinaBelova> tovin07 fine, thanks :)
15:52:29 <rcherrueau> nop
15:52:46 <DinaBelova> ok, so thanks everyone for joining today, let's have a good week :)
15:52:50 <DinaBelova> bye!
15:52:51 <tovin07> thanks
15:52:52 <tovin07> :D
15:52:56 <DinaBelova> #endmeeting