16:00:02 #startmeeting Performance Team 16:00:02 Meeting started Tue Aug 9 16:00:02 2016 UTC and is due to finish in 60 minutes. The chair is DinaBelova. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:03 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:06 The meeting name has been set to 'performance_team' 16:00:20 hey folks! 16:00:26 bvanhav indeed :) 16:00:41 going to push new release today 16:01:28 ok, let's see who's around 16:01:41 o/ 16:01:49 gokrokve o/ 16:01:53 hi 16:01:57 hi o/ 16:02:04 rohanion, rcherrueau nice to see you :) 16:02:21 same here :) 16:02:37 okay, so let's get started with usual topics 16:03:11 #topic Performance docs update 16:03:28 lots of stuff happened during last week 16:03:39 we were able to merge almost all changes on review 16:03:57 and I cleaned up a bit our LP bugs list 16:04:04 #link https://bugs.launchpad.net/performance-docs 16:04:32 so right now if bug is assigned to me you can feel free to reassign it to yourself and start work on it 16:04:36 :) 16:04:46 #link https://review.openstack.org/#/c/334964/ 16:05:08 this change is going to make our life less painful next merge party 16:05:20 yaaay, merge party! 16:05:29 :D 16:05:45 so we won't need to rebase all changes after each merged commit 16:06:20 rohanion please find Lenya and remind him about https://bugs.launchpad.net/performance-docs/+bug/1609924 :) 16:06:20 Launchpad bug 1609924 in OpenStack Performance Docs "500 nodes lab description is missing" [Medium,Triaged] - Assigned to Leontiy Istomin (listomin) 16:06:53 it's a bit strange we're running our tests against 500 nodes lab and no lab description is yet published here 16:07:20 yeah will do 16:07:54 #info next week will be dedicated to minor fixes for the bugs from the list https://bugs.launchpad.net/performance-docs 16:08:10 * rohanion bookmarked 16:08:31 so I would say that's all regarding performance docs themselves 16:08:49 #topic Current progress on the planned tests 16:09:15 so as I mentioned some meetings ago, Q3 is a logical continuation of the work done in Q2 16:09:37 so today we've continued Neutron testing 16:09:45 on 250 nodes lab 16:10:19 so more results will be coming in next 2 weeks or so 16:10:50 so that will be extended scalability tests, performance measurements under the load, etc. 16:11:06 also in parallel we're working on baseline performance testing 16:11:23 #link http://docs.openstack.org/developer/performance-docs/test_plans/control_plane/plan.html 16:11:30 against 500 nodes 16:11:45 and its comparison with 250 nodes runs we did already 16:12:04 extended reliability testing is planned later :) 16:13:30 #info working on extended Neutron testing and on baseline performance testing on 500 nodes. Reliability tests, 1000 nodes emulation and others are planned in 2 weeks or so 16:13:52 any updates on current test runs from other folks? 16:13:57 rcherrueau ? 16:14:02 Yes 16:14:12 We have a first solution that deploys a distributed OpenStack over g5k (a grid) using kolla. 16:14:25 You can find the code on github 16:14:29 #link https://github.com/BeyondTheClouds/kolla-g5k 16:14:36 ack, bookmarked 16:14:37 DinaBelova, hi, how are you? good to interact again...just curious, is there an established ETA / date for osprofiler w/ Neutron and Swift? any future plans to integrate w/ Juniper Contrail? 16:14:53 oh darn swift 16:14:58 We actually test with a small number of controller and go with the `fake_node` option to multiply the number of compute node. 16:14:58 16:15:01 we forgot about it T_T 16:15:11 #link https://github.com/openstack/kolla/blob/master/doc/nova-fake-driver.rst 16:15:19 kristian__ neutron is already supported :) 16:15:35 and you're right, Swift is not yet supported 16:15:46 I'd say it's up to rohanion to define the ETA 16:15:54 ...just confirming to make sure. and that is releasable for us, yes :) 16:16:39 do we have to integrate osprofiler with swift or just patch it to work with the new driver architecture? 16:16:42 more interested in Juniper contrail integration these days... 16:17:14 regarding the Contrail - the issue is that osprofiler is python library, so we'll be able to hack into its python client.. but not to the Juniper contrail itself using the profiler 16:18:18 yes so coding integration is blocked there, how about with juniper's REST API interface? 16:18:29 so right now I'm not sure how we can include contrail specific tracing details to the osprofiler trace 16:18:52 kristian__ well, I think (??) we need to check how osprofiler is working with the neurtron drivers 16:18:58 DinaBelova: maybe we can use osprofiler_web with juniper's REST API 16:19:10 good points, guys 16:19:38 rohanion - no need, we'll just wrap already wrapped stuff in the neutron drivers code - It looks like it'll be enough to check how drivers are wrapped in neutron 16:20:06 kristian__ good point anyway, let us check 16:20:19 np, thank you 16:20:41 #action DinaBelova rohanion check how can we estimate contrail profiling out of Neutron code using osprofiler 16:21:12 ok, so let's jump to the osprofiler itself :) 16:21:14 #info check if osprofiler is compatible with juniper's REST API 16:21:26 #topic OSProfiler weekly update 16:21:40 so hey! merge party finished today :) 16:21:47 yeah we had a great merge party last week :) 16:21:57 and I'm really glad to say I'm going to cut new release today :) 16:22:06 #info new Osprofiler release coming today 16:22:22 this will unblock lots of stuff 16:22:31 bvanhav ^^ 16:22:46 currently we basically have only one major change on review 16:22:47 too bad I'm not in the US, we could have had a release party lol 16:22:57 #link https://review.openstack.org/#/c/340936/ 16:23:01 rohanion indeed :D 16:23:11 Alex is away 16:23:32 I'll post the link to the openstack/releases once it'll be ready to this chat 16:23:32 Elasticsearch driver is almost ready 16:23:47 rohanion true, but I don't want to delay release because of this 16:24:05 he is going to cover it with tests and the change will be ready to merge 16:24:13 yes, there's no need to delay 16:24:15 I think it'll be more logical to cut one more with elasticsearch once it'll be landed 16:24:28 rohanion thanks for the information 16:24:29 elasticsearch does not change anything in the base architecture 16:24:34 u r welcome 16:24:36 indeed 16:24:48 will it be v1.4.0? 16:25:07 yep, I think so 16:25:14 in fact we may cut 2.0.0 16:25:17 because our changes do not look like a minor update 16:25:28 true 16:25:29 keeping in mind that arch was changed much 16:25:31 agree 16:25:35 2.0.0 sounds better 16:25:45 so elastic will be in 2.0.1 16:25:46 #info so it'll be 2.0.0 release 16:25:51 agree 16:25:53 * rohanion hopes so 16:26:23 I'll be looking forward to it no matter which release number it ends up being 16:26:30 ok, so it looks like that's all about osprofiler 16:26:34 bvanhav true :) 16:26:51 anything else to add here? 16:27:04 I have a question about osprofiler. We are interesting by measuring which methods of the database api (nova/db/api.py) is called the most often. Do you think that osprofiler is a good condidate to do that? 16:27:04 16:27:14 yes 16:27:16 cool 16:27:25 rcherrueau yes it is, but nova support was not yet merged 16:27:41 this needs to be rebased https://review.openstack.org/#/c/254703/ 16:27:45 ok when the merge will take place 16:27:52 we already count the database and wsgi operations 16:28:11 ah 16:28:17 got your point 16:28:44 rcherrueau not in Newton for sure - it's frozen for nova 16:28:57 it's not the change in nova 16:29:14 ok thanks 16:29:20 it looks like a change in osprofiler's 'stats' field 16:29:30 rohanion it is change for nova 16:29:49 as we need to teach NOva how to use osprofiler 16:30:01 exactly the same as it was for cinder and others 16:30:11 ah that's what you're talking about 16:30:15 yes, right 16:30:17 rohanion :D 16:30:29 I was talking about changing the stats 16:30:50 :) I see. For Nova I was really glad to gain help from Mark 16:31:13 but this did not change the fact that it's feature freeze for Nova will end of Newton 16:31:36 but we can always switch to a dev branch :) 16:31:41 ok, so it looks like we're done for osprofiler 16:31:50 seems so 16:32:00 rohanion hehe, we can, but the question is if rcherrueau can do it :) 16:32:16 Why I cannot? 16:32:21 rcherrueau btw, what OpenStack are you running? Mitaka? 16:32:26 the dev branch is public no ? 16:32:27 rcherrueau I'm asking :) 16:32:33 yes mitake 16:32:33 yes 16:32:38 mitaka* 16:32:50 it's somewhere in review.openstack.org/... 16:32:59 rcherrueau - I can find in the history of the change the number of patch set 16:33:06 is it a problem to run osprogiler over mitaka? 16:33:12 osprofiler* 16:33:15 that was workable against mitaka 16:33:19 great 16:33:29 rcherrueau nope, we just need to identify what patch set it was 16:33:36 lemme find it and send it to you 16:33:46 thank you DinaBelova 16:33:57 it will work with mitaka but you won't be able to use any drivers but messaging/ceilometer 16:34:01 * notmyname reads scrollback and notices "swift" 16:34:03 let me know when/if you have swift questions 16:34:25 #action DinaBelova find patch set # and link to the https://review.openstack.org/#/c/254703/ to rcherrueau with pure Mitaka support 16:34:44 rohanion: thanks for the precision 16:34:46 notmyname ack, no so far - as we did not start working on swift support yet :) 16:35:15 #topic Open Discussion 16:35:23 ok, so let's jump to open discussion 16:35:39 anything else to share? 16:35:49 I wanna talk about the 1000 node perf 16:35:55 can I? 16:35:55 sure 16:35:57 :) 16:36:26 So just like I said before, we go with kolla for the deployment of OpenStack 16:36:54 so yeah, container-based installation 16:37:20 We uses kolla to deploy some controller nodes and then we take the advantages of the `fake_driver` option to get many compute 16:37:22 popular topic - at Mirantis we're working on CCP (containeraized control plane) as well 16:37:30 Are any one of you familiar with this option 16:37:39 #link https://github.com/openstack/kolla/blob/master/doc/nova-fake-driver.rst 16:37:39 rcherrueau I used it some time ago 16:37:47 any specific questions? 16:38:00 if i won't be able to answer, I'll ask alwex to help 16:38:10 as he did similar testing in q2 16:38:25 DinaBelova: we don't know how this option works exactly and we wanna be sure that doing some perf test with this option is equivalent to doing the same test with real node. 16:38:41 rcherrueau well, it's not in fact 16:38:48 arf ... 16:39:09 fake driver emulates all operations to the hypervizor, but not doing them in fact 16:39:33 it was a little bit expected in fact 16:39:34 so fake nodes have no libvirt running, and no REAL VMs starting 16:40:02 and this means that some specific network operations and setup is not going aas well 16:40:31 ok thanks 16:40:36 do you know if we can (easily) deploy many compute on the same physical node with kolla? 16:40:59 rcherrueau this means we're emulating the "control plane" for the hypervisor and everything underlaying, but no real VMs will be created 16:41:42 rcherrueau sadly not an expert here :( , I'm pretty sure there will be issues with locating several compute containers on the same node 16:42:07 I think it'd be better to ask in kolla channel 16:42:21 DinaBelova: ok thanks for your responses 16:42:28 I will do so 16:42:46 rcherrueau the only 100% workable way I know to place several computes on the same physical node is to place them in the VMs 16:43:02 this is something we're going to do in next 2-3 weeks or so 16:43:09 against 500 nodes lab 16:43:10 ok 16:43:17 ah ok 16:43:29 so we'll place ~2 VMs per node and have real 1000 nodes cluster 16:43:44 ok got it 16:43:46 but i don't know how to manage this with docker 16:43:56 rcherrueau sorry for not helping much 16:44:21 no don't worry. I will ask on the mailing list 16:44:25 ack, thanks 16:44:31 ok, anything else? 16:44:43 yep 16:44:55 go ahead :) 16:45:08 We run the performance test using rally. 16:45:21 especially the `boot-and-delete` scenario 16:45:50 Which test do you use in the 500 nodes lab? 16:45:57 rcherrueau ack, can you share details regarding number of nodes, cloud topology and the results? 16:46:18 rcherrueau we are using what we're calling baseline performance test suite - http://docs.openstack.org/developer/performance-docs/test_plans/control_plane/plan.html 16:46:41 actually no, not now. We only run it over a small number of instances (less than 50). 16:47:11 rcherrueau ack -so for nova we're stressing it with booting and deleting VMs with security groups 16:47:19 ok thanks 16:47:33 as secgroups is really something that can ruin all the stuff in specific cases 16:47:54 ok 16:48:00 we've published the plugin we used here https://github.com/openstack/performance-docs/tree/master/doc/source/test_plans/control_plane/plugins 16:48:10 rcherrueau ^^ 16:48:11 nice bookmarked ;) 16:48:31 so don't be surprised seeing NovaPerformancePlugin.boot_attach_and_delete_server_with_secgroups scenario :) 16:48:40 this is used from this plugin 16:48:52 OK, I will take a look 16:49:15 all other scenarios are standard 16:49:20 cool :) 16:49:57 rcherrueau anything else? 16:50:09 no 16:50:20 ok, so it looks like we're done for today :) 16:50:23 guys please wait for a minute :) 16:50:28 rohanion a-ha :D 16:50:30 Alex is going to join 16:50:45 aignatyev hey :) 16:50:51 nice to see you here :) 16:50:52 hi :) 16:50:55 Welcome Alex! 16:51:06 hello 16:51:13 please meet another osprofiler monster :) 16:51:22 nice to meet you all :) 16:51:22 aignatyev can you please share info about what's left to be done for https://review.openstack.org/#/c/340936/ ? 16:51:27 so we can understand the ETA 16:51:48 I need to write 1-2 tests for retrieving reports from es 16:51:55 there's already 1 test, but I need more 16:51:58 will be done tomorrow 16:52:59 aignatyev ok, so I'll check the driver against my test lab when it'll be finalized 16:53:04 probably tomorrow as well 16:53:05 thanks 16:53:11 you are welcome 16:53:14 I'll check it as well 16:53:19 rohanion ack 16:53:32 ok, so we had really nice discussion today 16:53:38 thanks everyone for coming 16:53:47 bye! 16:53:51 #endmeeting