16:00:08 #startmeeting neutron_performance 16:00:09 Meeting started Mon Jun 3 16:00:08 2019 UTC and is due to finish in 60 minutes. The chair is mlavalle. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:10 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:13 The meeting name has been set to 'neutron_performance' 16:00:37 hello 16:00:58 hi 16:01:06 hi 16:01:55 I know slaweq and bcafarel won't attend today's meeting 16:02:25 so I think we can get going 16:02:50 o/ 16:03:14 there he is 16:03:27 #topic Updates 16:03:47 rubasov: do you have an update for today? 16:03:52 mlavalle: yes 16:04:14 just an hour ago I uploaded the rally scenario for port binding I promised 16:04:19 here's the topic 16:04:21 https://review.opendev.org/#/q/topic:rally-bind 16:04:36 the main stuff is in the rally-openstack change 16:05:09 when you review it please consider if it's realistic enough for our goals 16:05:54 that's about it 16:06:42 at first glance, this looks good 16:07:06 Yeah, I have only had a chance for a quick look but it seems sound out of the gate 16:08:32 just leave me comments there if you think anything else should be included 16:08:53 and the good news is that each one of the operations you are calling in https://review.opendev.org/#/c/662781/1/rally_openstack/scenarios/neutron/network.py is "atomic" 16:09:04 so we will be able to see them in the rally report 16:09:26 yep they are all measured one by one 16:10:07 because the alternative is to create VMs 16:10:27 but in that case we loose the ability to measure our primitive operations individually 16:11:45 the downside is that we are not measuring the "wiring" of the vifs in the agent 16:11:49 yeah that's quite different from port binding 16:12:00 may be interesting in itself though 16:12:17 yeah, we also need to capture that 16:12:24 somehow 16:12:59 I guess what we need is what happens after port plug in neutron, right? 16:13:11 right 16:13:29 unfortunately that's not exposed on the api, so rally may not be the right tool for it 16:13:43 actually I do not really know what would be the right tool for it 16:13:59 correct, what rally can measure is rest api operations 16:14:26 but maybe that part of the circuit we can capture with osprofiler 16:14:45 if we had a scenario that spins instances 16:15:04 we can measure a full vm boot and look at the osprofiler output 16:15:11 correct 16:15:26 I can add another scenario for that 16:15:35 I still think that the scenario you propossed today is very valuable 16:15:48 because it allows us to isolate that part of the circuit 16:16:06 this will be much easier to interpret 16:16:12 but it's not the full thing 16:16:19 but we also need to capture the agent side 16:16:50 we could just re-use one of the nova scnarios that already exist in rally-openstack 16:16:58 just add it to our rally job 16:17:35 yep likeley it's already around somewhere 16:17:42 I'll look around 16:18:22 rubasov: https://opendev.org/openstack/rally-openstack/src/branch/master/rally_openstack/scenarios/nova 16:18:49 and https://opendev.org/openstack/rally-openstack/src/branch/master/rally-jobs/nova.yaml 16:20:43 yep this sounds just the one we may need: NovaServers.boot_server_and_list_interfaces 16:21:02 yes 16:21:22 we are only interested on the boot_server half 16:21:37 and what osprofiler can tell us about it 16:22:34 anything else rubasov today? 16:22:41 that's all from me 16:22:58 great job! thanks very much! 16:23:19 never mind 16:23:34 so on my side and made progress with the EnOS deployment in my big PC 16:24:18 at this point, I am deploying i control node, 1 network and 10 computes 16:24:40 and I have about 55% memory utilization 16:25:13 so I feel confident that I can scale this up with the memory I have (64GB) to about 20 computes 16:25:46 and I can add another 64GB of memory, so probably I can get up to 50 or 60 computes 16:26:01 but before adding more memory, I want to stress this config 16:26:13 I want to make sure the CPU is not the limiting resource 16:26:21 wich doesn't seem to be 16:26:36 with 32 threads, CPU utilization is very low 16:27:29 My next steps are to max out the current memory with 20 computes 16:27:46 and then start runing the scanrio that rubasov just proposed 16:28:15 so I have several questions for jrbalderrama 16:29:02 Of course, I'll try my best, msimonin (the maintainer is out of office today) 16:29:17 1) Does enos install rally and rally-openstack from stable branch (stein)? 16:31:10 1. by default all is taken from a stable branch, alternatively you can define the repo path and branch 16:31:51 the question is relevant because I want to test with rubasov's patch 16:32:05 maybe just let enos install from branch 16:32:38 and take the relevant files from rubasov's patch and put them on the installed rally-openstack 16:32:58 ? 16:33:13 I will try that and let you know how it goes 16:33:19 That is possible, actually in the past we had some configs with some local patches applied after install 16:33:33 ok 16:34:20 2) I had trouble again booting up instances: http://paste.openstack.org/show/752412/ 16:34:40 please see lines 3 to 6 16:35:20 I used the the images that enos gave me, cirros-uec and debian-19 16:35:28 I mean, debian-9 16:36:55 It is strange. I cannot answer that right now. If we have access to your configuration file we can try to reproduce the behaviour here an let you know 16:37:09 but still, as you can see in the nova scheduler log lines, the ImagePropertiesFilter doesn't find compytes 16:37:26 you mean reservation.yaml 16:37:28 ? 16:38:28 yes 16:38:44 ok, I'll email to you as soon as the meeting is over 16:39:44 OK thanks 16:41:09 3) I came across this paper https://hal.inria.fr/hal-01415522v2/document, that some members of your team wrote. If you look at page 12, the last paragraph on the left column states that "We use the “fake driver” capability of Nova to deploy 50 nova-compute containers 16:41:11 per physical node, thus allowing to reach 1,000 fake 16:41:13 compute-nodes with 20 physical m 16:41:20 achines" 16:41:32 FYI I just checked the test environment with stein I have installed on my PC and I got public and private subnets. 16:43:18 so if I can deploy 20 computes in my bug pc, and configure the nova fake driver, I could achieve a 1000 scale test according to the paper 16:43:21 right? 16:44:03 The paper you mention probably is related to the presentation at the summit: https://www.openstack.org/summit/barcelona-2016/summit-schedule/events/15977/chasing-1000-nodes-scale 16:44:13 yes 16:44:35 my conclusion is that there wasn't a test with 1000 real nodes 16:44:50 they were simulated with the nova fake driver, right? 16:45:40 You are right those are not real. I can double check with msimonin about it to confirm. It is one of the authors, however lately we were working with VMs to deploy nodes scaling hundreds 16:46:21 so my complete question would be how do I configure that fake driver in my deployment 16:47:33 from our point of view (Neutron) the fakedriver is not going to be good enough, because if there are no instances, there are not vifs to be plugged to br-int 16:47:57 If I remember right it was more a hack than a deployment so there is no nodes but openstack does not detect. 16:48:35 indeed for neutron it is not the appropiate way to scale (they have different goals at that time) 16:48:46 but if we create another fakedriver that just plugs a vif to br-int for each fake instance, then we can simlulate the entire port creation, binding, wiring circuit 16:50:01 I guess if we can provide that driver it could work. But again, VMs are not enough ? 16:50:02 in that fakedriver we can just create the vifs the same way we create them for the dhcp and L3 agents: https://github.com/openstack/neutron/blob/master/neutron/agent/linux/interface.py#L261 16:50:49 My assumption is that with the fakedriver, you try to avoid the actual creation of VM"s 16:51:14 because that allows you to simulate 20 computes per physical machine 16:51:16 right? 16:51:35 Yes that is right 16:51:48 otherwise, if you are actually creating VMs, you need real phusical machines 16:52:28 rubasov, haleyb, njohnston: does it make sense about using a fake driver that just plugs vifs so we can trigger port wiring? 16:52:50 exactly, for physical machines situation we use the Grid5000 testbed 16:53:32 mlavalle: it sounds like it allows us to test vif plugging at a scale we can't reach otherwise, right? 16:53:50 rubasov: exactly 16:53:50 in your case once you validate your test in your machine with couple(s) of nodes we can go next level with Grid5000 16:54:29 yes, I agree, I think that exercises the basics of port binding sufficiently 16:54:46 ok, I just wanted to share my thoughts. Let's explore them further 16:55:06 sure 16:55:29 in the mean time, jrbalderrama if I can get some guidance as to how the fakedriver was configured in an EnOS deployment, it would be great 16:56:25 4) Last question: how do I add osprofiler to my deployment? 16:56:26 I will come to you once I got more elements about it 16:56:40 I found this: https://github.com/BeyondTheClouds/enos-scenarios/tree/master/osprofiler 16:57:04 but doesn't seem complete and rcent enough. Could I get some guidance over email on how to do it? 16:57:35 4. We did some progress on that at some point but we never release that because we never used 16:57:57 I don't mind hacking a bit 16:58:14 If I can just get some guidance, it would be great 16:58:17 No problem. We can work together on that 16:58:25 cool 16:58:34 That's all I have for today 16:58:43 anything else we should discuss today? 16:59:00 I added one more change to the topic I mentioned 16:59:10 rubasov: Great! 16:59:15 I hope it enables the nova boot vm scenario in our gate 16:59:29 Thanks for the quick turnaround :-) 16:59:36 :-) 16:59:49 ok guys, have a great week! 16:59:54 thanks for attending! 16:59:57 let's see if the zuul results also say I added it :-) 17:00:06 #endmeeting