#openstack-meeting log

16:00:08 <mlavalle> #startmeeting neutron_performance
16:00:09 <openstack> Meeting started Mon Jun  3 16:00:08 2019 UTC and is due to finish in 60 minutes.  The chair is mlavalle. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:10 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:13 <openstack> The meeting name has been set to 'neutron_performance'
16:00:37 <rubasov> hello
16:00:58 <haleyb> hi
16:01:06 <jrbalderrama> hi
16:01:55 <mlavalle> I know slaweq and bcafarel won't attend today's meeting
16:02:25 <mlavalle> so I think we can get going
16:02:50 <njohnston> o/
16:03:14 <mlavalle> there he is
16:03:27 <mlavalle> #topic Updates
16:03:47 <mlavalle> rubasov: do you have an update for today?
16:03:52 <rubasov> mlavalle: yes
16:04:14 <rubasov> just an hour ago I uploaded the rally scenario for port binding I promised
16:04:19 <rubasov> here's the topic
16:04:21 <rubasov> https://review.opendev.org/#/q/topic:rally-bind
16:04:36 <rubasov> the main stuff is in the rally-openstack change
16:05:09 <rubasov> when you review it please consider if it's realistic enough for our goals
16:05:54 <rubasov> that's about it
16:06:42 <mlavalle> at first glance, this looks good
16:07:06 <njohnston> Yeah, I have only had a chance for a quick look but it seems sound out of the gate
16:08:32 <rubasov> just leave me comments there if you think anything else should be included
16:08:53 <mlavalle> and the good news is that each one of the operations you are calling in https://review.opendev.org/#/c/662781/1/rally_openstack/scenarios/neutron/network.py is "atomic"
16:09:04 <mlavalle> so we will be able to see them in the rally report
16:09:26 <rubasov> yep they are all measured one by one
16:10:07 <mlavalle> because the alternative is to create VMs
16:10:27 <mlavalle> but in that case we loose the ability to measure our primitive operations individually
16:11:45 <mlavalle> the downside is that we are not measuring the "wiring" of the vifs in the agent
16:11:49 <rubasov> yeah that's quite different from port binding
16:12:00 <rubasov> may be interesting in itself though
16:12:17 <mlavalle> yeah, we also need to capture that
16:12:24 <mlavalle> somehow
16:12:59 <rubasov> I guess what we need is what happens after port plug in neutron, right?
16:13:11 <mlavalle> right
16:13:29 <rubasov> unfortunately that's not exposed on the api, so rally may not be the right tool for it
16:13:43 <rubasov> actually I do not really know what would be the right tool for it
16:13:59 <mlavalle> correct, what rally can measure is rest api operations
16:14:26 <mlavalle> but maybe that part of the circuit we can capture with osprofiler
16:14:45 <mlavalle> if we had a scenario that spins instances
16:15:04 <rubasov> we can measure a full vm boot and look at the osprofiler output
16:15:11 <mlavalle> correct
16:15:26 <rubasov> I can add another scenario for that
16:15:35 <mlavalle> I still think that the scenario you propossed today is very valuable
16:15:48 <mlavalle> because it allows us to isolate that part of the circuit
16:16:06 <rubasov> this will be much easier to interpret
16:16:12 <rubasov> but it's not the full thing
16:16:19 <mlavalle> but we also need to capture the agent side
16:16:50 <mlavalle> we could just re-use one of the nova scnarios that already exist in rally-openstack
16:16:58 <mlavalle> just add it to our rally job
16:17:35 <rubasov> yep likeley it's already around somewhere
16:17:42 <rubasov> I'll look around
16:18:22 <mlavalle> rubasov: https://opendev.org/openstack/rally-openstack/src/branch/master/rally_openstack/scenarios/nova
16:18:49 <mlavalle> and https://opendev.org/openstack/rally-openstack/src/branch/master/rally-jobs/nova.yaml
16:20:43 <rubasov> yep this sounds just the one we may need: NovaServers.boot_server_and_list_interfaces
16:21:02 <mlavalle> yes
16:21:22 <mlavalle> we are only interested on the boot_server half
16:21:37 <mlavalle> and what osprofiler can tell us about it
16:22:34 <mlavalle> anything else rubasov today?
16:22:41 <rubasov> that's all from me
16:22:58 <mlavalle> great job! thanks very much!
16:23:19 <rubasov> never mind
16:23:34 <mlavalle> so on my side and made progress with the EnOS deployment in my big PC
16:24:18 <mlavalle> at this point, I am deploying i control node, 1 network and 10 computes
16:24:40 <mlavalle> and I have about 55% memory utilization
16:25:13 <mlavalle> so I feel confident that I can scale this up with the memory I have (64GB) to about 20 computes
16:25:46 <mlavalle> and I can add another 64GB of memory, so probably I can get up to 50 or 60 computes
16:26:01 <mlavalle> but before adding more memory, I want to stress this config
16:26:13 <mlavalle> I want to make sure the CPU is not the limiting resource
16:26:21 <mlavalle> wich doesn't seem to be
16:26:36 <mlavalle> with 32 threads, CPU utilization is very low
16:27:29 <mlavalle> My next steps are to max out the current memory with 20 computes
16:27:46 <mlavalle> and then start runing the scanrio that rubasov just proposed
16:28:15 <mlavalle> so I have several questions for jrbalderrama
16:29:02 <jrbalderrama> Of course, I'll try my best, msimonin (the maintainer is out of office today)
16:29:17 <mlavalle> 1) Does enos install rally and rally-openstack from stable branch (stein)?
16:31:10 <jrbalderrama> 1. by default all is taken from a stable branch, alternatively you can define the repo path and branch
16:31:51 <mlavalle> the question is relevant because I want to test with rubasov's patch
16:32:05 <mlavalle> maybe just let enos install from branch
16:32:38 <mlavalle> and take the relevant files from rubasov's patch and put them on the installed rally-openstack
16:32:58 <mlavalle> ?
16:33:13 <mlavalle> I will try that and let you know how it goes
16:33:19 <jrbalderrama> That is possible, actually in the past we had some configs with some local patches applied after install
16:33:33 <mlavalle> ok
16:34:20 <mlavalle> 2) I had trouble again booting up instances: http://paste.openstack.org/show/752412/
16:34:40 <mlavalle> please see lines 3 to 6
16:35:20 <mlavalle> I used the the images that enos gave me, cirros-uec and debian-19
16:35:28 <mlavalle> I mean, debian-9
16:36:55 <jrbalderrama> It is strange. I cannot answer that right now. If we have access to your configuration file we can try to reproduce the behaviour here an let you know
16:37:09 <mlavalle> but still, as you can see in the nova scheduler log lines, the ImagePropertiesFilter doesn't find compytes
16:37:26 <mlavalle> you mean reservation.yaml
16:37:28 <mlavalle> ?
16:38:28 <jrbalderrama> yes
16:38:44 <mlavalle> ok, I'll email to you as soon as the meeting is over
16:39:44 <jrbalderrama> OK thanks
16:41:09 <mlavalle> 3) I came across this paper https://hal.inria.fr/hal-01415522v2/document, that some members of your team wrote. If you look at page 12, the last paragraph on the left column states that "We use the “fake driver” capability of Nova to deploy 50 nova-compute containers
16:41:11 <mlavalle> per physical node, thus allowing to reach 1,000 fake
16:41:13 <mlavalle> compute-nodes with 20 physical m
16:41:20 <mlavalle> achines"
16:41:32 <jrbalderrama> FYI I just checked the test environment with stein I have installed on my PC and I got public and private subnets.
16:43:18 <mlavalle> so if I can deploy 20 computes in my bug pc, and configure the nova fake driver, I could achieve a 1000 scale test according to the paper
16:43:21 <mlavalle> right?
16:44:03 <jrbalderrama> The paper you mention probably is related to the presentation at the summit:  https://www.openstack.org/summit/barcelona-2016/summit-schedule/events/15977/chasing-1000-nodes-scale
16:44:13 <mlavalle> yes
16:44:35 <mlavalle> my conclusion is that there wasn't a test with 1000 real nodes
16:44:50 <mlavalle> they were simulated with the nova fake driver, right?
16:45:40 <jrbalderrama> You are right those are not real. I can double check with msimonin about it to confirm. It is one of the authors, however lately we were working with VMs to deploy nodes scaling hundreds
16:46:21 <mlavalle> so my complete question would be how do I configure that fake driver in my deployment
16:47:33 <mlavalle> from our point of view (Neutron) the fakedriver is not going to be good enough, because if there are no instances, there are not vifs to be plugged to br-int
16:47:57 <jrbalderrama> If I remember right it was more a hack than a deployment so there is no nodes but openstack does not detect.
16:48:35 <jrbalderrama> indeed for neutron it is not the appropiate way to scale (they have different goals at that time)
16:48:46 <mlavalle> but if we create another fakedriver that just plugs a vif to br-int for each fake instance, then we can simlulate the entire port creation, binding, wiring circuit
16:50:01 <jrbalderrama> I guess if we can provide that driver it could work. But again, VMs are not enough ?
16:50:02 <mlavalle> in that fakedriver we can just create the vifs the same way we create them for the dhcp and L3 agents: https://github.com/openstack/neutron/blob/master/neutron/agent/linux/interface.py#L261
16:50:49 <mlavalle> My assumption is that with the fakedriver, you try to avoid the actual creation of VM"s
16:51:14 <mlavalle> because that allows you to simulate 20 computes per physical machine
16:51:16 <mlavalle> right?
16:51:35 <jrbalderrama> Yes that is right
16:51:48 <mlavalle> otherwise, if you are actually creating VMs, you need real phusical machines
16:52:28 <mlavalle> rubasov, haleyb, njohnston: does it make sense about using a fake driver that just plugs vifs so we can trigger port wiring?
16:52:50 <jrbalderrama> exactly, for physical machines situation we use the Grid5000 testbed
16:53:32 <rubasov> mlavalle: it sounds like it allows us to test vif plugging at a scale we can't reach otherwise, right?
16:53:50 <mlavalle> rubasov: exactly
16:53:50 <jrbalderrama> in your case once you validate your test in your machine with couple(s) of nodes we can go next level with Grid5000
16:54:29 <njohnston> yes, I agree, I think that exercises the basics of port binding sufficiently
16:54:46 <mlavalle> ok, I just wanted to share my thoughts. Let's explore them further
16:55:06 <jrbalderrama> sure
16:55:29 <mlavalle> in the mean time, jrbalderrama if I can get some guidance as to how the fakedriver was configured in an EnOS deployment, it would be great
16:56:25 <mlavalle> 4) Last question: how do I add osprofiler to my deployment?
16:56:26 <jrbalderrama> I will come to you once I got more elements about it
16:56:40 <mlavalle> I found this: https://github.com/BeyondTheClouds/enos-scenarios/tree/master/osprofiler
16:57:04 <mlavalle> but doesn't seem complete and rcent enough. Could I get some guidance over email on how to do it?
16:57:35 <jrbalderrama> 4. We did some progress on that at some point but we never release that because we never used
16:57:57 <mlavalle> I don't mind hacking a bit
16:58:14 <mlavalle> If I can just get some guidance, it would be great
16:58:17 <jrbalderrama> No problem. We can work together on that
16:58:25 <mlavalle> cool
16:58:34 <mlavalle> That's all I have for today
16:58:43 <mlavalle> anything else we should discuss today?
16:59:00 <rubasov> I added one more change to the topic I mentioned
16:59:10 <mlavalle> rubasov: Great!
16:59:15 <rubasov> I hope it enables the nova boot vm scenario in our gate
16:59:29 <mlavalle> Thanks for the quick turnaround :-)
16:59:36 <rubasov> :-)
16:59:49 <mlavalle> ok guys, have a great week!
16:59:54 <mlavalle> thanks for attending!
16:59:57 <rubasov> let's see if the zuul results also say I added it :-)
17:00:06 <mlavalle> #endmeeting