#openstack-lbaas log

20:00:08 <johnsom> #startmeeting Octavia
20:00:09 <openstack> Meeting started Wed Oct 18 20:00:08 2017 UTC and is due to finish in 60 minutes.  The chair is johnsom. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:00:10 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
20:00:13 <openstack> The meeting name has been set to 'octavia'
20:00:14 <Alex_Staf_> johnsom, yep I already told my manager that my hours will be partly US oriented
20:00:21 <johnsom> Hi all
20:00:24 <xgerman_> o/
20:00:26 <longstaf_> hi
20:00:34 <johnsom> #topic Announcements
20:00:35 <nmagnezi> o/
20:00:40 <Alex_Staf_> o/
20:00:45 <jniesz> hi
20:01:16 <johnsom> Starting off, Queens MS1 release is posted.  I'm not sure when they will approve and release that due to the zuul issues, but it's posted.
20:01:22 <johnsom> #link https://review.openstack.org/513072
20:01:35 <johnsom> Also, Newton EOL coming - delayed by gate issues / zuul v3
20:02:00 <johnsom> Newton EOL should have been today, but I suspect it will get delayed a bit as well.
20:02:41 <xgerman_> TC votes
20:02:42 <johnsom> TC elections are open for voting.  If you are a foundation member and have contributed to OpenStack you should have received a ballot email.
20:02:56 <johnsom> <It's on the agenda...>
20:03:04 <nmagnezi> ack.
20:03:23 <Alex_Staf_> checking
20:03:37 <johnsom> I also wanted to mention that I have started work on moving the Octavia projects over to using in-repo zuul v3 configs.
20:03:55 <johnsom> Side note, if you use gmail the ballots are going into the spam folder....
20:04:05 <Alex_Staf_> help with the email topic pls
20:04:09 <xgerman_> PTG dates have been announced: week of 2/26 in Dublin
20:04:13 <xgerman_> Ireland
20:04:33 <Alex_Staf_> it's going to a cold PTG
20:04:45 <rm_work> o/
20:04:52 <xgerman_> whiskey will keep you warm
20:04:57 <johnsom> The subject is "Poll: Queens TC Election"
20:05:11 <Alex_Staf_> xgerman_, u , sir, r a wise man :)
20:06:01 <johnsom> Impact from the Zuul V3 work is for a short (as short as I can) we will be running duplicates of some of the gates.  One is the legacy- and one not.  legacy- is the old auto-generated job that will go away.
20:06:36 <johnsom> I have started with neutron-lbaas as Octavia has nlbaas gates.
20:06:50 <johnsom> So far, it's going ok, but slower than I would like due to the zuul issues
20:07:34 <johnsom> Yep, PTG in Dublin.  Makes me nervous as I know some OpenStack locals that might try to break me.
20:07:48 <johnsom> Any other announcements today?
20:08:04 <xgerman_> birthdays, weddings ?
20:08:12 <johnsom> Ha
20:08:22 <johnsom> Neither here...
20:08:31 <johnsom> #topic Brief progress reports / bugs needing review
20:09:00 <johnsom> You have probably seen the patches go by, I am working on getting our zuul v3 house in order.
20:10:00 <johnsom> I also did some investigation into the OVH host gate failures, but ran out of creative ideas other than turning off KVM if the hostname has "-ovh-" in it.  Something about some of their hosts makes kvm die.
20:10:11 <rm_work> Ah while I remember, Alex_Staf_ can you stay on after the meeting for a bit if you can? wanted to talk to you about tempest stuff
20:10:23 <johnsom> The google implies an AMD issue, but I don't know
20:10:39 <johnsom> Also I wanted to highlight the provider driver spec is up for review:
20:10:45 <johnsom> #link https://review.openstack.org/509957
20:10:48 <Alex_Staf_> rm_work, sure it is part of the topics
20:11:15 <johnsom> Please give it a browse and comment so we can take a second pass addressing comments, having discussions, etc.
20:11:36 <johnsom> Any other progress updates folks would like to share?
20:12:11 <Alex_Staf_> johnsom, added to task list to view the doc
20:12:21 <johnsom> Cool, thanks
20:12:36 <johnsom> #topic Automated tests - Structure guidelines Doc is needed
20:12:46 <johnsom> Alex_Staf_ you have the floor...  grin
20:12:53 <Alex_Staf_> Ok that's one mine
20:13:19 <Alex_Staf_> after submitting inital patch to the octavia git , and not the plugin ( my bad )
20:13:37 <Alex_Staf_> I saw that I have many gaps that prevent me from going forward
20:14:01 <Alex_Staf_> the fact that I am junior python writer is not helping so your help and guidance will be needed
20:14:17 <johnsom> Not a problem, we are happy to help
20:14:18 <Alex_Staf_> additionally we need those guidelines to be written
20:14:28 <rm_work> Yeah, so we have been working on a patch for almost three months now for the new tempest testing framework for octavia
20:14:40 <rm_work> It's here: https://review.openstack.org/#/c/486775/
20:14:49 <rm_work> Any testing you write should be based on that patch
20:15:01 <rm_work> It provides all of the framework that should be required
20:15:12 <Alex_Staf_> rm_work, cool
20:15:18 <rm_work> Mostly that is a collaboration between JudeC, kong and myself
20:15:20 <Alex_Staf_> rm_work, I will look into that
20:15:28 <Alex_Staf_> rm_work, add me to that list :)
20:15:28 <johnsom> Well, there is a chain of patches: https://review.openstack.org/#/q/project:openstack/octavia-tempest-plugin+status:open
20:15:36 <rm_work> I recommend you pull that down and try it out, and give us your comments ;)
20:15:49 <rm_work> yes, so, there are a full range of tests based on this commit too
20:15:52 <Alex_Staf_> I saw those and commented on several
20:15:52 <rm_work> as johnsom points out
20:15:56 <rm_work> we need to rebase them
20:16:03 <rm_work> we have been stabilizing the first one
20:16:23 <rm_work> but now that it looks right, we can move on to rebasing and tweaking those other ones
20:16:28 <johnsom> Yeah, start with the LB patch, the others may need updates based on changes made to it
20:16:30 <nmagnezi> rm_work, will that framework use the octavia python client? (i know some tempest stuff is using tempest made clients)
20:16:31 <rm_work> that was our plan for the rest of this week
20:16:38 <rm_work> nmagnezi: no
20:17:04 <rm_work> nmagnezi: in fact, tempest guidelines specifically expect you to not use any client but the tempest ones
20:17:32 <nmagnezi> rm_work, i never understood the motivation for this
20:17:33 <rm_work> because the clients are specifically made to interact correctly with the API
20:17:35 <nmagnezi> but okay, good to know
20:17:44 <rm_work> and tempest wants to be able to send custom stuff for negative tests
20:17:51 <rm_work> and trying to break the API / hit edge cases
20:17:57 <rm_work> which using the clients would explicitly prevent
20:18:06 <johnsom> nmagnezi I was thinking the same thing, but in reading more on the tempest docs, they want to use a built in REST client so you can do more negative testing than the SDK or client would allow.
20:18:07 <Alex_Staf_> rm_work, tempest has functions for the Octavia API ?
20:18:22 <rm_work> Alex_Staf_: it does as of https://review.openstack.org/#/c/486775/ :)
20:18:25 <johnsom> Yeah, what he said... grin
20:19:02 <rm_work> the goal of tempest is not "for everything to work correctly", it's "to break everything if possible" :P
20:19:09 <rm_work> which means don't use known-good clients
20:19:13 <johnsom> These patches should be creating a tempest "service client" for octavia per: https://docs.openstack.org/tempest/latest/plugin.html#service-clients
20:19:20 <rm_work> yep, they do that
20:19:28 <johnsom> +1
20:19:41 <Alex_Staf_> rm_work, CRUD is not scenario testing we should move it to "API"
20:19:42 <nmagnezi> rm_work, in my mind, it would have made more sense to somehow have a way to switch off the checks in the official client and use it also for testing (but that's not for this discussion i guess)
20:20:13 <rm_work> Alex_Staf_: API testing is for testing the API interactions, which our CRUD actually isn't
20:20:21 <nmagnezi> yeah, a scenario test is more of a "user story" i think\
20:20:21 <rm_work> it's testing the backend processes
20:20:27 <Alex_Staf_> rm_work, hmmm
20:20:29 <rm_work> CRUD *is* a user story
20:20:42 <johnsom> nmagnezi There might be some interesting philosophical questions with SDKs and clients to be had over beverages with the QA team
20:20:42 <rm_work> we *do have* API tests, that test the API explicitly, in the Octavia functional test suite
20:21:09 <Alex_Staf_> rm_work,  a tricky one cus scenario should test not only creation but LB process for configuration scenario
20:21:09 <rm_work> the CRUD tests in Tempest are not actually testing the API
20:21:29 <rm_work> we know the API works from the other tests
20:21:53 <rm_work> these tests make sure the worker and queue and other-system integrations like nova and neutron work in reality
20:21:55 <Alex_Staf_> rm_work, ok I got u , so we need to add some LB verification to the creation process
20:22:04 <rm_work> we also have tests for traffic
20:22:23 <johnsom> rm_work I really like our existing API functional tests, but wonder if the release independent nature of the tempest plugin means we need "traditional" tempest API tests too.
20:22:27 <rm_work> but just being able to actually create, update and delete a LB is quite a lot more complex than just what the API does
20:23:10 <rm_work> johnsom: yeah, we discussed this before -- if we did API tests in tempest, it's probably a good idea, but we need to do them separately with a noop config
20:23:18 <rm_work> i'd be in favor of adding a set of those
20:23:27 <Alex_Staf_> rm_work,  ok I can live with that even though the terminology is different from what I am familiar with.
20:23:40 <johnsom> Ok, I am cool with that being a future item too.  We have coverage with functional API
20:23:45 <rm_work> yes
20:23:47 <rm_work> definitely future
20:24:20 <Alex_Staf_> where can I look at the tests for traffic ?
20:24:27 <Alex_Staf_> same link ?
20:24:27 <rm_work> Alex_Staf_: when you get more deeply familiar with how octavia works on the backend, you'll understand better why I call CRUD a scenario :P
20:24:30 <rm_work> yes
20:24:39 <rm_work> https://review.openstack.org/#/c/486775/31/octavia_tempest_plugin/tests/v2/scenario/test_basic_ops.py
20:24:42 <johnsom> Alex_Staf_ If you have not seen them, Octavia is a bit different than other projects in that our functional gates spin up an API server and run against that with no backend controller worker.
20:24:44 <rm_work> those are the traffic tests
20:24:55 <rm_work> right now there's one, I have a second one WIP on top of it in a different CR
20:25:24 <johnsom> #link https://github.com/openstack/octavia/tree/master/octavia/tests/functional/api
20:25:30 <Alex_Staf_> rm_work, I know actually but u added a new logic to my understanding
20:25:32 <rm_work> but I still need to figure out some logistics around that one (it's for testing failover, and right now it is designed specifically for ACTIVE_STANDBY, which we don't run in our gates at all (though we should!)
20:25:59 <rm_work> )
20:26:05 <johnsom> Right, Active/Standby tests have been long needed
20:26:28 <johnsom> Zuulv3 could make that actually possible
20:26:43 <nmagnezi> indeed. would be really great to have such  tests
20:26:43 <Alex_Staf_> whats zuul (v3)
20:26:55 <rm_work> yes, though also at the moment it only actually works with the FLIP driver, since the FLIP driver updates the DB on failover, but the default upstream driver does not
20:27:04 <rm_work> I wanted to discuss a thought about how to make that happen, actually
20:27:15 <johnsom> #link https://docs.openstack.org/infra/manual/zuulv3.html
20:27:34 <Alex_Staf_> tnx, added to the task list
20:27:35 <johnsom> It's a new gate testing system for OpenStack.  It went live Sunday
20:28:41 <johnsom> So back to the tempest plugin, Alex_Staf_ you were asking about guidelines, I pointed you to the tempest plugin docs, were there more guidelines you think we as Octavia need to create?
20:29:31 <Alex_Staf_> Where are the CRED function should be written for each object for example
20:30:07 <Alex_Staf_> johnsom, after u told me that there will be object_client files it is clear to me . So stuff like those
20:30:38 <Alex_Staf_> I think a brief explanation with example could be good .
20:31:20 <nmagnezi> johnsom, what are our goals for the tempest plugin as far as queens goes anyway?
20:31:41 <johnsom> I think once we have the LB patch landed it will be easier to follow the tempest plugin docs.
20:32:01 <Alex_Staf_> johnsom, agree
20:32:13 <johnsom> nmagnezi Good question, give me a second
20:32:19 <nmagnezi> np
20:32:54 <johnsom> For the 'official" queens goal on the tempest plugin the completion criteria is here:
20:32:55 <johnsom> https://governance.openstack.org/tc/goals/queens/split-tempest-plugins.html#completion-criteria
20:33:48 <nmagnezi> i see "Switch gating jobs to use the new plugin project instead of the bundled one"
20:33:51 <johnsom> For us, I would really like to see at least basic CRUD coverage for the main objects, LB, listener, L7, etc. running and gating
20:34:11 <nmagnezi> i guess we'll need to dedicate some resources to make it happen
20:34:44 * nmagnezi glares at Alex_Staf_
20:34:45 <nmagnezi> :)
20:34:48 <Alex_Staf_> l7 has zillion of possible scenarios
20:34:55 <johnsom> So far, rm_work, JudeC, and kong have been working on that, more is always helpful!
20:35:12 <Alex_Staf_> o/   \o/   \o
20:35:28 <rm_work> yeah we haven't gotten to L7 yet lol
20:35:41 <johnsom> Yeah, I'm not expecting we will have the mythical "full coverage" in queens, but at least a basic CRUD set would be good.
20:35:44 <Alex_Staf_> rm_work, I have a matrix for that I will share
20:35:54 <xgerman_> tests are a good way of getting people started on the project
20:36:06 <nmagnezi> xgerman_, +1
20:36:12 <rm_work> probably we can have you work on that one :P just use that first LB testing patch as a base
20:36:22 <rm_work> and look at how we were adding the other tests to it
20:36:27 <rm_work> like pool / listener
20:37:05 <Alex_Staf_> cool I will study the testing patch and follow that
20:37:22 <Alex_Staf_> prepare yourselves for a questions rain :)
20:37:26 <johnsom> Really, the only other must-do item (IMO) is get a provider driver interface in place for queens
20:37:39 <rm_work> yes
20:37:57 <johnsom> Act/Act is high-want
20:38:04 <johnsom> Again, IMO
20:38:28 <johnsom> Alex_Staf_ Don't be shy, we will all probably learn something
20:38:48 <xgerman_> +1
20:38:59 <openstackgerrit> Adam Harwell proposed openstack/octavia-tempest-plugin master: Create scenario tests for loadbalancers  https://review.openstack.org/486775
20:39:11 <rm_work> ^^ addressed the latest round of concerns from you and kong
20:39:22 <johnsom> Ok, any other items on the tempest plugin work?
20:39:24 <rm_work> err, johnsom and kong
20:39:27 <Alex_Staf_> https://ibb.co/mhCuMR
20:39:48 <rm_work> Alex_Staf_: yeah we may actually want to do some sort of dynamic testing with that
20:40:08 <rm_work> basically, provide that matrix in actual programatic matrix form, and have tempest do every permutation
20:40:19 <rm_work> (for an API test, at least)
20:40:41 <rm_work> though it'd be nice to do it with traffic as well, but that'd require more individual tests i think
20:40:42 <Alex_Staf_> rm_work, sounds smarter :) interesting how this happens
20:40:47 <johnsom> What ever happened to that DDT stuff that was started in neutron-lbaas?  I know it never made it far enough to actually be a gate.
20:40:58 <Alex_Staf_> rm_work, my plan is to test it with traffic
20:41:06 <xgerman_> nah, it was in the gate bu tetsing took too long
20:41:21 <Alex_Staf_> it did not play nice =\
20:41:23 <rm_work> johnsom: basically it died i think, fnaval_ was the driver for it IIRC and he got pulled off
20:41:25 <johnsom> #link https://specs.openstack.org/openstack/qa-specs/specs/tempest/ddt-testing.html
20:41:27 <Alex_Staf_> not on my setups at least
20:41:53 <fnaval_> hi Alex_Staf_ - were you able to find what you were looking for earlier?
20:41:54 <xgerman_> well, it was tesring a ton of combinations and that took long — so let’s focus on the ahppy case
20:42:26 <rm_work> if we reuse a LB for that, it should be OK
20:42:26 <nmagnezi> rm_work, looks like he's back :)
20:42:31 <kong> rm_work: ack
20:42:31 <Alex_Staf_> fnaval, I am getting there . Have a big patch to read :)
20:42:39 <kong> rm_work: will review and test today
20:42:42 <fnaval> cool! =)
20:42:49 <xgerman_> rm_work — reusing stuff is always questionable in tests
20:43:32 <rm_work> xgerman_: yeah but in this case, we can have one that tries to test all permutations
20:43:42 <Alex_Staf_> xgerman_, it is better to recreate for testing
20:43:43 <rm_work> since the issue will really be "does the traffic pass correctly", not "does the LB error"
20:44:07 <rm_work> we have other tests that we can run individually to make sure the LB correctly accepts the config
20:44:23 <fnaval> I really liked that it did/does permutations so that it can find the edge cases.   I think tagging happy test cases and running them would be better(but leaving in the ddt stuff).
20:44:33 <xgerman_> ok, I am just wary since if we find something odd it might be tough to reproduce.fix
20:45:04 <johnsom> Yes and no, I think we should look at optimizing the tests around a single LB where it makes sense.  We may be able to spin up more compute hosts with zuulv3 which would help with the test time by paralleling more.
20:46:41 <johnsom> LB create is high overhead and has a time penalty.
20:47:35 <johnsom> Are we ready to move on to Open Discussion or is there more to discuss here?
20:47:57 <johnsom> #topic Open Discussion
20:48:08 <johnsom> Ok, other topics today?
20:48:09 <rm_work> Maintenance API / Amphora-AZ
20:48:35 <rm_work> I've had a spec stewing for an API around doing maintenance on AZ/HV ...
20:48:42 <rm_work> but I think I've been convinced we don't really need it
20:48:58 <johnsom> #link https://review.openstack.org/509933
20:49:00 <rm_work> so long as I can get agreement to merge one of the two patches I've got up for returning Amphora AZ data
20:50:04 <johnsom> I lean towards the query nova approach, but I need to circle back on those. I have been distracted with all of these gate issues.
20:50:04 <rm_work> The first one stores the data locally in our DB at create time, so it can be very quickly searched/filtered on:
20:50:06 <rm_work> #link https://review.openstack.org/510225
20:50:33 <rm_work> The second approach queries nova along with every request, and returns responses from that data:
20:50:35 <rm_work> #link https://review.openstack.org/511045
20:50:57 <johnsom> Every request to this admin API for amphora details....
20:51:06 <rm_work> yes, good distinction
20:51:07 <johnsom> Not every API request
20:51:08 <rm_work> not like, EVERY call :P
20:51:27 <rm_work> It will be guaranteed accurate, while technically the first approach *could* drift, if people are doing live-migrates and such, though I don't think that's especially likely
20:52:13 <rm_work> I'm generally more in favor of the first approach (we store it in the DB) because I think the risk/reward tradeoff is clear for me, but possibly with some deployers it might skew the other way
20:52:56 <rm_work> I feel like the second one (query nova) is less contentious... less efficient in the 95% case, but more accurate in the 5% case
20:53:14 <rm_work> and since we get to always go with the LCD here...
20:53:26 <rm_work> My guess is that's the one people will generally vote for ;)
20:53:26 <jniesz> for the first approach is there an option to re-sync?
20:53:34 <rm_work> I didn't have one, though we could do that
20:53:36 <jniesz> if it happens to get out of sync since create
20:53:45 <rm_work> just adds complexity
20:53:55 <rm_work> I'm sure housekeeping could do it as a periodic
20:54:19 <johnsom> I lean more towards the query nova approach for two reasons. 1. Single point of truth in nova. 2. Platform folks may have maintenance procedures for the hosts/hypervisors that just live-migrates whatever instances are running for maintenance, so we would have inaccurate info in the DB.
20:54:20 <xgerman_> let’s not overthink this — let’s go with nova and if we see issues we can add on
20:56:16 <johnsom> True, it's probably easier to change to storing than to change the other way (no DB contraction for example)
20:56:22 <rm_work> yep
20:56:40 <rm_work> so ok, please review https://review.openstack.org/511045 ?
20:56:40 <jniesz> yea that makes sense
20:56:57 <johnsom> Ok, so  please everyone, review the specs rm_work listed and give your input.
20:57:09 <nmagnezi> #link https://review.openstack.org/511045
20:57:41 <rm_work> As I said, I think the maintenance API spec is dead -- planning to abandon it assuming the Amp-az patch merges, and no one else expresses interest
20:57:42 <johnsom> We have a few more minutes, any other topics?
20:57:55 <rm_work> I might put up another spec for an AZ-evacuate call to do pre-failure though
20:58:39 <johnsom> Ok folks, thanks for the strong turn out today!
20:58:59 <johnsom> #endmeeting