20:00:08 #startmeeting Octavia 20:00:09 Meeting started Wed Oct 18 20:00:08 2017 UTC and is due to finish in 60 minutes. The chair is johnsom. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:10 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:00:13 The meeting name has been set to 'octavia' 20:00:14 johnsom, yep I already told my manager that my hours will be partly US oriented 20:00:21 Hi all 20:00:24 o/ 20:00:26 hi 20:00:34 #topic Announcements 20:00:35 o/ 20:00:40 o/ 20:00:45 hi 20:01:16 Starting off, Queens MS1 release is posted. I'm not sure when they will approve and release that due to the zuul issues, but it's posted. 20:01:22 #link https://review.openstack.org/513072 20:01:35 Also, Newton EOL coming - delayed by gate issues / zuul v3 20:02:00 Newton EOL should have been today, but I suspect it will get delayed a bit as well. 20:02:41 TC votes 20:02:42 TC elections are open for voting. If you are a foundation member and have contributed to OpenStack you should have received a ballot email. 20:02:56 20:03:04 ack. 20:03:23 checking 20:03:37 I also wanted to mention that I have started work on moving the Octavia projects over to using in-repo zuul v3 configs. 20:03:55 Side note, if you use gmail the ballots are going into the spam folder.... 20:04:05 help with the email topic pls 20:04:09 PTG dates have been announced: week of 2/26 in Dublin 20:04:13 Ireland 20:04:33 it's going to a cold PTG 20:04:45 o/ 20:04:52 whiskey will keep you warm 20:04:57 The subject is "Poll: Queens TC Election" 20:05:11 xgerman_, u , sir, r a wise man :) 20:06:01 Impact from the Zuul V3 work is for a short (as short as I can) we will be running duplicates of some of the gates. One is the legacy- and one not. legacy- is the old auto-generated job that will go away. 20:06:36 I have started with neutron-lbaas as Octavia has nlbaas gates. 20:06:50 So far, it's going ok, but slower than I would like due to the zuul issues 20:07:34 Yep, PTG in Dublin. Makes me nervous as I know some OpenStack locals that might try to break me. 20:07:48 Any other announcements today? 20:08:04 birthdays, weddings ? 20:08:12 Ha 20:08:22 Neither here... 20:08:31 #topic Brief progress reports / bugs needing review 20:09:00 You have probably seen the patches go by, I am working on getting our zuul v3 house in order. 20:10:00 I also did some investigation into the OVH host gate failures, but ran out of creative ideas other than turning off KVM if the hostname has "-ovh-" in it. Something about some of their hosts makes kvm die. 20:10:11 Ah while I remember, Alex_Staf_ can you stay on after the meeting for a bit if you can? wanted to talk to you about tempest stuff 20:10:23 The google implies an AMD issue, but I don't know 20:10:39 Also I wanted to highlight the provider driver spec is up for review: 20:10:45 #link https://review.openstack.org/509957 20:10:48 rm_work, sure it is part of the topics 20:11:15 Please give it a browse and comment so we can take a second pass addressing comments, having discussions, etc. 20:11:36 Any other progress updates folks would like to share? 20:12:11 johnsom, added to task list to view the doc 20:12:21 Cool, thanks 20:12:36 #topic Automated tests - Structure guidelines Doc is needed 20:12:46 Alex_Staf_ you have the floor... grin 20:12:53 Ok that's one mine 20:13:19 after submitting inital patch to the octavia git , and not the plugin ( my bad ) 20:13:37 I saw that I have many gaps that prevent me from going forward 20:14:01 the fact that I am junior python writer is not helping so your help and guidance will be needed 20:14:17 Not a problem, we are happy to help 20:14:18 additionally we need those guidelines to be written 20:14:28 Yeah, so we have been working on a patch for almost three months now for the new tempest testing framework for octavia 20:14:40 It's here: https://review.openstack.org/#/c/486775/ 20:14:49 Any testing you write should be based on that patch 20:15:01 It provides all of the framework that should be required 20:15:12 rm_work, cool 20:15:18 Mostly that is a collaboration between JudeC, kong and myself 20:15:20 rm_work, I will look into that 20:15:28 rm_work, add me to that list :) 20:15:28 Well, there is a chain of patches: https://review.openstack.org/#/q/project:openstack/octavia-tempest-plugin+status:open 20:15:36 I recommend you pull that down and try it out, and give us your comments ;) 20:15:49 yes, so, there are a full range of tests based on this commit too 20:15:52 I saw those and commented on several 20:15:52 as johnsom points out 20:15:56 we need to rebase them 20:16:03 we have been stabilizing the first one 20:16:23 but now that it looks right, we can move on to rebasing and tweaking those other ones 20:16:28 Yeah, start with the LB patch, the others may need updates based on changes made to it 20:16:30 rm_work, will that framework use the octavia python client? (i know some tempest stuff is using tempest made clients) 20:16:31 that was our plan for the rest of this week 20:16:38 nmagnezi: no 20:17:04 nmagnezi: in fact, tempest guidelines specifically expect you to not use any client but the tempest ones 20:17:32 rm_work, i never understood the motivation for this 20:17:33 because the clients are specifically made to interact correctly with the API 20:17:35 but okay, good to know 20:17:44 and tempest wants to be able to send custom stuff for negative tests 20:17:51 and trying to break the API / hit edge cases 20:17:57 which using the clients would explicitly prevent 20:18:06 nmagnezi I was thinking the same thing, but in reading more on the tempest docs, they want to use a built in REST client so you can do more negative testing than the SDK or client would allow. 20:18:07 rm_work, tempest has functions for the Octavia API ? 20:18:22 Alex_Staf_: it does as of https://review.openstack.org/#/c/486775/ :) 20:18:25 Yeah, what he said... grin 20:19:02 the goal of tempest is not "for everything to work correctly", it's "to break everything if possible" :P 20:19:09 which means don't use known-good clients 20:19:13 These patches should be creating a tempest "service client" for octavia per: https://docs.openstack.org/tempest/latest/plugin.html#service-clients 20:19:20 yep, they do that 20:19:28 +1 20:19:41 rm_work, CRUD is not scenario testing we should move it to "API" 20:19:42 rm_work, in my mind, it would have made more sense to somehow have a way to switch off the checks in the official client and use it also for testing (but that's not for this discussion i guess) 20:20:13 Alex_Staf_: API testing is for testing the API interactions, which our CRUD actually isn't 20:20:21 yeah, a scenario test is more of a "user story" i think\ 20:20:21 it's testing the backend processes 20:20:27 rm_work, hmmm 20:20:29 CRUD *is* a user story 20:20:42 nmagnezi There might be some interesting philosophical questions with SDKs and clients to be had over beverages with the QA team 20:20:42 we *do have* API tests, that test the API explicitly, in the Octavia functional test suite 20:21:09 rm_work, a tricky one cus scenario should test not only creation but LB process for configuration scenario 20:21:09 the CRUD tests in Tempest are not actually testing the API 20:21:29 we know the API works from the other tests 20:21:53 these tests make sure the worker and queue and other-system integrations like nova and neutron work in reality 20:21:55 rm_work, ok I got u , so we need to add some LB verification to the creation process 20:22:04 we also have tests for traffic 20:22:23 rm_work I really like our existing API functional tests, but wonder if the release independent nature of the tempest plugin means we need "traditional" tempest API tests too. 20:22:27 but just being able to actually create, update and delete a LB is quite a lot more complex than just what the API does 20:23:10 johnsom: yeah, we discussed this before -- if we did API tests in tempest, it's probably a good idea, but we need to do them separately with a noop config 20:23:18 i'd be in favor of adding a set of those 20:23:27 rm_work, ok I can live with that even though the terminology is different from what I am familiar with. 20:23:40 Ok, I am cool with that being a future item too. We have coverage with functional API 20:23:45 yes 20:23:47 definitely future 20:24:20 where can I look at the tests for traffic ? 20:24:27 same link ? 20:24:27 Alex_Staf_: when you get more deeply familiar with how octavia works on the backend, you'll understand better why I call CRUD a scenario :P 20:24:30 yes 20:24:39 https://review.openstack.org/#/c/486775/31/octavia_tempest_plugin/tests/v2/scenario/test_basic_ops.py 20:24:42 Alex_Staf_ If you have not seen them, Octavia is a bit different than other projects in that our functional gates spin up an API server and run against that with no backend controller worker. 20:24:44 those are the traffic tests 20:24:55 right now there's one, I have a second one WIP on top of it in a different CR 20:25:24 #link https://github.com/openstack/octavia/tree/master/octavia/tests/functional/api 20:25:30 rm_work, I know actually but u added a new logic to my understanding 20:25:32 but I still need to figure out some logistics around that one (it's for testing failover, and right now it is designed specifically for ACTIVE_STANDBY, which we don't run in our gates at all (though we should!) 20:25:59 ) 20:26:05 Right, Active/Standby tests have been long needed 20:26:28 Zuulv3 could make that actually possible 20:26:43 indeed. would be really great to have such tests 20:26:43 whats zuul (v3) 20:26:55 yes, though also at the moment it only actually works with the FLIP driver, since the FLIP driver updates the DB on failover, but the default upstream driver does not 20:27:04 I wanted to discuss a thought about how to make that happen, actually 20:27:15 #link https://docs.openstack.org/infra/manual/zuulv3.html 20:27:34 tnx, added to the task list 20:27:35 It's a new gate testing system for OpenStack. It went live Sunday 20:28:41 So back to the tempest plugin, Alex_Staf_ you were asking about guidelines, I pointed you to the tempest plugin docs, were there more guidelines you think we as Octavia need to create? 20:29:31 Where are the CRED function should be written for each object for example 20:30:07 johnsom, after u told me that there will be object_client files it is clear to me . So stuff like those 20:30:38 I think a brief explanation with example could be good . 20:31:20 johnsom, what are our goals for the tempest plugin as far as queens goes anyway? 20:31:41 I think once we have the LB patch landed it will be easier to follow the tempest plugin docs. 20:32:01 johnsom, agree 20:32:13 nmagnezi Good question, give me a second 20:32:19 np 20:32:54 For the 'official" queens goal on the tempest plugin the completion criteria is here: 20:32:55 https://governance.openstack.org/tc/goals/queens/split-tempest-plugins.html#completion-criteria 20:33:48 i see "Switch gating jobs to use the new plugin project instead of the bundled one" 20:33:51 For us, I would really like to see at least basic CRUD coverage for the main objects, LB, listener, L7, etc. running and gating 20:34:11 i guess we'll need to dedicate some resources to make it happen 20:34:44 * nmagnezi glares at Alex_Staf_ 20:34:45 :) 20:34:48 l7 has zillion of possible scenarios 20:34:55 So far, rm_work, JudeC, and kong have been working on that, more is always helpful! 20:35:12 o/ \o/ \o 20:35:28 yeah we haven't gotten to L7 yet lol 20:35:41 Yeah, I'm not expecting we will have the mythical "full coverage" in queens, but at least a basic CRUD set would be good. 20:35:44 rm_work, I have a matrix for that I will share 20:35:54 tests are a good way of getting people started on the project 20:36:06 xgerman_, +1 20:36:12 probably we can have you work on that one :P just use that first LB testing patch as a base 20:36:22 and look at how we were adding the other tests to it 20:36:27 like pool / listener 20:37:05 cool I will study the testing patch and follow that 20:37:22 prepare yourselves for a questions rain :) 20:37:26 Really, the only other must-do item (IMO) is get a provider driver interface in place for queens 20:37:39 yes 20:37:57 Act/Act is high-want 20:38:04 Again, IMO 20:38:28 Alex_Staf_ Don't be shy, we will all probably learn something 20:38:48 +1 20:38:59 Adam Harwell proposed openstack/octavia-tempest-plugin master: Create scenario tests for loadbalancers https://review.openstack.org/486775 20:39:11 ^^ addressed the latest round of concerns from you and kong 20:39:22 Ok, any other items on the tempest plugin work? 20:39:24 err, johnsom and kong 20:39:27 https://ibb.co/mhCuMR 20:39:48 Alex_Staf_: yeah we may actually want to do some sort of dynamic testing with that 20:40:08 basically, provide that matrix in actual programatic matrix form, and have tempest do every permutation 20:40:19 (for an API test, at least) 20:40:41 though it'd be nice to do it with traffic as well, but that'd require more individual tests i think 20:40:42 rm_work, sounds smarter :) interesting how this happens 20:40:47 What ever happened to that DDT stuff that was started in neutron-lbaas? I know it never made it far enough to actually be a gate. 20:40:58 rm_work, my plan is to test it with traffic 20:41:06 nah, it was in the gate bu tetsing took too long 20:41:21 it did not play nice =\ 20:41:23 johnsom: basically it died i think, fnaval_ was the driver for it IIRC and he got pulled off 20:41:25 #link https://specs.openstack.org/openstack/qa-specs/specs/tempest/ddt-testing.html 20:41:27 not on my setups at least 20:41:53 hi Alex_Staf_ - were you able to find what you were looking for earlier? 20:41:54 well, it was tesring a ton of combinations and that took long — so let’s focus on the ahppy case 20:42:26 if we reuse a LB for that, it should be OK 20:42:26 rm_work, looks like he's back :) 20:42:31 rm_work: ack 20:42:31 fnaval, I am getting there . Have a big patch to read :) 20:42:39 rm_work: will review and test today 20:42:42 cool! =) 20:42:49 rm_work — reusing stuff is always questionable in tests 20:43:32 xgerman_: yeah but in this case, we can have one that tries to test all permutations 20:43:42 xgerman_, it is better to recreate for testing 20:43:43 since the issue will really be "does the traffic pass correctly", not "does the LB error" 20:44:07 we have other tests that we can run individually to make sure the LB correctly accepts the config 20:44:23 I really liked that it did/does permutations so that it can find the edge cases. I think tagging happy test cases and running them would be better(but leaving in the ddt stuff). 20:44:33 ok, I am just wary since if we find something odd it might be tough to reproduce.fix 20:45:04 Yes and no, I think we should look at optimizing the tests around a single LB where it makes sense. We may be able to spin up more compute hosts with zuulv3 which would help with the test time by paralleling more. 20:46:41 LB create is high overhead and has a time penalty. 20:47:35 Are we ready to move on to Open Discussion or is there more to discuss here? 20:47:57 #topic Open Discussion 20:48:08 Ok, other topics today? 20:48:09 Maintenance API / Amphora-AZ 20:48:35 I've had a spec stewing for an API around doing maintenance on AZ/HV ... 20:48:42 but I think I've been convinced we don't really need it 20:48:58 #link https://review.openstack.org/509933 20:49:00 so long as I can get agreement to merge one of the two patches I've got up for returning Amphora AZ data 20:50:04 I lean towards the query nova approach, but I need to circle back on those. I have been distracted with all of these gate issues. 20:50:04 The first one stores the data locally in our DB at create time, so it can be very quickly searched/filtered on: 20:50:06 #link https://review.openstack.org/510225 20:50:33 The second approach queries nova along with every request, and returns responses from that data: 20:50:35 #link https://review.openstack.org/511045 20:50:57 Every request to this admin API for amphora details.... 20:51:06 yes, good distinction 20:51:07 Not every API request 20:51:08 not like, EVERY call :P 20:51:27 It will be guaranteed accurate, while technically the first approach *could* drift, if people are doing live-migrates and such, though I don't think that's especially likely 20:52:13 I'm generally more in favor of the first approach (we store it in the DB) because I think the risk/reward tradeoff is clear for me, but possibly with some deployers it might skew the other way 20:52:56 I feel like the second one (query nova) is less contentious... less efficient in the 95% case, but more accurate in the 5% case 20:53:14 and since we get to always go with the LCD here... 20:53:26 My guess is that's the one people will generally vote for ;) 20:53:26 for the first approach is there an option to re-sync? 20:53:34 I didn't have one, though we could do that 20:53:36 if it happens to get out of sync since create 20:53:45 just adds complexity 20:53:55 I'm sure housekeeping could do it as a periodic 20:54:19 I lean more towards the query nova approach for two reasons. 1. Single point of truth in nova. 2. Platform folks may have maintenance procedures for the hosts/hypervisors that just live-migrates whatever instances are running for maintenance, so we would have inaccurate info in the DB. 20:54:20 let’s not overthink this — let’s go with nova and if we see issues we can add on 20:56:16 True, it's probably easier to change to storing than to change the other way (no DB contraction for example) 20:56:22 yep 20:56:40 so ok, please review https://review.openstack.org/511045 ? 20:56:40 yea that makes sense 20:56:57 Ok, so please everyone, review the specs rm_work listed and give your input. 20:57:09 #link https://review.openstack.org/511045 20:57:41 As I said, I think the maintenance API spec is dead -- planning to abandon it assuming the Amp-az patch merges, and no one else expresses interest 20:57:42 We have a few more minutes, any other topics? 20:57:55 I might put up another spec for an AZ-evacuate call to do pre-failure though 20:58:39 Ok folks, thanks for the strong turn out today! 20:58:59 #endmeeting