16:00:11 <johnsom> #startmeeting Octavia
16:00:12 <openstack> Meeting started Wed Aug 19 16:00:11 2020 UTC and is due to finish in 60 minutes.  The chair is johnsom. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:15 <openstack> The meeting name has been set to 'octavia'
16:00:24 <ataraday_> hi
16:00:31 <aannuusshhkkaa> holla!
16:00:38 <johnsom> Hi everyone! Thanks for joining. We have a full agenda today.
16:00:38 <haleyb> hi
16:00:42 <cgoncalves> hi
16:00:43 <gthiemonge> hi
16:01:10 <johnsom> #topic Announcements
16:01:38 <johnsom> Just a quick reminder, registration is open for both the Summit and PTG. Please register as it is free and virtual!
16:01:46 <johnsom> #link https://openinfrasummit2020.eventbrite.com/
16:01:54 <johnsom> #link https://october2020ptg.eventbrite.com/
16:02:12 <johnsom> There are at least three Octavia presentations proposed for the Summit.
16:02:26 <johnsom> One is new features, one on A10 driver, one on the F5 driver
16:02:38 <johnsom> I don't know if they got accepted yet, but good to see
16:03:09 <johnsom> Also I want to highlight some upcoming milestone dates. Feature freeze is coming up soon!
16:03:13 <johnsom> #link https://releases.openstack.org/victoria/schedule.html
16:03:34 <johnsom> Final octavia-lib release is the week of August 31st (not feature freeze, the final release)
16:03:51 <johnsom> Final python-octaviaclient release is the week of September 7th (not feature freeze, the final release)
16:04:02 <johnsom> All other repositories go into feature freeze the week of September 7th
16:04:22 <johnsom> So, if you have feature work you want to get into Victoria, time is running out.
16:04:45 <johnsom> I plan to setup a section the priority review this week to highlight feature work that has a upcoming deadline
16:05:10 <johnsom> Any questions/comments about feature freeze?
16:05:51 <johnsom> Any other announcements this week?
16:06:16 <johnsom> #topic Brief progress reports / bugs needing review
16:06:39 <ataraday_> Mostly do reviews and update some of amphorav2/default ciphers changes.
16:07:00 <ataraday_> I would like to raise some of amphorav2 topics, but I will postpone it until Open Discussion.
16:07:00 <johnsom> I have been working on an internal project a bit over the last week. Also continuing on the tempest updates, reviews, and reviving some of my older patches that are up for review.
16:08:11 <cgoncalves> reviews, ALPN and HTTP/2 work all around (octavia, client, SDK, tempest), fixing the gate
16:08:27 <aannuusshhkkaa> We are awaiting reviews on #link https://review.opendev.org/#/c/737111/ the refactor for amphora driver patch
16:08:28 <johnsom> I need to wrap up creating this internal training stuff then I will focus on the priority reviews list.
16:08:57 <aannuusshhkkaa> and on #link https://review.opendev.org/#/c/742294/ adding new metrics - response time patch
16:09:49 <johnsom> Cool, any other updates?
16:10:06 <aannuusshhkkaa> we have the also written the driver to integrate InfluxDB #link https://review.opendev.org/#/c/746822/ and would be able to merge that on top of these other patches.
16:10:56 <johnsom> #topic vPTG timeslots
16:11:19 <johnsom> So virtual PTG planning has started. I have signed us up to have a session again.
16:11:46 <johnsom> I have tentatively selected the same days and timeslot for our sessions. Tue Oct 27th and 28th 13:00-17:00 UTC
16:12:05 <johnsom> I wanted to ask if that is good for everyone or if we should do another doodle poll?
16:12:24 <ataraday_> works fine for me
16:12:53 <gthiemonge> same here
16:13:31 <cgoncalves> works for me but in the spirit of fairness we could change to another time to be nice to our US Western folks
16:13:38 <rm_work> my schedule adjusts to the things I need to do, so fine by me
16:14:03 <johnsom> Ah, I'm an odd person out, so I will be flexible and order extra coffee
16:14:32 <johnsom> Ok, anyone else that this time slot doesn't work for?
16:14:44 <johnsom> That seems like a pretty good quorum.
16:15:24 <johnsom> You know it is customary for the newest core members to run the PTG right?
16:15:33 <johnsom> Just kidding. grin
16:15:46 <gthiemonge> lol
16:16:16 <johnsom> Ok, unless I here from anyone in the next day or so, I will leave those timeslots on the vPTG calendar.
16:16:54 <johnsom> We could also do an additional session, but I think we did ok with the two time slots. Do we need more time or is this good?
16:17:28 <cgoncalves> thank you for organizing our PTG again
16:17:47 <johnsom> You are welcome
16:18:09 <johnsom> Ok. I think we are good on PTG topics for the week.
16:18:17 <johnsom> #topic Community goals
16:18:32 <johnsom> I wanted to do a quick review of the Victoria goals as they are coming due.
16:18:57 <johnsom> Goal 1 was to remove the legacy zuul jobs and migrate to zuulv3 only.
16:19:28 <johnsom> I think we have accomplished this goal. Does anyone know of a case we still have running legacy zuul?
16:19:46 <johnsom> If I remember right it was only grenade that was left.
16:20:44 <cgoncalves> yeah, I think so
16:20:56 <johnsom> Ok, I will consider that one complete as I think Carlos took care of all of the grenade jobs. (Thank you!)
16:21:08 <johnsom> Goal 2 is test on Ubuntu Focal
16:21:16 <johnsom> This one I think we have work to do.
16:21:44 <johnsom> I have tested the control plane running on Focal. There was a problem with Barbican, but I think that is fixed now.
16:22:10 <johnsom> So we should be ok to switch the control plane (nodepool hosts) over to focal where we haven't already.
16:22:26 <johnsom> What I know is broken is using Focal for the amphora image.
16:22:54 <johnsom> On boot, the lb-mgmt-net configured by nova/neutron fails to come up inside the image.
16:23:13 <johnsom> I tested this again yesterday to see if Ubuntu had fixed the bug, but it appears to still be a problem.
16:24:32 <johnsom> Is anyone interested in looking into this issue? Do we have anyone from Canonical here?
16:25:47 <johnsom> My guess is there is a bug in Focal that if the ifupdown compatibility package is installed, the cloud-init/netplan/systemd configuration doesn't work. This is purely a guess though
16:26:47 <johnsom> Should we just fail this goal and leave focal broken? We would have to use a legacy OS image for the amphora, but it should work.
16:27:32 <cgoncalves> I am reading the goal. the way I read it, the goal is limited to the nodepool host running Focal
16:27:38 <cgoncalves> #link https://governance.openstack.org/tc/goals/selected/victoria/migrate-ci-cd-jobs-to-ubuntu-focal.html
16:27:54 <johnsom> I think we should consider moving our default gate hosts to centos, but I don't want to do that mid-cycle. I would prefer that to happen at the start of a cycle
16:27:59 <cgoncalves> so, we would not be failing the goal if we run Focal controller + CentOS/Bionic amphora
16:29:02 <johnsom> Yeah, it's the "spirit" of the goal though....
16:29:43 <rm_work> yeah I think it was intended to mean we run a Focal amp...
16:29:45 <johnsom> I wish we had someone from Canonical that could help with debugging this...
16:29:50 <rm_work> at least, I would have assumed that
16:29:56 <cgoncalves> (I don't recall a similar goal for when CentOS 8 was released...)
16:30:22 <johnsom> True, but centos 8 is included in the PTI for this release:
16:30:52 <johnsom> #link https://governance.openstack.org/tc/reference/runtimes/victoria.html
16:31:10 <cgoncalves> correct. it was also in the PTI for Ussuri
16:31:54 <johnsom> Ok, so I think a proposal on the floor is to move forward with focal control plane, bionic amphora, and have a non-voting failing focal amp check job?
16:32:17 <cgoncalves> LGTM
16:32:59 <rm_work> eugh, that seems gross
16:33:03 <rm_work> but i don't see a way around it
16:33:04 <johnsom> I may try reaching out to the only person I know at Canonical and see if they know someone that can work on it.
16:33:22 <rm_work> i don't have the cycles or the knowledge to be useful with the ubuntu (or any) networking stack internals
16:34:05 <johnsom> Yeah, I think this is going to take some time to figure out and I can't take that time away from my other priorities for it.
16:34:53 <johnsom> I will take the action item to setup the gates as discussed in the proposal above.
16:35:11 <cgoncalves> thanks
16:35:20 <johnsom> Thank you for the discussion on this.
16:35:30 <johnsom> #topic Kolla updates for Octavia need review
16:35:56 <johnsom> I wanted to highlight that the Kolla team has been doing work to improve the Octavia support in Kolla. This is excellent!
16:36:04 <johnsom> #link https://review.opendev.org/#/q/project:openstack/kolla-ansible+topic:bp/implement-automatic-deploy-of-octavia
16:36:22 <johnsom> They have some patches up for review and I would like us to support them in this effort.
16:36:43 <johnsom> Please if you have some time, help them review these patches.
16:37:17 <johnsom> It will help folks new to OpenStack have a better out-of-box experience when deploying with Kolla.
16:37:40 <johnsom> Consider it some "building the community" time. grin
16:38:28 <johnsom> #topic Open Discussion
16:38:49 <ataraday_> I’ve got a couple of things regarding amphorav2.
16:38:58 <johnsom> Ok, thank you for hanging in as we go through this full agenda. Other topics? I think ataraday_ had some.
16:39:09 <johnsom> ataraday_ All yours
16:39:16 <ataraday_> I looked into the problem of https://etherpad.opendev.org/p/octavia-amphora-v2 and left some notes, but I think we can discuss it here.
16:39:33 <ataraday_> This issue can happen if state of task cannot be written to db properly - so it will keep retrying to do it.
16:39:53 <ataraday_> So, if we have DBConnection errors - nothing can be written so retries are acceptable.
16:40:12 <ataraday_> The main problem is when task is completed successfully, but it cannot write its state to db
16:40:24 <ataraday_> making the loop
16:40:39 <ataraday_> Like some if the objects are not json serializable - in repro example.
16:40:56 <johnsom> Yeah, I think DB issues are ok to retry. The issue I had was an endless loop creating network ports until my cloud ran out of IP addresses.
16:40:56 <ataraday_> But as I see this is something we should guarantee - that all our tasks are passing the correct type of data.
16:41:37 <johnsom> It "fork tested" my cloud and brought it down.
16:42:01 <ataraday_> Or it is possible with some other scenario?
16:42:02 <johnsom> Granted, it was a coding error that led to it, but it's a very scary thing to be a potential.
16:42:22 <ataraday_> I propose https://review.opendev.org/#/c/744156/ to taskflow as a workaround
16:42:34 <johnsom> I was wondering if there was a re-dispatch limit we could set.
16:43:08 <ataraday_> Not really fixing - but at least we won't retry endlessly
16:43:33 <johnsom> Thank you for the patch. I will review it. (I am a bad taskflow core as I only look for patches there every once in a while).
16:43:59 <ataraday_> johnsom, Thanks!
16:44:45 <ataraday_> One more thing about amphorav2 failover refactor - I have some comments there -will you have time to resolve them?
16:44:52 <ataraday_> Or I should step in?
16:45:14 <johnsom> I saw them, I have not had time to resolve/investigate them. Thank you for the review.
16:46:49 <johnsom> There is a relationship between the notification types that I know exists, we haven't implemented, but I don't yet fully understand yet. I feel like I need to spend some quality time with the taskflow code to understand how we need to tie those two things together. Like a notification should also have a hook for the timer notification.
16:47:36 <johnsom> If you have ideas or time to look at that, I am open of course. You can post followup patch proposals any time.
16:48:12 <ataraday_> OK, I just concerned with number of patches about amphorav2 for review got a bit stuck.
16:48:24 <johnsom> Yeah, I agree.
16:49:23 <ataraday_> That's all from me for today :D
16:49:59 <johnsom> ataraday_ Ok, thank you. Please keep reminding us and updating the priority patch review list to highlight those v2 patches.
16:50:41 <johnsom> Any other topics today?
16:50:45 <ZhuXiaoYu> I have uploaded a POC of multi-active mode on: https://review.opendev.org/#/c/746688 please have a look if you have time
16:51:13 <johnsom> Excellent, thank you for working on that.
16:51:26 <johnsom> That is a reminder too, we still have an open specification.
16:51:48 <openstackgerrit> Carlos Goncalves proposed openstack/octavia-tempest-plugin master: Switch default SUT to CentOS 8  https://review.opendev.org/746996
16:51:52 <johnsom> #link https://review.opendev.org/#/c/723864/
16:52:11 <johnsom> Thank you Carlos for your review. Others, please consider taking some time to review it.
16:53:55 <cgoncalves> leveraging Neutron ECMP support is a good idea. my main concern is with the amphora health check monitoring
16:54:07 <openstackgerrit> Carlos Goncalves proposed openstack/octavia-tempest-plugin master: Switch default SUT to CentOS 8  https://review.opendev.org/746996
16:54:30 <ZhuXiaoYu> yes, I see those comments
16:54:31 <johnsom> I think my last concern/comment was about the DR proposal that requires member servers to have a special configuration.
16:56:05 <ZhuXiaoYu> It is OK with DVR, neutron will pick ns to set configurations
16:58:51 <johnsom> Ok, we are coming up on time for the meeting. Any other comments on the specification or other topics?
16:59:41 <johnsom> Thank you everyone for hanging on for the long agenda! Have a great week.
16:59:52 <johnsom> #endmeeting