16:00:12 #startmeeting Octavia 16:00:13 Meeting started Wed Jun 5 16:00:12 2019 UTC and is due to finish in 60 minutes. The chair is cgoncalves. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:14 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:17 The meeting name has been set to 'octavia' 16:00:24 o/ 16:01:04 \o 16:01:28 Hi folks! 16:01:34 hi 16:01:56 copy cat 16:01:58 rm_work asked me if I could chair the meeting today. he was not sure he could attend today 16:02:10 #topic Announcements 16:02:33 o/ 16:02:42 I don't anything to announce. anyone? 16:03:10 I don't have anything either 16:03:26 nope 16:03:32 Nope 16:03:54 just a reminder that the Open Infrastructure Summit Shanghai is still open 16:04:17 deadline is July 2, 2019 at 11:59pm PT 16:04:35 FYI, this week is milestone 1 for Train 16:05:00 We don't cut milestone releases by default anymore, but if someone needs one, let us know. 16:05:08 #link https://releases.openstack.org/train/schedule.html 16:05:42 Ann Taraday proposed openstack/octavia master: Use retry for AmphoraComputeConnectivityWait https://review.opendev.org/662791 16:05:44 thanks 16:05:46 #topic Brief progress reports / bugs needing review 16:06:24 I have wrapped up all of the client unset patches and the octavia API patches to handle the unsets better. 16:06:52 I have not done tags as I found that the octavia API implementation of tags is not yet complete. 16:07:09 I was out of office for most of time since last meeting. I'm proposing a couple of patches in different OpenStack projects to remove neutron-lbaas support 16:07:17 I would like to ask for review https://review.opendev.org/662791 - this change - it solves issue with redis jobboard expirity. If the concept is OK, etc... 16:07:30 I also started moving some tags code up to osc-lib so we can use it when we get there: 16:07:32 #link https://review.opendev.org/#/c/662859/ 16:08:23 nice! 16:08:43 my time on octavia is focused on addressing the behavior described in this story https://storyboard.openstack.org/#!/story/2005512 as well as the failover behavior in the case of DB unavailability that i described at PTG. am hopeful that once addressed the production deployment i've prepared will be placed in service 16:08:56 I wonder how this aligns with tags in "openstack image" 16:09:02 I am now starting work on the log offloading patches. I put together a plan yesterday. 16:09:08 And johnsom change https://review.opendev.org/#/c/659689 - can we merge this one? As I have to rebase all my work on it and have some long queues of change.. 16:09:18 Maybe in open discussion I can share the proposed log format and get feedback. 16:09:51 cgoncalves probably not at all. lol 16:10:27 ataraday_ From my perspective https://review.opendev.org/#/c/659689 is ready for review/merge 16:10:28 #link https://review.opendev.org/#/c/659689 16:12:03 FYI, this is the "tags" spec we should be implementing: 16:12:05 at a first glance, johnsom's patch looks large but it isn't that hard to review. mostly file renamings 16:12:05 #link https://specs.openstack.org/openstack/api-wg/guidelines/tags.html 16:12:24 neutron seems to have it, thus why I want to share OSC code with them 16:12:43 makes sense 16:13:09 cgoncalves It's mostly copies and renames. It creates the "v2" amphora driver for our parallel development with jobboard. 16:14:09 right. the patch is passing on all jobs (except centos for unrelated reasons). should be relatively safe to merge it 16:15:14 That is my perspective as well 16:15:27 I'll review it this week and I encourage everyone to do the same, pretty please :) 16:15:54 anything else on this topic? 16:16:05 hi all, 16:16:44 CBR09, hi 16:16:58 ok, seems not 16:16:59 #topic Open Discussion 16:17:17 johnsom, you wanted to talk about log offloading? 16:17:18 Can octavia support ssl cert with tcp protocol? 16:17:36 So as I mentioned earlier in the meeting, I'm starting on log offloading 16:17:51 For the user flow logs, I'm going to propose this format: 16:17:58 project_id lb_id listener_id client_ip client_port date_time request_str status bytes_read %[ssl_c_verify] %{+Q}[ssl_c_s_dn] pool_id member_id processing_time termination_state 16:18:20 Which is a hybrid of haproxy log format, apache, and OpenStack stuff. 16:18:36 I also put "driver" specific fields at the end. 16:18:43 So an example would look like: 16:19:12 2f7cf2a1-d521-400e-a7c8-5611304723e8 b2f76e55-adfb-4aae-93eb-152c82adabec bfe473c0-d9f5-4cf0-aff8-90da27265cb3 10.0.1.2 33317 [06/Feb/2009:12:14:14.655] "GET /index.html HTTP/1.1" 200 2751 0 "/C=FR/ST=Ile de France/L=Jouy en Josas/O=haproxy.com/CN=client1/emailAddress=ba@haproxy.com" 15c7f458-aeec-42b1-921c-2bf24f62cf8c a7cb861f-9254-4c73-9929-332b81b40abf 109 ---- 16:19:36 CBR09 Not at this time. It is technically possible, but has not be implemented yet. 16:20:25 Please let me know if you have feedback on that log format. 16:20:41 i think that looks reasonable 16:20:41 would it be possible for operators to set a different format? 16:20:57 Yes, that is on my list of things to fix in the current patch 16:21:14 great 16:21:16 I have 12 things I identified yesterday that need work 16:21:54 We have a good PoC, but I want to make it user friendly and allow easy customization 16:22:03 this RFE isn't strictly specific to the amphora driver, right? other providers can also streamline logs 16:22:38 @johnsom: yea, thank you, I think we should have that 16:22:56 Right, there are some things to figure out with the other drivers, but yes, this should be general enough that other drivers could implement it. 16:22:58 I can see haproxy support that 16:23:13 +1 16:23:19 Yeah, all of the LBs do, we just need to do the work 16:23:51 Ok, thanks for the feedback! 16:23:58 one more question, sorry :) 16:24:05 please 16:24:39 I guess sink destination for admin logs will be configured in the configuration file 16:25:07 how do you propose for tenant logs? 16:25:23 Correct, I want to have to "destination" config settings. One for "user flow logs" and one for "admin logs". They could be the same, could be different, up to the operator. 16:25:28 an additional API parameter at POST/PUT? 16:26:15 At this point I am only targeting these for operator setting, not end user settings of a destination. That becomes a problem with networking, etc. 16:26:41 hmm, maybe I asked an invalid question. I was thinking of the possibility for Octavia to send the logs to an external sink log system 16:26:51 right 16:26:57 ok, thank you 16:27:33 That would be a different RFE IMO. This is really targeting operators that have ELK available for users, etc. 16:27:43 agreed 16:28:29 anything else anyone would like to discuss? we are on open discussion after all :) 16:29:24 is anyone operating a fleet larger than ~450 VIPs (Active/Standby) in a single region? 16:29:36 CBR09 Out of curiosity, what protocol are you wanting the TCP TLS for? It might be nice to know a usecase. 16:29:49 s/VIPs/LoadBalancers/ 16:30:08 colin-, not that I'm aware of 16:30:25 I know of some larger deployments, but I don't operate them. 16:30:39 colin-, are you hitting some sort of limitations? 16:31:25 @johnsom: the use case is when I just want use LB as a SSL termination and app protocol is tcp 16:31:52 I can't use https with ssl like octavia currently support 16:32:02 CBR09 Ok, so just generic. Not like SMTP, or something more specific 16:32:13 i think some of the symptoms i've described in the story above are exacerbated by the size of the fleet, and it makes me wonder if we are in uncharted territory with our scale. this is also a relatively "vanilla" deployment of stable/rocky with mostly default configuration settings save for some significant relaxation of heartbeat/failover timing 16:34:48 @johnsom: for example I need mqtt protocol, so I need tcp protocol and ssl to secure that 16:34:49 I think the unique situation you have is how your network (VXLAN) behaves with the neutron ports (macs moving around, etc.) 16:34:50 I can't be of much help to comment on such large scale deployments 16:35:05 CBR09 Ok, thanks. 16:35:10 mostly just a straw poll, best case scenario would be getting feedback from someone operating at a larger scale but that is probably unrealistic 16:36:19 one example in https://github.com/lelylan/haproxy-mqtt/blob/master/haproxy.cfg 16:36:44 colin-, have you tried asking to the OpenStack Ops team? 16:36:56 CBR09 Never would have guessed that one, so it's good to know for testing when it's implemented 16:38:29 the behavior by the vxlan was extreme, indeed, but not that peculiar. in my opinion it is more peculiar to blithely proceed with bringing up a new resource when the opportunity for resource collision may exist by virtue of our employing allowed address pairs 16:38:29 I think mtqq is so popular in IOT world : D 16:39:14 allowed address pairs is a strange concept on it's own.... 16:39:19 the bulk/cascading failover behavior, while being quite unrelated to duplicate mac addresses, is equally concerning but we've established that database unavailability can have unpredictable results 16:39:46 CBR09 Agreed 16:40:50 I think octavia should have secure tcp (tcp+ssl) for support that usecase 16:40:53 cgoncalves: i'm not familiar with the distinction between that and this channel, sorry :) 16:40:55 what's about websocket? 16:41:21 CBR09 I don't disagree, someone just needs to do the implementation work. It's been on the roadmap for a while. 16:41:21 do you load balancing websocket protocol with octavia? 16:42:52 colin-, the Ops team mostly consists of operators running openstack deployments. I just thought that you might give it a try asking in their list or so if someone is running octavia at large scale like you and collect feedback 16:43:39 CBR09 I think we do have support for it, but there may be a missing timeout setting for tuning. 16:46:26 ok, anything else? otherwise we can end the meeting 16:47:01 alright, thank you everyone for joining and participating! 16:47:01 #endmeeting