14:00:13 #startmeeting neutron_l3 14:00:13 Meeting started Wed Jul 10 14:00:13 2019 UTC and is due to finish in 60 minutes. The chair is liuyulong. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:14 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:16 The meeting name has been set to 'neutron_l3' 14:00:20 o/ 14:00:28 #chair haleyb 14:00:29 Current chairs: haleyb liuyulong 14:01:01 hi 14:01:59 The nickname 'liuyulong' is already in use! 14:02:12 OK 14:02:22 #topic Announcements 14:03:30 I have no announcement today. 14:03:43 If you have, please go ahead. 14:04:46 OK, let's move on. 14:04:53 #topic Bugs 14:05:16 #link http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007455.html 14:05:22 Bence Romsics (rubasov) was our bug deputy the week before last, thank you. 14:05:35 #link http://lists.openstack.org/pipermail/openstack-discuss/2019-July/007577.html 14:05:41 Bernard Cafarelli (bcafarel), last week bug deputy, also thank you. 14:06:45 #link https://bugs.launchpad.net/neutron/+bug/1835044 14:06:46 Launchpad bug 1835044 in neutron "[Queens] Memory leak in pyroute2 0.4.21" [High,Won't fix] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 14:06:54 This is now marked as wont-fix. 14:06:58 yes 14:07:09 But for this popular rpm repo, we still do not have a new version of pyroute2 for queens release. 14:07:11 because we can't modify stable requirements 14:07:12 #link http://mirror.centos.org/centos/7.6.1810/cloud/x86_64/openstack-queens/ 14:07:18 It is still python2-pyroute2-0.4.21-1.el7.noarch.rpm 14:07:33 hi 14:07:40 each company should fix this 14:07:54 we are pushing the changes for our RPM repos 14:08:19 but this won't be changed in a stable branch unless this is a security problem 14:08:52 So R and S repo have new version? 14:09:05 in devstack/requirements yes 14:09:08 not Q 14:09:17 0.5.2 vs 0.4.21 14:10:05 Our environment are all running queens, we indeed need a repo fix. : ) 14:10:39 ralonsoh, thank you for working on this. 14:10:46 #link https://bugs.launchpad.net/neutron/+bug/1834308 14:10:47 Launchpad bug 1834308 in neutron "[DVR][DB] too many slow query during agent restart" [Medium,Confirmed] - Assigned to LIU Yulong (dragon889) 14:10:47 my pleasure 14:11:02 I will submit a fix for DVR related DB query. 14:11:31 Our DBA help me to get some slow query LOG. 14:13:15 sorry lost the connection again.... 14:13:39 We noticed there will be 300k+ slow query (0.5s+) during 30 nodes ovs-agent restart. 14:14:08 Yeah, most of them are related to DVR 14:14:32 next one may be related to this. 14:14:34 #link https://bugs.launchpad.net/neutron/+bug/1835663 14:14:34 Launchpad bug 1835663 in neutron "Some L3 RPCs are time-consuming especially get_routers" [Medium,Confirmed] 14:14:42 As you noticed, it is really slow. 14:14:47 http://logs.openstack.org/11/669111/4/check/neutron-tempest-plugin-dvr-multinode-scenario/dc3af26/controller/logs/screen-q-l3.txt.gz#_Jul_07_04_18_11_791730 14:15:24 IMO, 37s for a single RPC, this is not acceptable for a production environment. My OP colleagues will complain. : ) 14:16:55 Neutron server side DB slow query may be one reason. 14:18:32 liuyulong: ack, and that log was from a check job? that's pretty bad 14:19:29 For this log here, maybe upstream CI neutron server just meet its bottleneck. It can not answer too much RPC calls concurrently. 14:19:52 haleyb, yes, it is bad 14:20:20 #link https://review.opendev.org/#/c/669111/ 14:20:33 ralonsoh, slaweq, hi, this patch ^^ 14:21:01 there is an implementation for this function 14:21:03 The time cost wrapper, I left some comments 14:21:16 I'll review it after the meeting 14:21:52 ralonsoh, yes, it's good to know we have similar function already. 14:22:46 Let me quote the comment here: 14:23:03 but it can not distinguish each call for same RPC, so I will still add a wrapper here which call that function inside. And a log for the function start is needed as well. We need to know the precisely call start and end. 14:23:35 yes exactly I think that if you override the message argument to the oslo.utils time_it function and add the generated uuid then you get the benefit of the function 14:24:22 njohnston, agree 14:24:34 but you could do that without a separate decorator 14:25:52 njohnston, how to enable the start log without a new decorator? 14:26:33 We may want to see what happened between start and end. 14:26:58 time_it just log the duration. 14:27:33 @osloutils.time_it(message="time-cost: %(seconds).02f seconds to run function '%(func_name)s', uuid=" + uuidutils.generate_uuid()) 14:29:12 I see, so you believe there is value in having the "call: start" separate from the "call: ended, time = %d" log messages 14:29:19 I can see your point 14:31:02 I agree with liuyulong that separate log for start and end can be useful 14:31:09 And I have another concern, that 'StopWatch' is used in the 'time_it'. It looks pretty complicated, don't know if it will cause something wrong in RPC calls. 14:31:29 this is just a context manager 14:33:16 OK, I will refactor this decorator. 14:33:30 It will be useful for upstream CI 14:35:48 I have no bug today, last week is a bit quite for L3. 14:36:41 I still have one bug in L3 14:36:45 #link https://bugs.launchpad.net/neutron/+bug/1732458 14:36:46 Launchpad bug 1732458 in neutron "deleted_ports memory leak in dhcp agent" [Medium,In progress] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 14:36:57 #link https://review.opendev.org/#/c/521035/ 14:37:19 (CI is not passing but I'm rechecking) 14:37:41 A very old one 14:39:09 Recently we meet many exceptions about DHCP during some upgrading or restarting. 14:39:58 And I'm deciding to remove the DHCP agent in our local environment. 14:40:44 config_drive or L2-agent self-sevice DHCP looks more friendly to large scale cloud. 14:41:35 liuyulong: there is RFE about distributed dhcp agent 14:41:50 let me find it 14:42:03 Yes, all from our OPs complain. 14:42:04 liuyulong: what do you mean by 'L2-agent self-sevice DHCP'? 14:42:22 slaweq, did you mean the OVN related RFE? 14:42:25 liuyulong: https://bugs.launchpad.net/neutron/+bug/1806390 14:42:26 Launchpad bug 1806390 in neutron "[RFE] Distributed DHCP agent " [Wishlist,In progress] - Assigned to Yang Youseok (ileixe) 14:42:37 liuyulong: no, this one isn't related to OVN 14:42:38 amotoki, it is a local implementation. 14:43:10 liuyulong: okay, is it a kind of distributed one? 14:43:14 amotoki, since OVS-agent have full acknowage of port IP and MAC. 14:43:27 amotoki: I remember when I was in OVH we also had something like that - neutron-ovs agent was spawning simple udhcpd service for each port on host - and that worked very well :) 14:44:07 liuyulong: slaweq: thanks. it reminds me of nova-network dhcp stuff per compute node. 14:44:58 the proposed distributed dhcp agent would be similar. 14:45:03 https://review.opendev.org/#/c/658414/9/specs/train/ml2ovs-ovn-convergence.rst@38 14:45:18 I left a comment here, but no response for now. : ) 14:45:33 ML2+OVS+DVR and OVN 14:46:18 liuyulong: i will look at your comment... 14:46:43 OK, next topic 14:46:58 #topic Routed Networks 14:47:22 I'm now interested in how this will work for external network with multiple segments. 14:47:36 Yes, I mean public (provider) network for router gateway and floating IP. 14:48:58 I also left some comment here: https://review.opendev.org/#/c/657170/ 14:49:02 No response. 14:49:22 mlavalle, tidwellr, wwriverrat: your turn now. 14:50:52 No updates? 14:51:01 Next topic 14:51:15 #topic On demand agenda 14:51:54 I have one more thing about OVN and dvr. 14:52:00 #link https://blueprints.launchpad.net/neutron/+spec/openflow-based-dvr 14:52:08 Maybe we shoud add a note for this BP, or mark it as something like not-complete or abandoned. 14:52:12 s/should 14:52:40 And also abandon the related gerrit patch. 14:55:25 yes, i don't think that will be implemented 14:55:28 +1. it clarifies the current situation and it is useful especially for operators. 14:57:02 OK, time is up, let's stop here. 14:57:10 Thank you guys. 14:57:15 #endmeeting