15:00:27 <slaweq> #startmeeting neutron_ci
15:00:28 <openstack> Meeting started Wed Jul  1 15:00:27 2020 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:29 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:30 <slaweq> hi
15:00:31 <openstack> The meeting name has been set to 'neutron_ci'
15:00:32 <ralonsoh> hi
15:02:30 <slaweq> lets wait few more minutes for others
15:02:35 <maciejjozefczyk> \p
15:02:37 <maciejjozefczyk> \o
15:03:56 <slaweq> bcafarel: njohnston: ping :)
15:04:09 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:04:25 <bcafarel> o/
15:04:37 <bcafarel> sorry I was listening to Edge session
15:04:42 <slaweq> ahh
15:04:43 <slaweq> ok
15:04:48 <slaweq> I forgot about it
15:04:58 <slaweq> ok, lets start
15:05:00 <slaweq> #topic Actions from previous meetings
15:05:05 <slaweq> bcafarel to check gate status on rocky and queens (uwsgi problem)
15:05:39 <bcafarel> ok so patch in neutron itself is not enough, we need multiple fixes in devstack
15:05:58 <bcafarel> latest iteration in devstack itself looks good to go https://review.opendev.org/#/c/735615/
15:06:43 <bcafarel> then we will need https://review.opendev.org/#/c/738851/ or something like that on top - with Depends-On I still see failures that should be fixed by the devstack one
15:06:52 <bcafarel> hopefully recheck once it is merged should be greener
15:07:14 <bcafarel> and once rocky is finally back on track, similar backports for older branches :)
15:07:24 <slaweq> ok
15:07:39 <slaweq> so it seems like we are still far from green gate for queens and rocky
15:08:11 <njohnston> o/
15:08:47 <bcafarel> yup :/ but there is progress at least
15:09:00 <slaweq> thx bcafarel for taking care of it
15:09:20 <slaweq> ok, next one
15:09:25 <slaweq> maciejjozefczyk will check test_ovsdb_monitor.TestNBDbMonitorOverTcp.test_floatingip_mac_bindings failiure in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_63a/711425/11/check/neutron-functional/63ac4ca/testr_results.html
15:09:40 <maciejjozefczyk> fixed
15:09:50 <slaweq> thx maciejjozefczyk :)
15:09:55 <maciejjozefczyk> #link https://review.opendev.org/#/c/738415/
15:10:06 <slaweq> so next one
15:10:09 <slaweq> ralonsoh will check get_datapath_id issues in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_63a/711425/11/check/neutron-functional/63ac4ca/testr_results.html
15:10:41 <slaweq> maciejjozefczyk: one question, should we backport it to ussuri?
15:10:42 <ralonsoh> slaweq, sorry, I didn't start debbuging
15:11:23 <maciejjozefczyk> slaweq, hmm we can
15:11:45 <slaweq> maciejjozefczyk: ok, can You propose backport?
15:11:49 <maciejjozefczyk> slaweq, clicked :D
15:11:52 <slaweq> thx
15:12:05 <slaweq> ralonsoh: sure, I know You were busy with other things
15:12:12 <slaweq> will You try to check that next week?
15:12:15 <ralonsoh> sure
15:12:28 <slaweq> #action ralonsoh will check get_datapath_id issues in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_63a/711425/11/check/neutron-functional/63ac4ca/testr_results.html
15:12:29 <slaweq> thx
15:12:37 <slaweq> ok, next one
15:12:42 <slaweq> slaweq will check errors with non existing interface in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_63a/711425/11/check/neutron-functional/63ac4ca/testr_results.html
15:12:51 <slaweq> I didn't have too much time to check this one
15:13:57 <slaweq> but from what I looked today it seems for me like maybe some tests are "overlapping" and other test cleanded some port/bridge
15:14:26 <slaweq> I will probably add some additional debug logging to be able to maybe investigate it more when it will happen again
15:14:45 <slaweq> and I will try to continue work on it this week
15:14:49 <slaweq> #action slaweq will check errors with non existing interface in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_63a/711425/11/check/neutron-functional/63ac4ca/testr_results.html
15:15:01 <slaweq> and the last one
15:15:02 <slaweq> maciejjozefczyk to check failing neutron-ovn-tempest-ovs-master-fedora periodic job
15:15:11 <maciejjozefczyk> yeah
15:15:15 <maciejjozefczyk> #link https://review.opendev.org/#/c/737984/
15:15:47 <slaweq> I saw that ovn jobs were failing on this patch today morning
15:15:54 <maciejjozefczyk> the issue is trivial, ovs compilation code was duplicated in both ovn and ovs modules
15:15:56 <slaweq> so lets wait how it will be now :)
15:16:06 <maciejjozefczyk> yeah, slaweq, this should be fine now
15:16:14 <maciejjozefczyk> (I hope so) :D
15:16:39 <bcafarel> and I think that one may be interesting for focal transition too
15:16:43 <maciejjozefczyk> This is about cleaning and unifying the way we compile ovs/ovn, so next time we'll need to fix one function instead two
15:16:48 <maciejjozefczyk> bcafarel, indeed
15:17:25 <slaweq> ++
15:17:27 <slaweq> thx
15:18:03 <slaweq> ok
15:18:17 <slaweq> that are all actions from last week
15:18:23 <slaweq> lets move on to the next topic
15:18:24 <slaweq> #topic Stadium projects
15:18:59 <slaweq> with migration to zuul v3 stadium projects are actually good now
15:19:07 <slaweq> as only neutron-ovn-grenade job is still missing
15:19:22 <njohnston> \o/
15:19:35 <slaweq> anything else You want to discuss about stadium projects today?
15:21:12 <slaweq> ok, so lets move on
15:21:18 <slaweq> #topic Stable branches
15:21:23 <slaweq> Ussuri dashboard: http://grafana.openstack.org/d/pM54U-Kiz/neutron-failure-rate-previous-stable-release?orgId=1
15:21:25 <slaweq> Train dashboard: http://grafana.openstack.org/d/dCFVU-Kik/neutron-failure-rate-older-stable-release?orgId=1
15:22:05 <slaweq> from what I was seeing this week, branches which are not EM are running pretty good now
15:22:09 <bcafarel> I mostly looked at rocky that week, but I think ussuri to stein were OK
15:22:14 <slaweq> and EM are red due to known reasons
15:23:14 <slaweq> I thought about one small improvement to safe some gate resources: wdyt about moving all non-voting jobs to experimental queue in the EM branches?
15:23:55 <bcafarel> :) I was thinking about that when filling https://review.opendev.org/#/c/738851/
15:24:20 <slaweq> :)
15:24:28 <slaweq> so there is at least 2 of us
15:24:42 <slaweq> that would safe about 4-6 jobs, mostly multinode
15:24:55 <slaweq> so pretty many vms spawned to test each patch
15:25:16 <bcafarel> probability is low that someone will work on fixing them in EM, and I don't think anyone checks their results in backports
15:25:26 <slaweq> exactly
15:25:37 <slaweq> almost noboday is checking non-voting jobs in master
15:27:31 <slaweq> ralonsoh: njohnston any thoughts?
15:27:43 <slaweq> if You are ok with this I will propose such patch this week
15:27:44 <ralonsoh> agree with removing them
15:28:18 <njohnston> I absolutely agree
15:28:27 <njohnston> There is no reason to be using those resources
15:28:47 <slaweq> ok
15:28:49 <slaweq> thx
15:28:54 <slaweq> so I will propose such change
15:29:12 <slaweq> #action slaweq to move non-voting jobs to experimental queue in EM branches
15:29:32 <slaweq> anything else regarding stable branches? or can we move on?
15:29:50 <bcafarel> nothing from me
15:30:44 <njohnston> nothing from me
15:30:54 <slaweq> ok, lets move on then
15:30:56 <slaweq> #topic Grafana
15:31:02 <slaweq> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:32:09 <slaweq> I don't see any serious problems there
15:33:16 <slaweq> I pushed small patch to update dashboard https://review.opendev.org/738784
15:34:32 <slaweq> ok, lets talk about some specific issues
15:34:34 <slaweq> #topic Tempest/Scenario
15:34:48 <slaweq> I found few failures for which I opened LPs
15:34:56 <slaweq> first one is
15:35:04 <slaweq> tempest.api.network.admin.test_routers.RoutersAdminTest.test_create_router_set_gateway_with_fixed_ip
15:35:09 <slaweq> in job neutron-tempest-dvr-ha-multinode-full
15:35:19 <slaweq> it happens very often
15:35:28 <slaweq> like e.g.: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_2c4/734876/1/check/neutron-tempest-dvr-ha-multinode-full/2c466a0/testr_results.html
15:35:35 <slaweq> Bug reported https://bugs.launchpad.net/neutron/+bug/1885897
15:35:35 <openstack> Launchpad bug 1885897 in neutron "Tempest test_create_router_set_gateway_with_fixed_ip test is failing often in dvr scenario job" [High,Confirmed]
15:35:51 <slaweq> in fact this is main reason of often failures of this job
15:36:34 <slaweq> any volunteer to check that one?
15:36:43 <slaweq> if not, I will try to check it this week
15:36:44 <ralonsoh> sorry, not this week
15:37:10 <slaweq> ok, I will check it
15:37:29 <slaweq> #action slaweq to investigate failures in test_create_router_set_gateway_with_fixed_ip test
15:37:41 <slaweq> I think that this may be some tempest cleaning issue maybe
15:37:45 <slaweq> or something like that
15:37:50 <slaweq> as it happens very often
15:38:11 <slaweq> next one is related to qos test
15:38:25 <slaweq> I found it in neutron-tempest-plugin-scenario-openvswitch job but I think it may happen also in other jobs
15:38:30 <slaweq> neutron_tempest_plugin.scenario.test_qos.QoSTest.test_qos_basic_and_update
15:38:35 <slaweq> https://20f4a85411442f4e3555-9f5a5e2736e26bdd8715596753fafe10.ssl.cf2.rackcdn.com/734876/1/check/neutron-tempest-plugin-scenario-openvswitch/a31f86b/testr_results.html
15:38:42 <slaweq> Bug reported https://bugs.launchpad.net/neutron/+bug/1885899
15:38:42 <openstack> Launchpad bug 1885899 in neutron "test_qos_basic_and_update test is failing" [Critical,Confirmed]
15:38:55 <slaweq> seems like nc wasn't spawned properly - maybe we should add additional logging in https://github.com/openstack/neutron-tempest-plugin/blob/master/neutron_tempest_plugin/common/utils.py#L122 ?
15:40:14 <slaweq> anyone wants to check that?
15:40:27 <ralonsoh> but I remember you changed the way to spawn nc
15:40:31 <ralonsoh> making it more reliable
15:40:45 <slaweq> yes
15:40:57 <ralonsoh> sorry, but next week
15:41:01 <ralonsoh> not this one
15:41:18 <slaweq> ok, lets keep it unassigned, maybe someone will want to check it
15:41:29 <slaweq> I marked it as critical because it impacts voting jobs
15:41:47 <slaweq> ok, next one
15:42:07 <slaweq> this is related only to the ovn based jobs where test     neutron_tempest_plugin.scenario.test_connectivity.NetworkConnectivityTest.test_connectivity_through_2_routers is failing
15:42:14 <slaweq> like e.g. https://4ec598fcefc6b0367120-6910015cdc6b96c34eca0ab65a68e7f2.ssl.cf5.rackcdn.com/696926/18/check/neutron-ovn-tempest-full-multinode-ovs-master/c1c51ca/testr_results.html
15:42:25 <slaweq> Bug reported: https://bugs.launchpad.net/neutron/+bug/1885898
15:42:25 <openstack> Launchpad bug 1885898 in neutron "test connectivity through 2 routers fails in neutron-ovn-tempest-full-multinode-ovs-master job" [High,Confirmed]
15:42:39 <slaweq> maciejjozefczyk: will You have time to take a look into this?
15:43:10 <maciejjozefczyk> slaweq, You have my sword
15:43:20 <slaweq> maciejjozefczyk: thx a lot
15:43:33 <bcafarel> :)
15:43:34 <slaweq> #action maciejjozefczyk to check neutron_tempest_plugin.scenario.test_connectivity.NetworkConnectivityTest.test_connectivity_through_2_routers failure in ovn jobs
15:44:43 <slaweq> maciejjozefczyk: there is also another failure in ovn based jobs
15:44:44 <slaweq> neutron_tempest_plugin.scenario.test_trunk.TrunkTest.test_trunk_subport_lifecycle
15:44:48 <slaweq> https://7bea12b2d1429b68c6c8-10caedded388001c6bbc38619ca4b324.ssl.cf2.rackcdn.com/737047/8/check/neutron-ovn-tempest-full-multinode-ovs-master/592e31b/testr_results.html
15:44:56 <slaweq> Bug reported: https://bugs.launchpad.net/neutron/+bug/1885900
15:44:56 <openstack> Launchpad bug 1885900 in neutron "test_trunk_subport_lifecycle is failing in ovn based jobs" [Critical,Confirmed]
15:44:57 <maciejjozefczyk> slaweq, yeah notiecd that bug :(
15:45:08 <maciejjozefczyk> that fails pretty often now?
15:45:14 <slaweq> You will probably don't have time to work on both this week but please keep it in mind :)
15:45:54 <slaweq> I saw couple of times at lease in last week
15:46:12 <slaweq> see http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%20%5C%22line%20240%2C%20in%20test_trunk_subport_lifecycle%5C%22
15:46:47 <slaweq> and that's all what I had for today
15:47:03 <slaweq> I opened LP for each of failures so we can track them there
15:47:07 <maciejjozefczyk> slaweq, thanks for the link
15:47:16 <slaweq> maciejjozefczyk: yw :)
15:47:26 <slaweq> anything else You want to talk about today?
15:48:42 * bcafarel looks for correct tab to copy link
15:49:31 <bcafarel> https://review.opendev.org/#/c/738163/ I started to change a few bits for focal
15:49:46 <bcafarel> still early WIP but if you want to add stuff, please do so
15:50:18 <slaweq> thx bcafarel
15:50:21 <maciejjozefczyk> sure bcafarel thanks
15:50:30 <maciejjozefczyk> ovn fails in all the cases :D
15:50:41 <slaweq> I added myself to the reviewers to be up to date with this :)
15:50:50 <slaweq> damm ovn :P
15:50:51 <njohnston> +1
15:50:58 <bcafarel> :) yes it needs more work on the skip ovs/ovn compilation bits
15:51:02 <slaweq> we should move it to the stadium project ;P
15:51:12 <bcafarel> ahah
15:51:19 <slaweq> maciejjozefczyk: wdyt?
15:51:21 <slaweq> :D
15:51:50 <njohnston> ROFL
15:53:00 <slaweq> ok, I think we can finish this meeting now
15:53:04 <slaweq> thx for attending
15:53:08 <slaweq> and see You next week :)
15:53:12 <slaweq> #endmeeting