15:00:17 <slaweq> #startmeeting neutron_ci
15:00:18 <openstack> Meeting started Tue Jan 12 15:00:17 2021 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:19 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:19 <slaweq> hi
15:00:22 <openstack> The meeting name has been set to 'neutron_ci'
15:00:55 <bcafarel> hi again
15:00:59 <ralonsoh> hi
15:01:51 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:01:58 <slaweq> please open and we can start
15:02:27 <slaweq> #topic Actions from previous meetings
15:02:33 <slaweq> slaweq to update grafana dashboard
15:02:40 <slaweq> Patch https://review.opendev.org/c/openstack/project-config/+/767470
15:02:44 <slaweq> it's merged already
15:02:59 <slaweq> and that was the only action from last meeting
15:03:08 <slaweq> so I think we can move on
15:03:14 <slaweq> next topic is
15:03:15 <slaweq> #topic Stadium projects
15:03:34 <slaweq> any ci related topics for stadium projects?
15:03:50 <bcafarel> most are still red with pip fun I think
15:04:16 <bcafarel> I saw some patches starting to appear for them, not sure if they are working (or merged?)
15:05:10 <slaweq> bcafarel: but it's for master or stable branches?
15:05:12 <bcafarel> https://review.opendev.org/c/openstack/networking-bgpvpn/+/769657 for example
15:05:24 <lajoskatona> Hi
15:05:45 <bcafarel> slaweq: master, I will send patches to drop l-c jobs on stable soonish ( ralonsoh taking the neutron ones)
15:05:55 <ralonsoh> right
15:06:16 <slaweq> ok, I will check those l-c patches for master branch then
15:07:18 <slaweq> anything else regarding stadium or we can move on?
15:08:09 <slaweq> so lets move on
15:08:13 <bcafarel> +1
15:08:17 <slaweq> #topic Stable branches
15:08:21 <slaweq> Victoria dashboard: https://grafana.opendev.org/d/HUCHup2Gz/neutron-failure-rate-previous-stable-release?orgId=1
15:08:24 <slaweq> Ussuri dashboard: https://grafana.opendev.org/d/smqHXphMk/neutron-failure-rate-older-stable-release?orgId=1
15:08:33 <slaweq> except that l-c issue, I think all is good there
15:08:36 <slaweq> right?
15:08:52 <bcafarel> indeed, I saw some failures but nothing too bad
15:09:22 <bcafarel> stein is waiting for rocky swift grenade fix (it was W+1 30 min ago), and then all branches should be back in working order
15:09:39 <bcafarel> pending l-c cleanup (fix for it is merged up to train at the moment)
15:09:49 <slaweq> ++
15:09:56 <slaweq> thx bcafarel for taking care of it
15:10:25 <slaweq> I think we can move on to the next topic then
15:10:27 <slaweq> #topic Grafana
15:10:34 <slaweq> #link https://grafana.opendev.org/d/PfjNuthGz/neutron-failure-rate?orgId=1
15:10:34 <bcafarel> and haleyb and a few others too :) it will be nice to forget that part
15:11:10 <haleyb> hi
15:11:14 <slaweq> hi haleyb :)
15:11:25 <haleyb> are there gate failures? :-p
15:11:29 <bcafarel> :)
15:11:31 <slaweq> thx for helping with pip issues :)
15:11:51 <bcafarel> haleyb: no I was just pointing out you helped/suffered a lot with that new fancy pip resolver too
15:11:53 <bcafarel> :)
15:11:54 <haleyb> i feel like it's been Thor's hammer kind of firedrill
15:12:03 <ralonsoh> good work on this!
15:12:05 <ralonsoh> thanks
15:12:12 * haleyb is still suffering with gate things
15:12:38 <slaweq> haleyb: we all suffer with gate things :P
15:13:04 <slaweq> but, speaking about gate and grafana
15:13:15 <slaweq> things looks much better IMO this week
15:13:19 <haleyb> nice
15:13:21 <slaweq> or even this year ;)
15:14:07 <slaweq> I saw surprisingly many patches merged recently without rechecking dozens of times :)
15:14:45 <bcafarel> nice
15:15:19 <slaweq> do You have anything related to our dashboard?
15:15:31 <slaweq> or we can move on to some specific issues which I found recently?
15:15:45 <bcafarel> nothing from me
15:15:59 <ralonsoh> no
15:17:26 <slaweq> ok, so let's move on
15:17:32 <slaweq> #topic functional/fullstack
15:17:46 <slaweq> those jobs are still most often failing ones
15:17:54 <slaweq> first functional
15:18:22 <slaweq> I again saw this error 500 during network creation in ovn tests: https://zuul.opendev.org/t/openstack/build/476b4b1684df45bca7ecebbd2d7353b9/logs
15:18:41 <slaweq> but that was only once and I'm not sure if otherwiseguy's patch was already merged then or not yet
15:18:56 * otherwiseguy looks
15:20:01 <slaweq> IIRC it was this patch https://review.opendev.org/c/openstack/neutron/+/765874
15:20:12 <slaweq> and it was merged Jan 5th
15:20:23 <slaweq> and failure which I saw was from Jan 4th
15:20:35 <slaweq> so now we should be good with that issue finally
15:21:03 <otherwiseguy> ah, yeah.
15:21:05 * otherwiseguy crosses fingers
15:21:33 <ralonsoh> the problem, I think, this is not working with wsgi
15:21:44 <ralonsoh> because we don't call "post_fork_initialize"
15:21:58 <ralonsoh> but I think lucas is investigating this
15:22:24 <slaweq> yes, he is working on that issue with uwsgi
15:22:34 <otherwiseguy> I remember functional test base manually calling post_fork_initialize?
15:23:36 <otherwiseguy> https://github.com/openstack/neutron/blob/f21b8950f8a51e81e389543fb482cc6cf445b882/neutron/tests/functional/base.py#L297
15:23:38 <ralonsoh> yes, in _start_ovsdb_server_and_idls
15:23:43 <ralonsoh> exactly
15:25:36 <slaweq> ok, lets move on
15:25:44 <slaweq> I also found issue with neutron.tests.functional.agent.common.test_ovs_lib.BaseOVSTestCase.test_update_minimum_bandwidth_queue_no_qos_no_queue
15:25:48 <slaweq> https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_4e0/769880/2/check/neutron-functional-with-uwsgi/4e09826/testr_results.html
15:26:10 <slaweq> did You saw such failures already?
15:26:16 <otherwiseguy> ralonsoh and I talked about this one I believe.
15:26:20 <ralonsoh> yes
15:26:21 <ralonsoh> one sec
15:26:35 <otherwiseguy> I think we discovered that two tests were using the same port name and maybe that was causing an issue?
15:26:39 <ralonsoh> https://review.opendev.org/c/openstack/neutron/+/770154
15:26:48 <ralonsoh> https://review.opendev.org/c/openstack/neutron/+/769975
15:26:54 <slaweq> ahh, right
15:26:58 <ralonsoh> both patches should help
15:26:59 <slaweq> I saw this patch today
15:27:16 <slaweq> both already merged
15:27:21 <slaweq> so we should be ok with those
15:27:26 <slaweq> thx ralonsoh
15:27:30 <ralonsoh> yw!
15:27:46 <otherwiseguy> yay ralonsoh :)
15:27:57 <slaweq> and that's all regarding functional job
15:28:00 <slaweq> now fullstack
15:28:05 <slaweq> here I found one issue
15:28:10 <slaweq> with neutron.tests.fullstack.test_qos.TestMinBwQoSOvs.test_min_bw_qos_port_removed
15:28:17 <slaweq> and I saw it twice:
15:28:21 <slaweq> https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_18a/740569/2/check/neutron-fullstack-with-uwsgi/18a1d60/testr_results.html
15:28:24 <slaweq> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f87/749012/15/check/neutron-fullstack-with-uwsgi/f87df94/testr_results.html
15:29:24 <ralonsoh> I'll take a look at this
15:29:33 <slaweq> ralonsoh: thx
15:29:39 <ralonsoh> at least I'll add some logs to print the qoses and queues
15:29:51 <slaweq> in logs there is some error RowNotFound: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_f87/749012/15/check/neutron-fullstack-with-uwsgi/f87df94/controller/logs/dsvm-fullstack-logs/TestBwLimitQoSOvs.test_bw_limit_qos_port_removed_egress_.txt
15:30:24 <ralonsoh> maybe I need to do the same as in FT, add a waitevent
15:30:26 <slaweq> in both cases there is same error
15:30:35 <slaweq> ralonsoh: maybe
15:30:41 <ralonsoh> perfect, that's "good"
15:30:57 <slaweq> #action ralonsoh will check fullstack test_min_bw_qos_port_removed issues
15:31:11 <slaweq> thank You
15:31:16 <slaweq> #topic Tempest/Scenario
15:31:29 <slaweq> here I found just one issue in neutron-tempest-plugin-scenario-ovn
15:31:36 <slaweq> https://bfc2304b36c89dd5efde-d71f4126f88f4263fd488933444cea49.ssl.cf1.rackcdn.com/740569/2/check/neutron-tempest-plugin-scenario-ovn/026535a/testr_results.html
15:32:24 <slaweq> but I saw this issue only once so far
15:32:32 <slaweq> did You saw it also maybe?
15:32:48 <ralonsoh> no
15:33:33 <otherwiseguy> I suppose I can take a look at it.
15:33:51 <slaweq> otherwiseguy: thx a lot
15:34:09 <slaweq> I will report that in LP and I will give You link to LP later
15:34:21 <otherwiseguy> I can't just keep complaining about CI and not fix things I suppose. :p
15:34:34 <slaweq> otherwiseguy: everyone is doing that :P
15:34:57 <slaweq> thx a lot for Your help, it's really appreciated :)
15:35:25 <slaweq> ok, that's all issue regarding scenario jobs for today
15:35:33 <slaweq> those jobs seems to be pretty stable recently IMO
15:35:38 <slaweq> lets move on
15:35:45 <slaweq> #topic Periodic
15:36:13 <slaweq> I noticed that neutron-ovn-tempest-ovs-master-fedora perdiodic job is failing 100% of times since few days
15:36:21 <slaweq> I opened bug https://bugs.launchpad.net/neutron/+bug/1911128
15:36:22 <openstack> Launchpad bug 1911128 in neutron "Neutron with ovn driver failed to start on Fedora" [Critical,Confirmed]
15:37:03 <slaweq> otherwiseguy: can You maybe take a look at that one? :)
15:37:24 <otherwiseguy> slaweq: sure :)
15:38:34 <slaweq> looks for me like maybe ovn isn't started at all there
15:38:36 <slaweq> idk
15:38:45 <slaweq> but it's failing like that every day on fedora job
15:38:49 <slaweq> thx otherwiseguy
15:38:52 <otherwiseguy> yeah: CRITICAL neutron [None req-4c1185cb-214e-4848-91b8-ea3b529f1d30 None None] Unhandled error: neutron_lib.callbacks.exceptions.CallbackFailure: Callback neutron.plugins.ml2.drivers.ovn.mech_driver.mech_driver.OVNMechanismDriver.pre_fork_initialize-627113 failed with "Could not retrieve schema from ssl:10.4.70.225:6641"
15:39:00 <otherwiseguy> that doesn't seem good. :p
15:39:11 <slaweq> #action otherwiseguy to check fedora ovn periodic job issue
15:39:37 <slaweq> ok, that are all ci related things for today from me
15:39:45 <slaweq> do You have anything else You want to discuss today?
15:39:53 <ralonsoh> nope
15:40:16 <bcafarel> stable ci should work better soon, seeing the series of "drop l-c" patches appearing in #openstack-neutron :)
15:40:30 <slaweq> yeah, I saw it :)
15:40:37 <slaweq> thx ralonsoh and bcafarel for sending them
15:40:42 <ralonsoh> yw
15:40:54 <slaweq> thx for attending the meeting
15:40:58 <slaweq> and see You online
15:40:58 <ralonsoh> bye
15:40:59 <bcafarel> and lajoskatona too
15:41:01 <slaweq> o/
15:41:02 <bcafarel> o/
15:41:03 <slaweq> #endmeeting