#openstack-meeting-3 log

15:00:14 <slaweq> #startmeeting neutron_ci
15:00:16 <openstack> Meeting started Wed Apr 15 15:00:14 2020 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:17 <slaweq> hi
15:00:17 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:20 <openstack> The meeting name has been set to 'neutron_ci'
15:00:39 <njohnston> o/
15:01:19 <slaweq> hi njohnston
15:01:26 <ralonsoh> hi
15:01:27 <bcafarel> o/
15:01:31 <njohnston> hello, how are you?  Did you have a good dyngus day?
15:01:33 <slaweq> hi
15:01:39 <slaweq> njohnston: yes, thx
15:01:42 <slaweq> I had
15:01:49 <slaweq> do You have it also in US?
15:02:12 <njohnston> my father's family is from Buffalo, New York which has one of the most active dyngus day celebrations in the US
15:02:23 <slaweq> nice :)
15:02:37 <lajoskatona> Hi
15:03:18 <slaweq> I was splashing water on my kids from neighborhood from  the window on first floor
15:03:21 <slaweq> it was fun :)
15:03:24 <slaweq> hi lajoskatona :)
15:03:31 <slaweq> ok, lets start meeting
15:03:43 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:03:52 <slaweq> #topic Actions from previous meetings
15:04:01 <slaweq> first action
15:04:02 <slaweq> slaweq to continue investigation of fullstack SG test broken pipe failures
15:04:31 <slaweq> I found out that in some cases it may be race condition between starting client and reading from it
15:05:28 <slaweq> so solution IMHO is to handle BrokenPipeException in same way as e.g RuntimeError
15:05:35 <njohnston> makes sense
15:05:40 <slaweq> patch is here https://review.opendev.org/#/c/718781
15:05:49 <slaweq> please review if You will have some time
15:06:06 <slaweq> next one
15:06:08 <slaweq> slaweq to check server termination on multicast test
15:06:17 <slaweq> I finally dig into it
15:07:12 <slaweq> and I found out that in tempest "all-plugin" env there is timeout 1200 seconds for each test set
15:07:38 <slaweq> and in case of our tests which require advanced image it may be not enough simply
15:07:51 <slaweq> so we are hitting this timeout e.g. during cleanup phase
15:08:11 <bcafarel> where do we ask again for more powerful hardware? ;)
15:08:13 <slaweq> I proposed patch https://review.opendev.org/#/c/719927/ to increase this timeout for some tests only
15:08:41 <slaweq> but as I talked with gmann today, it seems that better way would be to do it like e.g. ironic did: https://opendev.org/openstack/ironic/src/branch/master/zuul.d/ironic-jobs.yaml#L414
15:08:51 <slaweq> and set higher timeout globally "per job"
15:09:02 <slaweq> so I will update my patch and do it that way
15:09:18 <slaweq> bcafarel: I think that it's more matter of nested virtualisation
15:09:26 <slaweq> it would be much better if that would work fine finally
15:09:49 <bcafarel> oh yes
15:10:15 <slaweq> anyway, that's all about that one from me
15:10:21 <slaweq> questions/comments?
15:10:25 <lajoskatona> bcafarel: from fortnebula there is option to fetch bigger VMs
15:10:43 <lajoskatona> but that is just temporary for testing things
15:10:58 <bcafarel> slaweq: so you will update to use longer overall timeout? (which may help for later "long" tests)
15:11:00 <lajoskatona> at least it was when I last needed mor MEM
15:11:09 <bcafarel> nice I did not know that
15:11:22 <bcafarel> though hopefully just larger timeouts will be enough
15:11:52 <slaweq> bcafarel: yes, I will update to set longer timeout per test in job definition
15:12:03 <slaweq> it's not "job's timeout" but "test's timeout"
15:12:26 <bcafarel> ok sounds good
15:12:57 <slaweq> ok, next one
15:12:59 <slaweq> slaweq to ping yamamoto about midonet gate problems
15:13:05 <slaweq> I sent him an email last week
15:13:10 <slaweq> but he didn't reply
15:13:42 <slaweq> I will try to catch him during drivers meeting because for now networking-midonet's gate is still broken
15:14:20 <slaweq> #action slaweq to ping yamamoto about midonet gate problems
15:14:23 <lajoskatona> I looked for him as well to ask about taas, as the reviews are stopped there
15:14:54 <slaweq> he's not very active in community today
15:15:38 <slaweq> and I think he is the only "active" maintainer of networking-midonet
15:16:21 <slaweq> ok, last one from previous week
15:16:23 <slaweq> bcafarel to check and update stable branches grafana dashboards
15:17:26 <bcafarel> done in https://review.opendev.org/#/c/718676/ in the end that was almost full rewrite
15:17:38 <slaweq> thx bcafarel
15:17:44 <bcafarel> stable dashboards are quite close to neutron master ones now:
15:17:53 <bcafarel> http://grafana.openstack.org/d/pM54U-Kiz/neutron-failure-rate-previous-stable-release?orgId=1
15:18:00 <bcafarel> http://grafana.openstack.org/d/dCFVU-Kik/neutron-failure-rate-older-stable-release?orgId=1
15:18:16 <njohnston> thanks bcafarel!
15:19:09 <slaweq> I added links to meeting agenda
15:19:31 <slaweq> ok, I think that this is all regarding actions from last week
15:19:34 <slaweq> #topic Stadium projects
15:19:46 <slaweq> any updates about zuulv3 ?
15:21:19 <slaweq> ok, I guess this silence means "no updates" :)
15:21:35 <njohnston> nothing from me
15:21:36 <lajoskatona> not from me at least
15:21:46 <njohnston> networking-odl still has https://review.opendev.org/#/c/672925/ pending
15:21:56 <slaweq> ok
15:22:04 <njohnston> and networking-midonet still has nothing IIUC
15:22:10 <slaweq> I also don't have any updates about IPv6-only testing
15:22:34 <slaweq> regarding stadium issues, I know only about this one with networking-midonet which we already talked about
15:22:49 <slaweq> any other issues/questions regarding stadium and ci?
15:24:08 <njohnston> nope
15:24:31 <bcafarel> not from me either
15:24:34 <ralonsoh> no
15:24:53 <slaweq> ok, lets move on
15:24:58 <slaweq> #topic Stable branches
15:25:03 <slaweq> Train dashboard: http://grafana.openstack.org/d/pM54U-Kiz/neutron-failure-rate-previous-stable-release?orgId=1
15:25:05 <slaweq> Stein dashboard: http://grafana.openstack.org/d/dCFVU-Kik/neutron-failure-rate-older-stable-release?orgId=1
15:25:56 <slaweq> I don't see too much data on those dashboards so far
15:26:28 <njohnston> there rarely is
15:27:28 <ralonsoh> I need to review neutron-ovn-tempest-ovs-release, it's failing in all jobs
15:27:55 <slaweq> ralonsoh: but are You talking about stable branches or master?
15:28:02 <ralonsoh> sorry, master
15:28:06 <ralonsoh> my bad
15:28:24 <bcafarel> fullstack hopefully will be better with your fixes backports slaweq
15:28:40 <slaweq> bcafarel: we will see :)
15:28:56 <slaweq> there is also this rally issue https://bugs.launchpad.net/neutron/+bug/1871596
15:28:56 <openstack> Launchpad bug 1871596 in neutron "Rally job on stable branches is failing" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq)
15:28:56 <bcafarel> grenade yes is one of the common recheck causes in stable (though luckily not that bad)
15:29:03 <slaweq> but we should be good with it now, right?
15:29:29 <bcafarel> yep andreykurilin has merged the change to fix rocky
15:29:37 <slaweq> great
15:29:40 <bcafarel> (and networking-ovn stein/rocky also)
15:29:44 <slaweq> thx lajoskatona and bcafarel for taking care of this
15:29:59 <bcafarel> just waiting for some rechecks to complete to mark it as "back in working order"!
15:29:59 <slaweq> bcafarel: so we can close this LP, right?
15:31:43 <lajoskatona> good to hear :-)
15:31:52 <slaweq> today I got couple of +1 from zuul for various stable branches, so IMO it's fine now
15:32:02 <bcafarel> ok checking last results it does indeed look good
15:32:09 <bcafarel> one LP down
15:32:14 <slaweq> \o/
15:32:20 <lajoskatona> ok I abandon then my rally devstack plugin capping patch
15:32:25 <slaweq> thx lajoskatona
15:33:01 <slaweq> anything else related to stable brances for today?
15:34:12 <slaweq> ok, lets move on
15:34:14 <slaweq> #topic Grafana
15:34:42 <slaweq> I still need to update my patch https://review.opendev.org/#/c/718392/ - thx bcafarel for review
15:35:26 <bcafarel> getting at correct position in the file will probably take longer than fixing my nit :)
15:35:42 <slaweq> ralonsoh: speaking about ovn jobs, it seems that all of them, except "slow" are going up with failure rates today
15:35:50 <slaweq> so probably there is some problem with those jobs
15:36:42 <slaweq> https://e93bf74b2537cfa96a59-e4f20cff14b59b3a1c5b0d28b2b173f9.ssl.cf5.rackcdn.com/719765/3/check/neutron-ovn-tempest-ovs-release/0982c80/testr_results.html
15:37:13 <slaweq> seems that this subnetpool test is a culprit
15:37:41 <ralonsoh> I'll take a look today
15:37:53 <slaweq> ralonsoh: thx
15:38:08 <slaweq> #action ralonsoh to check ovn jobs failure
15:38:48 <slaweq> except that I think that things looks pretty good this week
15:39:14 <slaweq> even functional jobs are finally working better thanks to maciejjozefczyk and ralonsoh's work
15:40:56 <slaweq> anything else regarding grafana? do You see anything which worries You?
15:41:53 <maciejjozefczyk> ralonsoh, I can help if you'll need any help :)
15:42:52 <slaweq> thx maciejjozefczyk for volunteering :)
15:43:20 <ralonsoh> thanks a lot!
15:43:38 <slaweq> ok, regarding other issues, I don't have anything really new for today
15:44:11 <slaweq> so if You don't have anything else to talk about today, I will give You some time back :)
15:44:43 <bcafarel> I won't complain that we have less to talk about in this meeting :)
15:44:51 <maciejjozefczyk> :)
15:44:52 <slaweq> bcafarel: :)
15:45:11 <njohnston> +100
15:45:41 <slaweq> ok, so thx for attending and see You online :)
15:45:42 <slaweq> o/
15:45:45 <slaweq> #endmeeting