16:00:15 #startmeeting neutron_ci 16:00:16 Meeting started Tue Sep 10 16:00:15 2019 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:18 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:20 welcome again 16:00:20 The meeting name has been set to 'neutron_ci' 16:00:24 hi 16:01:15 mlavalle will not be here today 16:01:31 but lets wait few more minutes for njohnston bcafarel and others 16:01:39 * slaweq will be back in 2 minutes 16:01:52 o/ sorry did not see the time 16:02:35 really quickly I wanted to point out that some of octavia's CI problems were due to an ubuntu kernel bug in ovs that was causing the kernel to panic. Its possible neutron and others are seeing that too (if the job is retried) 16:02:55 that bug has been fixed we need to update our mirrors and rebuild images (complicated by a fileserver outage the other day we are still trying to recover from) 16:03:16 ok, I'm back 16:03:22 #link https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1842447 16:03:23 Launchpad bug 1842447 in linux (Ubuntu) "Kernel Panic with linux-image-4.15.0-60-generic when specifying nameserver in docker-compose" [Undecided,Confirmed] 16:03:25 fyi 16:03:47 clarkb: johnsom thx for heads up 16:03:49 I was going to ask after my meeting as we are not seeing new images today 16:03:55 but I didn't saw it in neutron jobs (yet) 16:04:34 but we will keep an eye on it for sure :) 16:04:47 ok, let's get going with meeting agenda 16:04:49 #topic Actions from previous meetings 16:04:56 first one 16:04:57 mlavalle to continue investigating router migrations issue 16:05:06 he told me that he is still investigating 16:05:15 so I will just assign it to him for next week too 16:05:19 #action mlavalle to continue investigating router migrations issue 16:05:26 next one 16:05:28 slaweq to check reasons of failures of neutron-tempest-plugin-scenario-openvswitch job 16:05:42 I didn't have time but it looks much better now so I hope we will be good with this job :) 16:05:56 next one 16:05:58 ralonsoh to report bug and investigate failing test_get_devices_info_veth_different_namespaces functional test 16:06:06 o/ sorry I am late 16:06:18 slaweq, that's solved in your patch 16:06:25 ahh, right :) 16:06:27 ok 16:06:32 thx ralonsoh 16:06:36 np! 16:06:42 ok, next one 16:06:43 slaweq to check reason of failure neutron.tests.functional.agent.test_firewall.FirewallTestCase.test_rule_ordering_correct 16:06:50 It was the issue which should be fixed with https://review.opendev.org/#/c/679428/ 16:06:55 so nothing more to check there 16:07:03 and the last one 16:07:05 slaweq to add mariadb periodic job 16:07:13 I proposed https://review.opendev.org/681202 16:07:24 but it also requires https://review.opendev.org/#/c/681200/1 and https://review.opendev.org/#/c/681201/ 16:07:55 +1 to this 16:08:06 ralonsoh: lets first check if that will work as expected :) 16:08:17 I will continue this work during next week 16:08:34 ok, lets move on to the next topic than 16:08:35 #topic Stadium projects 16:08:41 Python 3 migration 16:08:43 Stadium projects etherpad: https://etherpad.openstack.org/p/neutron_stadium_python3_status 16:08:51 I think we already talked about it on neutron meeting 16:08:59 +1 16:09:06 njohnston: anything You want to add about it? 16:09:36 slaweq: Just that I have not had a chance to check in with yamamoto about midonet 16:10:20 that's all 16:10:21 ok, yes, midonet is the last "almost not touched" one, right? 16:10:26 yes 16:10:42 I rarely see yamamoto online it seems 16:10:47 we still have some time but IMO we should finish this work before end of this year 16:10:57 as python 2 is EOL at 1.1.2020 16:11:34 ok, and second stadium projects topic 16:11:36 tempest-plugins migration 16:11:38 Etherpad: https://etherpad.openstack.org/p/neutron_stadium_move_to_tempest_plugin_repo 16:11:52 I know that tidwellr is doing some progress with neutron-dynamic-routing still 16:11:59 tidwellr: do You need help on this? 16:12:06 same state as last week I believe; vpnaas is -W from mlavalle 16:12:51 tidwellr did mention on his review that he was seeing worrisome errors from neutron https://review.opendev.org/#/c/652099 16:13:59 yes, I saw 1 error on this patch today 16:14:18 ok, tidwellr if You would need any help, please ping me :) 16:14:41 anything else You want to talk regarding stadium projects CI today? 16:15:04 slaweq: I'm looking into it, but I could use some help 16:15:29 I thought it would be simple, but these issue aren't seeming so simple anymore :) 16:15:32 tidwellr: ok, I will try to look into it this week too 16:15:47 nothing else from me 16:16:01 and interesting question is why it's not failing in old, legacy jobs 16:17:37 slaweq: I'm seeing intermittent failures where BGP peering simply doesn't start, but then on a recheck everything begins peering and the test in question passes 16:18:07 but we don't see this behavior with the legacy jobs in neutron-dynamic-routing 16:18:09 BGP peering is something run in docker container, right? 16:18:23 BGP agent peers with docker container 16:18:54 ok, I will take a look into logs 16:18:58 the BGP agent will always be the one to initiate the peering 16:19:02 maybe I will find something 16:19:10 good luck :) 16:19:24 thx tidwellr :) 16:19:29 and thx for working on this 16:19:33 ok, lets move on 16:19:37 next topic 16:19:39 #topic Grafana 16:19:45 #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:19:57 (sorry that I forgot sent it at the beginning) 16:20:56 currently we have 2 main issues in CI 16:21:02 1. problems with rally 16:21:32 which is failing 100% times due to jsonschema versions mismatch 16:22:10 2. neutron-tempest-iptables_hybrid-fedora which is even not starting (RETRY LIMIT) 100% times 16:22:31 because we were running it on F28 which is now EOL and there is no repositories for it anymore 16:22:43 for both cases there are patches to fix them 16:22:58 for rally issue we are waiting for new rally release 16:23:25 one question, if possible (about Fedora issue) 16:23:32 ralonsoh: sure 16:23:54 why, instead of using F29 we don't try F30? 16:24:01 slaweq: FOr Fedora issue, is this DNM patch the fix patch or is there a different one? 16:24:03 https://review.opendev.org/#/c/681213/ 16:24:25 yes, why don't we switch to F30? 16:24:32 instead of F29 16:24:54 maybe Brian can help us (is not here now) 16:24:54 my DNM patch was send only to test if haleyb's change to devstack will fix this job 16:25:04 haleyb, yes he is! 16:25:07 but it's not fix for the issue 16:25:13 ah ok. 16:25:18 fix is proposed by haleyb to devstack repo 16:25:24 https://review.opendev.org/#/c/662529/5 16:25:37 re f30 I don't know that it is quite ready yet. We are adding it in nowish iirc 16:25:44 according to comment from ianw there he had some issues with F30 16:25:45 (but once it is ready you should feel free to test on it) 16:25:48 and doug is looking at the barbican failure today 16:25:51 that's why we are changing for F29 now 16:25:57 ooook, thanks!! 16:26:04 haleyb, o/ 16:26:08 i didn't see any f30 support, so stopped at f29 16:26:16 perfect 16:27:22 ralonsoh: njohnston is that clear for You now? 16:27:33 it is, for sure 16:27:34 slaweq: yes thanks 16:28:00 great :) 16:28:10 haleyb: thx for fixing this 16:28:34 np, wish everything had merged sooner, started as a periodic ovn job failure 16:28:49 but also to unblock our gates ASAP I sent today patch https://review.opendev.org/#/c/681186/ 16:29:09 later we will be able to revert it when proper fixes in external repos will land 16:29:31 ok 16:29:41 other than those 2 issues, we are quite fine 16:29:56 even functional/fullstack jobs are in quite good shape this week 16:31:42 and that's all from me according to grafana 16:31:53 do You have anything else about grafana today? 16:33:07 nope 16:33:09 ok, lets move on 16:33:11 #topic fullstack/functional 16:33:30 I today found one new (for me) issue on functional tests 16:33:37 and it happend at least twice 16:33:47 it was on different tests 16:33:54 but same error in test's log 16:34:00 https://0668011f33af6364883c-c555fae2d8c498523cc4b2c363541725.ssl.cf1.rackcdn.com/679852/11/gate/neutron-functional/6b7c424/controller/logs/dsvm-functional-logs/neutron.tests.functional.agent.linux.test_linuxbridge_arp_protect.LinuxBridgeARPSpoofTestCase.test_arp_protection_port_security_disabled.txt.gz 16:34:02 or 16:34:06 https://148a66b404dde523de26-17406e3478c64e603d8ff3ea0aac16c8.ssl.cf5.rackcdn.com/680393/1/check/neutron-functional-python27/59e721e/controller/logs/dsvm-functional-logs/neutron.tests.functional.agent.linux.test_l3_tc_lib.TcLibTestCase.test_clear_all_filters.txt.gz 16:34:14 did You saw something like that before? 16:34:31 no, that is a new one for me... very strange... 16:34:51 this can be (i guess) because you are deleting the qos registers 16:35:03 tbh this ovsdb error may be redhearing as it could happend during cleanup 16:35:12 and the rules are not there anymore 16:35:25 but that's only thing which was common in those 2 failed tests for me 16:35:34 yes, this could happen 16:35:41 I can check this tomorrow 16:35:49 thx ralonsoh 16:35:51 do you have a bug ref? 16:35:55 ralonsoh: no 16:36:02 ok, I'll do it 16:36:07 ok, thx a lot 16:36:22 #action ralonsoh to report bug and check issue with ovsdb errors in functional tests 16:36:49 and I also found (happend once) issue with killing external process during cleanup 16:36:52 I reported it here New bug reported by me: https://bugs.launchpad.net/neutron/+bug/1843418 - not very urgent 16:36:54 Launchpad bug 1843418 in neutron "Functional tests shouldn't fail if kill command will have "no such process" during cleanup" [Medium,In progress] - Assigned to Slawek Kaplonski (slaweq) 16:37:19 I know that bcafarel and ralonsoh already reviewed proposed patch 16:37:30 I didn't had time to check those reviews yet 16:37:50 slaweq, I propose to implement a os.kill() method with and wihtout privsep 16:37:56 if root is True/False 16:38:21 there are several places in Neutron where a shell to execute "kill" is spawned 16:38:46 ralonsoh: that is good idea 16:38:54 I will do it this way 16:39:00 perfect! 16:39:22 that will be nice :) 16:39:37 very good +1 16:39:38 thx a lot for this idea and reviewing patch 16:40:20 that's all regarding functional/fullstack jobs from me 16:40:26 anything else You want to add? 16:41:47 ok, if not that was all from me for today 16:42:03 as I don't have anything new regarding scenario jobs 16:42:14 do You have anything else You want to talk about today? 16:42:26 if not, I think I can give You back about 15 minutes :) 16:43:15 o/ 16:43:23 yay 16:43:24 o/ 16:43:34 ok, lets finish than 16:43:35 bye 16:43:37 thx for attending 16:43:41 o/ 16:43:45 #endmeeting