15:00:54 #startmeeting neutron_ci 15:00:54 Meeting started Tue Jan 16 15:00:54 2024 UTC and is due to finish in 60 minutes. The chair is ykarel. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:54 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:54 The meeting name has been set to 'neutron_ci' 15:00:57 \o 15:01:00 hello wncslln. Looking at the neutron bug list there are 22 low hanging fruit bugs https://bugs.launchpad.net/neutron/+bugs?field.tag=low-hanging-fruit 15:01:04 bcafarel, lajoskatona, mlavalle, mtomaska, ralonsoh, ykarel, jlibosva, elvira 15:01:09 hi 15:01:15 o/ 15:01:34 o/ 15:01:38 o/ 15:02:46 let's start with the topics 15:02:47 #topic Actions from previous meetings 15:02:54 ralonsoh to push patch for pyroute command handling 15:03:26 sorry, I have the patch but didn't push it 15:03:30 not working locally 15:04:07 ok np thx for looking into it 15:04:24 ralonsoh to check if functional test failures are related to the backport 15:04:43 yes, the errors are legit 15:04:49 I'm going to stop these backports 15:04:57 but I still don't now the cause 15:05:15 once investigated, if I can solve it, I'll re propose these patches 15:05:35 thx much ralonsoh 15:05:51 slaweq to check with cirros guy to see if this is some known issue related to kernel panic trace 15:06:11 slaweq is not around but he shared is findings already 15:06:46 but checking both chatgpt and a cirros guy :) 15:06:47 The kernel panic message you provided indicates a general protection fault during the boot process of the Cirros operating system. The error occurs in the __kmalloc function, which is a kernel memory allocation function. The stack trace also shows that the issue is related to RSA key handling and signature verification. 15:06:48 Here are a few suggestions to troubleshoot and resolve the issue: 15:06:48 Memory Issues: 15:06:48 Make sure that the virtual machine has sufficient memory allocated. If possible, try increasing the allocated memory and see if the issue persists. 15:07:06 So it seems to me that it was issue with not enough memory given for the guest OS. According to info from the "Cirros guy" who I asked about it also, we should give 256 MB of memory for the flavor used for Cirros now. See https://github.com/cirros-dev/cirros/issues/53 for more details. 15:07:11 you mean use larger flavor? 15:07:43 that what the suggesting is 15:08:07 but looking at that issue link i see for bios/x86_64 128 mb is enough and that's what we using in these jobs 15:08:13 for that the parallely running tests number should be decreased perhaps also 15:08:22 ah, ok 15:09:30 considering this point i am not sure how much worth to update flavor in these jobs, as we seen that issue rarely 15:10:01 there is mention that for uefi it would need 256 mb, but looking logs doesn't seem we use that 15:11:56 so i think we can keep a watch and if we see it more frequently we can attempt that change, wdyt? 15:12:45 +1 15:12:48 +1 15:12:50 do we have a LP bug? 15:13:28 no we don't 15:13:41 seen just once so haven't reported it yet 15:14:03 I'll ping slaweq to do this because he has all the information 15:14:10 k thx 15:14:29 #topic Stable branches 15:14:39 bcafarel, any update here? 15:15:22 all looking good from what I checked! and now as haleyb commented, we have one less branch to worry about with ussuri EOL quite close 15:15:45 that's good 15:16:20 just one thing i noticed in xena some linux bridge job timing out, seen in check and periodic queue, not consistent though 15:16:50 possibly just slow nodes 15:17:42 #topic Stadium projects 15:17:51 all green in periodic-weekly 15:17:58 lajoskatona anything else to share here? 15:18:06 all is grren as I checked 15:18:20 perhaps I mention here the py311 addition to setup.cfg: 15:18:39 https://review.opendev.org/q/topic:%22py3.11%22+status:open,25 15:19:02 Merged openstack/neutron stable/wallaby: "ebtables-nft" MAC rule deletion failing https://review.opendev.org/c/openstack/neutron/+/905538 15:19:11 thx lajoskatona 15:19:18 I added the extra line to the stadiums, so please check the list when you have few extra minutes and let's add py311 to networking projects coverage :-) 15:19:33 ++ 15:19:35 that's it for stadiums 15:19:53 #topic Rechecks 15:20:25 quite good number of patches merged, with a few rechecks 15:20:43 1 bare recheck out of 7, let's keep doing it 15:20:54 #topic fullstack/functional 15:21:03 test_delete_route 15:21:09 https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_060/905105/1/gate/neutron-functional-with-uwsgi/0603eb7/testr_results.html 15:21:25 so an older issue which got noticed again https://bugs.launchpad.net/neutron/+bug/1988037 15:21:29 I've added some extra logs 15:21:32 Rodolfo pushed patch to collect more info if it fails https://review.opendev.org/c/openstack/neutron/+/905291 15:21:42 thx ralonsoh 15:22:04 next one 15:22:05 ovsdbapp.exceptions.TimeoutException: Commands ... exceeded timeout 10 seconds, cause: Result queue is empty 15:22:11 https://ce5c304e9b3a7699cf9c-f4e1e757f6e1e9727e757a7fa4b37fa7.ssl.cf2.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-uwsgi-fips/d9449f2/testr_results.html 15:22:16 https://a08072a22aec7dbb4aca-289b34983b13ece36a1a19c591f4d0ce.ssl.cf1.rackcdn.com/905332/1/check/neutron-functional-with-uwsgi/8770081/testr_results.html 15:22:44 you recall seeing similar issue? 15:23:23 wondering if it just due to slow node 15:23:54 I never seens similar to tell the truth 15:24:10 as looking into dstat output around the time of failure i noticed some cpu wait for i/o 15:25:28 there is indeed a 10 seconds gap in the server 15:25:41 https://ce5c304e9b3a7699cf9c-f4e1e757f6e1e9727e757a7fa4b37fa7.ssl.cf2.rackcdn.com/periodic/opendev.org/openstack/neutron/master/neutron-functional-with-uwsgi-fips/d9449f2/controller/logs/openvswitch/ovs-vswitchd_log.txt 15:26:03 2024-01-12T03:29:08.878Z|10296|bridge|INFO|bridge br-test-1fa6b16: using datapath ID 0000ded23192b84f 15:26:03 2024-01-12T03:29:08.878Z|10297|connmgr|INFO|br-test-1fa6b16: added service controller "punix:/var/run/openvswitch/br-test-1fa6b16.mgmt" 15:26:03 2024-01-12T03:29:18.814Z|10298|connmgr|INFO|test-bre94359f5<->tcp:127.0.0.1:16002: 9 flow_mods 10 s ago (9 adds) 15:26:20 but most probably because the test was waiting 15:28:44 not sure anything we can do for this issue 15:30:26 for now will just monitor and if i see it again will report lp to track the further investigation 15:30:46 #topic Periodic 15:30:56 ovn source(main branch) jobs broken 15:31:12 #link https://bugs.launchpad.net/neutron/+bug/2049488 15:31:18 Fix https://review.opendev.org/c/openstack/neutron/+/905645 15:31:34 already merged, so jobs should be back to green now 15:31:53 that's it on failures 15:31:58 #topic Grafana 15:32:08 let's hava quick look at #topic Grafana 15:32:16 sorry https://grafana.opendev.org/d/f913631585/neutron-failure-rate 15:33:13 looks fine, didn't it? 15:33:30 doesn't it** 15:33:50 yes looks good, just check queue has some spikes 15:33:55 I think so 15:34:23 and when i was checking failures there it were mostly patch related or the ones we already discussed 15:34:26 so all good here 15:34:39 #topic On Demand 15:34:48 anything you would like to raise here? 15:35:12 no thanks 15:35:33 not from me 15:36:31 k in that case let's close and have everyone 24 minutes back 15:36:32 thx all 15:36:40 #endmeeting