15:00:17 #startmeeting neutron_ci 15:00:17 Meeting started Tue Jul 6 15:00:17 2021 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:17 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:17 The meeting name has been set to 'neutron_ci' 15:00:27 hi 15:00:28 hi again 15:00:30 hi (again)! 15:00:37 Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:00:39 Please open now :) 15:01:37 ok, let's start 15:01:40 #topic Actions from previous meetings 15:01:46 amotoki to clean failing jobs in networking-odl rocky and older 15:01:50 Hi 15:01:52 hi 15:02:03 hi 15:02:27 zuul configuration errors happen in networking-odl, neutron-fwaas and -midonet stable/rocky or older. 15:03:09 networking-odl rocky and queens have been done. 15:03:25 others are under reviews except networking-odl ocata. 15:03:52 networking-odl ocata needs to be EOL'ed as neutron ocata is already EOL. lajoskatona prepared the release patch. 15:03:59 Jsut for reference the original mail: http://lists.openstack.org/pipermail/openstack-discuss/2021-June/023321.html 15:04:21 and the patch: https://review.opendev.org/c/openstack/releases/+/799473 15:04:53 thx amotoki and lajoskatona for taking care of all of that 15:05:08 no problem 15:05:22 https://review.opendev.org/c/openstack/releases/+/799472 also will clean up newton or older unofficial branches under the neutron governance curretly. 15:05:23 actually there's another release patch for older branches from elod: https://review.opendev.org/c/openstack/releases/+/799472 15:05:30 that's all from me 15:05:38 yeah, exactly as amotoki wrote it :-) 15:05:44 thx 15:05:48 ok, so next one 15:05:50 lajoskatona to start EOL process for networking-odl rocky and older 15:06:01 I assume it's already covered as well :) 15:06:12 yeah 15:06:30 thx 15:06:32 so next one 15:06:33 I haven't sent mail, as I assumethe original mail from You slaweq covered odl as well 15:06:35 ralonsoh to check if there is better way to check dns nameservers in cirros 15:06:50 still testing how to do it in cirros 15:06:53 in a different way 15:06:56 ok 15:07:10 I will add it for You for next time to not forget 15:07:12 ok? 15:07:13 sure 15:07:19 #action ralonsoh to check if there is better way to check dns nameservers in cirros 15:07:21 thx ralonsoh 15:07:29 and that are all actions from last week 15:07:35 #topic Stadium projects 15:08:04 lajoskatona: any other updates/issues with stadium 15:08:10 *stadium's ci 15:08:32 The most interesting is to eol old branches which we covered 15:08:47 cleanups++ 15:09:20 and the gerrit server restart (http://lists.openstack.org/pipermail/openstack-discuss/2021-July/023434.html ) wihc you mentioned on previous meeting, so after that taas will be under openstack/ namepace 15:09:52 great 15:09:55 thx 15:09:59 I think we can move on 15:10:01 during fixing netwokring-l2gw location in zuul, I noticed a lot of failures in stadium old branhces. I moved them to the experimental queue for the record. 15:10:11 is it the right action? 15:10:41 amotoki: if we can't/don't have time to fix such failing jobs, then yes 15:10:52 I tried to fix them if they have simple straight-forward fixes, but otherwise i did so. 15:10:55 I think it's good approach to move such jobs to experimental queue 15:11:00 I am fine with it, those fixes are really time consuming 15:11:12 thx a lot 15:11:17 if you have memory on devstack-gate and pip/pip3 issues, it would be nice to check it for example in neutron-fwaas. 15:11:37 +1, if it can be easily fixed it is nice, but for EM branches we mostly keep the light on 15:11:55 I will try to check but I don't promise anything 15:12:39 most failing jobs still use the legacy way and it takes time :( 15:13:06 yeah, that is not something I'm really familiar with 15:13:19 :) 15:14:09 anyway let's fix it separetely :) I think we can move on 15:14:37 thx 15:14:39 so next topic 15:14:42 #topic Stable branches 15:14:53 bcafarel: anything regarding stable branches ci in neutron? 15:15:02 IMO it works pretty ok recently 15:15:09 but maybe I missed some issues there 15:15:14 indeed, backports are getting in quite nicely 15:15:24 one that went back to rocky required a few rechecks 15:15:45 but nothing specific, just an array of separate tests (including compute and storage) failing 15:16:02 and that branch is low on backports so no need to dig further atm 15:16:29 full support branches almost all patches are in at the moment, which is nice for the incoming stable releases :) 15:17:00 ++ 15:17:03 great news 15:17:10 #topic Grafana 15:17:17 #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:18:08 in grafana failure rates seems to be pretty ok for all of the jobs recently 15:19:50 I think we can move on, unless You see any issue with grafana and wants to talk about it now 15:20:18 nothing from me 15:20:29 that looks good to me 15:20:39 ok, let's move on 15:20:41 #topic fullstack/functional 15:20:41 also this is with updated job names right? 15:21:04 bcafarel: I think it is 15:21:31 regarding those jobs I have only one small update about fullstack issue https://bugs.launchpad.net/neutron/+bug/1933234 15:21:49 I just confirmed today with extra logs why this test is failing 15:22:29 basically router is still processed and its ports aren't added to the RouterInfo.internal_ports cache 15:22:45 when network_update rpc message comes 15:23:02 due to that, it can't find port attached to that router from the network 15:23:11 and router update isn't scheduled 15:23:22 good catch 15:23:28 so this is in fact real bug, not test issue 15:23:35 indeed a race condition 15:23:51 how are we updating the network before adding the port to the internal cache? 15:23:51 and I don't know yet how to fix it 15:24:28 ralonsoh: so router is processed and goes to https://github.com/openstack/neutron/blob/3764969b82c6e7b8c74172a1ec4d230ce4ddedcc/neutron/agent/l3/router_info.py#L636 15:24:38 there it should be added to the internal ports cache 15:24:42 right 15:24:54 but in the meantime, network_update is called https://github.com/openstack/neutron/blob/3764969b82c6e7b8c74172a1ec4d230ce4ddedcc/neutron/agent/l3/agent.py#L602 15:25:13 ok, I see 15:25:16 and if there is no this port in cache yet, it fails to schedule router update 15:25:31 and nothing to say "stop, we are updating this..." 15:26:06 yes, I think I will add some kind of flag (lock) in https://github.com/openstack/neutron/blob/3764969b82c6e7b8c74172a1ec4d230ce4ddedcc/neutron/agent/l3/router_info.py#L1256 15:26:26 and then we can check it in network_update if it's processing that router info or not 15:26:36 if yes, we can wait a bit with looking for ports 15:26:45 but I didn't implement anything yet 15:26:53 I have it on my todo list for tomorrow :) 15:27:28 regarding https://bugs.launchpad.net/neutron/+bug/1930401 and privsep timeout 15:27:49 https://review.opendev.org/c/openstack/neutron/+/794994 is failing with old timeouts :-( 15:28:17 I started to check, but have no (quick) idea what should be there 15:28:25 but this is because what was failing is not the privsep command 15:28:35 but the daemon spawn process, right? 15:28:44 the daemon was not spawning 15:28:44 ++ 15:29:26 ralonsoh: yeah 15:29:31 the timeout will prevent from a dead lock during a command execution 15:29:46 but not prevent or mitigate the problem we have here 15:31:45 so we still need to have 2 dhcp agents in those test which helped us a lot to workaround that issue 15:32:13 yes for now until we now what is preventing the daemon to start 15:32:23 k 15:32:45 ok 15:33:43 #topic Tempest/Scenario 15:33:59 here I just wanted to ask You to review https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/799648 15:34:13 it's small patch but it is causing issues in the tripleo based jobs which runs ovn 15:34:18 thx in advance 15:34:48 regarding issues in those jobs, I didn't found anything new worth to discuss today 15:35:34 and that's all from me for today regarding CI jobs 15:35:44 do You have anything else You want to discuss? 15:35:49 I'm fine 15:36:14 same here 15:36:39 so one last thing - do You want to cancel next week's meeting or anyone wants to chair it? 15:37:03 (I'm fine with having a "free" week) 15:37:08 +1 15:37:22 if there is nothing catastrophic, of course 15:37:30 worst case we know where to ping people :) 15:37:38 hehehe yes 15:37:52 ok, good 15:37:59 so I will cancel next week's meeting 15:38:09 and I think we are done for today 15:38:24 thx a lot for attending and for keeping our ci up and running :) 15:38:31 it seems really good recently 15:38:41 have a great week and see You online 15:38:44 o/ 15:38:46 bye 15:38:47 #endmeeting