15:00:21 <slaweq> #startmeeting neutron_ci
15:00:21 <opendevmeet> Meeting started Tue Dec 14 15:00:21 2021 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:21 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:21 <opendevmeet> The meeting name has been set to 'neutron_ci'
15:00:21 <lajoskatona> sorry frickler :-)
15:00:29 <ralonsoh> hi again
15:00:33 <slaweq> welcome on another meeting :)
15:00:36 <lajoskatona> frickler: I try to fetch some time to check the trunk discussion
15:00:39 <lajoskatona> Hi
15:01:24 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:01:30 <obondarev> hi
15:01:34 <ykarel> hi
15:01:35 <mlavalle> o/o/
15:01:53 <slaweq> I think we can start now
15:01:57 <bcafarel> hi again
15:01:58 <slaweq> #topic Actions from previous meetings
15:02:06 <slaweq> slaweq to check missing grafana logs for periodic jobs
15:02:19 <slaweq> actually ykarel already proposed fix https://review.opendev.org/c/openstack/project-config/+/820980
15:02:23 <slaweq> thx ykarel
15:02:51 <slaweq> next one
15:02:52 <slaweq> ralonsoh to check  https://bugs.launchpad.net/neutron/+bug/1953481
15:03:08 <ralonsoh> #link https://review.opendev.org/c/openstack/neutron/+/820911
15:03:27 * dulek doesn't want to interrupt, but the CI is broken at the moment due to pecan bump in upper-constraints.txt. Ignore if you're already aware.
15:04:02 <ralonsoh> actually this is the meeting to tell this update
15:04:12 <slaweq> thx ralonsoh for the fix
15:04:13 <mlavalle> thanks dulek
15:04:24 <lajoskatona> dulek: thanks, recheck is useless now?
15:04:29 <slaweq> and thx dulek for info - it's very good timinig for that :)
15:04:38 <dulek> I think it's useless.
15:04:43 <dulek> I mean recheck.
15:04:43 <lajoskatona> ok, thanks
15:04:44 <ralonsoh> good to know
15:04:52 <mlavalle> yeah, chances are it's useless :-)
15:05:12 <dulek> The conflict is caused by:
15:05:14 <dulek> The user requested pecan>=1.3.2
15:05:16 <dulek> The user requested (constraint) pecan===1.4.1
15:05:31 <dulek> This is the error. And yes, I can't really understand it either, because I don't think this conflicts.
15:05:41 <ykarel> dulek, log link? just to check if it failed recently, as that issue was fixed for some provider some time back
15:05:48 <frickler> ah, that was likely an issue with pypi, see discussion in #opendev
15:05:49 <lajoskatona> mlavalle: https://youtu.be/SJUhlRoBL8M
15:05:55 <dulek> https://0d4538c7b62deb4c15ac-e6353f24b162df9587fa55d858fbfadc.ssl.cf5.rackcdn.com/819502/3/check/openstack-tox-pep8/a4f6148/job-output.txt
15:05:59 <frickler> hopefully solved by refreshing CDN
15:06:03 <ykarel> approx 2 hour back
15:06:18 <slaweq> ++ thx ykarel and frickler
15:06:19 <ykarel> ^ logs are older than that
15:07:47 <slaweq> ok, lets move on with the meeting
15:07:49 <slaweq> #topic Stable branches
15:07:58 <slaweq> bcafarel any updates?
15:08:10 <slaweq> I think it's not in bad shape recently
15:08:16 <bcafarel> indeed I think we are good overall
15:08:33 <bcafarel> I was off yesterday so still checking a few failed runs in train, but nothing that looks 100% reproducible :)
15:09:03 <slaweq> great, thx
15:09:09 <slaweq> #topic Stadium projects
15:09:20 <slaweq> anything in stadium what we should discuss today?
15:09:23 <slaweq> lajoskatona
15:09:26 <lajoskatona> everything is ok as far as I know
15:09:33 <opendevreview> Rodolfo Alonso proposed openstack/neutron master: Remove the expired reservations in a separate DB transaction  https://review.opendev.org/c/openstack/neutron/+/821592
15:09:48 <slaweq> that's great, thx :)
15:09:56 <slaweq> #topic Grafana
15:09:57 <lajoskatona> perhaps some advertisement: if you have time please check tap-as-a-service reviews: https://review.opendev.org/q/project:openstack%252Ftap-as-a-service+status:open
15:10:11 <slaweq> lajoskatona sure
15:10:46 <slaweq> lajoskatona all of them? or just those which don't have merge conflicts now?
15:10:55 <mlavalle> several of them have merge conflicts
15:12:21 <lajoskatona> slaweq: sorry, the recent ones, for some reason that is not working for me now
15:12:32 <slaweq> k
15:13:41 <lajoskatona> https://review.opendev.org/q/project:openstack/tap-as-a-service+status:open+-age:8week
15:13:53 <lajoskatona> sorry for spamming the meeting....
15:14:32 <slaweq> lajoskatona I will review them tomorrow
15:14:47 <lajoskatona> slaweq: thanks
15:15:16 <slaweq> ok, lets get back to grafana
15:15:21 <slaweq> http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:16:24 <slaweq> in overall things don't looks bad IMO
15:16:49 <slaweq> and I also saw the same while looking at results of the failed jobs from last week
15:17:03 <opendevreview> Maor Blaustein proposed openstack/neutron-tempest-plugin master: Add test_create_and_update_port_with_dns_name  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/821079
15:17:26 <slaweq> today I also run my script to check number of rechecks recently
15:17:32 <slaweq> and we are improving I think
15:17:47 <slaweq> week 2021-47 was 3.38 recheck in average
15:17:57 <slaweq> week 2021-48 - 3.55
15:18:05 <slaweq> week 2021-49 - 2.91
15:18:22 <mlavalle> nice, trending in the right direction!
15:18:26 <ralonsoh> much better than before
15:18:29 <opendevreview> Bernard Cafarelli proposed openstack/neutron stable/train: DNM: test tempest train-last tag  https://review.opendev.org/c/openstack/neutron/+/816597
15:18:31 <slaweq> it's at least not 13 rechecks in average to get patch merged as it was few weeks ago
15:18:32 <lajoskatona> +1
15:18:52 <mlavalle> it's a big improvement
15:18:53 <slaweq> and one more thing
15:18:57 <obondarev> cool!
15:19:08 <slaweq> I spent some time last week, going through the list of "gate-failure" bugs
15:19:15 <ykarel> cool
15:19:16 <slaweq> #link https://tinyurl.com/2p9x6yr2
15:19:21 <slaweq> I closed about 40 of them
15:19:34 <slaweq> but still we have many opened
15:19:40 <slaweq> so if You would have some time, please check that list
15:19:51 <ralonsoh> sure
15:19:55 <slaweq> maybe something is already fixed and we can close it
15:20:06 <slaweq> or maybe You want to work on some of those issues :)
15:20:20 <mlavalle> so the homework is to close as many as possible?
15:20:31 <mlavalle> ohh, now I understand
15:20:36 <slaweq> also, please use that list to check if You didn't hit some already known issue before rechecking patch
15:20:38 <lajoskatona> slaweq: thanks, and thanks for closing so many of old bugs
15:21:14 <slaweq> I want to remind You that we should not only recheck patches but try to identify reason of failure and open bug or link to the existing one in the recheck comment :)
15:21:22 <slaweq> that's what we agreed last week on the ci meeting
15:22:32 <slaweq> anything else You want to talk regarding Grafana or related stuff?
15:22:37 <slaweq> if not, I think we can move on
15:22:57 <lajoskatona> nothing from me
15:23:28 <ykarel> just that i pushed https://review.opendev.org/c/openstack/project-config/+/821706 today
15:23:33 <ykarel> to fix dashboard with recent changes
15:24:26 <lajoskatona> ykarel: thanks
15:25:30 <slaweq> thx
15:25:48 <slaweq> #topic fullstack/functional
15:25:49 <frickler> I merged that already, feel free to ping me for future updates
15:26:05 <ykarel> Thanks frickler
15:26:10 <slaweq> I opened one new bug related to functional job's failures: https://bugs.launchpad.net/neutron/+bug/1954751
15:26:24 <slaweq> I noticed it I think twice this week
15:26:32 <slaweq> so probably we will need to investigate that
15:27:02 <slaweq> I will add some extra logs there to know better what's going on in that test and why it's failing
15:27:16 <slaweq> as for now it's not easy to tell exactly what was the problem there
15:27:49 <slaweq> #action slaweq to add some extra logs to the test, related to https://bugs.launchpad.net/neutron/+bug/1954751 to help further debugging
15:28:51 <slaweq> #topic Tempest/Scenario
15:29:27 <slaweq> regarding scenario jobs, the only issue which I noticed that is really impacting us now often are jobs' timeouts https://bugs.launchpad.net/neutron/+bug/1953479
15:29:40 <slaweq> I saw at least 2 or 3 times such timeouts this week again
15:29:52 <slaweq> ykarel
15:29:57 <slaweq> proposed https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/821067
15:30:02 <slaweq> to use nested virt in those jobs
15:30:19 <slaweq> and wanted to discuss what do You think about it
15:30:31 <slaweq> IMO that sounds like worth to try (again)
15:30:37 <ralonsoh> for sure yes
15:30:41 <slaweq> maybe this time it will work for us better
15:30:59 <ralonsoh> do we have figures? to compare performance
15:31:04 <lajoskatona> my concern /question is regarding the availabilty of these nodes
15:31:05 <slaweq> the only issue with that, IIUC is that there is limited pool of nodes/providers which provides that possibility
15:31:10 <ykarel> yes initial results are good with that but worth trying it in all patches
15:31:13 <slaweq> so jobs may be executed longer
15:31:29 <lajoskatona> if ti will be a bottleneck as jobs are waiting for nodes/resources
15:31:33 <ykarel> those scenario jobs are now taking approx 1 hour less to finish
15:31:49 <ralonsoh> and are less prone to CI failures, right?
15:32:02 <ykarel> yes i have not seen any ssh timeout failure in those tests yet
15:32:17 <lajoskatona> sounds good
15:32:25 <ralonsoh> hmmm I know the availability is an issue, but sounds promising
15:32:27 <obondarev> +1 let's try
15:32:32 <ralonsoh> +1 right
15:32:36 <mlavalle> +1
15:32:50 <lajoskatona> +1, we can revisit if we see bottleneck :-)
15:32:59 <slaweq> yes, that's also my opinion about it - better wait a bit longer than recheck :)
15:33:08 <ralonsoh> right!
15:33:09 <ykarel> yes availability is a concern, but if we have less queue time then we should be good
15:33:32 <mlavalle> yes, the overall effect might be a net reduction in wait time and increase in throughput
15:33:38 <ykarel> and also another issue is when all the providers are down that provide those nodes
15:33:42 <ykarel> we will have issue
15:33:56 <ykarel> but if that happens rarely we can switch/switch-out from those nodes
15:34:06 <ralonsoh> another fire in OVH? I hope no
15:34:13 <slaweq> LOL
15:34:17 <mlavalle> LOL
15:35:02 <ykarel> will send seperate patch to easily allow switching/reverting from those nodes
15:36:07 <slaweq> ykarel thx a lot
15:36:12 <slaweq> ok, lets move on
15:36:13 <slaweq> #topic Periodic
15:36:35 <slaweq> I see that since yesterday neutron-ovn-tempest-ovs-master-fedora seems to be broken (again)
15:36:41 <slaweq> anyone want's to check it?
15:36:50 <slaweq> if not, I can do it
15:37:01 <mlavalle> I'll help
15:37:28 <slaweq> thx mlavalle
15:37:36 <slaweq> #action mlavalle to check failing neutron-ovn-tempest-ovs-master-fedora job
15:38:00 <slaweq> so that's all what I had for today
15:38:06 * mlavalle has missed to see his name associated to an action item in the CI meeting :-)
15:38:18 <slaweq> there is still that topic about ci improvements from last week
15:38:41 <slaweq> but TBH I didn't prepare anything for today as I though that maybe it will be better to talk about it next week on video call
15:38:43 <slaweq> wdyt?
15:38:55 <slaweq> should we continue discussion today or next week on video?
15:38:55 <mlavalle> sounds good
15:38:58 <ralonsoh> perfect, I like video calls for CI meetings
15:39:20 <mlavalle> what video service do we use?
15:39:34 <lajoskatona> mlavalle: jitsi
15:39:41 <mlavalle> cool!
15:40:39 <slaweq> ok, so lets continue that topic next week then
15:40:53 <lajoskatona> I like the video personally, its extra work fro you to keep the logs written
15:41:06 <slaweq> if there are no other topics for today, I can give You back about 20 minutes
15:41:26 <mlavalle> +1
15:41:28 <ralonsoh> see you tomorrow!
15:41:29 <slaweq> lajoskatona it's not big thing to do TBH, and I like video meetings too :)
15:41:33 <ykarel> +1
15:41:47 <slaweq> thx for attending the meeting today
15:41:54 <slaweq> have a great day and see You online :)
15:41:56 <mlavalle> o/
15:41:59 <slaweq> #endmeeting