15:01:40 #startmeeting openstack-helm 15:01:44 Meeting started Tue Nov 17 15:01:40 2020 UTC and is due to finish in 60 minutes. The chair is lamt. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:45 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:47 The meeting name has been set to 'openstack_helm' 15:02:14 #link https://etherpad.opendev.org/p/openstack-helm-weekly-meeting agenda 15:02:20 Hello 15:02:31 o/ 15:02:41 \o 15:02:42 o/ 15:03:02 Our fearless leader is feeling unwell. He asked me to host this week's meeting 15:03:31 I am glad he/she decides to take a day off 15:04:12 I hope they are getting some rest 15:04:32 I hope *he* is too. 15:04:59 I think we can get started. 15:05:29 Quick reminder, there is no meeting next week 15:05:41 #topic OSH compute gate failure 15:06:05 This is a follow up from last week, but the gate is still broken 15:06:32 X) 15:06:35 so no OSH patches can merge 15:07:05 oh no! 15:07:59 Do we know how it is broken? 15:08:26 Not exactly, conjecture is some system library in Bionic 15:08:37 starts to act up 15:09:01 to be precise, causing openvswitch to gag when it can't create threads 15:09:32 Sounds tricky :/ 15:09:40 let me try that tweak today 15:09:52 I tried to update all the old stuff in the gate (minikube version, helm version, k8s version) 15:10:14 all signs point to a system issue though 15:10:22 (I think they should be updated anyway) - here was my last attempt that sort of work 15:10:41 oh really... did that actually help? 15:10:45 https://review.opendev.org/#/c/762361/ 15:10:55 Updating the host to focal worked 15:11:02 "worked" - but 15:11:09 it introduced new problems 15:11:33 it crashes ceph and some other python/C library errored with Rocky 15:12:35 I assume it is some system library difference between focal and bionic was the cause, but I wasn't able to pinpoint it 15:12:59 hmm.... so at least it confirms our suspicsion 15:13:00 if anyone wants to take a lot, I'd appreciate that 15:13:21 I will take another crack at it today 15:14:00 Thanks - if upgrading to focal isn't the correct path, we have to find the fix for ovs 15:14:07 so any other host we can try on besdies vexx and focal? 15:14:36 I believe Andrii tried to address the max task parameter, but from what I can tell it was never limited 15:14:43 so setting it to infinity doesn't do anything 15:15:15 I tried the 32gb node, and same error - doesn't look like system memory constrainted. 15:17:07 Thanks miniroy for taking a look 15:17:29 I wonder if something in the image change..... 15:17:37 but these are great data points to have 15:17:58 I tried that too - I reverted the ovs image to the date the last gate passed 15:18:04 same failure :/ 15:18:31 =( 15:19:19 others can chime in as well - but I exhausted all the possibilities why ovs keeps failing 15:20:18 we don't have local access to any of these hosts right? 15:20:30 I do not think so 15:22:20 if it is not system library - it is some constraints preventing pthread creation 15:22:51 any idea what kernel is running on focal? 15:24:00 ansible_kernel: 5.4.0-53-generic 15:24:39 oh and that works actually... wow 15:24:58 it breaks other things 15:25:03 like ceph and rocky 15:25:09 with cffi 15:26:10 so should we move ahead and try to get it work with focal? or try to get it working with vexx? 15:26:52 I will let others chime in - both require a bit of work 15:27:33 it is asking if you want to fix errors A or errors B. 15:28:44 well let me poke at that ovs/vexx issue today more then 15:28:55 thanks miniroy 15:28:58 sounds like probably more work at this point with upgrade to focal 15:29:11 looks like an onion to me I am afraid 15:29:25 the alternative is to fix ovs 15:29:38 ceph is probably just the first layer I am afraid 15:29:56 yup 15:30:14 in the absense of our fearless leader, I say let's focus on fixing ovs for now 15:30:30 ++ 15:31:16 we can take this offline to troubleshoot 15:31:41 + 15:31:43 let's move on 15:31:50 #topic Open Discussion 15:32:28 there is no other agenda item - so opening the floor for discussion 15:34:12 if nothing else, we can end the meeting. Everyone have a great rest of the week. 15:34:19 #endmeeting