13:00:13 #startmeeting kolla 13:00:13 Meeting started Wed Jun 28 13:00:13 2023 UTC and is due to finish in 60 minutes. The chair is mnasiadka. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:13 The meeting name has been set to 'kolla' 13:00:15 #topic rollcall 13:00:19 o/ 13:00:41 o/ 13:00:56 \o 13:01:04 o/ 13:01:25 \~o~\ 13:01:45 o/ 13:02:26 o/ 13:02:49 \o 13:03:25 #topic agenda 13:03:25 * Announcements 13:03:25 * Review action items from the last meeting 13:03:25 * CI status 13:03:25 * Release tasks 13:03:26 * Regular stable releases (first meeting in a month) 13:03:26 * Current cycle planning 13:03:28 * Additional agenda (from whiteboard) 13:03:28 * Open discussion 13:03:30 #topic Announcements 13:03:38 As discussed previously - I'm off for the next three weeks 13:03:48 I cancelled weekly meetings for that time 13:04:06 #topic Review action items from the last meeting 13:04:19 mnasiadka to raise EOL for stable/wallaby and send email to the ML - done 13:04:25 #topic CI status 13:04:39 I guess green on average 13:04:46 fixes ongoing here and there 13:04:55 #topic Release tasks 13:05:18 I think we restored using master on the master branch 13:05:43 So no other tasks planned in the calendar 13:05:55 #topic Regular stable releases 13:06:04 Do we want to post stable releases next week? 13:06:26 or maybe now, lest we forget? 13:06:27 was there wallaby release before eol? or do we not do such? 13:06:40 that would be good imho, there where some users here asking for bugs that where closed in git, but not on pypi releases 13:06:40 we can't do releases when branch is EM 13:06:41 wallaby was long em, no releases 13:06:48 ok 13:07:02 frickler: willing to raise a patch or do we need another volunteer? 13:07:18 if erlang/rmq got sorted out on stable/* then maybe release? 13:07:39 I can do that, just need to know what to wait for. erlang rmq seems sensible 13:07:51 yeah, it got sorted-ish, we need to sort out Ubuntu/aarch64 it seems 13:08:01 I _guess_ rmq should be stable, but I just moved a bug about crashing rmq to kolla-ansible: 13:08:15 https://bugs.launchpad.net/bugs/2023668 13:08:59 isn't 3.9 EOL? 13:09:08 at least it was claimed the installation was done according to the kolla docs, so I would be interested in finding out why this crashes on yoga 13:09:27 https://www.rabbitmq.com/versions.html 13:09:37 didn't we have this last week already? they're using binary build it seems 13:09:39 I think I found 3.9 referenced in some docs/files on an older branch of us? 13:09:48 binary, uh 13:10:15 anyway, bug is there, we can have a look 13:10:25 #topic Current cycle planning 13:10:39 I see kevko has revived Let's Encrypt - so there's a chance we'll merge it finally 13:10:39 Debian/bookworm? 13:10:49 hrw: stage is yours for Bookworm ;) 13:11:00 just started to debug why it is not working out of the box on my env :P 13:11:11 mnasiadka: it is https://review.opendev.org/q/topic:for-debian-bookworm-upgrade as it was before 13:11:34 that's kolla side 13:11:48 what about nodepool instances and testing on those? 13:12:01 mnasiadka: building we can do on anything 13:12:31 still need to look into building nodepool image 13:13:03 hrw: we can, if you're fine we drop Debian in kolla-ansible - let's just build images :) 13:13:06 Merged openstack/kolla-ansible master: loadbalancer: Add option to not define track script https://review.opendev.org/c/openstack/kolla-ansible/+/887020 13:13:48 Given we have vacation season ahead - I'm pretty sure we're going be last minute merging kolla-ansible support 13:14:36 Anyway, let's not forget about building nodepool image, I think it's fine if we don't have a mirror and download over the internet (just like in Rocky case) 13:14:55 mnasiadka: patches can be hold with 2 CR+2, V+1 and without W+1 13:14:55 the support matrix at https://docs.openstack.org/kolla/xena/support_matrix doesn't indicate btw that we don't support binary builds of rabbitmq containers, should we change that? 13:15:01 mirror is done 13:15:07 ah, mirror is done, fantastic 13:15:10 mnasiadka: but without reviews what is a point of doing anything further? 13:15:17 sorry, maybe defer my point to "open discussion" 13:15:27 I'm reviewing those patches 13:15:35 hrw: I'll have a look later today 13:15:55 also maybe we want RP+1 for those? or +2? 13:16:18 I did reviews there as well, there are open discussion points, without answer, so I didn't look again yet 13:17:27 hrw: it would also help if those would be passing Zuul ;) 13:17:42 mnasiadka: sure 13:18:06 so run a recheck, let's see fresh results, and try to get somewhere :) 13:18:19 Ok, podman guys are not here today - but I've seen some updates 13:18:31 INFO:kolla.common.utils.openstack-base: The user requested cachetools 13:18:34 INFO:kolla.common.utils.openstack-base: The user requested (constraint) cachetools===5.3.0 13:18:47 that kind of bugs are "not my fault" ;D 13:18:53 I notice rabbitmq 3.10 in bookworm is only supported upstream until end of july, this year, is the backporting in debian good? :) 13:18:55 hrw: if it's ord rax nodepool provider - they have some weird networking issues towards pypi 13:19:02 btw, yoga is unbuildable on localhost 13:19:16 kevko: I'm still amazed how it works in CI 13:19:18 but passing in CI ...do you know why ? 13:19:30 SvenKieske: it can/will move to 3rdparty repo 13:19:42 kevko: even the experimental job that doesn't use infra wheel is failing? 13:19:47 horizon is really unbuildable on localhost ..just try it ...i am fixing it in my downstream repo by limiting setuptools 13:20:06 i really don't know :D ... just try it on localhost 13:20:13 Is anybody able to help kevko - or we just merge the backport? 13:20:16 kevko: mhm I know some ppl who will be very sad about it 13:20:32 kevko: do you have any patch I can look at? 13:20:40 Michal Nasiadka proposed openstack/kolla-ansible stable/2023.1: loadbalancer: Add option to not define track script https://review.opendev.org/c/openstack/kolla-ansible/+/887069 13:20:49 Michal Nasiadka proposed openstack/kolla-ansible stable/zed: loadbalancer: Add option to not define track script https://review.opendev.org/c/openstack/kolla-ansible/+/887170 13:21:42 SvenKieske: https://review.opendev.org/c/openstack/kolla/+/873913 13:21:57 SvenKieske: it's abandoned for now i think 13:22:09 mnasiadka thanks, yeah that one 13:22:24 ty 13:22:26 not abandoned, I tried building locally long time ago - and it also failed for me 13:22:35 so the mystery is - how does it work in zuul 13:22:39 Michal Nasiadka proposed openstack/kolla stable/yoga: Pin setuptools=67.2.* https://review.opendev.org/c/openstack/kolla/+/873913 13:23:13 ok then, let's move on 13:23:26 #topic Additional agenda (from whiteboard) 13:23:43 gkoper: Infloblox designate driver (mdns running as root) 13:23:54 gkoper: stage is yours 13:24:08 o/ 13:24:11 (stage fright phase) 13:24:14 Infoblox does not support changing the port used to communicate with MDNS to request zone transfers(AXFRs) 13:24:14 Therefore MDNS containers need to be bound to port 53. Kolla built designate containers are starting with user designate, so are unable to bound the service on privileged port 53 13:24:25 Dirty workaround is to locally build containers to start with root user 13:24:25 (This poses security risk) 13:24:35 Another approach is to use CAP_NET_BIND_SERVICE to provide the capability for user designate to bind a service to a privileged port (0-1024) [Testing now ] 13:24:56 I think we're doing something similar to something related to prometheus - to be able to run ping 13:24:57 We also found some issues while templating pools.yml 13:25:13 Marcin Juszkiewicz proposed openstack/kolla master: Move to Debian 12 'bookworm' https://review.opendev.org/c/openstack/kolla/+/886088 13:25:28 gkoper: one thing at a time 13:25:55 building your own containers is the documented solution for this 13:26:00 frickler: I know you -2d a patch to run as root - but that's logical - do you see any problems with using CAP_NET_BIND_SERVICE? 13:26:23 frickler: we build downstream one set of container images, wouldn't want to have a separate one for infoblox, and separate mdns for bind ;-) 13:26:35 well it provides privileges that most deployments don't need 13:26:40 could we make that optional? I'd have to check what we do for prometheus 13:26:53 in prometheus we setcap for blackbox exporter binary 13:27:02 * SvenKieske is also thinking about how to make this conditional 13:27:11 frickler: https://github.com/openstack/kolla/blob/70f74eb64101431e23d56c6a7df96d7aab37ce2f/docker/prometheus/prometheus-blackbox-exporter/Dockerfile.j2#L32 13:27:12 it still runs as unprivileged user 13:27:33 which should be fine 13:27:36 As for the pools.yml templating we ran into wrong templating of ns_records (list was used as sigle record resulting in string[fqdn]. - easyfix) and nameservers templated using dqdn resulting in designate-manage failing with: "(proper fqdn) is not IP address or host name" 13:27:43 darmach: one thing at a time 13:27:56 ^ nothing that can't be fixed when we are done with port 53 bind 13:28:04 let's get some agreement on the root stuff 13:28:24 so I think it would be okay to setcap this as well, should be documented of course. 13:28:52 * SvenKieske wondering if this really works with podman 13:29:00 it would only add privileges to bind a low port, which most probably is not a security issue 13:29:11 it is 13:29:33 ok then, can we make it optional for greater good? 13:29:39 any privilege that is not needed is a security issue 13:29:56 Then I'm pretty sure majority of our containers is insecure in those terms ;) 13:30:01 well, even "needed" ones are :) 13:30:03 I'm not sure we want to do much special casing for a weird non-free backend 13:30:03 and running as root is a less one ? 13:30:44 if I'm not mistaken the default is to run all containers in the host network namespace, correct? 13:30:44 frickler: we support it out of the box with some kolla-ansible variables, so it would make sense to make it easier and better 13:31:02 mnasiadka: or drop support for it completely? 13:31:12 I can agree with weird non-free backend - there are customers out there in the wild with infoblox deployed though... 13:31:17 frickler: as you can see - SHPC needs that for it's customers 13:31:45 so as long as there's somebody that wants to maintain it - I don't see a reason to drop it 13:32:39 so ... why is building a custom container locally not feasible? 13:32:53 can we talk about the actual issue, please? 13:32:53 can we then add a special container build upstream 13:33:13 and deploy a different container in the infoblox case? 13:33:29 SvenKieske: the issue is there's an infoblox backend in designate, which is not tested in designate CI by the way, and requires designate-mdns to run on port 53. 13:33:49 i know... 13:33:51 the other backend, which is bind (and probably powerdns) does not require that 13:34:07 what indeed might be a problem, if my assumption above is correct regarding the host network namespace, is that other services might already be bound to port 53 13:34:36 that's a deployment specific, if somebody wants infoblox, then he needs to deal with that in his env ;) 13:35:10 currently we direct users to build it on their own, which is most probably fine - but still requires root user, so we basically direct them to do insecure installation of designate 13:35:40 mnasiadka: well not really, if per default, e.g. systemd-resolved is listening on localhost:53 and you deploy designate with k-a and infoblox and it breaks the default local resolver I would indeed refuse to merge such patches 13:35:40 so if we amend the doc to use CAP_NET_BIND_SERVICE that is enough for you? 13:36:49 not really, since we use one source of downstream images for N clients 13:36:49 it doesn't make much sense to patch something, if the patch doesn't work in major deployment scenarios, so that should at least be tested and be guaranteed to work 13:37:32 so ... extra container? designate-mdns-insecure? 13:37:41 just joking on the name of course 13:38:04 extra container - why not, and some logic in kolla-ansible to use it when infoblox is enabled 13:38:11 ack 13:38:14 or the same container and some extend_start logic to do setcap 13:38:22 under the provision that this works at all (I have doubts): couldn't we introduce a conditional, that, _if_ infoblox is enabled the container gets (re)started with cap_net_bind_service? 13:38:22 just let's agree on one of those 13:38:34 but that would modify the container at runtime? 13:38:45 at restart time ;) 13:39:04 it would, but is that something new? we remove default http certs at start time ;) 13:39:18 we mangle opensearch-dashboards plugins at start time 13:39:20 * SvenKieske thinking about which is more pain, a second container or to modify the existing one 13:39:28 "or the same container and some extend_start logic to do setcap" < I quite like that, we can give it a try and test how it works 13:40:02 but it really should only setcap on the infoblox conditional 13:40:23 sounds mostly fine to me, and imho less maintenanceburden than a whopping complete container, no? 13:40:24 SvenKieske: second container with one extra layer that runs one command sounds funny, but that's also nothing new in Kolla land 13:40:24 so if that works, I think that should be acceptable, then 13:40:34 ok then 13:40:41 Great 13:41:02 no new images, optional setcap in the existing one based on an ENV variable and we're fine 13:41:08 great 13:41:33 darmach: the other issue seems like pools.yaml template misconfiguration, just propose a new patch and let's discuss it there? 13:42:13 Yes, going to do that. It's nothing complicated. 13:42:15 there is https://review.opendev.org/c/openstack/kolla-ansible/+/878270 already, which I need to get back to 13:42:38 please check if that covers your issue, too 13:42:40 darmach: can you also have a look at ^^? 13:42:47 yeah, that would be good :) 13:43:09 * SvenKieske praying for more designate maintainers 13:43:17 from my perspective that patch needs to be backwards compatible, and now it's not. 13:43:50 but let's discuss in the patch itself 13:43:57 Will do, looping in that template was exactly what I was thinking about. 13:43:57 maybe it needs to be split into some parts, too 13:44:27 Maybe we could split-out the pools.yml part, and I could take a stab at it. 13:44:39 frickler: might be, but we need to support designate_ns_record, or at least have prechecks saying you need to rework your config 13:45:35 ok then, let's move on 13:45:37 we have an edge case , that needs to support more than 1 ns_group to be updated on Infoblox (WAN and LAN) 13:45:38 #topic Open discussion 13:46:01 gkoper: you can override pools.yaml just as other files in kolla-ansible 13:46:09 let's just cover the usual case, not the oddities 13:46:27 sure, i think Jakub already had an idea howto template that. 13:46:32 lets move on. 13:46:46 We could use the loop in junja as @frickler did - to create separate pools 13:47:17 regarding the crashing rmq 3.9 release I just asked in the bug report to provide reproducer steps as I'm 90% sure we don't mention this release anywhere. 13:47:33 I also advised to maybe not use binary builds 13:48:35 SvenKieske: we deprecated it in Yoga and removed in Zed, if they like using deprecated content - fine by me ;) 13:49:30 maybe we should make our support matrix more clear? or did we already? I think we did..half 13:50:05 we just dropped the distinction binary/source from our image support matrix, maybe a big disclaimer shouting "we do not support binary images anymore" would be good? 13:50:49 maybe not worth the effort, I guess there where maybe not even ten users with binary problems, I think. 13:50:59 that bothered to ask, at least :D 13:53:50 yeah well, binary builds always had its own issues 13:54:04 ok, anything else? 13:56:01 thanks for coming - see you again 26th July 13:56:15 in case of an urgent review request - please bug bbezak and mgoddard ;-) 13:56:19 #endmeeting