13:00:09 #startmeeting kolla 13:00:09 Meeting started Wed Oct 18 13:00:09 2023 UTC and is due to finish in 60 minutes. The chair is mnasiadka. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:09 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:09 The meeting name has been set to 'kolla' 13:00:13 #topic rollcall 13:00:53 o/ 13:00:58 \o 13:01:16 o/ 13:01:16 o/ 13:01:23 \o 13:01:24 o/ 13:01:27 o/ 13:03:32 #topic Agenda 13:03:32 * Announcements 13:03:32 * CI status 13:03:32 * Release tasks 13:03:32 * Current cycle planning 13:03:34 * Additional agenda (from whiteboard) 13:03:34 * Open discussion 13:03:37 #topic Announcements 13:03:43 PTG is next week 13:03:51 #link https://etherpad.opendev.org/p/kolla-caracal-ptg 13:04:00 \o/ 13:04:04 Sign up please in L37 13:04:07 * mmalchuk ready 13:04:16 And please add topics (L101 and beyond) 13:04:37 #topic CI status 13:04:50 I think it's mainly green, not couting our Ansible breakage 13:04:51 Juan Pablo Suazo proposed openstack/kolla-ansible master: Enable the Fluentd Plugin Systemd https://review.opendev.org/c/openstack/kolla-ansible/+/875983 13:04:56 Juan Pablo Suazo proposed openstack/kolla master: Adds TAAS Neutron plugin to support OVS port mirrors https://review.opendev.org/c/openstack/kolla/+/885151 13:04:59 Trying to wrap my head around it, but it's not going to be an easy one 13:05:08 Anybody did have a look in that as well? 13:05:33 not yet. maybe building a simple reproducer will be needed? 13:05:35 not really, I still suspect a real upstream bug, or did we change anything in that area? 13:06:00 We did not, that's the usual "we fixed that bug in Ansible" :D 13:06:01 there is an issue upstream, SvenKieske have a link? 13:06:01 there's also already user reports about this on the ML, I answered with the workaround 13:06:27 #link https://github.com/ansible/ansible/issues/81945 13:06:39 right 13:07:21 Anyway, let's move on 13:07:26 #topic Release tasks 13:07:35 Time to have a look at them - we merged switching the sources to Bobcat 13:08:28 seems still no sign of RDO repos on the centos stream mirror 13:08:49 I'll wait instead of using trunk.rdoproject.org 13:08:58 Dawud proposed openstack/kolla-ansible stable/yoga: Remove the `grafana` grafana https://review.opendev.org/c/openstack/kolla-ansible/+/898736 13:09:02 no other release tasks per se - just reviewing existing patches 13:09:39 Dawud proposed openstack/kolla-ansible stable/yoga: Remove the `grafana` volume https://review.opendev.org/c/openstack/kolla-ansible/+/898736 13:09:54 not sure I can finish the fluentd stuff in time..currently have a problem checking stuff inside the container. I think it's best to split out fluentd out of the "common" role, which will require some more work. 13:10:13 not sure if I should move this point to "open discussion"? 13:10:47 good idea to split 13:10:58 it's ok to discuss it here 13:11:04 we should probably move this cycle 13:11:09 but it seems we don't have to 13:11:46 yeah, so if there are no objections I'd move fluentd in a dedicated role, I'm also on vacation friday and with PTG coming up..how much time is left? :D 13:12:07 Dawud proposed openstack/kolla-ansible stable/yoga: Remove the `grafana` volume https://review.opendev.org/c/openstack/kolla-ansible/+/898736 13:12:57 I guess I take the id software dev approach: "It's done, when it's done". 13:13:08 it doesn't need to be done before the PTG 13:13:18 we're cycle trailing, so we still have some time to release 13:13:29 although I would prefer we do it sooner than last minute :) 13:14:16 is fluentd the only upgrade missing? I lost track, are we good with prometheus plugins? 13:14:18 yeah sure; I just need to do some other stuff as well - mainly writing docs it seems - will check internally what has higher priority, maybe the answer is even fluentd. 13:14:23 or yesterday) 13:15:02 afaik I asked last - or the meeting before that - if someone could take a look at the prometheus plugins, no? I have lost track as well. 13:15:47 That might be looked into even last minute, although I would prefer that we would finally have a proper solution (I remember hrw started a script for checking versions on github and updating sources.py) 13:16:09 I might have a look into that after that crappy Ansible breakage 13:17:56 ok then 13:18:04 Podman - seems it's waiting for some reviews, probably mine 13:18:07 Let's Encrypt the same 13:18:17 frickler: do you have any cycles for looking at those two? 13:18:19 https://github.com/openstack-exporter/openstack-exporter/releases is at least at the latest :) 13:18:23 It would be nice to merge them this cycle 13:18:55 I can check podman 13:19:43 be sure to also look at the (linked?) ansible-collection-kolla change for podman, I guess that's also still open 13:20:02 ack 13:20:19 ah it's in the depends on: https://review.opendev.org/c/openstack/ansible-collection-kolla/+/852240 13:20:22 https://review.opendev.org/c/openstack/kolla/+/887347 can be merged 13:20:35 I commented on ack patch - I think we need podman based jobs there 13:20:41 a-c-k it is 13:21:05 yeah, I agree, there should be some testing going on :) 13:21:27 Ok, Let's Encrypt - I'll ask bbezak to review those next week 13:21:53 and let's try to get those in, even if the code is not perfect-ish ;) 13:22:07 #topic Additional agenda 13:22:24 4 patches from jsuazo (the same ones again) 13:22:25 let's see 13:22:34 https://review.opendev.org/c/openstack/kolla/+/885151 13:22:56 +2 from me 13:23:09 https://review.opendev.org/c/openstack/kolla-ansible/+/885417 13:23:14 (k-a side of TAAS) 13:23:25 already has +2 from me 13:23:34 frickler: willing to have a look or should wait for bbezak? 13:24:09 better ask bbezak, I'd be too picky for this 13:24:21 understood 13:24:27 https://review.opendev.org/c/openstack/kolla-ansible/+/844614 - Glance/Cinder-backup S3 13:25:10 ready 2 weeks ago 13:26:15 commented, but basically looks good 13:26:42 ok then 13:26:48 nothing more on the whiteboard 13:26:55 #topic Open discussion 13:27:06 I'll cancel next weeks meeting because we'll have the PTG sessions 13:27:11 Anything else? 13:27:27 if someone has free time, and it's maybe also a topic for PTG: https://review.opendev.org/c/openstack/kolla-ansible/+/898543 13:28:01 just a hack to enable quorum queues; all the feedback I read was that they are really nice to have 13:30:00 interesting. is this tested under heavy load? 13:30:15 well, the questions I have is 1) do we test that in CI 2) do we want to switch it to default in C 3) Migration docs for users - since it's breaking-ish 13:31:10 but since classic queue mirroring is deprecated for removal in 4.0 - it's either quorum queues or streams 13:31:35 Seems there is some movement in oslo.messaging 13:31:38 #link https://review.opendev.org/c/openstack/oslo.messaging/+/888479 13:31:44 2 - no 13:31:44 mmalchuk: untested from my side (the patchset), from what I understand OVH does use quorum queues under heavy load, but they don't use k-a 13:31:44 this enough, so we need this in k-a 13:31:44 we can start with adding the flag and allow this for new deployments. migration can be done next cycle then 13:31:44 no docs yet :) there's also an open bug to enable streams from OVH, but afaik that's not even implemented in oslo 13:31:47 #link https://review.opendev.org/c/openstack/oslo.messaging/+/890825 13:31:48 frickler: that was my intention as well :) 13:31:50 frickler +1 13:32:10 similar to what we did with the HA flag for rabbit 13:32:11 frickler: would feel safer if we had at least one CI job that uses it 13:32:36 I agree on the CI job, didn't have an immediate idea how that would best be implemented 13:32:49 I don't think we need to test all possible configuration combinations. having a job when we switch the default is good enough IMO 13:32:49 mnasiadka test ha flag or +queues? 13:32:59 quorum queues 13:33:37 Dawud proposed openstack/kolla-ansible stable/yoga: Remove the `grafana` volume https://review.opendev.org/c/openstack/kolla-ansible/+/898736 13:35:19 we don't need to test all possible configuration combinations, but it would be useful to test this - with a vision that we'll move to this as default since queue mirroring will be gone in RMQ 4.0 13:35:47 like let's use quorum on Ubuntu and HA on Rocky - or something similar 13:36:39 sure; would that just entail a set of jobs with it enabled? but which scenario should this run? I've never written upstream CI jobs "from scratch" so I would be grateful for some pointers how that should look like 13:36:54 I would still prefer to defer that to the next cycle. but feel free to add a patch for that on top of SvenKieske's, it shouldn't block that change though 13:37:12 I can see both sides of the argument 13:37:46 I can add a warning doc "this is untested" :D on the other hand, when I looked at other deployment projects, they enabled quorum queues without additional tests ;) 13:38:24 you said OVH uses it 13:38:26 ¯\_(ツ)_/¯ 13:38:32 they don't use k-a 13:38:43 I can see people raising bugs about it and if we can't test it in CI - and no one uses that in production from the Kolla community - we're kind of not supporting it? :) 13:38:44 k-a only deployment tool 13:38:47 if you search for quorum queues on the ML there are some happy users :) 13:39:24 to have some real testing, one would at least need to do upgrades, if not host failovers. for upgrades we need to add the flag now so we can test next cycle 13:39:36 two cents not enable it by default 13:39:39 fwiw the SCS project is interested in quorum queues because there was some perceived instability in certain upgrade scenarios, so I hope I get some testers there as well 13:40:07 not enable it by default just yet 13:40:19 mmalchuk: the current patch disables it 13:40:23 just have a minimal testing coverage 13:40:25 frickler: agree 13:40:47 okay, so a basic (HA?) job that just tests if enabling the flags works? 13:41:00 well, I guess multinode would be best 13:41:09 seems like a good compromise? 13:41:13 but singlenode would at least tell us it works 13:41:14 :) 13:41:17 yeah multinode, that's what I meant 13:41:24 because now we can only assume it works 13:42:00 Ok, I'll think of something 13:42:12 it should work ;) it's at least supported by oslo :D I'll add some very basic test, probably not this week, so if someone beats me, I'm also fine with that :) 13:42:13 Not counting we just changed the default to HA 13:42:29 and mattcrees spent some time to enable it in upgrade jobs including a cleanup ;) 13:42:54 and probably moving from HA to quorum includes the same dance 13:43:11 yeah, it's a little different, but in practice you have to recreate all the queues. 13:43:15 what about adding it to the "experimental" pipeline for now? Won't guarantee it won't regress, but at least that way the job lives in the repo. 13:43:56 or switch multinode (not multinode-upgrade) to quorum queues and fix the upgrade bit in C 13:44:11 this way we will have some better testing coverage and a path forward 13:44:12 let's discuss further stuff in the patch review, what do you think? 13:44:18 but anyway, we've spent too much time on this :) 13:44:31 yes, let's discuss there 13:44:35 * kevko is upgrading to zed right now, but watching 13:44:45 Merged openstack/kayobe stable/2023.1: bifrost: Populate bifrost host vars on deprovision https://review.opendev.org/c/openstack/kayobe/+/898561 13:45:33 ok then 13:45:42 I don't think there is anything more 13:45:59 we still lack of Kayobe reviews 13:46:01 https://review.opendev.org/c/openstack/kayobe/+/861397 13:46:07 and https://review.opendev.org/c/openstack/kayobe/+/879554 13:46:17 a half of year 13:46:31 almost 13:46:56 Passed that internally in SHPC 13:47:11 thanks 13:47:18 Let's finish for today - thank you all for coming and speak to you on Monday :) 13:47:21 #endmeeting