14:00:01 #startmeeting kolla 14:00:01 Meeting started Wed Feb 7 14:00:01 2024 UTC and is due to finish in 60 minutes. The chair is mnasiadka. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:01 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:01 The meeting name has been set to 'kolla' 14:00:04 #topic rollcall 14:00:04 o/ 14:00:08 \o 14:00:18 o/ 14:00:30 |o 14:00:31 \o 14:00:33 \o 14:00:35 o/ 14:00:46 \o/ 14:01:14 \o 14:01:14 o/ 14:01:21 o/ 14:01:38 o/ 14:02:04 #topic agenda 14:02:04 * CI status 14:02:05 * Release tasks 14:02:05 * Regular stable releases (first meeting in a month) 14:02:05 * Current cycle planning 14:02:06 * Additional agenda (from whiteboard) 14:02:06 * Open discussion 14:02:08 #topic CI status 14:02:28 So, we broke OVN jobs - fixing that with https://review.opendev.org/c/openstack/kolla-ansible/+/908166 14:02:49 Rocky9 Ironic CI is suffering from ironic-api vs ironic-inspector race 14:02:57 any other CI issues that I haven't noticed? 14:03:07 And centos9 decided to break libvirt 14:03:11 cephadm jobs? 14:03:30 last time I checked those were failing 14:03:47 yeah, those fail from time to time, but maybe there's something new in them 14:03:53 anyway, CI needs some love 14:03:57 yoga is gone to unmaintained, so upgrade jobs in zed will be failing, too. I pushed patches already 14:04:14 Ok, let's merge those if they pass 14:04:23 #topic Release tasks 14:04:48 R-8 week - nothing planned for it in release schedule 14:05:07 #topic Regular stable releases 14:05:16 Did we get in the stable releases in Jan? 14:05:21 yes 14:05:40 fantastic, so then it's time for Feb releases (excluding Yoga of course) 14:05:44 Any volunteer? 14:06:07 can do 14:06:16 aka will do 14:06:18 updating the docs to drop yoga would also be nice 14:06:34 good idea 14:06:45 #topic Current cycle planning 14:06:53 We bumped Ansible 14:07:00 jovial: Would be nice to do the same in Kayobe 14:07:12 Sounds like a good idea 14:07:13 I started working on the OVN BGP Agent 14:07:38 The same for Ubuntu 24.04 LTS - so we don't do it last minute 14:08:11 slightly OT but might be good to be aware of, this OVN SNAT bug, if you haven't seen it already: https://bugs.launchpad.net/bugs/2051935 14:08:14 SLURP patch would like some reviews I guess - https://review.opendev.org/c/openstack/kolla-ansible/+/905322 14:08:24 I think I'll add secure RBAC for ironic to the release tasks, as it looks we need to push it there this release 14:08:48 SvenKieske: that's probably a bit unusual setup, as in two levels of routers 14:08:49 +1 I'm happy to review the rbac and service role stuff, want to be done with it :D 14:09:34 true, regarding the bug, but still a little bit disturbing. 14:09:38 anybody working on anything from the list? 14:09:56 list == whiteboard L231 14:10:26 I pestered some folks from OSBA regarding mirrors at fosdem 14:10:54 Any luck? 14:10:57 might be we actually get new mirrors for a more stable CI, but I believe it when I see it (I guess I will need to do more talking still) 14:11:24 Ok then, good luck ;) 14:11:28 it was promised to me under the influence of some alcohol, so let's see how the promise holds up once everyone is sober ;) 14:12:07 Ok, let's move to topics from whiteboard 14:12:11 #topic Additional agenda (from whiteboard) 14:12:21 (SvenKieske): https://bugs.launchpad.net/kolla-ansible/+bug/2049762 (service token verification in cinder wrong?) 14:12:45 SvenKieske: I guess after bbezak's work - we could just stop setting service_token = admin? 14:12:52 yeah 14:13:24 we should 14:13:31 I proposed a singular patch for that, I was curious if our CI would break, it didn't, at least not obvious. I guess it's a matter of taste if we want two patches for that 14:14:19 ok, we should get all service roles/tokens/ironic system scope patches as RP+1 and start reviewing them 14:14:20 I'm fine either way, we probably need to discuss the service role patch distinctly. I think it can be much simpler than it currently is. 14:14:35 bbezak: can you group them in a topic and do RP+1? 14:14:36 big +1 from me on getting this stuff over the line :) 14:14:44 service role is ready to review https://review.opendev.org/c/openstack/kolla-ansible/+/815577 14:14:48 the other one I have some more ideas 14:15:15 but yeah, I'll group them in the topic 14:15:21 goodie, thanks 14:15:37 (bbezak): Service role discussion - https://review.opendev.org/c/openstack/kolla-ansible/+/815577/ 14:15:43 anything to discuss here? 14:16:29 indeed, there were some questions from frickler and SvenKieske if we need admin role still for service users 14:16:38 and we probably don't for some services 14:16:55 however not all projects implemented service role support 14:17:04 https://etherpad.opendev.org/p/rbac-goal-tracking#L48 14:17:44 but if this isn't ready upstream, do we need to adopt it at all already? 14:18:15 ironic needs it, cinder apparently too for this service_token 14:18:33 frickler: well some projects (cinder/nova) do regard our current handling of admin role as a security bug, if you read https://bugs.launchpad.net/kolla-ansible/+bug/2049762 14:18:58 so then only change the accounts for those projects? also hurray for openstack doing wildly inconsistent stuff once again 14:19:24 Seems sane to adopt it for the projects that support it 14:19:25 yeah, I think we should implement what works today, and track the rest of the projects 14:19:27 I think we should configure stuff with the minimum needed roles possible, obviously. and if we need to maybe split up the existing service_ks_register role for that, fine. 14:19:55 bbezak: seems you went in a nice rabbit hole 14:20:05 :) 14:20:10 I actually don't think we should let users override this, but if user are already using it, we can't of course deprecate this functionality this fast. 14:20:33 I adopted old change that did the same thing and polished it with new services etc. 14:20:34 it is a rabbit hole, for sure. at least I learned some stuff about ironic and rbac in keystone :) 14:20:48 I'm fine with going with service role only for ironic/cinder for now 14:20:57 let's see if it will work 14:21:03 I'll focus on ironic 14:21:15 There's Neutron mentioned in the rbac goals 14:21:17 and SvenKieske could for cinder with his patch for service tokens 14:21:47 Problem with service tokens in cinder is that we need to backport this all the way to zed (unmaintained/yoga ?) 14:21:59 So maybe the question is what is the minimum set cinder needs 14:22:02 okay for me, but then I guess I need to adapt it to explicitly use the "service" role. I'm not sure I understand our config merging code in this regard :D 14:22:15 mnasiadka: ack 14:22:36 I thought that adding service roles for all services is a solution that could would solve our issues in the future 14:22:50 bbezak: as long as those services support it :) 14:22:51 but we could do that selectively too 14:22:58 and it seems some support it in 2023.1, some 2023.2, etc 14:23:14 adding service role won't hurt :) 14:23:14 So seems like fantastic mess 14:23:34 yeah well, if it's not needed and supported, then maybe it doesn't make sense 14:23:42 I agree 14:23:56 so maybe we should have per service/service group patches 14:24:01 75% agree :) 14:24:17 I know that's more work, but this way we can decide what to backport 14:24:30 but yeah it is a mess. Ironic being in fact only service with system scope is somewhat breaking my mind 14:24:36 bbezak: I think in the long run your approach is fine, I don't know if we need to patch each service for that, though. maybe have three widgets for this? $service_role_default=service $service_role_not_migrated_yet=admin $service_role_user_override_beware_here_be_dragons=foobar 14:24:39 but that's different story 14:25:07 bbezak: you are not alone in that, I still don't know if I understood all this really (I think I didn't) 14:25:52 maybe someone needs to draw a nice flowchart how this works :D 14:26:32 Well, I think it might make sense to implement service roles for those projects, that support that today 14:26:33 I guess the service user doesn't get any more perms with the service role over admin. So I can see bbezak's point of doing it in a big bang. 14:26:39 we have a list on the etherpad 14:27:28 I don't think just adding the role fixes anything, still we need some per-service configuration (e.g. cinder.conf) entries, right? 14:27:40 I was under the impression bbezak did refresh that etherpad, is the information in there current, or stale? 14:27:57 old etherpad with system scope is stale 14:28:08 mnasiadka: at least for some services I think a customization is currently needed (not 100% sure) 14:28:13 mnasiadka: adding service rolesis just initial thing yes 14:28:42 ironic for instance needs also that - https://review.opendev.org/c/openstack/kolla-ansible/+/908007 14:28:53 if not then system scope member user 14:29:20 yeah, I get that - and on Ironic side they enabled enforcing new defaults 14:29:32 meaning enforcing system scope 14:29:53 I'm not a fan of adding a role to a user just because 7 years later they might support service roles 14:30:23 ok, I'll add it just for ironic. will check if it will cope with just service role, or it will need admin too 14:31:51 I can have a look on Neutron, as in how to switch it to use service role in service-to-service communication 14:31:59 furthermore the inital change for adding service roles to all projects was somewhat agreed within the comments of then PTL 14:32:11 but I agree that things pivoted since then 14:32:23 out of system scope most importantly 14:33:23 We assumed every project will implement it in a reasonable time 14:33:31 Now it seems it's not that simple 14:33:59 scope implementation and service role are different phases 14:34:11 and scope died, so :) 14:34:22 well, system scope died 14:34:26 yeah 14:34:32 ok, I'm done with secure rbac for now thx :) 14:34:44 xD 14:34:45 #link https://governance.openstack.org/tc/goals/selected/consistent-and-secure-rbac.html#change-in-scope-implementation 14:35:28 ok, let's move on 14:35:39 (mnasiadka): Add unmaintained/* reviews to Gerrit dashboards (another section or stable backports) 14:35:55 basically gerrit bot doesn't announce patches to that branch 14:36:03 and we don't see them in the review dashboard as well 14:36:23 so either we get a new section in the review dashboard called Unmaintained branches 14:36:34 or we change the query for stable backports 14:36:39 or we basically don't care 14:36:42 which one do we choose? 14:37:10 I'm confused (again) why do we want notifications on unmaintained branches? wasn't the point that the "unmaintained" team handles those? 14:37:18 * frickler don't core 14:37:35 so I would opt for don't care, but maybe someone has a compelling reason? I can't think of any though. 14:37:35 ehm ... care even ... but core is also not wrong ;) 14:37:48 I'm confused with Kayobe - there is 2 commits in stable/yoga which are not in unmaintained/yoga 14:37:51 Maybe another question 14:37:56 mmalchuk: this one later 14:38:13 Who wants to maintain unmaintained/yoga for Kolla/Kolla-Ansible apart SHPC? 14:38:26 me 14:38:45 I'm always care for backports 14:38:50 I may contribute drive-by backports on a case by case basis, but I wouldn't count that officially :) 14:39:13 then likely you should create a kolla-unmaintained-core group and amend the gerrit acls 14:39:27 I can find an example patch after the meeting 14:39:43 makes sense 14:40:14 frickler: what if we would prefer that the kolla-core group is core for unmaintained branches? :) 14:40:51 mnasiadka: well I would prefer to not be core for unmaintained 14:41:11 shouldn't kolla-core be cleaned up either way? I swear I've seen people in their where their last patch/contribution was somewhere in 2017 or so? 14:41:23 that's yet another topic 14:41:24 there* 14:41:41 frickler: but by default what are the ACLs? kolla-core has rights or some openstack-unmaintained-core? 14:41:50 the latter only 14:41:53 ok 14:42:08 I'll create kolla-unmaintained-core and kayobe-unmaintained-core 14:42:41 and add current cores for starters, nobody is forced to review anything - just like in the usual EM branches 14:43:11 mnasiadka: cf. I169e52d5fb545c675549ce06fef1ca2f8eb1de86 14:45:16 frickler: thanks 14:45:19 ok, let's go forward 14:45:28 (dougszu): Discuss sending all service and infra logs to journald (don't shoot). 14:45:36 dougszu: please elaborate ;-) 14:45:44 about 2 commits above unmaintained/yoga in Kayobe? 14:46:05 mmalchuk: they will be merged in unmaintained/yoga once requirements repo has unmaintained/yoga 14:46:08 So basically, oslo.log can output additional logging info, that we don't currently get: https://docs.openstack.org/oslo.log/latest/admin/journal.html 14:46:12 now nothing is mergable 14:46:45 There are two ways to get the extra info - write out logs in JSON format, which means they are less readable on the box, or send everything to journald 14:47:12 dougszu: I'm personally a big +1 on this, as it streamlines the logging infrastructure more. I don't know about the implementation though, but I guess I already commented on the patch. 14:47:20 only one +2 and +w for unmaintained branches ! :) 14:47:24 mnasiadka stable/yoga would be dropped as after merge? 14:47:31 mmalchuk: yup 14:47:37 thanks 14:47:46 kevko: once they start working we can go back to that 14:47:49 thanks Sven - I could look at alternatives to writing direct to the journal in the patch 14:48:18 I have a slightly complicated question - you know that RH-clones do not persist journal across reboots? 14:48:23 Is there a link to the patch? 14:48:49 there is no patch yet - some thoughts on the etherpad: https://etherpad.opendev.org/p/KollaWhiteBoard#L63 14:48:53 Persisting the journal can be enabled though, right? 14:48:58 mnasiadka: well thats configurable, isn't it? just create /var/log/journal 14:49:04 IIRC all you need to do is create the default direcotry 14:49:18 frickler: true, but still that's something that needs to be included 14:49:30 I assume we're speaking about not logging anymore to /var/log/kolla? 14:49:39 well we should probably do a robust config, not just create the directory, that means taking care that it doesn't overflow etc. 14:50:07 Correct - I am proposing to stop logging to /var/log/kolla, and hand over everything to journald 14:50:21 we could also for the first part redirect /var/log/journal to /var/log/kolla, it's also configurable which directory to use, probably better for older users 14:50:42 SvenKieske: that's not the same format, not really human readable 14:50:42 the location is totally orthogonal to the mechanism being used.. 14:50:43 i like var/log/kolla :) 14:50:53 me too 14:50:54 it is not only a matter of the directory, also text format vs. journal format 14:50:58 sure it's human readable, just use "journalctl" ;) 14:51:12 SvenKieske: cat vs journalctl, err... no 14:51:24 if i need to choose beetween journalctl and tail ..i am voting for tail 14:51:28 but this is still a third orthogonal problem, can we please stop mixing problem spaces all the time? :) 14:51:50 journalctl is pretty good if you take time to read the man page 14:51:58 journalctl has some nice filtering options such as log priority 14:52:18 so we have three problems: a) using journald b) which location on the FS to log to c) which binary format to use (utmp is also a binary log file btw) 14:52:23 We all understand that, but you know that proposal is like dropping Docker? 14:52:26 filtering bad with multiline logs 14:53:14 mmalchuk: there is no multi-line regex with this approach, it should fix some issues with that 14:53:22 so which problem do we want to talk about? all at once? because you can of course configure journald to output plain text and the location is configurable, so this really has nothing to do 14:53:22 I'm pretty sure there will be a lot of people disliking that approach 14:53:25 why to not provide user and option if he want to do a or b ? 14:53:36 maintenance 14:54:05 You could perhaps configure fluentd to write back out kolla logs from the journal 14:54:14 So, we have 6 more minutes. 14:54:24 Don't we only care about giving the the services access to the journal socket. Where the logs are stored are up to how you configure the host OS. 14:54:35 Unless dougszu can formulate the proposal in depth in a separate etherpad - we will need to discuss that at the PTG. 14:55:02 PTG sounds good, thanks all 14:55:09 I can't see an option we stop writing text files to /var/log/kolla/$service without a proper research, asking users on the ML and providing long deprecation phase 14:55:18 grep -ri error /var/log/kolla :D 14:55:21 well the current state of affairs is at least very inconsistent, afaik, correct me if I'm wrong but: a) we have logs shipped with fluentd into opensearch b) we have(?) some local only logs in /var/log/{kolla}, c) we have stuff like docker logs which are not persistet anywhere afaik. d) I honestly don't know what podman does e) we have journald installed by default for stuff like kernel/systemd logs by 14:55:23 default afaik anyway.. 14:55:29 SvenKieske: PTG 14:55:46 It's a very big change 14:55:54 Unless we are going to support both modes 14:56:08 it doesn't have to be. stop making things complicated. like, really! 14:56:14 :D 14:56:21 Supporting both looks like it may be possible, right? 14:56:44 SvenKieske: yeah and only b is the place where you are sure where all logs are present :D 14:56:45 s/syslog-ng/journald/ works, you know. if you do the proper config dance. 14:56:46 mission impossible) 14:56:49 Well, yes - but one of them won't have test coverage :) 14:56:53 Just use log_file and use_journal at the same time? 14:56:57 kevko: not true, there aren't all logs in b) 14:57:10 SvenKieske: okay + docker logs 14:57:15 but I agree it seems to be a PTG topic, or for some larger meeting at least 14:57:52 SvenKieske: fluent + openserarch works ...but until we will not drop parsing and regexp and will not use python fluent logger ...it's 75% working logging system 14:57:55 the current logging state is a mess, to be honest. but still we need some careful planning to improve upon it and don't make it worse :) 14:58:10 I want to ask, i started working on refactoring docker worker to not use low level client and instead use client similar with podman, i guess logical steps are to first merge this refactor and then try to put as many functions into container worker. What do you think about it? 14:58:33 kevko: last time I looked there are no kernel logs in fluentd? :D it's a mess! ;) 14:58:34 halomiva: I think we refactored Kolla to do the same, right? 14:59:09 halomiva: sounds sane on the surface at least 14:59:11 SvenKieske: nobody stops you to run syslog-ng and forward logs to fluentd 14:59:20 mnasiadka: i think yes 14:59:35 halomiva: so fine, I've seen the patch - is it ready for reviews? 14:59:53 mnasiadka: right, and you can do the very same thing with journald, so that's not really a big thing, if you just talk about using journald (without all the binary log file blabla) 15:00:13 ok, it's 16:00 15:00:13 waiting for tests to finish, i tested it locally on basic deployment and it worked so we will see after tests 15:00:26 It's the first meeting in last 2 years when we did use the full hour 15:00:31 Thanks for coming guys! 15:00:37 #endmeeting