19:01:28 #startmeeting infra 19:01:28 Meeting started Tue Nov 16 19:01:28 2021 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:28 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:28 The meeting name has been set to 'infra' 19:01:51 o/ 19:02:11 #link http://lists.opendev.org/pipermail/service-discuss/2021-November/000296.html Our Agenda 19:02:23 o/ 19:03:22 #topic Announcements 19:03:35 Gerrit User Summit is happening virtually December 2&3 19:04:20 also openinfra live keynotes tomorrow and thursday 19:04:46 also virtual 19:04:53 15:00-17:00 utc both days, though may run longer 19:05:05 basically this year's replacement for the openinfra summit 19:05:10 Anyway thought I would throw that out there as I've always missed previous gerrit user summits as they don't seem toget as much advertisement 19:05:17 and others may be interested in attending 19:06:21 Also next week is a US holiday week. I wasn't planning to take the whole week off just the end of the week but then school last minute cancelled for the whole week 19:06:40 i expect to be around most of the week, in case anything comes up 19:06:49 This means I'm going to try and avoid being around during the whole week to spend time with family. I'm inclined to cancel our team meeting next week as a result but happy for others to run a meeting if they like 19:06:52 though will likely be busy thursday, maybe friday 19:08:24 i'd do it but given low attendence doesn't seem worth it 19:08:35 seems better to force some time away :) 19:08:46 wfm, we'll cancel next week then. See you all here in 2 weeks 19:08:52 #topic Actions from last meeting 19:08:53 yeah, i'm fine cancelling 19:08:59 #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-11-09-19.01.txt minutes from last meeting 19:09:07 We don't seem to have recorded any specific actions 19:09:12 #topic Topics 19:09:23 #topic Improving OpenDev's CD Throughput 19:09:51 I've got new motivation to pick this up again and realyl do hope to do so this week (I was sick last week otherwise I'd like to think I would've looekd at these) 19:10:07 Yesterday we landed a change to our base inventory stuff which caused all the jobs to run and it took a couple of hours 19:10:30 Right now we are in good shape when we run jobs in a targetted manner but when we do those base updates it is painful waiting for everything to complete. 19:11:49 Anyway thats the motivation. I really do hope to look at the changes and figure out why CI isn't super happy with the later changes 19:11:58 #topic Gerrit Account Cleanups 19:12:26 I still haven't heard from the most recent user I updated so assume that was fine. I haven't done the mass emails yet but need to explicitly block off time for that. Maybe week after next and just work through those 19:12:44 It is too easy to deprioritize this effort, but it is very close to being done so I should just get it done 19:14:30 #topic Zuul Multi Scheduler 19:14:40 Zuul is still running with two schedulers and seems significantly more stable 19:14:53 zuul01 and zuul02 are the two hosts and zuul02 is the primary and runs zuul-web 19:15:08 been running that way since the weekend, and we even had a zero-downtime rolling restart of the schedulers 19:15:10 zuul-web has been updated to talk to zk directly which means you should always get a consistent view from the status pages now 19:16:00 This is feeling like it is stabilising now, and maybe we'll get another zuul release in the near future (though there are a few inflight fixes that we will ideally restart opendev on to sanity check first, good news is those can hopefully be done without downtime now) 19:17:34 Anyway just an update on that as its been a thing lately. Hopefully less and less a thing as we stabilize it further 19:18:31 #topic User management on our systems 19:18:48 Thank you fungi and ianw and frickler and others for the reviews on this. We've landed the first changes in starting to clean this up 19:19:14 unnecessary users should now be removed and we've adjusted the uid and gid range on our systems (though any uids/gids outside of that range haven't been moved) 19:19:27 #link https://review.opendev.org/c/opendev/system-config/+/816769/ Give gerritbot and matrix-gerritbot a shared user 19:19:32 is the next step in that process I think 19:19:53 please review that one carefully as I suspect this will act as a sort of template as we go through and update other services to do similar 19:21:51 separately I think that if we want to work on shifting mariadb uids over to match the services the db is running for I think we can probably start that work in parallel though I expect it to be fiddly 19:22:03 thanks, sorry reviewing of these got a bit distracted, but will do 19:22:08 thanks. 19:22:17 re mariadb I suspect that etherpad would be a good first candidate 19:22:31 since it is fairly self contained and doesn't ahve a lot of moving parts like gitea or gerrit 19:23:04 #topic Caching of openstack/openstack on our DIB image builds 19:23:12 with mariadb, the idea would be to have a specific mariadb user, or use the same user as the etherpad service runs under? 19:23:15 #undo 19:23:15 Removing item from minutes: #topic Caching of openstack/openstack on our DIB image builds 19:23:35 sorry, i'm typing slowly today 19:23:50 fungi: I think use the same user as the etherpad service. Since all of the services running etherpad have rw access to the db there isn't much to gain from separateing them 19:24:10 if we ran a central db and shared it with many services I would say they should be separate but in this case we run a mariadb separately for each service 19:24:23 and this helps simplify things. THough we could split users if there was a good reason to 19:24:41 yeah, makes sense as long as multiple services don't share a common db server 19:25:11 thanks, that answers my question 19:25:22 mariadb already runs as 999 though right? 19:26:00 ianw: yes 19:26:02 which unfortunately conflicts with the range of accounts that might get taken by distro packages of things 19:26:12 right the concern is we don't want the overlap with system usage 19:26:30 we should shift it to our uses and having it run as the same user as the service seems fine since the db is rw and separate for each service 19:26:37 hence our desire to carve up separate ranges 19:27:34 ++ 19:29:26 #topic Caching of openstack/openstack on our DIB image builds 19:29:44 I wasn't sure if this would be fixed or not by the time of the meeting so I had this on the agenda to discuss simply removing openstack/openstack from caching 19:30:04 But ianw was able to track this down to a weird git interaction (I kinda think it might be a bug?) between using the git dir option and how submodules are checked 19:30:35 thankfully there is the -C option as an alternative to the git dir option and that should fix things. The dib change to switch to -C has landed but I think we need a release and new nodepool iamges to take advantage of it 19:30:56 it might be a bug that git made it so confusing as to what was going on, but it does appear to be mostly our fault 19:32:04 at least git is written in C and builds with make, so it wasn't a multi-day setup to instrument it and figure out what it thought it was doing :) 19:32:54 ya I mean it seems liek git should be resilient to people passing flags like that 19:33:08 either by saying "I can't function in this situation" explicitly or by figuring out a way to function 19:33:09 i'll probably do a dib point release and update nodepool, because it's very racy getting images built atm 19:34:41 sounds good, and thank you for looking into that. An upside to using containers this way I guess 19:34:45 i think openstack/openstack project is fine, but it's still not clear to me how it's actually setup 19:35:05 ianw: gerrit has a magic flag where it will auto update submodules in a repo if it hosts the submodule repos too 19:35:26 ianw: so basically if you add a submodule to that repo and it is on our gerrit gerrit will make commits to it automatically as we merge chagnes to hose repos 19:35:47 (its an explicit config option on the repo that has to be set) The idea when it was set up was that it could be used to track a log of the order things land in 19:35:59 but it never really got used and has become very bit rotten (it lacks repos iirc) 19:36:43 ahh, yeah that's what i was wondering, what maintains it 19:38:08 it looks like there's a "generate-gitmodules.py" 19:38:19 but, it doesn't look like this auto-runs or auto-proposes? 19:39:54 I think it is in gerrit itself 19:40:11 ianw: https://gerrit-review.googlesource.com/Documentation/user-submodules.html 19:41:43 in theory the openstack tc is supposed to be maintaining the list of included submodules in it 19:42:18 $ git review 19:42:18 ssh://iwienand@review.opendev.org:29418/openstack/openstack.git did not work. Description: fatal: Upload denied for project 'openstack/openstack' 19:42:31 there might be some automation in openstack/governance, like the script they run which creates github projects for mirroring into 19:42:33 interesting. it looks like AJaeger used to mostly run the script and update .gitmodules in there 19:43:15 anyway, worth checking with the opemstack tc to see if anyone's maintaining it 19:43:55 "2. configure the submodule to allow having a superproject subscribed" was the step i was unclear on 19:44:26 submodule.enableSuperProjectSubscriptions seems like it defaults to true, so that explains that bit 19:45:05 i guess we have done something in All-Projects for this repo? 19:45:53 that could be 19:46:04 I wasn't invovled in the setup so don't recall how it was done 19:47:30 #topic Open Discussion 19:47:31 fungi: i think the answer empirically is "no" https://opendev.org/openstack/openstack/commits/branch/master/.gitmodules 19:47:51 Figured I'd open it up to any other discussion 19:49:08 clarkb: https://review.opendev.org/c/opendev/system-config/+/817301 was a quick one from your comment on doing a mark/unmark/mark cycle on the gerrit testing 19:50:14 ah yup +2 19:50:26 i had been under the impression this repo was automatically keeping itself up to date 19:50:46 given that it isn't, i guess i'll post to the list and propose retiring it 19:51:09 one of the main reasons they've wanted it kept around is the "cncf landscape" 19:51:39 for a project to be included on the landscape, it must be represented by one (and only one) repo on github 19:52:09 so the github mirror of the openstack/openstack superrepo is how the lf/cncf measures the level of "activity" for the openstack project 19:52:57 interesting 19:53:23 since i couldn't push an update to the .gitmodules, i'm guessing it's locked down to some group 19:53:51 the first thing i'm thinking is that the proposal bot should run the project update 19:54:23 push = group Release Managers 19:54:32 exclusiveGroupPermissions = Push 19:54:54 that's in the [access "refs/for/refs/*"] section of gerrit/acls/openstack/openstack.config 19:55:31 or, could we do something more like have zuul run it in a periodic job? 19:55:49 but where would the zuul job get permissions to propose/push the change? 19:56:14 could use the same account which pushes tags for openstack 19:56:21 it's a member of release managers 19:57:55 seems reasonable 19:58:36 you'd still need someone to review them? or are you saying just push directly? 19:58:47 if it ran and just +2 +W'd it's change it would seem to keep it in sync 19:59:00 yeah, i'm thinking that reality of people reviewing is low 19:59:26 we could adjust the acl to give create = group Release Managers in [access "refs/heads/*"] and bypass reviewing, i suppose 19:59:52 since it's based on the YAML file that is reviewed it seems low risk 20:01:14 is it running any testing from project-config? 20:01:40 i don't think so 20:01:56 https://review.opendev.org/c/openstack/openstack/+/741207 ... just noops 20:01:58 which is another reason direct pushing to the branch might be more sensible 20:02:28 ok, well i'll put it on the todo. could be an interesting exercise in zuul jobbing 20:10:22 seems like we've probably finished meeting. thanks clarkb 20:10:36 #endmeeting