19:01:24 <clarkb> #startmeeting infra
19:01:25 <openstack> Meeting started Tue May  5 19:01:24 2020 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:26 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:28 <openstack> The meeting name has been set to 'infra'
19:01:35 <clarkb> #link http://lists.opendev.org/pipermail/service-discuss/2020-May/000017.html Our Agenda
19:01:59 <ianw> o/
19:02:01 <diablo_rojo> o/
19:02:10 <clarkb> #topic Announcements
19:02:28 <clarkb> Just the standing reminder that we are meeting here now. I can probably drop that at this point
19:02:38 <clarkb> #topic Actions from last meeting
19:02:46 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-28-19.01.txt minutes from last meeting
19:02:58 <clarkb> There were no actions from last meeting (maybe need to get better at tracking those)
19:03:12 <fungi> it's a cool meetbot feature
19:03:17 <clarkb> I think we can dive straight into the agenda for today
19:03:19 <clarkb> #topic Priority Efforts
19:03:25 <fungi> but i kinda like having no action items ;)
19:03:26 <clarkb> #topic Update Config Management
19:03:30 <clarkb> fungi: me too :)
19:03:47 <clarkb> First up is a check in on the Gerrit and particularly Gerritbot situation
19:03:56 <clarkb> People keep noticing that gerritbot's config isn't updating
19:04:05 <clarkb> mordred: ^ anything new to share there or things people can help with?
19:04:16 <mordred> uh
19:04:29 <mordred> no - mostly still slogging through the zuul/nodepool rollout
19:05:12 <mordred> so I have made zero progress on gerrit/gerritbot since last week
19:05:24 <clarkb> mordred: is the plan there to deploy it in a container on eavesdrop?
19:05:31 <mordred> BUT - as soon as we rollout nodepool and then do your reorg patch, I'll work on that
19:05:33 <mordred> yes
19:05:44 <mordred> so it should be easy - once there are slightly fewer plates spinning
19:05:48 <clarkb> k
19:05:59 <clarkb> also I suppose if anyone wanted to help they could start with changes to build those docker image(s)?
19:06:01 <mordred> you know - and fewer memory leaks
19:06:06 <mordred> already done
19:06:12 <clarkb> ah perfect
19:06:25 <mordred> done and merged even: https://review.opendev.org/#/c/715635/
19:06:35 <mordred> so it's really just adding the ansible to run gerritbot on eavesdrop
19:06:54 <mordred> but - that's new jobs, so I think is an after-clarkb-zuul.yaml-split sort of thing
19:07:12 <mordred> else we might lose our minds ;)
19:07:18 <clarkb> #info Next up for Gerritot is to add asnible to run gerritbot docker image on eavesdrop. Will want to do it after the system-config zuul config reorg
19:07:50 <clarkb> Lets talk about that then. The bit we are waiting for before reorging system-config's zuul configs is getting nodepool running via ansible and docker
19:08:08 <clarkb> mordred: I think most of the changes ofr that are ready to go? its just a matter of deploying them?
19:08:46 <mordred> yup
19:08:50 <clarkb> maybe we can pick a launcher and a builer to convert, test that everything works on them then roll out to the other launchers abd builders? my biggest concern is with openstack doing release things nowish that we want to avoid big disruptions but a staged rollout should be pretty safe
19:09:11 <mordred> yeah - we'll have chowns to do like we did for zuul
19:09:26 <mordred> but - it's launchers and builders - so one being down for a minute isn't a big deal
19:09:38 <clarkb> yup rolling updates on those is pretty small impact
19:09:44 <mordred> clarkb: maybe we put everything in to emergency, then just work them one at a time
19:09:49 <clarkb> ++
19:10:27 <clarkb> then when that is rolled out I can refresh my system-config reorg and hopefully land that quickly before it conflicts with anything
19:11:19 <clarkb> The last thing I wanted to talk about here was trying to minimze user impacts for a bit to help openstack's release process go smoothly as possible
19:11:32 <mordred> ++
19:11:38 <clarkb> I know we want to restart gerrit and we'll want to restart zuul services on the scheduler once new images land to pick up the jemalloc update
19:12:04 <clarkb> if we want to do that in the next week and a half we should coordinate with the openstack release team
19:12:19 <clarkb> Both should be relatively low impact but zuul-scheduler in particular will dump queues restarting jobs
19:12:47 <mordred> clarkb: we could do the scheduler restart on the weekend when there's less traffic
19:13:17 <mordred> the gerrit restart should be super easy - that one we can probably do whenever - no new functionality - really just the repl config
19:14:18 <clarkb> ya, maybe plan both for friday (releas ethings go quiet on fridays iirc)
19:14:29 <clarkb> I'll ask the release team if that works fo rthem
19:14:52 <mordred> sounds good to me
19:15:11 <clarkb> great. Anything else on the subject of our config management manuevers?
19:15:22 <mordred> oh - re: nodepool, we've also got arm images working through the system - so we should also be able to update nb03 to ansible/docker
19:15:43 <fungi> gerrit restart still needs the release team to avoid pushing release changes though, since in the past they've had trouble with replication lag and a gerrit restart will force a full replication run, right?
19:15:44 <mordred> and the other nb's can get rebuilt as opendev and go ansible/docker too
19:15:55 <fungi> so anything which merges shortly after the restart may not get replicated for a while
19:15:57 <mordred> fungi: it should not force a full repl
19:16:00 <clarkb> fungi: we don't force replication on restart anymore
19:16:07 <mordred> yeah - we finally fixed that
19:16:11 <fungi> oh, excellent, i had clearly forgotten that!
19:16:13 <clarkb> but we'll clear it with them anyway
19:17:37 <clarkb> #topic OpenDev
19:18:17 <clarkb> I made a call for Service Coordinator volunteers with a deadline set for the end of April. Last week I promised I would volunteer officially to mkae that more official and I did not see any other volunteers
19:18:51 <clarkb> I think that means I'm it, but wanted to get some acknowledgement of that beyond me asserting it if that makes sense
19:19:11 <fungi> congratudolences!
19:20:39 <clarkb> Are there any other opendev related topics? I haven't heard anything new on citycloud/vexxhost network trouble so I assume that got sorted?
19:20:40 <AJaeger> thanks clarkb for volunteering!
19:20:42 <ianw> ++
19:20:44 <fungi> oh, in other opendev news, i've started on a poc for reporting aggregate user activity statistics for our services
19:20:48 <fungi> #link https://review.opendev.org/724886 Create new project for OpenDev Engagement Stats
19:21:00 <fungi> once that is approved i can push my initial work in there
19:21:27 <fungi> so far it's just measuring gerrit changes created/merged, revisions pushed, comments added, et cetera
19:21:39 <fungi> queried from the rest api
19:21:53 <fungi> planning to add stuff for counting our mailing list archives, irc logs, and so on
19:22:21 <clarkb> oh related to new projects we fixed the long standing bug where jeepyb didn't update gerrit acls for retired projects
19:22:24 <AJaeger> stackalytics version 3? ;)
19:22:27 <clarkb> fungi: ^ thank you for getting that in and running it
19:22:39 <fungi> not per-user stats, no
19:22:45 <fungi> and nothing related to affiliation
19:23:18 <AJaeger> fungi: so overall commits for all of openstack, all of Zuul?
19:23:25 <fungi> this is more like how many reviewers interacted with projects in a given namespace, how many reviews were done, et cetera
19:23:34 <fungi> yep
19:23:52 <fungi> the gerrit stuff so far has per-namespace and total across the gerrit deployment
19:23:53 <clarkb> more a project health than comparison metrics
19:24:06 <fungi> and can chunk by month, quarter or year
19:24:06 <AJaeger> understood now
19:25:20 <fungi> the idea is taht we can at least provide some additional useful numbers to folks who are investing in our services (whether volunteering their time, donating server quota, ...)
19:25:42 <fungi> so that they can see what kind of impact we're having
19:26:28 <fungi> anyway, once the code's up i'll start up a conversation on what direction(s) we should take it in
19:27:32 <clarkb> #topic General Topics
19:27:38 <clarkb> First up is the PTG
19:27:47 <clarkb> #link https://virtualptgjune2020.eventbrite.com Register if you plan to attend. This helps with planning details.
19:28:11 <clarkb> It is free registration. This is mostly for planning and related bookkeeping purposes
19:28:34 <clarkb> I've requested three 2 hour chunks of time as mentioned previously and haven't been told no so I expect we'll get those
19:28:46 <fungi> and it makes sure "attendees" have agreed to eth event's code of conduct
19:28:54 <clarkb> Monday 1300-1500 UTC, Monday 2300-0100 UTC, Wednesday 0400-0600 UTC with Monday being June 1, 2020
19:28:59 <fungi> as much as it can anyway
19:29:15 <clarkb> #action clarkb prep PTG agenda etherpad
19:29:27 <clarkb> I've been meaning to do ^ and now that things are getting close I should actually do that
19:29:53 <clarkb> I'll send a link for that out to the mailing list once I've got it up
19:30:05 <clarkb> Hope to see you there and please register when you have a moment if you plan to attend
19:30:18 <clarkb> Next is the wiki server. fungi any updates there?
19:30:45 <fungi> nothnig
19:30:59 <fungi> or some similarly-spelled word
19:31:04 <clarkb> fungi: you're still next: Gerrit reviewers plugin
19:31:10 <clarkb> #link https://review.opendev.org/#/q/topic:gerrit-plugins+is:open
19:31:13 <fungi> ahh, so this is an interesting one
19:31:41 <fungi> jim sommerville from the starlingx project reached out to me about this because i was the only user in #opendev...
19:31:48 <fungi> ...on oftc
19:31:58 <fungi> (i'm squatting the channel there in case we ever need it)
19:32:37 <fungi> not sure what prompted him to /join it, though i did suggest to him that in the future the equivalent channel on freenode was a better bet for finding us
19:32:52 <corvus> o/
19:33:21 <fungi> anyway, it sounds like folks in some of the starlingx teams are looking for better ways to organize their review backlogs and notifications
19:33:49 <fungi> and he found the reviewers gerrit plugin and was asking whether it was something we'd consider adding
19:34:01 <fungi> so i took a look, it seems to be maintained
19:34:06 <clarkb> when I saw this go on the agenda  Idid check that the plugin has gerrit 2.13 versions and it does
19:34:18 <fungi> i pushed up a couple of preliminary changes to add it to our images
19:34:30 <fungi> marked wip for now until we could talk it through
19:34:48 <fungi> there's a big missing piece, which is figuring out how to manage the configuration for it
19:34:53 <clarkb> from a technical standpoint I think my only concern would be if we need to rebuild the plugins like we did for melody and storyboard ics?
19:35:02 <clarkb> fungi: oh its configurable?
19:35:24 <clarkb> when I looked at it it seemed to look for a file in repos and if that was there add the listed users as reviewers
19:35:41 <fungi> a quick summary is that the project-specific configs are where you say certain file or branch patterns in a repo should get specific reviewers or groups (i think we'd just support the latter maybe for sanity) get mapped to one another
19:35:58 <fungi> these are implemented similarly to how acls and other project options are set
19:36:14 <fungi> so i think we'd need to handle them the same way we do acl configs with manage-projects
19:36:24 <corvus> "Per project configuration of the @PLUGIN@ plugin is done in the reviewers.config file of the project."
19:36:45 <corvus> my guess is that's in the refs/meta/config branch
19:36:46 <fungi> and probably auto-create any new groups they reference in reviewers.config just like we do for groups referenced in an acl
19:36:53 <mordred> I think it should be fine to roll that out as part of >=2.14 - I don't want to try to build new stuff for 2.13
19:36:55 <corvus> but i don't know that
19:36:55 <fungi> corvus: yeah, that's what it looked like to me as well
19:37:17 <mordred> like - last time I tried it was a complete failure which is why the dockerfile for 2.13 just downloads the war
19:37:38 <clarkb> mordred: ya thats why I was concerned about needing to build things
19:37:41 <fungi> so anyway, it's not just as simple as adding the plugin, doesn't look like (though that part seems straightforward and easy, thus the changes i've pushed so far)
19:37:48 <corvus> we're planning on stopping at 2.16 for a bit; how about we open the window for new plugins there?
19:38:05 <clarkb> fungi: also we might want to make sure we understand what starlingx's needs are to ensure that plugin would help them
19:38:08 <fungi> that seems reasonable to me, for sure
19:38:11 <mordred> wfm. and I don't see anything about this plugin that would make me think it's a bad idea
19:38:39 <clarkb> corvus: ++
19:38:50 <fungi> right, it seemed like a reasonable request, i wanted to dig into it some more, but since it looks like it'll require additional automation i wanted to make sure to start talking about it
19:39:22 <fungi> i can start up a thread on the service-discuss ml if that'll help us get through the specifics
19:39:34 <fungi> and i agree, waiting until 2.16 is probably warranted
19:39:47 <fungi> so as not to further complicate our upgrades
19:40:00 <fungi> i can update my wip patches to remove it from <2.16
19:40:31 <fungi> anyway, that answers my questions for now, i think
19:40:52 <corvus> fungi: thanks, this was a good discussion :)
19:41:12 <fungi> i didn't know if it warranted a full spec, but i'll write up the use case and how i would envision it working and send it to the ml
19:42:00 <fungi> and solicit feedback from the starlingx crew there
19:42:10 <clarkb> ++
19:42:12 <clarkb> thanks!
19:42:49 <clarkb> #topic Open Discussion
19:43:02 <clarkb> That was all I had on the agenda. Anything else before we close the meeting?
19:43:16 <mordred> fungi: it's worth noting that we're motivated to get upgraded to 2.16 - so we're not talking about years here
19:43:28 <fungi> yup
19:43:53 <fungi> we've found some more stale afs volume locks from broken vos releases resulting from the 2020-04-28 afs01.dfw.o.o outage
19:44:15 <fungi> i took care of mirror.fedora earlier today, corvus fixed mirror.centos yesterday
19:44:22 <fungi> i've got mirror.opensuse running now
19:44:33 <fungi> and mirror.yum-puppet needs doing as well
19:44:34 <clarkb> mirror.opensuse and mirror.yum-puppetlabs in particular seem to be the remaining locked but no transaction cases
19:44:35 <fungi> there may be others
19:44:49 <clarkb> fungi: I don't think there are based on my read of vos listvldb and vos status
19:44:58 <fungi> i recall i (or someone) fixed some of the docs/static site volumes last week
19:45:00 <clarkb> (someone should double check it though)
19:45:20 <fungi> thanks for checking that, clarkb
19:46:34 <ianw> matches up with http://grafana.openstack.org/d/ACtl1JSmz/afs?orgId=1 , thanks
19:47:28 <fungi> yeah, i think the fedora one should clear when the current rsync and vos release pulse ends
19:47:52 <fungi> the manual vos release in these cases seems to only redo/finish the incomplete vos release from the time of the crash
19:48:00 <fungi> so doesn't actually switch to the current data
19:49:16 <clarkb> the second vos release will do the catch up
19:49:22 <fungi> yep
19:49:32 <fungi> which i've just been letting the mirror-update cronjobs take care of
19:50:58 <clarkb> alright last call for any other items otherwise I'll call it a few minutes early today
19:51:35 <ianw> if i could request some reviews on https://review.opendev.org/#/q/status:open+topic:cleanup-test-image-python
19:51:39 <ianw> #link https://review.opendev.org/#/q/status:open+topic:cleanup-test-image-python
19:51:50 <ianw> that will help me make progress getting rid of pip-and-virtualenv
19:53:40 <ianw> there's also some nodepool things out there that could use some eyes too
19:54:35 <ianw> #link https://review.opendev.org/724214
19:54:52 <ianw> is another one just to get afs testing on focal
19:55:07 <ianw> focal did the same thing as bionic unfortunately and shipped a ~pre version of openafs
19:55:34 * fungi grumbles
19:55:40 <clarkb> a different version though I expect
19:55:43 <fungi> not the same ~pre version i guess?
19:55:48 <ianw> the arm64 job @ https://review.opendev.org/724439 fails due to pip-and-virtualenv
19:55:53 <corvus> ianw: know whow well it works?
19:56:22 <ianw> being removed on that platform.  i need to fix ensure-tox, as we install tox for the zuul user and then try to run it as root which blows up
19:57:04 <ianw> corvus: it being openafs?  we put in 1.8.5 packages to the ppa and it passes testing, at least
19:57:25 <corvus> ya.  thx.  so we just need to get one of the mirrors on it and see how it holds up over time?
19:57:48 <corvus> also sounds like we'll be wanting the new executors on it too
19:58:05 <fungi> great point
19:58:13 <ianw> yep that would be a good start.  although our bionic hosts are on 1.8 series too, so i would expect things are ok
19:58:46 <clarkb> ya iirc 1.8 can client to 1.6 but we can't mix fileservers between 1.6 and 1.8 ?
19:58:56 <clarkb> soits the upgrade of the fileservers that will be "fun"
19:59:15 <clarkb> and we are at time. Thanks everyone
19:59:18 <clarkb> #endmeeting