19:01:10 <clarkb> #startmeeting infra
19:01:11 <openstack> Meeting started Tue Jun 23 19:01:10 2020 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:12 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:14 <openstack> The meeting name has been set to 'infra'
19:01:23 <clarkb> #link http://lists.opendev.org/pipermail/service-discuss/2020-June/000042.html Our Agenda
19:01:35 <clarkb> #topic Announcements
19:02:11 <ianw> o/
19:02:13 <clarkb> On Thursday the OpenStack Foundation is doing two rounds of community updates. One for more europe friendly timezones and the other for asia pacific timezones. The Americas are sort of stuck in the middle
19:02:19 <clarkb> feel free to join those if interested
19:02:38 <clarkb> (though really the audience is people not involved in daily happenings so maybe boring for you all :) )
19:03:09 <clarkb> also that is relative to my Thursday. Local day of week may differ
19:04:03 <clarkb> #topic Actions from last meeting
19:04:06 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-06-16-19.01.txt minutes from last meeting
19:04:11 <clarkb> We didn't record any actions
19:04:17 <clarkb> #topic Specs approval
19:04:23 <clarkb> #link https://review.opendev.org/#/c/731838/ Authentication broker service
19:04:31 <clarkb> Still not ready for approval but worth calling out
19:04:32 <fungi> i've started to pick it back up, haven't pushed a new patchset though
19:05:00 <clarkb> fungi: thanks. Probably worth a call out when a new patchset does show up so we can take a look quickly
19:05:15 <fungi> still digestnig recent comments, but more comments don't need to wait for me to update the spec either
19:05:33 <clarkb> this is true
19:06:12 <clarkb> #topic Priority Efforts
19:06:23 <clarkb> #topic Update Config Management
19:06:50 <clarkb> haven't seen a whole lot on this front, but I've also been distracted with image updates and openstack testing stuff
19:07:11 <clarkb> Is there anything worth calling out on this subject? Maybe the plan for reducing "run all the jobs when inventory updates" problems?
19:07:35 <clarkb> mordred: ^ that caught corvus again on Friday when trying to land the zk config updates (which eventually applied and everything was fine but the number of jobs was unexpected)
19:08:34 <fungi> ianw has started work on containerizing our grafana deployment
19:08:57 <clarkb> ah cool. I ninja updated my local copy of the agenda to talk about grafana's nodepool dashboards in a bit too
19:09:18 <ianw> can update in there, it is coming together i think
19:09:24 <fungi> i also have a half-baked change underway to move our reprepro mirroring from puppet to ansible, i need to pick that back up and hack on it some more (unless anyone's just dying to take it over)
19:09:24 <clarkb> mordred: is step0 there getting the split up puppet else changes landed?
19:09:35 <clarkb> mordred: and if so are those ready for review?
19:10:50 <mordred> not even
19:11:16 <mordred> I think step0 can actually be just copying over the inventory/ file matchers from the system-config-run jobs
19:11:24 <mordred> we have smaller file matchers for them already
19:12:00 <clarkb> #info Copy file matchers from system-config-run jobs to infra-prod jobs to reduce number of jobs that run when inventory/ is updated
19:12:19 <fungi> another good one to tackle soon might be storyboard... mordred's already added image building jobs for it, so assuming those are functional the actual deployment ansible for them might not be too hard (though there are rather a lot of different contaniers associated with it)
19:13:14 <clarkb> fungi: that will also convert us to python3 for that service which would be nice
19:13:32 <clarkb> Alright last call on config management changes
19:13:39 * diablo_rojo sneaks in the back and sits down
19:14:06 <corvus> wait we can sit during this meeting?
19:14:15 <clarkb> I'm sitting
19:14:16 <fungi> i thought it was a standing meeting
19:14:24 <clarkb> sometimes I stand
19:14:24 <fungi> but clarkb is a chair
19:14:35 <fungi> let's table this
19:14:44 <clarkb> #topic OpenDev
19:14:56 <clarkb> First up we upgraded Gitea to version 1.12.0
19:15:22 <clarkb> This includes caching of info that Gitea uses to render repo UI pages which should speed those page loads up quite a bit. But its still a cache so the first load is still slow
19:15:34 <clarkb> we've seen this help quite a bit for repos like nova already though
19:16:04 <clarkb> I had some small concern that the caching would increasememory usage on those servers but I've not seen that being a problem yet according to cacti
19:16:25 <clarkb> #link http://lists.opendev.org/pipermail/service-discuss/2020-May/000026.html Advisory Board thread.
19:16:44 <clarkb> I've bumped this thread and am setting a soft deadline for July 2 where we'll proceed with the volunteers we have at that point
19:17:05 <clarkb> soft deadline because we can always add new members later, but I didn't want to sit in this holding pattern forever
19:18:33 <clarkb> The last OpenDev item I had was related to git branches. The topic of potentially renaming branches has come up in a couple of contexts and I wrote an etherpad with details on what that would mean for us as a hosting platform and our hosted repos
19:18:42 <clarkb> https://etherpad.opendev.org/p/opendev-git-branches
19:19:14 <clarkb> There are some potentially painful technical challenges which that tries to call out so that we can provide reasonable guidance to people that may consider such changes
19:20:10 <clarkb> Anything else to talk about on the subject of OpenDev? or does anyone want to dig into anything above?
19:21:43 <clarkb> #topic General topics
19:21:50 <clarkb> #topic Grafana Nodepool Graphs
19:22:20 <clarkb> We've discovered that we're no longer producing templated nodepool provider graphs taht function
19:22:47 <clarkb> looking at the dashboard json it appears that we aren't setting a data source properly. I wrote a change to the nodepool-ovh dashboard to force the data source and that doesn't seem ot have helped
19:23:14 <clarkb> The change I wrote did end up switching the data source value from null to OpenStack though
19:23:29 <clarkb> which maybe means the problem is unrelated to data sources
19:23:53 <clarkb> ianw: ^ maybe it makes more sense to debug once we've got the deployment tooling switched out?
19:24:00 <fungi> ossh grafana
19:24:11 <ianw> yeah, so i saw that and started to debug too and i guess yak shaving took over
19:24:20 <fungi> heh, this is not my shell terminal ;)
19:24:41 <ianw> however, i think it probably makes more sense to start debugging latest grafana via containers
19:25:32 <ianw> i have it mostly working, building a container based on upstream grafana and including grafyaml in it
19:26:06 <ianw> it's easy to deploy locally and should work the same in production
19:26:14 <clarkb> that is likely to aid in debugging
19:26:26 <clarkb> graphite is publicly accessible so we can point at production data easily
19:26:42 <fungi> is the suspicion at this point that our regression is due to the recent grafana 7.x releases?
19:26:55 <ianw> i think so
19:27:02 <clarkb> fungi: ya I'm assuming that grafyaml isn't supplying the necessary info to make the templating work
19:27:16 <ianw> #link https://review.opendev.org/737397
19:27:19 <ianw> that's the base container
19:27:29 <clarkb> what our grafyaml does is tell grafana to query graphite for a list of nodepool provider regions. Then using that list it produces graphs for each entry in the list
19:27:35 <ianw> #link https://review.opendev.org/737406
19:27:42 <clarkb> from what I can tell we aren't producing a valid input list to the graphs
19:27:50 <ianw> that will test it.  both are really active wip
19:28:25 <clarkb> ok, I'll try not to worry about it too much until we've got the deployment stuff redone as that will simplify debugging
19:28:48 <ianw> i agree, let's work from that common base
19:30:04 <clarkb> #topic Etherpad Upgrade to 1.8.4 or 1.8.5
19:30:30 <clarkb> Bringing this up to say I'm planning on holding off until the end of next week since the opendev event next week will use the etherpad
19:30:43 <clarkb> once that is done it should be very safe to use a hacky 1.8.4 :)
19:31:00 <clarkb> the change to do the upgrade is WIP'd with a similar message and shouldn't land early
19:31:04 <fungi> 1.8.4 is still the current release at this time
19:31:07 <clarkb> yes
19:31:54 <fungi> i suppose "1.8.5" here could be a stand-in for tip of develop branch
19:32:11 <clarkb> ya 1.next might be most accurate
19:32:44 <clarkb> if something after 1.8.4 arrives before end of next week I'll respin to test and deploy that. Otherwise I'll land our change as is on 1.8.4 with local css fix
19:33:07 <mordred> ++
19:34:13 <clarkb> #topic DNS cleanup
19:34:19 <clarkb> #link https://etherpad.opendev.org/p/rax-dns-openstack-org First pass of record removals has been done. Could use a second pass.
19:34:40 <clarkb> ianw thank you for putting this together. From what I can see its all working as expected. I did a first pass of removals based on what was noted on the etherpad too.
19:34:58 <clarkb> Since then fungi has annotated more things on the etherpad and I think we need a second pass at cleanup. I was planning to do that todayish
19:36:04 <clarkb> Calling this out so that if anyone else wants to do a pass they do it soon and catch it in my next set of removals
19:36:31 <clarkb> I should also share it with the foundation sysadmins again and see if they want to remove anything
19:37:11 <clarkb> #topic Getting more stuff off of python2
19:37:21 <clarkb> #link https://etherpad.opendev.org/p/opendev-tools-still-running-python2
19:37:54 <clarkb> I ran out of steam on this last week, but wanted to call it out again in case others have notes to add about things they know need python2 attention
19:38:11 <clarkb> Not incredibly urgent, but crowd sourcing tends to help with this sort of problem space
19:38:33 <clarkb> #topic Wiki Upgrade
19:38:41 <clarkb> fungi anything new on the wiki upgrade?
19:38:47 <fungi> nope!
19:39:05 <clarkb> #topic Open Discussion
19:39:47 <mordred> please to review https://review.opendev.org/#/c/733967/ and https://review.opendev.org/#/c/737023/
19:40:17 <clarkb> As a general heads up I'm going to be helping with the opendev event next week and will be distracted during those hours. I'm also looking at taking July 6-10 as a staycation and will need someone else to chair the meeting on the 7th if we have a meeting
19:41:40 <fungi> in case folks hadn't seen, a while back i audited our listserv mta logs and determined that the long-running qq.com spam flood to the -owner aliases has *finally* abated. the starlingx list owners have requested we start allowing messages to the -owner alias for their list again so i proposed https://review.opendev.org/729649 but more generally we could think about lifting all of the blackhole
19:41:42 <fungi> aliases i think
19:44:28 <fungi> they do still receive some random spam on a daily or weekly basis (roughly proportional to their list activity volume it seems) so just turning it back on for all lists without warning could catch some folks by surprise
19:45:25 <fungi> we've also had issues in the past with people setting list owner addresses to e-mail service providers who happily report those moderation request messages as spam to blacklisting services, which creates unfortunate deliverability issues for us
19:45:57 <clarkb> neat
19:46:06 <fungi> anyway, just crossed my mind again, not sure if anybody has suggestions for how to go about that
19:46:37 <clarkb> fungi: maybe we should reach out to list moderators and ask them if they'd like it to be toggled?
19:46:45 <clarkb> we know who those people are and can ask directly I think
19:47:00 <fungi> yeah, i thought about that. there are a lot, but the deduplicated list of them might not be so many
19:47:59 <clarkb> Anything else? I'll give it another minute or two then call it
19:48:04 <clarkb> thank you everyone for your time
19:48:16 <fungi> oh, also it looks like a stale apache worker on static.opendev.org caused sslcheck to fire a warning about the cert for zuul-ci.org expiring in a month
19:48:41 <fungi> can move discussion of what to do for that and when to #opendev though
19:49:09 <clarkb> ++
19:49:14 <clarkb> #endmeeting