19:01:04 <clarkb> #startmeeting infra
19:01:05 <openstack> Meeting started Tue Jun  9 19:01:04 2020 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:06 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:08 <openstack> The meeting name has been set to 'infra'
19:01:12 <clarkb> #link http://lists.opendev.org/pipermail/service-discuss/2020-June/000034.html Our Agenda
19:01:20 <clarkb> #topic Announcements
19:01:36 <clarkb> No announcements were listed
19:02:14 <mordred> o/
19:02:56 <clarkb> #topic Actions from last meeting
19:03:03 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-06-02-19.03.txt minutes from last meeting
19:03:57 <clarkb> Last week's meeting was informal and we ended up debugging the meetpad/jitsimeet/etherpad/xmpp case sensitivity thing
19:04:25 <clarkb> No explicit actions came out of that that we recorded. But I think it gave us a better understanding of what we can do to make that case handling difference less confusing
19:04:27 <corvus> o/
19:04:32 <fungi> seems like we have a plan for it though
19:04:54 <fungi> or at least some consensus of things we can do
19:04:59 <clarkb> ya I think what we've found is that case confusion is a thing and we should probably switch to enforcing lower case in etherpad to avoid that anyway
19:05:11 <clarkb> then we've got to deal with renaming/merging pads as necessary to handle that
19:06:06 <clarkb> #topic Specs approval
19:06:16 <clarkb> This spec isn't ready for approval yet, but I wanted to call it out
19:06:35 <clarkb> #link https://review.opendev.org/#/c/731838/ Central Authentication Service spec
19:06:45 <fungi> yeah, it needs some heavy editing
19:06:50 <clarkb> fungi: I think we half expect a new PS based on conversation we had at the PTG?
19:06:53 <fungi> good feedback in there from neal too
19:07:12 <fungi> yes, you can half expect it, but i fully intend to provide it ;_
19:07:28 <fungi> just might not come this week
19:07:38 <fungi> we'll see
19:08:13 <clarkb> thanks
19:08:18 <clarkb> #topic Priority Efforts
19:08:28 <clarkb> #topic Update Config Management
19:08:58 <clarkb> The main topic I wanted to bring up here was the reorganization of our ansible inventory, groups, *vars, and base playbook
19:09:34 <clarkb> What we've realized is that the vast majority of the base playbook is not service specific. It configures admin users and exim for email and ntp and so on.
19:10:10 <clarkb> But the playbook runs against all hosts which means if any one of them fails then playbook fails. This can then cause problems if you wanted letsencrypt to run on a specific host or zuul to be updated and those hosts were fine
19:10:43 <clarkb> in order to make that more reliable we've split the iptablse role out of base as it is service specific and put that into our service roles. Then we can decouple running base as a requirement before every service update
19:10:54 <clarkb> mordred: ^ is that a reasonable summary of the change? Anything else to add to that?
19:10:59 <mordred> I think that's great
19:11:28 <clarkb> from the operator side of things be aware files haev moved around and some config has been updated. You may need to rebase outstanding changes in system-config
19:12:34 <clarkb> Any other configuration management items to bring up?
19:13:22 <mordred> I think that's about it - we may have discovered we're actually ok to run zuul-executor in containers
19:13:37 <mordred> corvus is goign to verify - but I think I found that to be true now on friday
19:13:50 <mordred> so I've got some patches up to do that
19:13:53 <clarkb> mordred: the thought there is we have to give the container some additional permissions?
19:14:23 <mordred> clarkb: turns out we don't seem to need anything past privileged
19:14:23 <corvus> locally i think i saw it working in bwrap but behaving weirdly inside docker itself.  but it sounds like mordred saw something different when trying on ze01
19:14:32 <mordred> yeah
19:14:47 <mordred> so it's possible there are differences wrt kernel versio or docker version from the original test - or who knows
19:14:57 <mordred> but i did bwrap inside of docker and it SEEMED to do the right things
19:15:05 <corvus> based on what i saw, i think we should be "okay" to do it without the seccomp stuff, but i think it might be more comfortable with seccomp
19:15:15 <corvus> mordred: did you test out afs inside docker but not in bwrap?
19:15:27 <mordred> corvus: I think so?
19:15:31 <corvus> k
19:15:37 <mordred> corvus: but - let's double-check :)
19:15:46 <corvus> so if what mordred saw holds, then i agree, we should be gtg without anything else
19:16:00 <corvus> i'll do this after the meeting
19:16:22 <corvus> ^ = confirm mordred's tests
19:16:29 <mordred> if that works - we'll just be down to nodepool builder on arm running non-containerized - and we need to swing back around to that issue anyway
19:16:56 <clarkb> the arm nodepool builder is hung up on the odd stream crossing we saw with multi arch docker builds right?
19:17:23 <mordred> yeah - which we need to reproduce and figure out what's going on
19:17:59 <ianw> i can probably make some tiem for at least reproduction
19:19:00 <clarkb> #topic OpenDev
19:19:20 <clarkb> #link http://lists.opendev.org/pipermail/service-discuss/2020-May/000026.html Advisory Board thread.
19:19:52 <clarkb> The advisory board "recruiting" is still in progress. At the PTG we discussed that a gentle reminder to those who haven't responded is a good idea and then we'll move forward in a few weeks with who we get.
19:20:07 <clarkb> The thought is that by having some involvement we can generate interest and an example of what the system is there for
19:20:21 <clarkb> I plan to send out those gentle reminders today
19:20:59 <fungi> like a snowball rolling downhill
19:21:07 <corvus> in june?
19:21:21 <clarkb> On the service disde of things Gitea 1.12.0 has had its second rc tag and I've got a change up to test a deployment of that. Looks like they've already added some additional bug fixes on top of that. We should hold off until the actual release I expect
19:21:25 <clarkb> corvus: in some parts of the world
19:21:33 <fungi> corvus: feel like taking a trip to chile? ;)
19:21:38 <corvus> fungi: yes
19:21:58 <clarkb> the good news is the templates have been very stable between rc1 and rc2 so any final release should be really close to ready and its just a matter of updating the tag I hope
19:22:31 <fungi> i've also got a change up for upgradnig the version of etherpad. supposedly a major cause of the "broken" pads is addressed with it
19:22:41 <clarkb> I'm excited for this update as it adds caching of git commit info which should drastically speed up our rendering of repos with large histories like nova
19:22:51 <fungi> now that the ptg is done, this may be a good time for etherpad upgrades again
19:23:04 <clarkb> fungi: ++ I think we can land and deploy that as soon as we are happy with the change and its testing
19:23:49 <fungi> just double-checked and 1.8.4 is still the latest release
19:24:18 <corvus> what does "broken" mean?
19:25:12 <clarkb> corvus: i think like the clarkb-test etherpad on the old etherpad-dev server
19:25:19 <clarkb> corvus: etherpads that eventually stop serving correctly
19:25:58 <fungi> yeah, the ones which hang with "loading..."
19:26:22 <corvus> ack
19:26:30 <clarkb> Anything else on OpenDev or shoudl we moev on?
19:26:34 <clarkb> (I can't type today)
19:26:36 <fungi> i mentioned the change some weeks back in #opendev, but when we hit one of those there are telltale errors in the log which are referenced by the fix
19:26:54 <fungi> so fingers crossed anyway
19:27:56 <clarkb> #topic General Topics
19:28:07 <clarkb> #topic Project Renames
19:28:18 <clarkb> I want to start with this one to make sure we get a chance to talk about it
19:28:36 <clarkb> we had pencilled in June 12 which is this Friday. Unfortunately I've discovered I have a kids doctor visit at ~1800UTC that day
19:29:14 <clarkb> I'm happy to go ahead with it and help as I can (we can do it early friday or later friday and I'll be around) or move it to another day if we don't have enough people around
19:29:33 <clarkb> also we've added a few more renames since we last talked about this, the openstack foundation interop repos are getting moved now I guess
19:29:53 <fungi> also it sounds like the openstack tc may want to rename a few more repos out of the openstack namespace into the osf namespace (relating to osf board of directors committees/working groups)
19:30:01 <fungi> er, yeah what you just said
19:30:09 <clarkb> fungi: yup gmann added that to the list of things about half na hour agao
19:30:15 <fungi> perfect
19:30:45 <clarkb> do we have any volunteers for Friday other than myself?
19:30:54 <fungi> i'll be around
19:31:03 <fungi> happy to do renames
19:31:15 <clarkb> fungi: cool do you have a perference on time and I'll do my best to be around to help ?
19:31:31 <fungi> let's say not 18:00 utc in that case...
19:31:41 <clarkb> I can start as early as 1400UTC, then have cut off around 1730UTC, and expect to be back around 2030 UTC
19:31:57 <clarkb> (it'll likely be shorter than that but you never know with those visits)
19:32:31 <corvus> i should be around but would like not to drive
19:32:33 <fungi> my schedule is wide open friday. are there other volunteers with time constraints? i could certainly accommodate either of those windows
19:32:57 <fungi> 21:00 would work for me if that helps others
19:33:49 <clarkb> That works for me and should give me plenty of padding on my schedule
19:34:02 <clarkb> why don't we go with that then. Thank you fungi !
19:34:13 <fungi> let's do that then, we can always do some prep earlier in the day in anticipation too
19:34:24 <clarkb> ++ thanks
19:34:51 <clarkb> Between now and then we'll want to construct the yaml input to the renaming process and commit it to opendev/project-config once the renames happen
19:34:58 <fungi> yep
19:35:01 <clarkb> I can help coordinate with you to make sure we are ready by Friday
19:35:09 <fungi> sounds good, thanks
19:35:24 <clarkb> #topic Pip and Virtualenv Next Steps
19:35:40 <clarkb> ianw: ^ Any update on this subject?
19:35:56 <clarkb> I believe I saw at least one project (octavia) testing that the chagnes don't break them which was reassuring
19:36:08 <ianw> yeah, i didn't get any complaints, and some people saying things worked
19:36:12 <ianw> #link https://review.opendev.org/734428
19:36:25 <ianw> that's the review to drop it, so ... i guess we just do it?  i'm not sure what else to do
19:37:11 <fungi> wfm
19:37:46 <clarkb> we've communicated it, at least some people have done testing and reinforced the expectation that this will be low impact, I think the next step is to land the change
19:38:49 <AJaeger> ++
19:38:56 <fungi> this is also early enough in openstack's release cycle that any resulting disruption can be addressed at a comfortable pace
19:39:10 <ianw> the one to watch for is if people say virtualenv is missing
19:39:24 <ianw> their best bet is to add "ensure-virtualenv" role
19:39:41 <AJaeger> ianw: please send an email once we merge the change
19:39:47 <clarkb> a followup to the announcement thread would be good indicated we've landed the change once that happens
19:39:48 <clarkb> AJaeger: ++
19:39:55 <ianw> will do
19:40:29 <clarkb> anything else on this topic?
19:40:35 <ianw> no, thanks
19:41:06 <clarkb> #topic DNS Cleanup
19:41:20 <clarkb> ianw: did we end up publishing the contents for comment yet?
19:41:55 <ianw> it looks like the backup went into merge failure
19:41:57 <ianw> #link https://review.opendev.org/#/c/728739/
19:42:02 <ianw> but it would be good to merge that
19:42:16 <ianw> the one to look through is
19:42:17 <ianw> #link https://etherpad.opendev.org/p/rax-dns-openstack-org
19:43:24 <ianw> perhaps to make it more manageable, if people want to delete from that things that should definitely stay, it will reduce it
19:43:26 <clarkb> thanks and I guess we can just mark that up with comments around what can be removed?
19:43:56 <clarkb> ah ya I see the note about removing things that should definitely stay, thanks
19:44:31 <clarkb> I'll try to take a look at that today
19:45:23 <clarkb> #topic PTG Recap
19:45:31 <clarkb> #link http://lists.opendev.org/pipermail/service-discuss/2020-June/000035.html Recap Email
19:45:43 <clarkb> I wrote a long email trying to cover the important bits of the PTG for us
19:45:49 <clarkb> Overall I think it went well.
19:46:01 <clarkb> From an operations side meetpad seemed to work with most of its scaling issues being client side
19:46:20 <clarkb> there were some annoying things like the etherpad focus going away when people talked sometimes and needign to reconnect because all sound went away
19:46:38 <clarkb> but overall it held up and the groups using it seemed happy (though groups with more than 20 had less success)
19:47:12 <clarkb> As participants we managed to get through our agenda. I think the total of 6 hours was about correct for us
19:47:27 <clarkb> #link https://etherpad.opendev.org/p/June2020-PTG-Feedback Provide your PTG event feedback
19:47:28 <fungi> i was pleased with the way it worked out
19:47:42 <clarkb> the PTG organizers are soliciting feedback on the etherpad I just linked. Feel free to add your thoughts there
19:47:51 <corvus> i have heard from folks they'd like to continue (trying) to use meetpad in the future; i think we can/should wind down pbx in favor of meetpad
19:48:00 <clarkb> corvus: ++
19:48:20 <fungi> i concur
19:48:27 <clarkb> One of the things we talked about was getting off of pytho3n for our little tools and utilities as well as services.
19:48:30 <fungi> we lose the dial-in trunk though
19:48:36 <clarkb> I've started to try and put together an audit of the todo list around that
19:48:37 <clarkb> #link https://etherpad.opendev.org/p/opendev-tools-still-running-python2 Python2 Audit
19:48:50 <clarkb> fungi: jitsi meet supports that and I think we can even use the same number
19:48:57 <clarkb> fungi: but that is new config we need to sort out
19:49:14 <clarkb> (I don't know how it maps phone calls to meeting rooms as an example)
19:49:50 <clarkb> One thing that was missing from the virtual event was unwind/decompression time
19:49:54 <fungi> yeah, i figured it was something we could add
19:50:05 <clarkb> at the in person events there are game nights and dinner with people
19:50:16 <clarkb> I was wondering if anyone was interested in trying some virtual form of that
19:50:23 <clarkb> more likely to be game night than dinner :)
19:50:24 <fungi> also beer you don't have to pour yourself ;)
19:50:41 <fungi> i guess i can get over pouring my own
19:50:47 <clarkb> I've discovered hedgewars does remote multiplayer and maybe we can play a silly game of that with comms over meetpad
19:51:05 <clarkb> its an open source clone of worms armageddon
19:51:35 <clarkb> I'm open to other ideas or being told that there isn't sufficient interest
19:52:36 <clarkb> Anything else to call out from the PTG?
19:53:25 <clarkb> #topic Trusty Updates
19:53:35 <clarkb> fungi: want to quickly recap the comodo cert situation?
19:53:49 <fungi> sure
19:54:13 <fungi> as of june 1, the old comodo/addtrust certificate authority ca cert expired
19:54:46 <fungi> some of our sites used and still use certs which were validated through a chain including that as an intermediate
19:55:01 <fungi> one in particular is openstackid.org
19:56:30 <fungi> we discovered that on older python deployments, like that found on ubuntu trusty, the cert validation behavior of the requests module is to report a failure/exception if there is an expired cert in the chain bundle, even if another cert in the bundle is sufficient to validate the server's cert
19:56:57 <fungi> this was causing people to be unable to log into refstack.openstack.org
19:57:44 <fungi> it was ultimately "fixed" by updating the intermediate chain bundle on the openstackid.org server to no longer include the expired (and thus useless) addtrust cert
19:58:02 <fungi> leaving only the newer sectigo cert
19:58:32 <clarkb> and that is something we should apply to our other sectigo certs?
19:58:40 <fungi> this matches the current chain bundle recommended by sectigo (the ca of record for our non-le certs obtained from namecheap)
19:59:07 <fungi> it likely depends on what's out there accessing those sites
19:59:51 <fungi> we can safely remove the old addtrust ca from all our intermediate bundles, but a lot of the copies i found are stale from before we started moving stuff to le
20:00:08 <clarkb> ya so two layers of cleanup there I expect
20:00:15 <fungi> so we could consider generally cleaning up old data in our hiera
20:00:18 <clarkb> ++
20:00:25 <clarkb> and that takes us to the end of our alotted time
20:00:27 <clarkb> thank you everyone
20:00:35 <clarkb> Feel free to continue conversation in #opendev
20:00:41 <fungi> if someone knows a programmatic way to identify those, that would be great
20:00:46 <clarkb> but I'll end the meeting now to ensure people can eat breakfast/lunch or go to bed :)
20:00:53 <clarkb> #endmeeting