19:01:12 <clarkb> #startmeeting infra
19:01:12 <opendevmeet> Meeting started Tue Jul  6 19:01:12 2021 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:12 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:12 <opendevmeet> The meeting name has been set to 'infra'
19:01:18 <clarkb> #link http://lists.opendev.org/pipermail/service-discuss/2021-July/000264.html Our Agenda
19:01:30 <clarkb> The agenda went out a bit late due to yesterday's holiday observance btu we do have an agenda :)
19:01:35 <clarkb> #topic Announcements
19:02:06 <clarkb> July 18 the Gerrit server will be upgraded. Update your firewall rules now if you need to do that (details are on the service-discuss mailing list)
19:02:26 <clarkb> I've also got this info going out on the foundation newsletter this week to try and spread the word
19:02:47 <clarkb> #topic Actions from last meeting
19:02:53 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2021/infra.2021-06-29-19.01.txt minutes from last meeting
19:03:26 <clarkb> I had an action to talk to the openstack tc about the next steps for the ELK stack. I have done this and been asked to bring it up at the TC meeting on Thursday. I updated their agenda and plan to be there to discuss the subject
19:03:41 <clarkb> It sounds like the board has asked them for timelines and a few details which I think we can help with too
19:03:58 <clarkb> #action someone write spec to replace Cacti with Prometheus
19:04:13 <clarkb> I don't think this has happend yet. But its possible I may have time for that this week looking at my current todo list. We'll see
19:04:19 <fungi> someone is falling down on the job
19:04:46 <fungi> it's times like this i'm glad to be noone
19:04:59 <clarkb> careful we can make you a somebody
19:05:03 <fungi> oof
19:05:25 <clarkb> #topic Topics
19:05:32 <clarkb> #topic Gerrit Account Cleanup
19:05:48 <clarkb> This morning I retired 176 accounts that we identified as unused or unlikely to be unused
19:06:14 <fungi> yes!
19:06:19 * fungi throws a quick party
19:06:20 <clarkb> This was based on account activity, age, and the situation the conflicting account is in. In many cases we could see the accounts haven't been used in almost a decade or one account was used then another took over
19:06:46 <clarkb> That should leave us with about 80 accounts where the situation is more complicated and we'll try to reach out to users for those.
19:07:10 <clarkb> The next step for these 176 is to wait 2 or 3 weeks then once we've given it time for people to raise any alarms we can remove the conflicting external ids from the retired accounts
19:07:23 <fungi> as usual, if folks complain their account has broken, we should probably start by grepping the id from the logs you saved, yeah?
19:07:24 <clarkb> I'd like to start reaching out to individuals in the ~80 remaining while we wait on that too
19:07:47 <clarkb> fungi: yes exactly. Then you should be able to revert the commit I pushed to refs/users/xy/abxy to set them back to the way they were
19:08:02 <fungi> awesome. thanks for confirming
19:08:14 <clarkb> I'm 99% sure I tested this and it is only the external ids refs where gerrit will reject the changes. reverts to the refs/users/* refs are fine
19:08:51 <clarkb> I did have a small network hiccup when retiring users so I had to rerun my retirement script for one user. That is the only oddity in the logs
19:09:03 <clarkb> but the revert process doesn't change for that user
19:10:14 <clarkb> #topic Review Upgrade
19:10:25 <clarkb> This has been announced for the end of day July 18 UTC time
19:10:42 <clarkb> as mentioned before I've got this info going out on the foundation newsletter to help advertise it.
19:10:56 <clarkb> ianw: are there changes we should be reviewing now (like the SSHFP cleanups?)
19:10:56 <fungi> this is the point in the meeting where i realize i'm actually away from home visiting extended family when the upgrade is scheduled to happen
19:11:15 <fungi> but i'll try to be online anyway, their internet connectivity willing
19:11:17 <ianw> yeah, if i could get eyes on
19:11:19 <clarkb> fungi: I expect we'll be fine. ianw and I can be around
19:11:27 <clarkb> but the help is appreciated
19:11:35 <ianw> #link https://etherpad.opendev.org/p/gerrit-upgrade-2021
19:11:48 <ianw> and particularly the two pre-merge things
19:12:06 <fungi> i've left a few notes on there, but will go over it more closely and check out the linked changes
19:12:37 <ianw> ahh, thank you, i will go through comments again today
19:12:37 <clarkb> ya I'ev gone over it too, but should go over it again and make sure I've reviewed the chagnes and followed up on any comments to the process
19:13:43 <clarkb> One thought I had was that we should maybe land https://review.opendev.org/c/opendev/system-config/+/799225 to update our gerrit image and fix the lp blueprints integration then when we do the upgrade we'll know we have the same happy image
19:13:59 <clarkb> er I mean land that then do a quick restart on prod before the upgrade (like this week?)
19:14:08 <fungi> #link https://review.opendev.org/799124 Good riddance to track-upstream and its cronjob
19:14:24 <fungi> that's related, since the new server seems to be opaquely failing to actually run its cronjob
19:14:48 <clarkb> ianw: ^ if you don't think those two changes are a problem for pre merge activities maybe go ahead and add them to the etherpad? Or let me know and I can add them
19:15:19 <ianw> ok, will look
19:15:51 <ianw> i think maybe the switch of zuul to review01.opendev.org will require a zuul restart?
19:16:04 <ianw> so it might be a good time to pull in the updated image too
19:16:43 <clarkb> yes updates to that portion of the zuul config will require a zuul restart
19:16:47 <ianw> i can do that on my monday when it's quiet
19:16:52 <clarkb> ok
19:17:20 <ianw> that gives it a week to bake in
19:17:36 <clarkb> might need to coordinate with corvus on zuul restarts as there is a lot of chrun on the zuul side and we may have to do a full restart to safely update depending on the changes that land
19:17:46 <clarkb> (zuulv5 development is full steam ahead)
19:17:55 <ianw> indeed it is!
19:18:41 <clarkb> Anything else on this topic?
19:19:05 <ianw> not from me
19:19:28 <clarkb> #topic Draft matrix spec
19:19:58 <clarkb> A few of us had a call with Element Matrix Services (EMS) last week to discuss the possibility of using their hosted EMS platform
19:20:31 <clarkb> From what I understood they didn't have any problems with us doing the slightly hacky setup to only maintain admins and bot users on our server then have users use matrix.org or their own homeservers
19:21:04 <fungi> i have a feeling they're happy to see open source communities making use of it
19:21:21 <clarkb> Their suggestion to us for next steps and getting started is to spin up a trial instance on the lowest tier of their service. Get things set up and start interacting with it. Then if we want to use the slightly more expensive silver version (they think this will be our best choice) we can upgrade to that painlessly
19:21:28 <clarkb> fungi: yup they were super helpful in talking us through this
19:22:02 <clarkb> I then talked to jbryce at the foundation about this and he didn't think this would be a problem. I need to coorindate with him to set up the accoutn and configure the payment details. Hopign to do that this week
19:22:09 <fungi> what does silver get us over the entry level?
19:22:40 <fungi> or is silver the lowest tier paid option?
19:22:54 <clarkb> fungi: Nickel is the lowest option. Silver is second lowest
19:22:54 <fungi> i see you said "trial" so i suppose that's time-limited?
19:22:59 <clarkb> https://element.io/pricing
19:23:11 <fungi> aww, missed opportunity there was no "dime" to go with "nickel"
19:23:12 <clarkb> ya nickel is free for the first month as a trial thn you start paying for it
19:23:28 <clarkb> my understanding over why silver would probably be best was simply a matter of scale
19:23:55 <clarkb> we would probably be ok with nickel for zuul but as we grow that can change
19:24:00 <clarkb> it is something to figure out as we go and interact with it
19:24:27 <clarkb> They also noted that if we don't want our instance hosted in sweden you have to pay for Gold or higher but we don't think that is a problem
19:24:31 <corvus> oh hai
19:24:41 <clarkb> On our end the spec got updated to reflect the plan to try EMS
19:24:43 <clarkb> #link https://review.opendev.org/796156
19:24:44 <fungi> i like sweden
19:24:52 <clarkb> corvus: hello, I was just recounting what we learned from our call with EMS
19:25:14 <clarkb> and noted that I talked to jbryce about it and he seemed happy with it. Now I need to coordinate with him to get an accoutn created with appropriate payment details
19:25:16 <ianw> when it says "active user" that means people with a @user:opendev.org address?
19:25:19 <corvus> lgtm
19:25:37 <corvus> ianw: yes, so basically, our bots/admin accounts
19:25:46 <clarkb> ianw: yes, and you can actually have a bunch of inactive versions of that in the system. They differentiate between actually active and you can log in and cold storage
19:26:31 <clarkb> (not sure the active vs inactive designation will end up being useful for us but it means you can create accounts pretty safely and then just be careful about what you activate)
19:26:58 <fungi> i guess if we needed occasional-use admin accounts that could come up
19:27:09 <corvus> clarkb: i think i have 2 questions: 1) how to proceed with spec approval?  2) do you want to wait for that before setting up the account/server, or go ahead and get started on that and we can start working on bots, etc?
19:27:35 <clarkb> corvus: I'd like to go ahead and get started with the account/server creation since we may learn something important doing that.
19:27:42 <corvus> fungi: yeah, i sort of feel like a single admin account that's either used by a bot or one of us manually when required will probably be fine for things like setting up rooms, etc.
19:28:13 <fungi> agreed, from what little i know so far
19:28:15 <clarkb> corvus: but then once the server is up and spec updated to accomodate any new info and no major issues pop up I think we can land the spec?
19:28:23 <corvus> clarkb: cool; everything about this can be reversed easily right up until we ask people in #zuul to move, so we have a lot of leeway if we're okay being casual about it.
19:28:24 <clarkb> corvus: maybe give the trial a couple of weeks and then land the spec?
19:28:32 <clarkb> corvus: good to know
19:28:44 <clarkb> also mordred moved homeservers recently and that seems to have gone reasonably well
19:29:06 <corvus> yeah, mordred also did a test of moving a room from his old homeserver to the new; even that worked without a hitch
19:29:12 <mordred> I moved a channel I'd created ... yeah that ^^
19:29:29 <mordred> now - moving homeservers was a bit more involved and didn't really transition state
19:29:42 <clarkb> I think I'm mostly worried about finding something about EMS that is a deal breaker for us and deciding we need to run it ourselves which will have a big impact on the spec
19:29:50 <clarkb> but once we are past that step I think we should land the spec
19:29:56 <clarkb> s/step/concern/
19:29:58 <fungi> i keep meaning to set up one since i have a private inspircd with some semi-used channels i'd like to add a bridge for eventually, and could have a vanity username that way too
19:30:02 <mordred> from a user account pov - basically I had @mordred:waterwanders.com and I created @mordred:inaugust.com then invited @mordred:inaugust.com to anything that @mordred:waterwanders.com was in
19:30:36 <mordred> but moving the room from waterwanders homserver to inaugust homeserver went amazing
19:30:57 <clarkb> It does seem like the plan is congealing which is nice. I'll have to review the spec properly to indicate that
19:31:01 <mordred> I believe if we have a homeserver with EMS and we decide at a point in the future we need to run it ourselves they can work with us to export the data and do an actual move
19:31:06 <corvus> i was literally like "what's mordred talking about, this room is on inaugust" oh yeah, it didn't use to be.  ;)
19:31:12 <fungi> i guess the irc bridges work by emulating an irc server and networking with existing servers on the same irc network?
19:31:39 <mordred> that seemed like a lot to ask them for just my little homeserver, so I didn't do it :)
19:32:19 <corvus> fungi: i'm not 100% sure; but https://github.com/matrix-org/matrix-appservice-irc is apparently the software
19:32:33 <clarkb> fungi: I suspect that the integartion is less coupled than that. They probably get connection limit exceptions for their bridge in the network then just emulate being a bunch of clients
19:32:37 <fungi> ahh, thanks! i'll give that a thorough look
19:33:06 <clarkb> Alright anything else to talk about on this subject?
19:33:46 <corvus> oh 1 thing
19:33:59 <corvus> feel free to sign up to make a replacement irc bot (see the spec)
19:34:14 <corvus> eavesdrop/statusbot are available (tristanC wrote a gerritbot)
19:34:26 <corvus> meetbot too, but is not needed for zuul
19:34:37 <corvus> clarkb: otherwise, next steps seem clear to me
19:34:41 <clarkb> and for eavesdrop we may not even need channel logging if we can just grab those directly from the matrix server
19:34:48 <fungi> is there still any benefit to merging the bots into a single codebase?
19:35:12 <clarkb> fungi: I'm not sure I know enough to say at this point :)
19:35:29 <fungi> at a minimum we ought to at least shoot for significant code reuse
19:35:37 <corvus> clarkb: probably the easiest way to do that though is to have a bot account join a room
19:35:47 <clarkb> corvus: ah makes sense since that is the "api"
19:35:58 <corvus> ultimately, i think having something writing an html file to disk is still useful for search engine indexing
19:36:20 <corvus> (so even if an individual user can use the in-client search feature, if we value the indexing, we should have an eavesdrop bot)
19:36:31 <corvus> (if we don't value search engine indexing, then, er, maybe we drop that :)
19:36:50 <mordred> yeah. history is great - but having browsable html archives has been nice so far
19:36:53 <ianw> it's always handy to be able to link to a prior conversation
19:37:15 <mordred> yeah. I mean - you can do web links to matrix history - but they take you to a matrix client
19:37:33 <clarkb> oh in that case ++ to having separate html
19:37:35 <corvus> so yeah, that's a neat feature, but separately the plain html archive is still nice
19:37:35 <mordred> and that's maybe a bit heavy for wanting to reference in some other context
19:37:43 <mordred> ++
19:38:37 <clarkb> We have a few more topics to get to so lets move on
19:38:41 <clarkb> #topic Gitea01 Backups
19:38:46 <corvus> https://matrix.to/#/!eitSLAJcQKeehLruKf:matrix.org/$AB4j0_Z_HOn_pOAWfXdHSI-eHvDE_MMlJD2za1-kspU?via=matrix.org
19:39:14 <fungi> seems the network connectivity between vexxhost regions is stil broken
19:39:16 <clarkb> These continue to not be urgent to fix because we haven't done any recent project renames. However, ianw was looking at the issue any we plan to do renames in a few weeks
19:39:31 <clarkb> fungi: ianw: want to fill us in on what you found?
19:39:58 <fungi> short story is that some ipv6 addresses in sjc1 can't communicate with some ip addresses in ca-ymq-1 over some protocols
19:40:10 <ianw> yeah, there was that
19:40:17 <clarkb> ipv4 is fine though?
19:40:22 <ianw> but the backup does seem to be daily running.  i guess it's falling back to ipv4?
19:40:30 <fungi> as if flows are being load balanced between routers at layer 4 and at least one router has a stray route with a greedy/short prefix
19:40:42 <clarkb> ianw: well it is emailing us about the failures as recently as yesterday
19:40:49 <ianw> the file-system backup.  but then the db part fails
19:40:53 <clarkb> oh got it
19:41:08 <clarkb> I think well behaved applications are expected to fallback to ipv4 if v6 doesn't work
19:41:12 <ianw> i found some dump options that looked promising, but i have to admit i got sidetracked on the ipv6
19:41:13 <clarkb> openssh is probably well behaved in this way
19:42:03 <ianw> fungi: you double checked the ipv6 thing right?
19:42:26 <fungi> yeah, if memory serves, the !h is being returned by the first hop in ca-ymq-1
19:43:00 <fungi> which leads me to suspect the core routing in that region, but it's all a black box to me. mnaser was looking into it
19:43:39 <ianw> i'll ping again, i mentioned something at the end of last week
19:43:51 <clarkb> sounds good.
19:44:04 <clarkb> #topic Gerrit project renames
19:44:12 <clarkb> #link https://review.opendev.org/797990 rename playbook updates
19:44:32 <clarkb> That is a change we'll need to have in before we do renames. We're still a few weeks away from our week after the gerrit server upgrade so not in a rush but wanted to point it out
19:44:45 <clarkb> If we can get that reviewed that would be great
19:45:02 <clarkb> And if you know of any additional ernames that should be considered now is the time to get them on the list
19:45:40 <fungi> all of osf/* should probably be renamed to openinfra/* but that's not urgent and i still need to talk to folks at the foundation about it
19:45:47 <clarkb> fungi: ok
19:45:58 <clarkb> I'll mention the rename schedulign to the TC on thrusday too
19:46:06 <fungi> good call
19:46:38 <clarkb> #topic Should we change our meeting time?
19:47:03 <clarkb> It occurred to me that this meeting time is not very good for ianw (at least I don't think it is) and ianw is one of the primary participants.
19:47:19 <clarkb> frickler mentions that they lurk the meetings to follow along even though they don't actively participate
19:47:39 <clarkb> I wanted to put it out there that I'm ok with trying to find a time that works better for others if that would be helpful
19:47:44 <fungi> i'm happy to do whatever time others would prefer
19:47:51 * fungi has no life
19:48:27 <fungi> though also i expect to miss the next meeting, possibly the text two
19:49:10 <clarkb> frickler will miss the next three as well and mentioned that may be a good tiem to experiment with non EU timezone friendly meetings
19:49:22 <ianw> i don't mind, in (my) summer the meeting moves 6-7am which is perfectly fine.  7-8am usually involves family, and at the tail end of 8-9 i have to do school run
19:50:14 <ianw> so from my POV earlier is better
19:50:57 <clarkb> ok I wanted to double check. Since we have had changes to who participates and could move things around if it helps
19:51:07 <clarkb> Sounds like the current time slot is fine though and we can keep it as is
19:51:41 <clarkb> That was all I had
19:51:43 <clarkb> #topic Open Discussion
19:51:48 <clarkb> Anything else before our hour is up?
19:52:05 <ianw> if i could get a couple of eyes on
19:52:06 <ianw> #link https://review.opendev.org/c/opendev/system-config/+/798400
19:52:16 <ianw> that adds paste to ansible
19:52:36 <clarkb> ++ I'll add that to my afternoon todo list.
19:52:37 <ianw> what i'm really interested in is the mariadb container behind that, same bits as for gerrit
19:53:08 <ianw> would be nice to have a bit more experience with it
19:54:43 <clarkb> Last call :) Otherwise I'll let everyone go find breakfast/lunch/dinner
19:55:12 <clarkb> #endmeeting