Monday, 2023-11-20

ianwthat's interesting about "borg compact".  when i look at the borg source that comment seems to predate our use https://github.com/borgbackup/borg/commit/a8d52351bb41dbe82c6fef6010465e05cf4dc43e03:40
ianwhttps://meetings.opendev.org/irclogs/%23opendev/%23opendev.2021-01-19.log.html#t2021-01-19T22:26:29 seems like where we discussed that03:40
ianw<clarkb>ianw: I run borg prune --verbose --list --prefix '{hostname}-' --show-rc --keep-daily    7 --keep-weekly   4 --keep-monthly  603:40
ianwthat looks suspiciously like the command we use.  the conclusion i'm reaching is that i've probably never read that man page :)03:41
ianwhowever empirically we do know that pruning does actually create free space :)  03:42
tonybit does I think it's akin to "git prune" vs "git repack"03:52
ianwyeah, we probably should start running "compact"03:58
ianwit may, however, be a good time to rotate out the backup volumes before doing that ... 03:58
tonybwe can look at that.  I don't pretend to understand the implications of that.04:01
tonybI was kinda thinking we try it as a one of on backup02 and see what happens.04:02
opendevreviewRoman Kuznecov proposed zuul/zuul-jobs master: DNM: Investigate POST_FAILURE error  https://review.opendev.org/c/zuul/zuul-jobs/+/90129410:31
fungitonyb: yeah, rotating the backup volumes is fairly straightforward. make and attach new volumes, change configuration to start putting all new backups on those13:05
noonedeadpunkhey! not sure if it's known or not, but are in-gerrit edits broken after fridays maintenance?13:10
noonedeadpunkas files in edit mode never opens13:10
funginoonedeadpunk: try a force refresh or cache clear, gerrit seems to cache its javascript client-side and it doesn't work across versions of the server13:17
funginoonedeadpunk: https://groups.google.com/g/repo-discuss/c/DTrYQtY0j1k/m/7riBbIa5BwAJ13:18
noonedeadpunkfungi: ah, yes, I tried to force refresh on the change page, but not on edit window, that's apparently why it didn't help, thanks for the pointer!13:20
fungiyw13:31
fungiclarkb: ^ looks like we have another data point13:31
tonybfungi: ianw: of that makes sense and explains the volume name on the host 13:59
fungitonyb: it's also convention, i think, that we delete the old-old volume around the same time. possibly before creating the new one in order to free up cinder quota, i don't recall14:00
tonybgood to know14:01
tonybwould anyone object to me sending a brief message to service-announce. with words the effect of "since the update in browser edits may be not work as expected.  we're investigating.  in the meantime do a full refresh on any page that is missing data.14:03
tonyband then sending that link to  the various discuss lists?14:03
fungino objection from me. maybe clarkb can weigh in once it's daytime in his neighborhood14:04
tonybsounds good 14:04
tonyb.... that'll give me time to join some of the other lists14:05
*** dhill is now known as Guest760814:28
fungiour automated root e-mails from the backup servers about >90% volume usage seem to have ceased after friday's pruning14:43
tonybGreat success14:43
fungiinfra-root: https://public-cloud.status-ovhcloud.com/incidents/ssscy5l81mtm14:49
tonybOh, we run borg in append-only mode right?  "If borg compact command is used on a repo in append-only mode, there will be no warning or error, but no compaction will happen."14:49
fungilooks like we missed the 2023-11-20 08:30-10:30 window for bhs14:49
funginot sure if we saw any impact from it14:49
fungithere is another at the same time tomorrow affecting gra14:50
fungiit looks like they don't expect any outages/down-time from it, so i guess no need to temporarily remove them from log uploading14:51
tonyb"Customers must follow RFC rules to avoid having disruptions on their requests" .... What are RFC rules?14:51
fungi"This upgrade will add support for TLS 1.3 as well as strict compliance with RFC7540."14:51
fungii assume they mean make sure your systems comply with ietf rfc 7540 prior to the maintenance time14:51
tonybAhh okay that makes sense14:52
fungidisappearing for an hour to run some errands, back soon15:23
tonybKk15:23
*** dmellado2 is now known as dmellado15:58
clarkbI have deleted my gerrit upgrade related autoholds16:00
tonybOkay16:01
clarkbI'll work on changes to update our gerrit image builds next. Though we probably want to wait until after the long holiday weekend to land any of them16:01
clarkbtonyb: re append only I htink that is correct and explains why we don't compact16:01
tonybclarkb: That makes some sense at least.16:02
opendevreviewClark Boylan proposed openstack/project-config master: Switch jeepyb to building gerrit 3.8 images  https://review.opendev.org/c/openstack/project-config/+/90146516:10
opendevreviewClark Boylan proposed opendev/system-config master: Switch gerrit 3.7 image to 3.8 in a couple places we missed  https://review.opendev.org/c/opendev/system-config/+/90146616:11
opendevreviewClark Boylan proposed opendev/system-config master: Cleanup Gerrit 3.7 image jobs and disable Gerrit upgrade job  https://review.opendev.org/c/opendev/system-config/+/90146716:11
fungiaccording to the boss, i have more errands to run, so disappearing for another hour-ish (seems like nothing's on fire at the moment)16:14
opendevreviewClark Boylan proposed opendev/system-config master: Add gerrit 3.9 image builds  https://review.opendev.org/c/opendev/system-config/+/90146816:18
opendevreviewClark Boylan proposed opendev/system-config master: Add gerrit 3.8 to 3.9 upgrade testing  https://review.opendev.org/c/opendev/system-config/+/90146916:24
clarkbhttps://review.opendev.org/c/openstack/project-config/+/901465 and https://review.opendev.org/c/opendev/system-config/+/901466 are small updates that will need to land before we can start testing the rest of the stack16:24
clarkbinfra-root for the old hosts in emergency.yaml I'm basically doing git log -p in system-config and grepping for those names. Confirming that last mention of them was their removal from the inventory and in those cases I'll remove them from the emergency file16:26
tonybclarkb: Sounds good to me.16:27
tonybclarkb: I was thinking about doing the 2 OVH mirror nodes at the same time (single zone chnage to add both, single inventory addition etc).  Is there any reaon this is a bad idea?16:29
clarkbtonyb: no that should be fine16:29
tonybThanks for confrming16:30
clarkbafter looking at the emergency file I cleaned up the openedge and citynetwork kna1 mirrors, health01, subunit-worker01/02, nb03.opendev.org16:30
clarkbI've left bridge.openstack.org in place since we probably needto dobuel check that node is fully gone to avoid any accidental restarts16:30
clarkbthough bridge.openstack.org is out of the inventory16:31
tonybBack in a about 30-45mins16:49
fungiokay, back for good17:52
tonybbold claim ;P17:52
fungiwell, for neutral at least, not for evil17:53
fungiprobably chaotic good, but we'll see where this week goes17:53
tonybchaotic seems probable18:02
clarkbfor the meeting agenda I was going to drop mm3, keep gerrit 3.8 to recap the upgrade and followup changes above, discuss gitea ssh key rotation stuff. Anything else to add drop?18:02
tonybDo we want to discuss / think out upgrading the rax trove from mysql-5.7?  and/or self hosting the DB?  That's probably better as a mailinglist discussion 18:04
fungion the mm3 front, we do have a non-urgent need to add ansible management of templates, but i doubt it requires discussion in the meeting18:05
clarkbfungi: ya I think we can just make that a part of the ansible vars for each list and proceed until that doesn't work for some reason.18:05
fungitonyb: there are possibly multiple rax trove instances still in use, but i'm assuming you mean the one zuul is relying on (though i guess we could inventory and upgrade all of them if there are still any others)18:06
clarkbI think storyboard, zanata, and zuul are all trove still off the top of my head18:06
fungii think we took sb's database local in order to improve performance, but maybe we never cleaned up the trove side18:07
clarkbah18:07
fungiyeah, there's a mysqld running on the sb server18:08
tonybYeah I was thinking about zuul specifically.18:08
clarkbfungi: tonyb: https://review.opendev.org/c/openstack/project-config/+/901465 and https://review.opendev.org/c/opendev/system-config/+/901466 are two followup changes that should be quick to review and safe to land now (the project-config change is needed before CI jobs will run against the other changes as well)18:09
fungiboth of those lgtm, thanks!18:11
opendevreviewMerged openstack/project-config master: Switch jeepyb to building gerrit 3.8 images  https://review.opendev.org/c/openstack/project-config/+/90146518:24
clarkbok meeting agenda updates are in. Let me know if I missed anything18:26
fungithanks!18:40
opendevreviewClark Boylan proposed openstack/project-config master: Add new Gerrit submodule java-prettify to Zuul  https://review.opendev.org/c/openstack/project-config/+/90147918:49
opendevreviewMerged opendev/system-config master: Switch gerrit 3.7 image to 3.8 in a couple places we missed  https://review.opendev.org/c/opendev/system-config/+/90146618:57
opendevreviewClark Boylan proposed opendev/system-config master: Add gerrit 3.9 image builds  https://review.opendev.org/c/opendev/system-config/+/90146819:03
opendevreviewClark Boylan proposed opendev/system-config master: Add gerrit 3.8 to 3.9 upgrade testing  https://review.opendev.org/c/opendev/system-config/+/90146919:03
clarkbI feel like I have to refresh my memory on how we deal with submodules in gerrit each time they change. tldr is plugins we just copy paste into place and don't submodule init them. jgit, polymer-bridges, and now java-prettify we do submodule init but we rm the gerrit repo git remote first so that it does the init against the local filesystem paths19:07
fungidoesn't zuul-jobs have some helpers for dealing with submodules, or did i dream that?19:12
clarkbunsure. I think gerrit is sufficiently weird we wouldn't want a general solution though19:15
fungifair19:21
tonybLooking at the OVH mirror nodes, and I see they're consisent with each other but different to the rax nodes.  logical volume names are different, as is the size of the volume:  https://paste.opendev.org/show/bJxUerJHsxoUb8qaFaFW/19:23
tonybWhile I get that this isn't a problem, as I'm rebuilding them should I a) stick with the same names/sizes or b) standardize on the rax names and sizes ; or c) $somethign else19:25
clarkbtonyb: the reason for the size difference is that the apache cache cleaner daemon couldn't keep up on rax (slower io iirc) so we gave it larger disk for more headroom19:25
clarkbtonyb:  Ithink you can use consistent sizes within ovh as that hasn't been a problem there.19:26
clarkbtonyb: for naming I suspect that is a side effect of needing to do this manually :/ I kinda like the more generic volume naming in rax beacuse we might one day choose to swap out apache for some other proxy cache instead19:26
tonybnaming is hard ;P19:27
fungithat's why i treat it as a drinking game19:29
opendevreviewTony Breeds proposed opendev/system-config master: Add a helper script for doing the LVM setup on mirror nodes.  https://review.opendev.org/c/opendev/system-config/+/90150419:57
opendevreviewTony Breeds proposed opendev/system-config master: Add a helper script for doing the LVM setup on mirror nodes.  https://review.opendev.org/c/opendev/system-config/+/90150421:55
clarkbtonyb: made some notes on ^22:00
tonybclarkb: replied22:08
clarkbthanks responded inline22:10
tonybokay.  I'll address that in a bit.22:19
tonybclarkb: thanks 22:19
corvustonyb: fyi the reasoning for append-only is for forensic analysis and recovery in the case of compromise23:49
corvusfungi: gerrit's zuul has some gerrit-specific helpers, that's probably what you're thinking of.  with a little bit of work they could be genericized, but they still only are appropriate for projects using submodule subscription, so there's a lot of caveats about their use.  :)  and they're not useful for opendev's gerrit because we aren't in that situation with gerrit (we don't have submodule subscriptions for the plugins)23:53
corvuss/not useful for opendev's gerrit/not useful for opendev's jobs-that-build-gerrit-itself/  :)23:57
clarkbfwiw https://review.opendev.org/c/openstack/project-config/+/901479 is the change we need to make the new submodule work in our buidls of gerrit23:57
clarkbsince that is in a trusted repo and requires a zuul config bump it isn't self testing and needs to be merged before it will work23:58
corvusi feel like we run into a high number of meta language issues working on zuul and gerrit23:58

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!