Thursday, 2023-06-22

fricklerthat seems to be happening daily for quite some time: error: f443f4.conf:1 duplicate log entry for /var/log/acme.sh/acme.sh.log07:26
fricklerhappening on multiple mirrors in /etc/cron.daily runs. /etc/logrotate.d has both acme.sh.log.f443f.conf and f443f4.conf07:31
fricklerwas there some change in the logrotate role? looks like this code hasn't changed in 4y https://opendev.org/opendev/system-config/blame/branch/master/playbooks/roles/letsencrypt-acme-sh-install/tasks/main.yaml#L37-L4107:34
fricklerah, ianw did this. seems the cleanup isn't working as planned https://review.opendev.org/c/opendev/system-config/+/87348107:36
fricklerI think I found the issue, commented on https://review.opendev.org/c/opendev/system-config/+/87348207:45
fricklermnasiadka: Error 122 creating lock file '/afs/.openstack.org/mirror/ubuntu-ports/db/lockfile': Disk quota exceeded!10:40
mnasiadkafrickler: fantastic :)10:41
fricklerwill check whether we have a bit of breathing room for quota10:41
fricklerinfra-root: ^^ I was thinking about adding another 1T volume to the AFS lvs, should be pretty easy and not too much risk, any concerns with that?10:42
fricklermain reason would be getting space for bookworm mirroring before we make progress with old those cruft removals10:42
frickleraccording to grafana, ubuntu-ports has been full for more than a month. I'm still holding back on simply increasing quota because some other mirrors that we might want to give priority in case the lvm plan is dismissed or doesn't work out11:19
frickler*... because there are some other quotas (which are also very tight on headroom) ...11:20
* frickler should really train to better proofread11:20
noonedeadpunkhey folks. is it possible to restore etherpad to the version 2474 https://etherpad.opendev.org/p/r.24fab14385c0aa2db6fa7340a8b2aae7/timeslider#2474 ?11:31
noonedeadpunkas it was messed up with autotranslate...11:31
noonedeadpunkNeilHanlon: ^11:31
hemantHello, Help me set up kubernetes as a service in openstack.    11:34
noonedeadpunkhemant: you should check with magnum team I believe in #openstack-containers11:35
funginoonedeadpunk: i'm not sure how to reverse the read-only hash to a pad id. what was the name of the pad?11:40
fungionce i have that, i should be able to revert it11:40
fungifrickler: the main risk is that every new cinder volume we add is one more possible point of failure any time there's a problem in the service provider's backend storage network. adding a 5th volume roughly increases the risk of an afs outage by another 25% over the current 4 (with the old static.o.o carrying 14 cinder volumes we saw it happening quite frequently)11:48
fricklerfungi: so is that an objection or just a comment? should we discuss possible alternatives?11:51
noonedeadpunkI see... I actually don't know - got URL like that (just without timeslider)11:52
noonedeadpunkMaybe Neil has a name of the pad11:52
fungifrickler: it's more an explanation of why we try basically any other alternative before giving up and adding yet another volume11:52
frickleris that the same pad that NeilHanlon mentioned yesterday? looks like it from the content11:52
fungii expect so, but i don't recall him saying what the actual pad url was in here at least11:53
fungianyway, on afs, if we've run out of things to delete (did the fedora mirror deletion happen yet?) then we should consider adding another volume11:54
fricklerfedora went from 400g to 150g a month ago, so not completely done yet, but also nothing that will help for very long, either11:56
fricklerwe could consider deleting things like xenial and also bionic is eol now, but we need to discuss how long we want to delay supporting bookworm if we want to wait for those11:57
fricklermaybe we can also drop centos and move to rocky only, but that also has some political implications so no quick win expected11:59
opendevreviewMerged openstack/project-config master: Remove the CLOSED_SERIES static list  https://review.opendev.org/c/openstack/project-config/+/88621511:59
opendevreviewMerged openstack/project-config master: Add the branch to the release metadata  https://review.opendev.org/c/openstack/project-config/+/88621611:59
mnasiadkarax pypi mirror/proxy seems to have some issues, multiple Kolla jobs are failing every now and then with similar output: https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_4da/886675/1/check/kolla-build-debian/4dada85/kolla/build/000_FAILED_kolla-toolbox.log12:23
frickleractually quite funny to see how translating to chinese (I'm guessing) and back, then to polish and back, transforms the content. "Red Hat" - "The cap" - "The constraint" ;)12:26
NeilHanlonfungi: yes. it's https://etherpad.opendev.org/p/resf-rocky-linux-git-c.o-changes12:45
mnasiadkaargh those polish people12:52
fungiNeilHanlon: got it, just a sec12:54
NeilHanlonthanks fungi, noonedeadpunk12:54
NeilHanlonfwiw we've stopped updating said etherpad as we've been informed there are.... Corporate Watchers12:55
NeilHanloni don't like to be watched 🙃12:55
fungi#status log Rolled the resf-rocky-linux-git-c.o-changes etherpad back to revision 2474 since it fell victim to an accidental bulk auto-translation12:57
opendevstatusfungi: finished logging12:57
fungiNeilHanlon: noonedeadpunk: ^12:57
NeilHanlonthank you!12:58
fungimy pleasure12:58
noonedeadpunkthanks!12:58
NeilHanloni'll say I'm sorta relieved that even y'all cannot reverse engineer the readonly link into the original key12:58
fungithere may be a way, but in true sysadmin tradition i'm too lazy to find out the details13:01
NeilHanlonhear hear13:02
fungieither the hash is stored in a column in the db or there's a stored key it's transformed with13:02
NeilHanlonit probably involves SQL and a JOIN or two13:02
NeilHanlonway too much effort13:02
fungiclearly the service knows how to direct the read-only url to the correct pad id, i've simply never bothered to look into the implementation details since usually there's someone who happens to know the normal pad id anyway13:03
fungimnasiadka: so python-openstackclient===6.2.0 isn't appearing in the proxied pypi index in dfw, from the looks of that error. i'll poke around13:05
fungithat release was from over 3 months ago, so not like it's new13:05
fungier, in ord not dfw13:07
fungimmm, also couldn't find python-dateutil===2.8.213:09
fungiand pytz===2022.413:11
fungialso it was installing openstacksdk 1.3.0 just fine several times, and then one time it ended up grabbing 0.57.0 instead13:15
mnasiadkayou mean it might be a pip resolver issue, not ord problem? it works fine in ovh now on a recheck13:18
fungii don't have proof, but it seems like one of pypi's fastly cdn endpoints is randomly returning truncated or ancient indices13:18
NeilHanlonyea, that happens...13:20
NeilHanlonI occasionally have to flush the Rocky CDN cache due to that13:21
NeilHanlonif you have the url for the index, you might be able to `curl -X PURGE https://the.url/and/path` if they are not restricting that verb, which should prompt a refresh of the content13:22
opendevreviewRodolfo Alonso proposed openstack/project-config master: Disable networking-odl jobs temporarily  https://review.opendev.org/c/openstack/project-config/+/88675013:22
fungiNeilHanlon: yeah, we do that when it's just one or a few, but in this case it seems to be all over the place (each of those packages i mentioned is a different index file)13:27
fungiif it doesn't seem to clear up, we can start the game of whack-a-mole13:28
mnasiadkafungi: seems it got better, or our jobs landed somewhere else than ord - now grafana upstream is behaving weird ;-) (joys of building images)13:44
opendevreviewMerged openstack/project-config master: Disable networking-odl jobs temporarily  https://review.opendev.org/c/openstack/project-config/+/88675013:50
Clark[m]frickler I'm not opposed to adding 1tb to the two dfw afs file servers. Worst case we find that doesn't work well and we add a third fileserver instead or something. But that seems like a good easy starting point14:56
fricklero.k., I'll start with practicing that again on a test vm, my sysadmining has gotten a bit rusty. I assume we do not need a maintainance announcement or something? so my tomorrow morning would be fine?15:18
Clark[m]Yes that should all be possible without downtime I think. And you can possibly put the RW volume on a single fileserver then do the other fileserver first. Then move the RW volumes and then update the other15:21
Clark[m]I'm not sure if that is necessary. I suspect lvm is involved and we can just rely on that but I would have to look closer 15:22
Clark[m]I actually don't know how you expand the afs stuff. Maybe that requires downtime. Worth looking into15:24
fungicreating the cinder volume, attaching it to the instance, marking it as a pv, extending the vg onto it, expanding the lv, and then growing the ext4 fs are all doable live with no downtime, we have cut-n-paste steps for all of it documented in system-config15:25
fungiit's just mounted as /vicepa so i'm pretty sure afs will simply see the available space in the fs at that point15:25
Clark[m]Ah vicepa is ext4 and not something special then15:26
Clark[m]That was the missing bit of info15:26
fungi/dev/mapper/main-vicepa on /vicepa type ext4 (rw,relatime,nobarrier,errors=remount-ro)15:27
Clark[m]Great that makes it super straightforward 15:28
fungifrickler: https://docs.opendev.org/opendev/system-config/latest/sysadmin.html#cinder-volume-management15:28
opendevreviewRodolfo Alonso proposed openstack/project-config master: Update networking-odl templates according to documentation  https://review.opendev.org/c/openstack/project-config/+/88677015:51
fricklerfungi: thx, that's really helpful and spares me from sieving through web search hits15:53
frickleralso is there a reason we still have bridge.openstack.org in DNS? the server itself was deleted, right? I will also update that doc, but waiting till I'm done, maybe more nits come up15:54
frickleroh, it's still shutoff only. just saw that incidentally when doing the first preparation step ;)16:03
frickleralso it looks like at some point in time we replaced afs01.dfw.opendev.org/main04 with afs01.dfw.opendev.org/main05, should I re-create main04 or rather go for 06? I'd choose the first option unless there is a reason not to16:06
Clark[m]I don't think lvm will care since it relies on metadata in the volume itself16:15
Clark[m]So it's mostly just bookkeeping on our side. I'm fine either way but maybe fungi has more input 16:16
opendevreviewRodolfo Alonso proposed openstack/project-config master: Update networking-odl templates according to documentation  https://review.opendev.org/c/openstack/project-config/+/88677016:30
fungifrickler: it doesn't matter. recreate main04 or create main06, it's all the same. just so long as the volume name in cinder isn't the same as one that currently exists (on the instance side it's opaque regardless)16:33
opendevreviewMerged openstack/project-config master: Update networking-odl templates according to documentation  https://review.opendev.org/c/openstack/project-config/+/88677016:51
fricklerthx for confirming. with that thing settled, how much should we increase the mirror.ubuntu-ports quota? current value is 470g, so 550 or 600g? please also have a look at the other nearfull mirror volumes in grafana16:53
Clark[m]I would do at least 10%16:54
Clark[m]Rounding to nice numbers16:54
fungiseems reasonable. we can increase afs quotas fairly quickly and often as long as there's space to extend them into16:58
fricklero.k., starting with 550g and will check how much of that is used when the mirror update is finished17:08
frickler#status log increased the quota for the mirror/ubuntu-ports volume from 470G to 550G17:09
opendevstatusfrickler: finished logging17:09
fungithanks!17:30
*** iurygregory_ is now known as iurygregory19:25
NeilHanlonsharing: https://rockylinux.org/news/2023-06-22-press-release/21:32

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!