Monday, 2021-07-19

clarkband so when we started the scheduler up again it got failures from those merger instances when fetching configs?00:00
ianwi didn't, only the scheduler.  that's a good point.  i'll do a full restart00:00
clarkbya ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) shows up in the zuul error report00:00
ianwi'm running the full restart playbook now00:01
clarkbok00:01
ianwi didn't think about the mergers00:01
ianwok that's finished00:02
clarkbits still running through its processes to start up though00:03
ianwyep00:03
clarkbianw: Unknown projects: opendev/meetbot00:05
clarkbin my paranoia I wonder if that wasn't synced over to the new server properly?00:06
clarkbhrm no it seems git/opendev/meetbot.git does exist00:06
clarkbit might be an order of operations thing loading configs?00:07
clarkbya I think its a cross tenant order of operations thing00:07
ianwloading, loading, loading ... 00:12
ianwok, seems back00:13
clarkbthe error list is much much smaller now too :)00:13
clarkbI think you can recheck your system-config change00:13
ianwyep, it's running now00:14
ianw\o/00:14
ianwi think time for a cup of tea and take a breath!00:14
clarkbI want to see the jobs actually start but I agree00:15
clarkbhttps://zuul.opendev.org/t/openstack/stream/5427ef9af78943c5aafe41ca8431fa99?logfile=console.log is the tox-docs job and it did just start00:15
clarkbI'm a bit worried that this extended queued time is due to the mergers taking longer to set up repos00:17
clarkbbut I guess it could also just be slow node launches. Far too early to say00:17
ianwthe rest are multi node jobs ithink00:18
ianwnot quite i guess00:19
ianwwe have quite a few building nodes00:20
clarkbLooking at zuul logs I think the issue is noderequest fulfilment00:20
clarkbthe scheduler has only accepted 2 completed node requests00:21
clarkband another job just started. I just need to learn to be pateitn I think00:21
ianwinap-mtl01 has a bunch of building nodes that look exactly like what we want :)00:21
clarkbI wonder if we are hitting its image cache problems that we run into periodically where it has a slow period. I think everything is going as expected except for slow node boots and that is independent of our work today00:22
clarkbianw: I think I'll take a break now since stuff seems to be moving the right direction. I'll check in later00:23
ianw++ thank you!00:24
clarkb`grep 'Accepting node request' /var/log/zuul/debug.log` on the scheduler if you want to see its progress using nodesets00:24
clarkbthough I guess that isn't much different than checking the js dashboard00:25
ianwyeah it's building a ton of nodes, so i think it's just getting itself warmed up00:25
clarkband thank you for doing all the hard work to make this happen :)00:25
clarkbthere we go a bunch of jobs just startedon the system-config change00:29
clarkband really need to take a break for dinner now.00:30
fungii'm not really around still, sorry, but all's looking okay?00:47
Clark[m]We've sorted through the issues that have come up so far. Currently waiting for zuul to merge the system config change to make review02 the review server in Ansible. Then we can run the playbook manually00:48
Clark[m]Then once that is happy we can reenable Ansible and do some cleanup00:49
Clark[m]I'm working on dinner then probably a walk then will check back in again00:49
ianwahh, that system-config job wasn't actually running the gerrit checks prior00:56
Clark[m]Ya it's different because you touched the config file00:57
opendevreviewIan Wienand proposed opendev/system-config master: review02: move out of staging group  https://review.opendev.org/c/opendev/system-config/+/79756300:58
ianwClarkb[m]: another attempt that updates the system-config-run job as well00:59
clarkb+201:00
Clark[m]ianw: looks like it is still failing01:33
ianwargh01:44
ianwgroupadd: GID '3000' already exists01:46
Clark[m]That's the Gerrit gid ?01:47
Clark[m]Maybe we are running twice for some reason?01:47
ianwcalling it review02 will help not have to fiddle fake letsencrypt certs01:50
ianw(in the system-config-run test)01:50
opendevreviewIan Wienand proposed opendev/system-config master: review02: move out of staging group  https://review.opendev.org/c/opendev/system-config/+/79756301:53
clarkbianw: any idea why Create Gerrit Group isn't the only thing created a gid 3000?01:54
ianwno i'm going to watch this more closely 01:55
clarkbok01:55
ianwit is created as 3000 on review0201:55
clarkbI wonder if some package on focal that we install is creating a group after we shift the min gid and uid?02:01
clarkbianw: do you have a hold set up for the runs of ^02:01
clarkbmight be a good idea to see what /etc/group says about gid 300002:01
ianwyep :)02:04
clarkbgerrit has mreged a change since the move (just one more indication that things are working overall)02:05
clarkbhttps://review.opendev.org/c/openstack/sushy/+/801034/ that one02:05
clarkbI'm going to double check it shows up on the giteas02:06
ianwyeah i think it's fine.  i really wish i'd noticed this job not running prior to this02:06
clarkbianw: I wonder if we should consider skipping ahead and reenqueuing zuul changes?02:08
clarkbfwiw that change showed up on the giteas just fine02:08
ianwyep if this run doesn't pass i'll skip ahead, re-enqueue the changes (there's only about 4) and make sure the backups are running02:15
ianwthe node doesn't appear to start with a gid 3000, so that's something02:17
clarkbianw: for general sanity should we remove the replication config on review01? maybe just ocmment it out in the file?02:20
clarkb(just wondering what happens if gerrit starts there again unexpectedly and I think the only real issue would be if it replicated)02:21
ianwsure, i can move that out of the way.  the apache is still serving the maintenance page so should be hard to merge anything02:26
ianwi've moved it to a .post-ugprade file02:26
clarkbthanks02:27
ianwgerrit2:x:3000:02:33
ianwso it made the group02:33
clarkbthats good02:34
clarkbcould it have been a side effect of the LE failure somehow?02:34
ianwi think it must have been, but i'm not sure how02:35
ianwDestination directory /etc/netplan does not exist02:36
ianwsigh02:36
ianwit's going to be a bit more work getting the CI job and hence ansible run working02:37
clarkbseems like it is getting close now though as that is the lsat bit of the playbook isn't it?02:37
clarkbianw: I think you can split the netplan fix up out into another playbook and have that not run in the test job?02:38
ianwwe only want to do that on the production hos02:38
clarkbbut do run it in the infra prod job02:38
clarkbya02:38
clarkbor use a testing flag and only do that when it is undefined or false?02:40
clarkbthat might be better for simplifying testing and keeping things consistent across hosts02:40
ianwi will re-enqueue the zuul changes, put review02 in emergency and allow ansible to start running again02:40
clarkbok02:40
ianwi'm happy the server is operational, it's now just making sure the ansible apply is idempotent and doesn't move it backwards :)02:41
clarkb++02:42
clarkbI'm working on an update to your change to do the its a test flag02:42
clarkbfor the netplan config02:42
opendevreviewClark Boylan proposed opendev/system-config master: review02: move out of staging group  https://review.opendev.org/c/opendev/system-config/+/79756302:45
clarkbianw: ^ something like that for the netplan issue maybe02:45
ianwthanks02:46
clarkbthat hasn't kicked the running jobs out of check yet02:47
clarkba new change has just entered check so why hasn't the new patchset of ^ bumped the old one out02:51
clarkbthat could be a bug in the zuul pipeline changes I ugess02:51
clarkbnow its queued up. Ya I suspect some sort of starvation processing the pipelines02:53
ianwmaybe the last job there was in it's post phase or something02:54
clarkbwell we did redo the pipeline processing in zuul this last week02:54
clarkbso it could totally be something to do with that02:54
clarkbianw: do you know why we need to build the gerrit images in those changes too?02:56
clarkbhrm I bet we turned it on for test_gerrit.py but we don't really need it? probably helps in the long run02:56
clarkbI expect we had problems where we were updating the tests and tryingto test new images with depends on or similar and it wasn't working02:56
ianwi think system-config-run-review depends on the images so it always builds them?  03:00
ianwi don't imagine we'll be taking the server down at this point, so i think we can announce that it is back online03:03
clarkbmaybe mention that we are still working through restoration of our config managmeent processes so acl changes and new projects aren't possible yet03:05
ianwi might keep it simple and say the update is over, and if i can't get this sorted by EOD (which I should be able to) call that out03:17
clarkbok03:18
ianwI wonder if the "unknown" time remaining somehow has to do with the pause entering the gate 03:22
clarkbya zuul only shows a number there once all jobs have at least started03:24
ianw#status alert The maintenance of the review.opendev.org Gerrit service is now complete and service has been restored.  Please alert us in #opendev if you have any issues.  Thank you03:24
opendevstatusianw: sending alert03:24
clarkbso having a pause and then waiting for some stuff ot happen causes that to happen in the zuul web ui03:25
-opendevstatus- NOTICE: The maintenance of the review.opendev.org Gerrit service is now complete and service has been restored. Please alert us in #opendev if you have any issues. Thank you03:25
clarkbianw: do alerts change the topic?03:25
clarkbdoesn't look like it. I guess03:25
ianwnot at the moment03:25
ianwsomething to do with acl permissions in oftc or something or other03:25
ianwoh, doh, there's an end command isn't there03:26
clarkbya you #status ok to end the alert03:26
clarkbwhich sets the topics back again iirc03:26
ianw#status ok03:27
clarkbI usually use #notice unless I know I want it in the topics03:27
clarkbit might not process that until it is done processing the alert (and you may need to reissue it?03:27
ianwoh well it's in the checklist for next time :)03:28
ianwprobably it's good that it's been so long since i sent a global alert that i forgot!03:29
ianwreview jobs running now, fingers crossed03:29
opendevstatusianw: sending ok03:30
ianwsystem-config-run-review-3.2 success ! yay03:48
clarkbprogress03:48
ianwi've disabled the backup cron jobs on review01 and will get backups happening on 02 once 797563 merges and i run it03:54
clarkbianw: ok. Keep in mind having review02 in the emergency file makes running the playbook weird03:54
ianwyep i have a command that uses inventory out of my checkout 03:54
clarkbI think you may end up a huge set of jobs because ineventory changed that zuul will work through. If service-review is far down the list you might get away with just running the playbook after removing 02 from the emergency file03:55
clarkbianw: without the emergency file?03:55
ianwyeah, for just running review.  i'll run it by hand as i want to watch it03:55
clarkbgot it03:55
clarkbianw: I'm a little annoyed we'll get a new gerrit image we don't need, but at the same time we just updated the gerrit imgae so that should be fine for hwenever we restart in the future04:01
ianwyeah, i'm not sure of a way around that04:04
opendevreviewIan Wienand proposed opendev/system-config master: gerrit: fix Launchpad credentials write  https://review.opendev.org/c/opendev/system-config/+/80122704:07
opendevreviewMerged opendev/system-config master: review02: move out of staging group  https://review.opendev.org/c/opendev/system-config/+/79756304:49
ianwyay, it's that easy04:51
*** ykarel|away is now known as ykarel04:53
*** dpawlik0 is now known as dpawlik05:10
ianwok i have run the review playbook against the new server and everything looks good.  replication config is setup, nothing out of order in the other configs, cron jobs are there for cleanup etc.05:17
ianwi'm taking the server out of emergency as it should be fine now05:17
opendevreviewIan Wienand proposed opendev/system-config master: backups: add review02.opendev.org  https://review.opendev.org/c/opendev/system-config/+/79756405:29
*** mgoddard- is now known as mgoddard06:04
*** amoralej|off is now known as amoralej06:10
opendevreviewMerged opendev/system-config master: backups: add review02.opendev.org  https://review.opendev.org/c/opendev/system-config/+/79756406:19
ianwlooks like package installation on review02 is actually borked due to https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/192691806:44
ianwi'm going to try the downgrade mentioned there06:44
ianwi think we actually might need to check all our focal systems for this06:45
jssfrGood morning everyone. First time contributor to OpenStack here. My company just signed the CCLA, with my address on the list. I am now looking at the gerrit UI to figure out how to apply this. The only choices I have are the "OpenStack Individual Contributor License Agreement" and two "externally managed" ones. Should I sign the ICLA (<https://docs.openstack.org/contributors/common/setup-gerrit.html#contri06:49
jssfrbutors-from-a-company-or-organization> seems to suggest that) or will an external process (which I may need to poke?) add some system CLA to my account once the CCLA by my employer has been processed?06:49
*** dpawlik5 is now known as dpawlik06:50
ianwjssfr: i'm no expert here, but *you* should sign the ICLA  and the corporate one is an extra things for company lawyers and the opendev foundation07:02
*** hashar is now known as Guest136507:02
*** hashar_ is now known as hashar07:02
ianwok, i finally got borg onto review02.  running initial backups now07:02
opendevreviewIan Wienand proposed opendev/system-config master: review02: skip ~gerrit2/tmp in backup  https://review.opendev.org/c/opendev/system-config/+/80123507:05
*** dpawlik7 is now known as dpawlik07:06
jssfrianw, aha, that is a viewpoint which fits my mental model *and* the stuff written on the page. Thanks!07:17
ianwit could be made more explicitly clear, i'd probably agree07:22
jssfrI mean the text as written is unambiguous, but combined with the slightly aged screenshots, I wasn't sure if the process is still up-to-date.07:25
ianwjssfr: i'm sure a contribution would be welcome to https://opendev.org/openstack/contributor-guide/src/branch/master/doc/source/common/setup-gerrit.rst :)07:26
ianwok, long day, but all 24. of the checklist points are marked off on https://etherpad.opendev.org/p/gerrit-upgrade-202107:42
ianwthe server is up; no complaints and it's processed quite a few changes now, it has had successful backup runs07:43
ianwnothing on the cleanup list can't wait07:44
ianwi'll try to check back in for the next few hours, but i'm mostly out now07:44
opendevreviewMerged opendev/system-config master: review02: skip ~gerrit2/tmp in backup  https://review.opendev.org/c/opendev/system-config/+/80123508:14
*** dpawlik3 is now known as dpawlik08:30
*** dpawlik4 is now known as dpawlik08:44
*** ykarel is now known as ykarel|lunch08:54
*** rpittau|afk is now known as rpittau09:31
*** kopecmartin is now known as kopecmartin|pto09:45
*** ykarel|lunch is now known as ykarel09:59
fungijssfr: just to clarify, the ccla is purely paperwork, an a best-effort/honor-system tracking of affiliations for contributors to official open infrastructure foundation projects. in contrast, the icla is required for all contributors to certain projects, for example openstack, and enforced in the code review system so that it prevents contributions from being pushed for repos under the11:29
fungigovernance of those projects unless you've agreed to it11:29
jssfraha, understood11:34
fungiinfra-root: just a reminder, i'm still away and on the road all day today, but should be around and start catching back up tomorrow12:26
*** amoralej is now known as amoralej|lunch12:52
opendevreviewAnanya Banerjee proposed opendev/elastic-recheck master: Run elastic-recheck container  https://review.opendev.org/c/opendev/elastic-recheck/+/72962313:06
*** sshnaidm|afk is now known as sshnaidm13:10
opendevreviewAnanya proposed opendev/elastic-recheck master: Run elastic-recheck container  https://review.opendev.org/c/opendev/elastic-recheck/+/72962313:14
mnaseris it me or gerrit does feel much more snappy/quick13:42
*** amoralej|lunch is now known as amoralej13:43
rm_workQuestion for folks -- when you do DB maintenance, do you just ... take down the DB briefly and expect OpenStack services to deal with retries or whatever for the duration? Do you have a more complex strategy? Turn off OpenStack services first? Keep the DB available via a mirrored DB setup using the read-only node?13:49
clarkbrm_work: we don't operate openstack services for the most part so not in a great position to answer14:28
rm_workheh yeah but this channel is a who's who of operators :D14:28
clarkbmnaser: our big theory for why the old gerrit was very slow was memory contention preventing gerrit and the operating system and the web server from having enough memory to all be happy at once. The new instance is larger (thank you mnaser and vexxhost!) allowing us to allocate more memory for each of those memory consumers14:28
clarkbmnaser: long story short I'm very glad to hear you think it is snappier and we thank you for the help in making that happen :)14:29
mnaserclarkb: yeah, i'm happy that it actually had a positive impact -- i tried removing a topic from a change and it happened instantly vs before which would take quite a bit of time :)14:29
clarkbrm_work: the two recent db maintenances we did (the gerrit move and a zuul upgrade that required a db migration) were both done with services down. Not ideal but things are getting better slowly.14:30
clarkbmnaser: another good test is dansmith's giant patch bombs :) pushing those has been very slow in the past14:31
clarkba series of changes or change updates to a large repo in particular14:31
clarkbsince things seem to be going well this morning I'm going to go find breakfast and do my normal startup routine. I'm hoping that I can then start hacking on testing of our project rename playbook today as well in prep for the planned renames sometime next week14:33
clarkbrm_work: back when lifeless was thinking about these problems I think he liked the idea of a transparent cutover using an intelligent proxy14:34
clarkbrm_work: I have no idea how feasible that is with the tooling available today, but basically you mirror the database then force all reads and writes to go through a proxy to keep things in sync. Then to cut over you have the proxy halt conncetions momentarily while you do a catch up on the new side and then remove the old side from the proxy14:35
rm_workyeah the thing i've run into is a DB team that thought it'd be helpful of them to use read-only mode for cutovers rather than a hard outage, and some services (at least octavia) that are coded to understand and retry on that, but write failures cause them to behave BADLY14:35
rm_worktrying to figure out if it's reasonable and normal to just do hard-down for maintenance on the DB briefly, and if most services play nice with that14:36
rm_work * yeah the thing i've run into is a DB team that thought it'd be helpful of them to use read-only mode for cutovers rather than a hard outage, and some services (at least octavia) that are coded to understand and retry on hard-outage, but write failures cause them to behave BADLY14:37
*** ykarel is now known as ykarel|away14:38
rm_worksorry for probably misusing your channel, I have a kind of bad habit of that since it's the best place I know of to catch a specific set of people 😅14:39
clarkbno problem, I just wanted to be clear that we don't really have direct experience with that problem and openstack. Though I suppose other channel lurkers may (like mnaser?)14:39
rm_workyeah I was about to ping him directly :P14:40
clarkbcorvus: yesterday when we were trying to get the system-config change to specify review02 as the new gerrit server tested and landed I pushed a new patchset for the change and zuul didn't evict the old patchset as quickly as I expect it would.15:20
clarkbcorvus: https://review.opendev.org/c/opendev/system-config/+/797563 is the change and it was patchset 5 in check when I pushed patchset 6. I don't think this is currently urgent but it occured to me that that may indicate starvation in the pipeline processing loops?15:22
clarkbwanted to call it out in case others notice similar15:22
dtantsurhey! are there any mirror problems with opensuse nodes? https://zuul.opendev.org/t/openstack/build/4ba8493813d440998547da49825f7440/log/job-output.txt#67315:34
clarkbdtantsur: we may have synced bad state from our upstream mirror15:35
clarkblooks like we last synced opensuse 18 days ago. The upstream we are using has a different repomd.xml that points at a file present in the upstream dir http://mirror.us.leaseweb.net/opensuse/update/leap/15.2/oss/repodata/15:39
clarkbVLDB: vldb entry is already locked15:41
clarkbthat is why we aren't updating that volume. I'll dig into that15:41
dtantsurthank you!15:45
clarkbI don't see any running vos release for that on the system that does the vos releases. I've held the flock we use to do the mirror updates for opensuse and will break the vldb lock and manually run the mirror update script15:47
*** marios is now known as marios|out16:29
*** amoralej is now known as amoralej|off16:57
clarkbdtantsur: I think you should be good now. I've rerunning the update manually one mroe time to convince myself that it is happy on the update side but the mirrors show the new content as expected17:03
dtantsurgreat, thanks! I'll create a test patch17:04
clarkbinfra-root I've updated https://gerrit-review.googlesource.com/c/gerrit/+/312302/ with tests and if those pass in upstream CI (figuring out how to run them locally was an experience) I'll see what I can do to get reviews upstream17:07
clarkbdtantsur: thank you for letting us know17:09
*** rpittau is now known as rpittau|afk17:38
opendevreviewChao Zhang proposed zuul/zuul-jobs master: Update commits since tag calculation  https://review.opendev.org/c/zuul/zuul-jobs/+/80137018:03
opendevreviewChao Zhang proposed zuul/zuul-jobs master: Update commits since tag calculation  https://review.opendev.org/c/zuul/zuul-jobs/+/80137018:04
opendevreviewmelanie witt proposed openstack/project-config master: Set launchpad bug Fix Released after adding comment  https://review.opendev.org/c/openstack/project-config/+/80137618:54
timburkei saw the all-clear notice went out a while ago, but i'm still getting redirects to the maintenance page when i go to https://review.opendev.org/ -- is that expected?19:22
timburkei'm also seeing errors like "ssh: connect to host review.opendev.org port 29418: Connection refused" if i use git/git-review, which makes me curious about how the patches just above got submitted :-/19:24
timburkemaybe i've got some stale dns? review.opendev.org and review.openstack.org both seem to resolve to 104.130.246.32 for me, fwiw19:27
opendevreviewmelanie witt proposed openstack/project-config master: Set launchpad bug Fix Released after adding comment  https://review.opendev.org/c/openstack/project-config/+/80137619:33
Clark[m]Yes, that is the old DNS record19:33
Clark[m]timburke: any idea what might be holding on to that value? We lowered the ttl to 5 minutes last week and prior to that it was 60 minutes. Both much shorter than the time between now and when we updated dns19:34
timburkeseems to be something on my end -- dig's telling me there's a TTL of 0 (!!) coming from SERVER: 127.0.0.53#53(127.0.0.53) :-(19:36
timburkedefinitely user error! turns out i've got something in /etc/hosts with a comment like "WTF IPv6 (Nov 2020)" 🤣19:37
timburkeignore me :-)19:37
opendevreviewChao Zhang proposed zuul/zuul-jobs master: Update commits since tag calculation  https://review.opendev.org/c/zuul/zuul-jobs/+/80137019:59
opendevreviewChao Zhang proposed zuul/zuul-jobs master: Update commits since tag calculation  https://review.opendev.org/c/zuul/zuul-jobs/+/80137020:00
clarkbI've made my edits to the team meeting agenda. I'll hold off on sending it until ianw can check it for any missing important items/details20:32
clarkbplease add your edits soon though :)20:32
clarkbinfra-root rax rebooted eavesdrop01.opendev.org a few minutse ago. A heads up if you notice bots acting weird20:58
ianwo/22:24
clarkbianw: good morning. Its been really quiet. I think things went well. I did leave a -1 over a small thing on your change to fix the lp creds file22:25
ianwagenda lgtm thanks22:25
ianwyep i checked my mail this morning to see if i had a bunch of "revert" emails but thankfully not :)22:26
clarkbianw: I was also going to suggest maybe you double check backups and such now that the ~gerrit2/tmp exclusion landed and all the jobs for that should've run22:26
clarkbbut other than that I think its mostly answer the occasional question that came up (timburke had an /etc/hosts override for review and anothe rperson was looking for firewall update details)22:26
ianwcan do22:27
clarkboverall looking really good. I've been working on my java as a result :) wrote tests for my openid fix and am communicating iwth a reviewer on that now. I'm hopeful we can get this landed22:27
ianw++ it would be great to not deal with that again!22:28
ianwsomehow we've got two scripts trying to backup the review db22:32
mordredthat seems less than optimal22:33
ianwoh, it's an old one from before it got the "use a local mariadb flag"22:34
mordredclarkb: what's your gerrit change link?22:34
mordredclarkb: nm. I see in backscroll22:34
clarkbmordred: https://gerrit-review.googlesource.com/c/gerrit/+/31230222:34
ianwthere is also a cron job for /usr/local/bin/track-upstream22:34
ianwwhich i think we removed right?22:34
clarkbianw: I know fungi was working on that, but I don't know if it landed /me looks in git logs22:34
clarkba change titled "Good riddance to track-upstream and its cronjob" did merge22:35
clarkbianw: I think you can remove that cronjob22:35
clarkbthat change was in system-config22:35
ianwI0d6edcc34f25e6bfe2bc41d328ac76618b59f62d yep; ok i'll remove the entry22:35
clarkbmordred: I was hoping to get some feedback on my assertions there before I push a new patchset, but I'll probably push a new patchset at EOD if I don't hear back before then just to keep things moving22:36
ianwok, root now runs only the two cron jobs for the backups22:36
clarkbagenda has been sent22:43
clarkbianw: one good side effect of keeping the maintenance banner up on review01 has been that it is abundandtly clear you are talking to the wrong gerrit22:47
clarkbwe might want to update the text to say something like "This server has moved. If you are seeing this page then double check your DNS resolution and /etc/hosts file for review.opendev.org." ? Though that may be a one off22:48
ianwyeah we can change it to "if you are seeing this, you're in the wrong place" :)22:48
ianwif infra-root wants to audit their home dirs etc. for anything they feel is important and migrate it, we can probably shut it down after that22:49
clarkbI'll make a note to do that tomorrow22:50
clarkbI do want to preserve the gerrit account cleanup records I've been keeping. I can move those22:50
opendevreviewIan Wienand proposed openstack/project-config master: afs graphs: track openeuler mirror volume  https://review.opendev.org/c/openstack/project-config/+/80139722:58
ianwclarkb: ^ i think this is what pushed afs up recently and will give a more complete picture in the dashboard22:59
ianwit would probably be good to have a stacking graph that shows all the volumes usage in context23:00
clarkbianw: ya the opensuse mirro stopped updated (stale lock) and I went looking I expect it was that mirror. There was talk elsewhere about maybe doing alma linux and debian is 5GB below its quota limit23:00
clarkbanyway wanted to discuss if we thought we needed more disk and if the mirror.yum-puppetlabs is used23:01
clarkbI went ahead and approved ^ since it seems straightforward23:02
clarkbianw: if only we could make the distros smaller :)23:04
fungiclarkb: ianw: i'm home and skimming nick highlights, but not really here properly until tomorrow... i did delete the track-upstream cronjobs on both the old and new review server, if you check there should have been a sudo crontab -e logged when i did it too. perhaps something put it back?23:06
fungii guess wait and see if it reappears23:06
ianwwe may have had a run against an older system-config at some point23:07
clarkbfungi: ya no rush or worries at the moment. I think it will likely just be a bunch of small updates here and there as we find things to improve23:09
fungiJul 15 13:06:18 review02 sudo:    fungi : TTY=pts/6 ; PWD=/home/fungi ; USER=root ; COMMAND=/usr/bin/crontab -e23:09
fungii wonder what replaced it23:09
clarkbfungi: if you do have a moment https://review.opendev.org/c/opendev/system-config/+/800274 is one that I'd like to land on Wednesday probably (I should have time to watch and monitor as it goes in). Again no rush given that timeframe but review always welcome23:12
fungii'll load it up in my gertty at least, maybe that'll remind me23:12
opendevreviewIan Wienand proposed opendev/system-config master: Point cacti at review02 explicitly  https://review.opendev.org/c/opendev/system-config/+/80139923:13
ianw^ that's one i just thought of, i'm pretty sure cacti is still hanging on to talking to the old server, but it's better to be clear about it23:13
clarkbianw: review.openstack.org points to the new server so it should be getting data from the new one now23:14
clarkbbut explicit is nice23:14
opendevreviewMerged openstack/project-config master: afs graphs: track openeuler mirror volume  https://review.opendev.org/c/openstack/project-config/+/80139723:19
ianwyeah i guess http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=27&rra_id=all looks right23:21
ianwi'm really not sure about the load average results though http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=26&rra_id=all23:21
clarkbthat looks about right given what I recall from memory before? It might be a bit lower now if we don't have as much memory/io/disk contention23:22
ianwalso the "used memory" doesn't seem to show up 23:23
ianwi wonder if cacti isn't so happy with something that focal is doing23:24
clarkbianw: sometimes bouncng the snmpd service on the host (review02 in this case) is sufficient to make things happy again23:25
clarkbbut it could also be due the the size of the values (they are much larger now)23:25
ianwi guess this is not worth too much effort given replacement plans23:26
opendevreviewMerged openstack/diskimage-builder master: Convert multi line if statement to case  https://review.opendev.org/c/openstack/diskimage-builder/+/73447923:31

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!