Thursday, 2020-10-15

*** gyee has quit IRC01:04
*** rcernin has quit IRC02:26
*** rcernin has joined #openstack-swift02:49
*** evrardjp has quit IRC04:33
*** evrardjp has joined #openstack-swift04:33
*** mikecmpbll has quit IRC05:15
*** mikecmpbll has joined #openstack-swift05:18
*** dsariel has joined #openstack-swift05:56
*** rpittau|afk is now known as rpittau07:22
*** rcernin has quit IRC07:35
*** rcernin_ has joined #openstack-swift07:35
*** rcernin_ has quit IRC07:42
*** MooingLemur has quit IRC07:45
*** MooingLemur has joined #openstack-swift07:46
*** mikecmpbll has quit IRC08:10
*** tkajinam is now known as tkajinam|away09:17
*** tkajinam|away is now known as tkajinam09:17
*** rcernin_ has joined #openstack-swift09:56
*** rcernin_ has quit IRC10:25
*** rcernin_ has joined #openstack-swift10:42
*** rcernin_ has quit IRC11:06
*** fingo has quit IRC11:25
*** rcernin_ has joined #openstack-swift11:59
*** tkajinam has quit IRC14:20
*** rcernin_ has quit IRC14:27
*** dsariel has quit IRC15:11
*** gyee has joined #openstack-swift15:39
*** rpittau is now known as rpittau|afk15:52
ormandjis there a backport of https://opendev.org/openstack/swift/commit/754defc39c0ffd7d68c9913d4da1e38c503bf914 to ussuri?16:28
ormandjwith victoria being 20.04 only, and that being a critical issue for us, we're hoping it's possible ;)16:29
timburkeormandj, not yet. i haven't checked how cleanly it would apply, but i can look into it. fwiw, though, i wholely expect victoria swift to work on older versions of ubuntu, and to play well with an otherwise-ussuri openstack install16:32
ormandjtimburke: i think ubuntu cloud archive is only building for >=20.0416:40
ormandjwe're working on getting that all together because some of the fixes in victoria are pretty huge for the big ticket issues we have16:40
ormandjbut it's not an overnight process /16:40
timburkehuh. http://ubuntu-cloud.archive.canonical.com/ubuntu/dists/focal-updates/victoria/main/binary-amd64/Packages lists swift 2.25.1... though 2.26.0 is in focal-proposed, so i guess everything's on-track16:52
timburkei don't see any differences looking at the package dependencies (which makes sense; we make a point of not bumping deps unnecessarily) so i think you might be good to just pull down the victoria swift packages and install them on bionic16:54
timburke(fair warning: i've never tried it. indeed, i usually don't use distro packages at all -- i'm usually working from source, being a dev and all, and when we need to upgrade swift on our clusters, we build our own packages)16:56
ormandjtimburke: yeah, we'll try to figure it out17:11
ormandjsecond one, testing a mass rebalance on dev nodes, some data has gone 404. using swift-get-nodes to get the location then checking each of the primary/handoffs, the data isn't there17:12
ormandjby mass rebalance i mean adding a new node into the ring and putting weight at 100% effectively at once17:12
ormandjis that expected due to partition location changes and self-rectifying as rebalancing completes?17:12
ormandji didn't expect that, and i haven't used swift-get-nodes on the old ring files to see if the 'old' data still exists17:13
ormandjbut we definitely serve 404s for data that was there now17:13
timburkeormandj, triple replica, right? how quickly did you rebalance the ring? ever since https://github.com/openstack/swift/commit/ce26e789 only one assignment should change per rebalance, so i would've expected the other two primary locations to still have it...17:20
timburkehow quickly *and how many times*17:20
ormandjyes, triple replica17:22
timburkeyou had three beefy nodes before, right? is the new one roughly the same size as the others, or considerably larger?17:23
ormandjlarger17:23
ormandjone ring change to add it, at full capacity, then a month later, one more ring change, then a week alter one more17:24
timburkethat seems perfectly reasonable -- do we know whether the object was still accessible at the intermediary stages?17:25
ormandjno, 404ing almost at the very beginning17:26
ormandjunfortunately don't have logs going back far enough to determine if a DELETE went through17:26
timburkei was just about to ask about a sanity check there :-)17:27
ormandjyeah, it's still in the container db17:27
ormandjbut it _is_ possible a DELETE went through for it, and the container db just didn't get the update17:27
ormandjbut i'd expect that to eventually have caught up, too17:27
ormandjif all the data itself is actually purged17:27
timburkeyou have this habit of answering my next question before i ask it :P17:27
ormandji just looked at the old rings (backups) and the locations the data 'should' be, checked those locations, it's definitely not there17:28
ormandjbased on the old rings17:28
ormandjit's actually the same location as the 'new' rings show it should exist17:28
timburkewhen you deliver new rings, do you have a feel for how long the gap is between the first node getting the new ring and the last one getting it? rledisez had a ~30min window that led him to observe https://bugs.launchpad.net/swift/+bug/189717717:30
openstackLaunchpad bug 1897177 in OpenStack Object Storage (swift) "Race condition in replication/reconstruction can lead to loss of datafile" [High,In progress]17:30
ormandjabout 5 seconds17:30
timburkeyeah, negligible. good17:30
timburkecheck quarantine dirs?17:31
ormandjhm, protip on doing that?17:31
ormandji did check the handoff nodes fwiw17:32
ormandjnot seeing a quarentined dir in the /srv/node/driveID17:34
timburkehrm. yeah, i would've checked with something like `find /srv/node*/*/quarantined`17:35
ormandjyeah, no such directory17:36
ormandjbig async_pending17:36
timburkemight have a delete record to send to the container17:40
timburkehow big is the container?17:40
ormandji'm sure huge17:41
ormandjis there a way to look for a delete record pending?17:41
ormandjwe want to make sure this is a result of a client operation17:41
ormandjnot the server(s)17:41
ormandjbut we don't have client logs going back 2 months17:41
timburkeeach file is just a pickled dict iirc -- https://review.opendev.org/#/c/725429/1/swift/cli/async_inspector.py almost seems too simple for me to bother pushing on ;-)17:43
patchbotpatch 725429 - swift - Add a tool to peek inside async_pendings - 1 patch set17:43
timburkehow's your reclaim age? might want to bump it up if we're worried about how big the async pile is getting...17:44
timburkehttps://gist.github.com/clayg/249c5d3ff580032a0d40751fc3f9f24b may be useful (both to get a feel for the state of the system, and as a starting point to go looking for a specific async)17:47
timburkethough given the suffix/hash from swift-get-nodes, it probably wouldn't be so hard to find for it anyway...17:49
timburkesomething like `find /srv/node*/*/async*/${SUFFIX}/${HASH}*`17:50
cwrighttimburke: reclaim_age is set to 259200017:51
timburkethen, assuming you find something, crack it open and make sure it really was for a delete17:51
timburkecwright, so 30 days -- that might not be long enough, if we're legit worried that this was deleted a couple months ago and never made it back to the container server...17:52
*** whmcr has joined #openstack-swift17:55
timburke:-/ i should help clayg make that async stats script work on py317:56
whmcr@timburke we're assuming that suffix & hash are the parts of the filepath that are post the partition. IE /srv/node/DRIVEID/objects/PARTITIONID/SUFFIX/HASH, if so no dice on that18:03
timburkeyup, that was the idea18:04
timburke:-(18:04
*** djhankb has quit IRC18:07
timburkeok, so https://gist.github.com/tipabu/abf38940d49d67d33fe98b957f9306a6 should work on py3 and i taught it to report the age of the oldest async it can find18:17
whmcrrunning it now18:20
whmcrcount is already >70k, oldest (so far at least) is from July18:21
timburke😳18:23
*** dsariel has joined #openstack-swift18:23
timburke70k may or may not be something to worry about, but we definitely need to bring that reclaim age up18:23
timburkeat whatever point the updater gets around to that async from July, it's not making any container requests; it's just going to delete it18:24
timburkehow's the updater tuned? in particular, what've you got for workers, concurrency, objects_per_second?18:25
timburkeprobably also want to get a feel for your success rate for processing updates18:29
cwrighttimburke: those three settings are still using defaults: concurrency = 8, updater_workers = 1, objects_per_second = 5018:30
timburkemy gut says you probably want to turn up workers, considering how dense your chassis are18:33
timburkeormandj, just had a thought, since you mentioned still having backups of the old rings. have you compared the results you get from swift-get-nodes between them? i would expect there to be at least one disk assignment that didn't change, though i suppose i could be wrong...18:36
whmcryup, we've done that against pre-adding the node18:36
whmcrall match up18:36
timburkewait, so *none* of the assignments changed with the rebalance? this seems increasingly like it *must've* been deleted, but who knows when...18:37
timburkefwiw, one of the tricks we've got for server logs is to push them back into the cluster as part of log rotation, under a .logs account that only reseller admins can access18:40
*** djhankb has joined #openstack-swift18:40
timburkemight want to check recon cache, looking for object_updater_sweep time18:43
whmcrsorry, the drive asignments do change, but the files are not there on any of the versions we've checked18:44
openstackgerritTim Burke proposed openstack/python-swiftclient master: Remove some py38 job cruft  https://review.opendev.org/75847918:47
timburkewhmcr, did all of the assignments change? just one? two?18:48
whmcrlooks like one of the non [Handoff] ones changes, and then all of the [handoff]'s change18:49
timburkeso the primaries (non-handoffs) that *didn't* change should be pretty authoritative -- if they don't have it (either in objects/PARTITIONID/SUFFIX/HASH or quarantined/objects/HASH) it was most likely deleted19:02
timburkewhen you found it in listings, was that from just one replica of the container DB, or looking across all of them?19:03
whmcrlisting was from an s3 client doing a GET on the container for an object listing19:04
timburkemight be worth doing direct listings to each container server with limit=1 and prefix=<object>, see if they seem to agree that it should exist19:08
timburkeor even drop into sqlite3 and query for it directly. if you go that route, note that you'll probably want to include a 'AND deleted in (0, 1)' clause to take advantage of the deleted,name index19:09
timburkebut then you can also see the tombstone row (if it exists)19:10
openstackgerritTim Burke proposed openstack/swift master: Clarify some object-updater settings  https://review.opendev.org/75848819:23
*** gregwork has quit IRC19:26
ormandjtimburke: for clarity, reclaim_age being too short when asyncs haven't updated container.db means data that _should_ be purged, won't be, if the async container update doesn't go through prior?20:00
ormandjie: dark data will be left on system that shouldn't be there20:01
timburkeso reclaim_age can bite you two ways at the object-layer if it's too short: you might reap some object-server tombstones (*.ts files) before all of the *.data have had a chance to get cleaned up, leading to dark data -- OR you might give up on ever getting an async pending through to the container layer, leading to either dark data (if the async was for a PUT) or ghost listings (if the async was for a DELETE)20:06
ormandjcopy. we'll set it really large then until all this is caught up ;) the key is makings ure it wouldn't result in objects going missing that shouldn't be20:06
timburkeat the container layer, a too-short reclaim age pretty much always leads to ghost listings, where one replica goes offline for a while then comes back and syncs with other copies that had & reclaimed a deleted row for some of the objects20:07
ormandjwe'll get that out of the way first, then crank up the updater workers, some of these container.dbs are showing an update time from july20:07
ormandjwith lots of asyncs pending for them20:08
timburkenope -- having it too high just means you're using up some inodes "unnecessarily" -- i'd definitely err on the side of too high rather than too low20:08
ormandjperfect20:08
ormandjtimburke: updating the worker count, anything else we can do to push these asyncs through?20:17
ormandjcontainerdbs are on ssds20:17
timburkeormandj, might check to see if you've got https://review.opendev.org/#/c/741753/ in your swift -- if not, you can kick up your container replicator interval to like 48hrs or something until asyncs settle down20:21
patchbotpatch 741753 - swift (stable/ussuri) - Breakup reclaim into batches (MERGED) - 1 patch set20:21
ormandjlooking20:23
ormandjtimburke: unfortunately, i don't think that's in the ussuri cloud packages we have20:25
ormandjdon't see the other_stuff function in the db.py20:25
timburkethe fix should be in 2.26.0, 2.25.1, and 2.23.220:29
timburkeagain, you can work around it by temporarily prolonging your container-replicator cycle time -- it's just a thing we've seen where the replicator may hold a long lock while reclaiming deleted rows20:31
ormandjyeah, those releases didn't get built in ubuntu cloud archive20:36
ormandj2.25.1 that is20:36
ormandjjust checked, latest is still 2.25.020:36
ormandjwe'll set the replication interval to 48 hours for the container replicator, update the updater_workers to 4, and set reclaim_age to 120 days20:37
ormandjwe're hoping that's enough to much on these asyncs, we are way behind in this cluster20:39
ormandjlast-modified on the container db is something like july 06 heh on this one container20:39
ormandjwe stopped that script and it was over 3 million asyncs20:40
timburkecertainly a bunch, but with ssds and a tuned-down container-replicator it should be quite manageable. you've got this!20:44
openstackgerritTim Burke proposed openstack/python-swiftclient master: Allow tempurl times to have units  https://review.opendev.org/75850021:10
klamath_atx@timburke I upgraded our lab, the only weirdness im seeing right now is container-reconciler is having issues connecting to remote memcache servers, is that a know upgrade issue?21:26
timburkeklamath_atx, i've not seen that before :-/21:29
klamath_atxgotcha, just wanted to check in before i start spinning wheels21:33
mattoliveraumorning21:52
*** rcernin_ has joined #openstack-swift22:03
openstackgerritTim Burke proposed openstack/swift master: Optimize swift-recon-cron a bit  https://review.opendev.org/75850522:06
*** rcernin_ has quit IRC22:19
openstackgerritMerged openstack/python-swiftclient master: Close connections created when calling module-level functions  https://review.opendev.org/72105122:41
*** tkajinam has joined #openstack-swift22:59
zaitcev"Firefox can’t establish a connection to the server at review.opendev.org."23:58

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!