Monday, 2019-09-09

*** rcernin has quit IRC00:30
*** ccamacho has quit IRC00:51
*** rcernin has joined #openstack-swift01:45
*** baojg has quit IRC02:47
*** baojg has joined #openstack-swift02:48
*** gkadam has joined #openstack-swift03:57
*** e0ne has joined #openstack-swift06:09
*** e0ne has quit IRC06:18
openstackgerritMatthew Oliver proposed openstack/swift master: PDF Documentation Build tox target  https://review.opendev.org/67989806:18
*** ccamacho has joined #openstack-swift06:42
*** rcernin has quit IRC07:02
*** tesseract has joined #openstack-swift07:13
*** aluria has quit IRC07:33
*** aluria has joined #openstack-swift07:38
*** pcaruana has joined #openstack-swift07:49
*** e0ne has joined #openstack-swift08:19
*** tkajinam has quit IRC08:42
*** rcernin has joined #openstack-swift10:08
*** spsurya has joined #openstack-swift10:21
*** rcernin has quit IRC12:22
*** gkadam has quit IRC12:55
*** BjoernT has joined #openstack-swift12:56
*** e0ne has quit IRC13:19
*** camelCaser has quit IRC13:28
*** BjoernT_ has joined #openstack-swift13:41
*** BjoernT has quit IRC13:44
*** NM has joined #openstack-swift13:48
*** e0ne has joined #openstack-swift13:50
NMHello everyone. Mind if someone can point any direction to me: One of our sharded container is reporting it has zero objects and its header "X-Container-Sharding" returns False. The container.recon confirms it's sharded (                    "state": "sharded"13:52
NM"state": "sharded").13:52
NMOne thing we found out: although we work with 3 replicas,  there are 6 db files: 3 regular ones and 3 handoffs. Are there any tips to "fix" this container?13:52
*** BjoernT_ has quit IRC13:58
*** BjoernT has joined #openstack-swift14:08
*** BjoernT_ has joined #openstack-swift14:12
*** BjoernT has quit IRC14:13
*** zaitcev has joined #openstack-swift14:59
*** ChanServ sets mode: +v zaitcev14:59
*** diablo_rojo__ has joined #openstack-swift15:05
timburkeNM: sounds a lot like https://bugs.launchpad.net/swift/+bug/183935515:06
openstackLaunchpad bug 1839355 in OpenStack Object Storage (swift) "container-sharder should keep cleaving when there are no rows" [Undecided,In progress] - Assigned to Matthew Oliver (matt-0)15:06
timburkegood news is, mattoliverau's proposed a fix at https://review.opendev.org/#/c/675820/15:06
patchbotpatch 675820 - swift - sharder: Keep cleaving on empty shard ranges - 4 patch sets15:06
*** gyee has joined #openstack-swift15:07
*** NM has quit IRC15:52
*** diablo_rojo__ is now known as diablo_rojo16:00
openstackgerritThiago da Silva proposed openstack/swift master: WIP: new versioning mode as separate middleware  https://review.opendev.org/68105416:00
*** tesseract has quit IRC16:07
*** e0ne has quit IRC16:10
*** spsurya has quit IRC16:27
*** diablo_rojo has quit IRC16:49
*** diablo_rojo has joined #openstack-swift17:02
*** NM has joined #openstack-swift17:09
*** camelCaser has joined #openstack-swift17:19
NMthanks timburke. Meanwhile, is there any workarround? Should it be safe to delete de db's on the handoff nodes?17:20
*** camelCaser has quit IRC17:28
*** camelCaser has joined #openstack-swift17:28
timburkeNM, mostly likely yes, it's safe. it also shouldn't really be *harming* anything, though...17:58
claygtdasilva: timburke: should I rebase p 673682 ontop of p 681054 - or wait until we can talk more abou tit?17:58
patchbothttps://review.opendev.org/#/c/673682/ - swift - s3api: Implement versioning status API - 2 patch sets17:58
patchbothttps://review.opendev.org/#/c/681054/ - swift - WIP: new versioning mode as separate middleware - 1 patch set17:58
timburkeNM, why are we worried about it? is it making the account stats flop around or something?17:58
claygmattoliverau: IIRC, you seemed pretty vocal about versioned writes in the meeting last week - would you be available to talk about what our strategy should be ahead of the Wednesday meeting?17:59
timburkeclayg, heh, i've been tinkering with a rebase of that on top of https://review.opendev.org/#/c/678962/ ;-)17:59
patchbotpatch 678962 - swift - WIP: Add another versioning mode with a new naming... - 2 patch sets17:59
claygtimburke: I think the big question is going to be "can we commit to land this BEFORE we go to China" ???17:59
timburke👍18:00
tdasilvare the idea of auto-creating the versioned container, is it out of the question to create it in a .account?18:01
timburketdasilva, i think that'd throw off the account usage reporting a lot...18:02
*** zaitcev has quit IRC18:02
clayg^ 👍18:02
tdasilvaI know there was a concern about accounting, but I was wondering if versions should actually be a separate "count" from objects? that way users would know how many versions they have for a given account18:02
claygI mean if the counts and bytes went.. oh.. you i see what you mean...18:03
claygbut the bytes need to be billage18:03
claygoh but still - "bytes of version" vs "bytes of objects"18:03
timburkeeven if your billing system were to assign bytes-used from .versions-<acct> to <acct>, the client gets no insight into which containers are costing them18:03
clayghrmm...18:03
tdasilvaright, but that should that be a separate stat18:03
tdasilvaright18:03
tdasilvatimburke: they get from the ?versions list, no?18:04
clayg😬 we need to think very carefully about this 🤔18:04
tdasilvaand the container stat could also have a count of versions? maybe?18:04
claygwe should definately look at how s3 does it 🤣18:04
tdasilva^^^ !18:05
tdasilvai think they charge even for delete markers18:05
claygyeah - two HEAD requests to fulfill a single client request is fine... if that's all we needed and it works better i think i could get behind that18:05
timburketdasilva, so you'd need to do a ?versions HEAD to each of your... hundreds? thousands? of containers to get an aggregate view? there's definite value in having bytes/objects show up in GET account...18:05
claygyes HEAD on account needs to have all the bytes - but we could have "bytes" and "bytes of versioned" returned by the API - I'm not seeing a good reason to argue that wouldn't work18:06
timburkei guess maybe you could have it do two GETs for the client request, one for the base account, one for the .versions guy... in some way merge the listings...18:06
claygI mean you could even like *annotate* the containers in the bucket list18:07
claygdoes s3 have bucket accounting in list-buckets?18:07
timburkewhat happens when you have a versions container but no base container?18:07
timburkeclayg, nope :P Name and CreationDate as i recall18:08
claygre-vivify?18:08
tdasilvatimburke: should that be allowed to happen?18:08
claygtdasilva: NO18:08
claygbut... eventual consistency 😞18:08
timburke...but eventual consistency will guarantee that it will at some point ;-)18:09
claygwe can consider it an error case tho - and degrade gracefully as long as it's well understood18:09
tdasilvagoing...i need to step out a bit for dinner, be back later18:10
claygas long as it's determistic and we can explain how the client should preceed depending on what they want - I feel like that'd be workable.18:10
claygso basically re-viviy - every versions container gets and entry in the container listing - if there's not one there you have to bake it up - and the client can either delete all the versions or recreate the container18:11
claygi'm just spitballin18:11
claygI still like we should say "you can't delete this container until you've deleted all the versoins" - we'll get it right most of the time18:11
*** e0ne has joined #openstack-swift18:12
NMtimburke: not sure if that is the reason but I'm getting some HTTP 503 when I try to list the container content. When I look at my proxy-error I got this message: " ERROR with Container server x.x.x.x:6001/sdb "#012HTTPException: got more than 126 headers"18:15
*** zaitcev has joined #openstack-swift18:15
*** ChanServ sets mode: +v zaitcev18:15
timburkeNM, have you tried curling the container server directly and seeing how many headers are returned? you might try bumping up extra_header_count in swift.conf...18:18
NMIf I curl direct to the container-server, it returns 1288 headers of X-Container-Sysmeta-Shard-Context-GUIDNUMBER18:19
NM(I was typing that :) )18:19
timburkeO.o18:19
NMAnd they all have the same values: ""max_row": -1, "ranges_todo": 0, "ranges_done": 8, "cleaving_done": true, "last_cleave_to_row": null, "misplaced_done": true, "cursor": "", "cleave_to_row": -1, "ref": "SOME UID IN THE HEADER}18:20
timburkei expect the primaries are also going to have problems then... we really need to do something to clean up old records :-(18:21
timburkeyou should be able to POST to clear those -- i forget if we support x-remove-container-sysmeta-* like we do for user meta or not though. worst case, curl lets you specify a blank header with something like -H 'X-Container-Sysmeta-Shard-Context-GUIDNUMBER;'18:23
NMtimburke:  So I can assume it's safe to delete then, right? I didn't find anything about this header and I was somehow curious about ir.18:26
timburkeNM, we use it to track sharding progress so we know when it's safe to delete an old DB. if it's not present, we might reshard a container, but that's ok -- everything will continue moving toward the correct state18:28
*** e0ne has quit IRC18:34
timburkeNM, filed https://bugs.launchpad.net/swift/+bug/184331318:38
openstackLaunchpad bug 1843313 in OpenStack Object Storage (swift) "Sharding handoffs creates a *ton* of container-server headers" [Undecided,New]18:38
timburkeif you want all the nitty-gritty details, look for "CleavingContext" or "cleaving_context" https://github.com/openstack/swift/blob/master/swift/container/sharder.py18:40
timburkethere's a little bit about that metadata in https://docs.openstack.org/swift/latest/overview_container_sharding.html#cleaving-shard-containers -- we should probably include an example of the header, though, to help with search-ability18:42
*** NM has quit IRC18:54
*** diablo_rojo has quit IRC18:56
*** diablo_rojo has joined #openstack-swift18:56
*** BjoernT has joined #openstack-swift18:59
*** BjoernT_ has quit IRC19:00
*** NM has joined #openstack-swift19:02
*** e0ne has joined #openstack-swift19:03
*** e0ne has quit IRC19:09
*** e0ne has joined #openstack-swift19:10
*** BjoernT_ has joined #openstack-swift19:12
*** BjoernT has quit IRC19:13
*** henriqueof1 has joined #openstack-swift19:18
*** baojg has quit IRC19:19
*** henriqueof1 has quit IRC19:20
*** henriqueof has joined #openstack-swift19:20
*** NM has quit IRC19:44
*** NM has joined #openstack-swift19:44
claygdude, I have no idea what we're going to do about the listing ordering thing - what a hozer - i needs a column 😢19:54
timburkeclayg, the nice thing about '\x01' is that nothing can squeeze in before it ;-)20:26
timburkemy naming scheme with '\x01\x01' gets even better with https://review.opendev.org/#/c/609843/20:27
patchbotpatch 609843 - swift - Allow arbitrary UTF-8 strings as delimiters in lis... - 4 patch sets20:27
claygI mean if it *works* that's *amazing* - I wonder if we could make it an implementation detail to the database somehow?20:28
claygMaybe it's only at the container server or broker just for versioned containers - but the API can still be sane looking?20:28
claygwhy do you need \x01\x01 if \x01 works?20:28
clayga versioned container could potentially have a lot of context - we could know that \x01<timestamp> is like *not* part of the name when returning results20:29
timburkemake it more likely that we'll get the right sort order even if clients include \x01 in names20:29
timburke(provided *they* don't double them up, *too*20:29
claygso it can still go south if someone puts \x01 or \x01\x01 anywhere *in* the name?  😞20:30
claygI mean if it doesn't *work* - I don't think it's worth it - let's just rewrite the container-server - if it works!?  awesome kludge!  👍20:31
*** notmyname has quit IRC20:32
timburkeidk about getting this down to the broker -- one of the things i really like about how VW works today is that you've got enough visibility as a client to be able to repair things even if there's a bug in how you list/retrieve versions20:32
*** patchbot has quit IRC20:32
timburkefwiw, \x00 would be a kludge that'd *always* work... provided we continue restricting *clients* from creating things with null bytes when we open it up for *internal* requests20:35
NMtimburke: thanks for open a bug report. I can't see the headers if I send the request to the proxies. I can see the headers only when I perform a get to the container-server. I tried to send a POST to the container-server but it complains about the  "Missing X-Timestamp header".20:35
claygI like that *operators* have flexibility to "fix" things - I *hate* that versioned writes today makes us worry all the time about "well what if a client goes behind the curtain and messes everything up!" 😞20:35
claygtimburke: yes!  let's do \x00 then?  why doesn't it "always" work - why would you suggest \x01\x01 then!?  😕20:36
timburkeclayg, i'm not sure how deep the "no null bytes in paths" thing goes, though. some unlikely-to-be-used but already valid string seemed "safe enough"20:38
timburke*shrug*20:38
timburkeif clients go and mess things up behind our back, their on their own; it's undefined behavior. we should do our best to do something "reasonable", but as long as it's not a 500...20:39
timburkethey're*20:40
timburkeNM, add a -H 'X-Timestamp: 1568061615.94565' or so. it's more or less just a unix timestamp20:41
timburkeuse something like `python -c 'from swift.common.utils import Timestamp; print(Timestamp.now().internal)'` if you *really* want it to be up-to-date ;-)20:41
*** patchbot has joined #openstack-swift20:42
*** notmyname has joined #openstack-swift20:42
*** ChanServ sets mode: +v notmyname20:42
*** mvkr has joined #openstack-swift20:48
NMtimburke: that is what I call up-to-date :)  Anyway, do you see any relation between this headers and the container reporting 0 objects inside it and not listing its own objects?21:04
NM(And also the x-container-sharding as False)21:05
DHEany one  see a problem is I had objects updated every ~5 seconds?21:05
timburkeNM, once the primaries get all those headers, the proxies are always going to get the same 503 from them that you were seeing. presumably there's a container PUT at some point when clients go to upload, which is why we've got a bunch of handoffs and the heap of cleaving metadata. the freshly-created handoff containers would be the only things responsive, so... no objects :-(21:10
timburkei'm guessing that replication's not real happy right now, either...21:10
timburkeDHE, how many objects are we talking?21:13
timburkeand why are we writing them every 5s? are clients going to run into any trouble if they get a stale read?21:15
NMtimburke: should I be worried about this container? At least the container on .shard_AUTH are responding correctly.21:19
timburkeNM, i think once you get the old cleaving sysmeta out of there, it should sort itself out. you might need to do that periodically for a bit though, at least until https://bugs.launchpad.net/swift/+bug/1843313 is fixed and you can upgrade to a version that includes the fix :-( sorry21:21
openstackLaunchpad bug 1843313 in OpenStack Object Storage (swift) "Sharding handoffs creates a *ton* of container-server headers" [Undecided,New]21:21
timburkeNM, how long has the container been sharded?21:21
timburkeis it on your default storage policy, or some other one?21:22
timburkeif it's non-default, https://bugs.launchpad.net/swift/+bug/1836082 is a worry...21:23
openstackLaunchpad bug 1836082 in OpenStack Object Storage (swift) "Reconciler-enqueuing needs to be shard-aware" [High,Fix released]21:23
timburkewhat version of swift are you on?21:23
timburke(probably should have been my *first* question ;-)21:23
NMtimburke: LOL. I should also have said that earlier. Sharding: It was done 2 or 3 months ago. Replicas: I'm using the default (3 replicas). Version: 2.20.021:24
timburkereplicated vs ec is good to know, but i was thinking more about how many storage policies you've got defined in swift.conf and whether this container's policy matches whichever is flagged as default in that config21:26
timburkethat reconciler bug definitely affects 2.20.021:27
NMWe don't use EC right now. 1 thing about this container (I don't know if it's relevant) but it recieve lots of tar.gz file to be unzip and stored.21:28
DHEtimburke: a lot of objects, but maybe 2000 will be kept busy...21:28
DHEtimburke: I'm okay with the previous version being read, MAYBE 2, but that's about my limit...21:30
DHEonly failure scenario I can think of is a brief window when an object server is down, comes back up, and there's ~5 second window while a user could fetch an ancient version21:30
DHEwhere 5 minutes is considered ancient21:30
timburkeNM, that might explain where the container PUTs are coming from: https://github.com/openstack/swift/blob/2.20.0/swift/common/middleware/bulk.py#L303-L30921:32
timburkeNM, couple what you're seeing with https://bugs.launchpad.net/swift/+bug/1833612 -- yeah, that's probably gonna create a heap of handoffs21:32
openstackLaunchpad bug 1833612 in OpenStack Object Storage (swift) "Overloaded container can get erroneously cached as 404" [Undecided,Fix released]21:32
NMtimburke:  Hummm… Do you see any way to fix this? Like, list all shards and uses this list to 'feed' the original container?21:38
*** e0ne has quit IRC21:40
timburkeNM, i *think* the shards should be ok. it's definitely worth correcting me if i'm wrong though! if you do a direct POST to clear the unneeded headers to just one of the primaries, it'll propagate to the other replicas, including the handoffs21:42
timburkeand it should fix that replica pretty much immediately. you'll probably have to wait for it to no longer be error-limited, though21:44
timburkeDHE, will the client be smart enough to see that the ancient version is ancient and either retry or bomb out? how big's the cluster? what kind of policy will those objects use? trying to figure out how many objects are likely to land on the same disk and cause contention...21:47
timburkeand what's the read/write ratio? given how often we're writing, i'd expect that to dominate, but just want to confirm21:49
*** BjoernT_ has quit IRC21:54
DHEtimburke: it's been ordered, but hardware is still a few weeks out. looking at a somewhat geographically diverse cluster. it's basically being used for live TV. might be no hits, might be 1000 hits per second. but I plan to put some edge caching in place in front of the proxy servers, even if it's on the same host.22:17
DHEhmm.. maybe I could just set an expiration time of like 30 seconds...22:18
timburkeDHE, and use a naming scheme that will give each write a distinct name. let the client derive the expected name based on current time (or maybe even better, some server-provided time)22:20
DHEI don't have that luxery22:21
timburkewith client time, you've got to worry about some client that thinks it's in the future. with server time, you've gotta worry about caching22:21
DHEluxury22:21
DHEwhatever22:21
DHEI've been pushing back that swift wasn't optimal for live content, cache or not22:22
timburkeyeah, it's definitely got me a little nervous... likely to see async pendings piling up, but that's not *so* bad. stale reads sound like a much more likely problem, and more customer-facing22:25
timburkehow big's 5s worth of content? i wonder if you could serve it more or less straight from memcached...22:27
timburkeswift is great at durability. our scaling and performance are pretty good, but way less so when you're hammering just a handful of objects. need that nice, broad distribution of load22:29
timburkemeanwhile, this use-case seems to be tossing the durability aspect out the window22:30
timburkedefinitely tee it off to swift so you can use it for on-demand! i'm less sure about the live stream22:31
*** tkajinam has joined #openstack-swift22:55
*** rcernin has joined #openstack-swift22:59
NMtimburke: I've tried curl -v -H "X-Container-Sysmeta-Shard-Context-7ddb3….-547595833ff8;" --data "" -H"X-Timestamp: 1568070332.21750" http://MY_IP:6001/sda/42306/AUTH_643f797035bf416ba8001e95947622c0/components23:13
NMAnd curl -v -H "X-Remove-Container-Sysmeta-Shard-Context-7ddb3….-547595833ff8: x" --data "" -H"X-Timestamp: 1568070332.21750" http://MY_IP:6001/sda/42306/AUTH_643f797035bf416ba8001e95947622c0/components23:14
*** threestrands has joined #openstack-swift23:14
NMNeither of then seems to work. At some point I successfull set the header to "{}" but after some time it got back to the data about sharding process.23:15
timburkethis was for one of the UUIDs that claimed "cleaving_done": true, "misplaced_done": true, yeah? hmm...23:17
NMYeah! {"max_row": -1, "ranges_todo": 0, "ranges_done": 8, "cleaving_done": true, "last_cleave_to_row": null, "misplaced_done": true, "cursor": "", "cleave_to_row": -1, "ref": "7dd…"}23:18
timburkethere's a chance that it just happened to be one of the handoffs out there sharding, i suppose...23:19
NMI see. My last shot was to send the post to all container-server at once but it didn't work also.23:21
timburkeat like 6/1000 or so, the odds seem against it, though...23:23
NMThe 3 primaries db are sharded. Handoff 4 and 5 are unsharded and handoff 6 is sharding.23:26
NMConsidering "X-Backend-Sharding-State" header23:27
timburkehave you looked further out in the handoff list? i'm a little worried that replication will poison some of the handoffs so *they'd* start responding with too many headers, too...23:28
NMOne handoff is "poisoned". The one that says it's shardiing. The other 2 are not. But they say "X-Backend-Sharding-State: unsharded"23:33
timburkeNM, what about handoffs 7, 8? might want to add a --all to your swift-get-nodes to see more handoffs23:43
timburkesorry, i gotta head out... the surgery might be a little more involved than i'd originally hoped, sorry NM :-(23:45
NMtimburke: sure! Thanks anyway. Tomorrow I'll get back to this.23:47
NMtimburke: (Do you mean real surgery or are you talking about swift?)23:48
*** NM has quit IRC23:50

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!