Friday, 2020-01-03

openstackgerritTim Burke proposed openstack/swift master: Have slo tell the object-server that it wants whole manifests  https://review.opendev.org/69773900:09
openstackgerritTim Burke proposed openstack/swift master: symlink: Clean up app iters better  https://review.opendev.org/70095900:10
*** d34dh0r53 has quit IRC00:14
claygthat sucks, I thought we already did something to assert no unread requests in teardown00:19
viks___hi, i'm testing erasure code policy on my setup.. I notice that it is slower compared to default policy?  i.e. for a 500mb file upload i see around 6 sec and with EC i see around 9sec ... is this expected? anyone has any idea??00:23
openstackgerritTim Burke proposed openstack/swift master: account-server: Add test for leading delimiter  https://review.opendev.org/70096400:27
openstackgerritTim Burke proposed openstack/swift stable/train: account-server: Correctly handle containers starting with delimiter  https://review.opendev.org/70096500:27
timburkeclayg, no *unclosed* requests, which is also good. but i wanted to minimize 499s00:28
claygohhhhhh00:28
claygbrilliant00:28
claygviks___: what's your test setup?00:28
timburkeviks___, what's your EC algorithm? the rs_vand that ships with liberasure code is mainly meant for demonstration/testing purposes; if it were me, i'd want to be ssure to use isa-l00:29
claygviks___: if you're cpu limited (i.e. all nodes/services on the same SAIO) that would be expected.00:29
claygviks___: normally when people compare single stream upload throughput on testing clusters their *disk* limited - and when going to EC there's more spindles and less bytes so it goes a good bit *faster* than replicated00:30
claygbut that assumes pleanty of cpu headroom00:30
timburkeor bandwidth-limited, but there's a similar trick wherein you're only sending 1.5x (say) out your cluster-facing NIC instead of 3x, so you actually see a speedup for large enough objects (and 500MB should definitely be large enough)00:32
viks___No.. all nodes are separate...And my cluster is not loaded at all... when i check the same in my vagrant based setup, i see EC is faster..  so i'm confused why they are behaving differently? My erasure code has 3 fragments and 2parity set00:34
viks___in vagrant setup is see around 20-25% better speed with EC..00:37
timburkeviks___, what's the ec_type? is it the same on both the test cluster and the vagrant setup?00:38
viks___liberasurecode_rs_vand00:39
viks___yes00:39
viks___both are same.. both have 3 storage nodes..00:40
viks___ok.. i;ll check once with isa-l and revert back00:42
openstackgerritTim Burke proposed openstack/swift master: Use less responses from handoffs  https://review.opendev.org/70023900:45
viks___is number of fragments should be set based on no. of storage nodes?00:45
viks___because when i tried with 7 fragments and 3 parity, i was getting the below swift proxy error:00:47
viks___```00:47
viks___Object PUT returning 503, 6/8 required connections00:47
viks___```00:47
claygso writing down "quoted-paths: true" in sysmeta is pretty awful - for one an un-upgraded node can write down an *unquoted* root *after* an upgraded node wrote down "quoted_paths: true"00:47
claygafter replication takes the latest value for each key I've got an unquoted path that says "quoted_paths: true"  🙄00:48
timburkeeh... i'd say it's more about what kind of durability/storage overhead trade-off you want to make. keeping the numbers small simplifies the math a bit and should reduce CPU overhead... but makes it so you can't reliably withstand as many simultaneous drive failures00:48
timburkeviks___, how many disks do you have per node?00:49
timburkeclayg, yeha, that sounds pretty terrible00:49
viks___9 disks00:50
timburke:-/ did the logs tell you much else? it should've tried to connect to every disk...00:51
viks___no other info... i'm yet to try with debug enabled in logs... Also when i tried with 6 fragments and 3 parity, i'm getting the below swift proxy error:00:54
viks___```00:54
viks___Object PUT returning 503, 6/7 required connections00:54
viks___```00:54
viks___So i could not make out how this required connections getting calculated...00:54
viks___timburke: Do i need to modify some worker/concurrency tuning for this ? orany other parameter?00:55
claygtimburke: yeah I also managed to get some lost objects into a shard named with the quoted value when the proxy sent down the target-shard quoted and the object forwarded it00:57
claygi guess since the container was in an autocreate account it made a new db - which strangely itself doesn't think it's a shard00:57
claygi wonder if there's a clue in recon or logs about a db in .shards_X account that's not a shard?00:58
*** gyee has quit IRC00:58
claygthere's lot of this:01:01
claygJan  3 01:00:17 saio container-sharder-6021: Failed to put shard ranges to 127.0.0.1:6041/sdb4: 'utf8' codec can't decode byte 0xbe in position 3: invalid start byte: #012Traceback (most recent call last):#012  File "/vagrant/swift/swift/container/sharder.py", line 596, in _put_container#012    headers=headers, contents=body)#012  File "/vagrant/swift/swift/common/direct_client.py", line 348, in01:01
claygdirect_put_container#012    path = _make_path(account, container)#012  File "/vagrant/swift/swift/common/direct_client.py", line 60, in _make_path#012    for x in components)#012  File "/vagrant/swift/swift/common/direct_client.py", line 60, in <genexpr>#012    for x in components)#012  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode#012    return codecs.utf_8_decode(input, errors,01:01
claygTrue)#012UnicodeDecodeError: 'utf8' codec can't decode byte 0xbe in position 3: invalid start byte01:01
claygi bet that's because of the quoted-paths + (not-quoted)-root01:01
claygok, so that was a disaster!  😁01:02
timburkeclayg, that seems like a great thing to add around https://github.com/openstack/swift/blob/2.23.0/swift/container/sharder.py#L776-L777 -- if broker.is_root_container() and broker.account.startswith(self.shards_account_prefix): ...01:02
claygi'll make a sticky to write a bug report tomorrow01:02
claygare you going to re-rev the sharder-quoting patch again?01:03
timburkeha! you went with my all-beef container name, didn't you 😁01:03
claygi'm going to abandon my "alternative approach"01:03
claygwell, now it's like 25% beef 😬01:03
timburkenah, i don't have anything to push up on that guy at the moment01:04
timburkei suppose i could try to get the client-says-it-wants-quoted change in... but i haven't started on it yet01:06
timburkeand i kinda want to actually try a non-default prefix with https://review.opendev.org/#/c/700818/01:09
patchbotpatch 700818 - swift - Deprecate per-service auto_create_account_prefix - 2 patch sets01:09
timburkesince that seems pretty handy for https://review.opendev.org/#/c/700449/01:09
patchbotpatch 700449 - swift - Allow reconciler to handle reserved names - 3 patch sets01:09
*** f0o has quit IRC01:37
*** f0o has joined #openstack-swift01:38
*** spsurya has quit IRC02:03
*** d34dh0r53 has joined #openstack-swift02:56
openstackgerritMerged openstack/swift master: Use less responses from handoffs  https://review.opendev.org/70023903:13
openstackgerritTim Burke proposed openstack/swift master: Deprecate per-service auto_create_account_prefix  https://review.opendev.org/70081805:28
*** evrardjp has quit IRC05:33
*** evrardjp has joined #openstack-swift05:33
*** Fengli1 has joined #openstack-swift05:57
*** Fengli has quit IRC05:59
*** Fengli1 is now known as Fengli05:59
openstackgerritTim Burke proposed openstack/swift master: symlink: Clean up app iters better  https://review.opendev.org/70095906:06
openstackgerritTim Burke proposed openstack/swift master: Middleware that allows a user to have quoted Etags  https://review.opendev.org/70005606:16
zaitcevJan  3 01:54:56 rhev-a24c-01 swift[9927]: - - 03/Jan/2020/06/54/56 GET /v1/.misplaced_objects%3Fformat%3Djson%26marker%3D%26end_marker%3D%26prefix%3D HTTP/1.0 404 - Swift%20Container%20Reconciler - - - - tx7e12fe8b104845d19ce94-005e0ee540 - 0.0358 - - 1578034496.477324247 1578034496.513160467 -06:55
zaitcevJan  3 01:54:56 rhev-a24c-01 container-reconciler[9927]: Reconciler Stats: {} (txn: tx7e12fe8b104845d19ce94-005e0ee540)06:55
zaitcevSame PID, 9927. I'm wondering why the reconciler reports itself as "container-reconciler" at some times and as just "swift" at other times.06:56
*** psachin has joined #openstack-swift08:00
*** tesseract has joined #openstack-swift08:14
*** Fengli1 has joined #openstack-swift08:14
*** Fengli has quit IRC08:16
*** Fengli1 is now known as Fengli08:16
*** pcaruana has joined #openstack-swift08:35
*** rpittau|afk is now known as rpittau08:41
*** psachin has quit IRC09:18
*** Fengli has quit IRC09:53
*** Fengli has joined #openstack-swift10:05
*** Fengli has quit IRC11:26
*** henriqueof has joined #openstack-swift14:57
*** henriqueof has quit IRC15:03
*** henriqueof has joined #openstack-swift15:03
*** renich has joined #openstack-swift15:24
*** renich has quit IRC15:33
*** takamatsu has joined #openstack-swift16:37
*** henriqueof has quit IRC16:44
*** gyee has joined #openstack-swift17:03
*** rpittau is now known as rpittau|afk17:18
*** evrardjp has quit IRC17:33
*** evrardjp has joined #openstack-swift17:33
clayg@zaitcev I bet it's something to do with the internal client config17:44
claygget's it's own logger instead of passing it through?  I think all the apps in the pipeline call get_logger 🤔17:45
zaitcevright... I'm sure it comes from internal_client18:00
timburkelol https://github.com/openstack/swift/blob/2.23.0/doc/manpages/object-server.conf.5#L371-L37518:39
timburkewe were so optimistic :P18:40
zaitcevmore like le sigh18:40
openstackgerritTim Burke proposed openstack/swift master: Deprecate per-service auto_create_account_prefix  https://review.opendev.org/70081818:45
*** paladox has quit IRC19:10
*** henriqueof has joined #openstack-swift19:21
*** paladox has joined #openstack-swift19:34
*** paladox has quit IRC19:35
*** paladox has joined #openstack-swift19:35
openstackgerritClay Gerrard proposed openstack/swift master: wip: move to new quoted-path  https://review.opendev.org/70105919:50
claygtimburke: ^ so just having everyone start using quoted path works great!  a couple of shards might think they're roots temporarily but it all works out19:51
claygi'm not really sure why just starting to quote location seemed to work - I had some async pending pile up with container servers rejecting 301 (obviously) then 412 (!?)19:52
claygeventually everyone updated and everything worked with no lost or misplaced updates 🥳19:52
claygI might be happy with something a little more progressive or careful than p 701059 but not significantly so w/o some proven failure mode I can duplicate19:53
patchbothttps://review.opendev.org/#/c/701059/ - swift - wip: move to new quoted-path - 1 patch set19:53
claygthe main advantage of letting the natural failure modes recover on their own when they case is that it proves we're trusting the case when the old path isn't sent19:54
claygi.e. I don't have to think "ok, what happens in rare case when the quoted & unquoted paths are NOT the same; what does that failure look like!?" ...19:55
clayg... because THAT failure mode is the happy path, and if it *works* then we don't need to do anything special for when the unquoted path isn't equivilent - which I think was really the heart of what was bothering me19:56
timburkeclayg, this is what i've got so far, but i need to fix up some tests http://paste.openstack.org/show/788045/20:42
*** takamatsu has quit IRC20:51
zaitcevI added some innocuous imports into my auditor plugin, and it makes the auditor loop hard on start. Must be eventlet, I just know it.22:10
*** tesseract has quit IRC22:54
*** pcaruana has quit IRC22:59
openstackgerritClay Gerrard proposed openstack/swift master: Deprecate per-service auto_create_account_prefix  https://review.opendev.org/70081823:16
openstackgerritClay Gerrard proposed openstack/swift master: Allow reconciler to handle reserved names  https://review.opendev.org/70044923:56
*** gyee has quit IRC23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!