Monday, 2019-08-12

*** psachin has joined #openstack-swift03:35
*** viks___ has joined #openstack-swift04:55
*** pcaruana has joined #openstack-swift05:29
*** pcaruana has quit IRC05:37
*** pcaruana has joined #openstack-swift05:50
*** e0ne has joined #openstack-swift06:16
*** e0ne has quit IRC06:17
openstackgerritMatthew Oliver proposed openstack/swift master: sharder: Keep cleaving on empty shard ranges  https://review.opendev.org/67582006:34
*** rcernin has quit IRC07:13
*** tesseract has joined #openstack-swift07:14
*** e0ne has joined #openstack-swift07:49
*** onovy has joined #openstack-swift08:36
onovyhi guys. I'm current "maintainer" of swauth. I'm not doing my job :). I tried to fix swauth for Stein: https://review.opendev.org/#/c/670891/ but without success. I'm thinking about discontinue of swauth completly. is anyone interested?08:38
patchbotpatch 670891 - x/swauth - Fix compatibility with Swift Stein - 4 patch sets08:38
*** tesseract has quit IRC08:42
*** hogepodge has quit IRC08:42
*** onovy has quit IRC08:42
*** onovy has joined #openstack-swift08:42
*** tesseract has joined #openstack-swift08:43
*** openstackgerrit has quit IRC08:45
*** hogepodge has joined #openstack-swift08:47
*** hogepodge has quit IRC08:47
*** hogepodge has joined #openstack-swift08:47
*** irclogbot_2 has quit IRC08:49
*** irclogbot_2 has joined #openstack-swift08:53
*** mvkr has joined #openstack-swift09:47
*** pcaruana has quit IRC10:43
*** pcaruana has joined #openstack-swift10:43
*** tdasilva has joined #openstack-swift11:24
*** ChanServ sets mode: +v tdasilva11:24
*** henriqueof has joined #openstack-swift11:42
*** baojg has quit IRC12:06
viks___clayg: I tried stopping replicator service, and i noticed that, the object server cpu usage comes down to almost zero. I think that means object server process also participates in these replications right? Does this mean, cpu usage should automatically go down after few days or week?12:35
viks___Also currently i do not have any of the below in my object-server.conf under `[app:object-server]`:12:35
viks___```12:35
viks___# replication_server = false12:35
viks___# replication_concurrency_per_device = 112:35
viks___# replication_lock_timeout = 1512:35
viks___# replication_failure_threshold = 10012:35
viks___# replication_failure_ratio = 1.012:35
viks___```12:35
viks___I have a separate replication network, so do i need to set these? The description of these have something `SSYNC`. So i had left out these as i'm using rsync.12:35
DHErsync/ssync is about getting the bulk files around. swift still signals other object servers that replication has happened so they can upload local indexes which are also used to help judge if a node is out of sync and needs replicating (iirc)12:44
DHEthe rsync/ssync bulk transfer, and this signaling, runs over the replication network IPs if so configured12:44
*** openstackgerrit has joined #openstack-swift13:02
openstackgerritThiago da Silva proposed openstack/swift master: Allow bulk delete of big SLO manifests  https://review.opendev.org/54012213:02
*** frickler has quit IRC13:18
*** zaitcev has joined #openstack-swift13:29
*** ChanServ sets mode: +v zaitcev13:29
*** tdasilva has quit IRC13:30
*** BjoernT has joined #openstack-swift13:59
*** BjoernT_ has joined #openstack-swift14:04
*** BjoernT has quit IRC14:05
*** tdasilva has joined #openstack-swift14:07
*** ChanServ sets mode: +v tdasilva14:07
*** donnyd has joined #openstack-swift14:53
donnydHow can I accelerate writes in swift using NVME drives? Is there a mechanism to cache writes in a faster device?14:54
donnydDoes this need to be done at a layer below swift?14:55
donnydI guess something like having a hot tier15:08
tdasilvadonnyd: typically faster drives are used for the account/container layer. I'm not sure I've heard of anyone actually caching writes on a prod. cluster. There was some investigation work done with CAS a few years back, might be worth looking into: https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/accelerating-swift-white-paper.pdf15:17
tdasilvadonnyd: to use at the object layer just becomes really costly for a typical swift cluster, no?15:24
donnydWell it can, but at my scale (very small); I am trying to get better performance if I can15:42
*** gyee has joined #openstack-swift15:54
*** e0ne has quit IRC16:01
*** tdasilva has quit IRC16:09
*** tdasilva has joined #openstack-swift16:10
*** ChanServ sets mode: +v tdasilva16:10
donnydI was trying to use ZFS to underpin swift because I can accelerate writes and reads from faster media... but that didn't work out so well16:15
*** zaitcev has quit IRC16:25
*** zaitcev has joined #openstack-swift16:39
*** ChanServ sets mode: +v zaitcev16:39
BjoernT_how do I go about deletion of objects inside the container database ?17:23
*** BjoernT_ is now known as BjoernT17:23
BjoernTdelete from object where name like '%c92f64f79f0d1ed01e6d5b314f04886c/008k171b%';17:23
BjoernTError: no such function: chexor17:23
BjoernTthe problem is I have again corrupted object names and cant delete them via swift api, not update them as that is not allowed per trigger17:24
BjoernTError: UPDATE not allowed; DELETE and INSERT17:24
BjoernTseems like chexor is a function created in memory (swift/common/db.py)  when connecting to the database ?17:26
*** klamath has joined #openstack-swift17:33
*** diablo_rojo has joined #openstack-swift17:36
*** tdasilva has quit IRC17:53
*** tdasilva has joined #openstack-swift17:53
*** ChanServ sets mode: +v tdasilva17:53
claygBjoernT: correct, the function is needed for bookkeeping - maybe get a ContainerBroker object in a repl - and do the sql commands in python?18:37
claygBjoernT: I feel like you almost definately want the updates to the object table to go through merge_items tho...18:39
claygif you could get the list of names and use ContainerBroker.delete_object that might be a *lot* safer18:40
clayg... than doing the sql/like match18:40
claygmattoliverau: ping p 67545118:46
patchbothttps://review.opendev.org/#/c/675451/ - swift - Consolidate Container-Update-Override headers - 2 patch sets18:46
claygoh right, I forgot last week I was working on getting symlink versions func tests to "work" with use_symlinks true/false 😞18:46
DHEdonnyd: I've considered that, and lvmcache or bcachefs is probably your best bet. lvmcache would require setting up LVM on each disk though18:50
donnydDHE: they don't really work quite the same way as ZFS does though. I think its probably better in this case to worry less about speed and more about reliability18:53
DHEZFS write cache isn't what you probably think it is18:53
DHEunless you're dealing with small objects18:53
BjoernTclayg I just updated the filename and not deleted antyhing so that I dont have to deal with all the bookkeeping functions. do you have an example on ContainerBroker ?19:06
claygyou *updated* the filename?  the table really shouldn't allow inplace updates... replication won't be able to propogate anything unless the row/timestamp changes 😬19:21
clayg>>> from swift.container.backend import ContainerBroker19:22
clayg>>> b = ContainerBroker('/srv/node4/sdb4/containers/57/576/e7c419a563cd36341b12e9ef22343576/e7c419a563cd36341b12e9ef22343576.db')19:22
BjoernTyeah I removed the trigger and added it back as and placed the db at the primary locations, the customer will delete the container anyway19:23
claygok, sounds like a plan then!19:25
*** diablo_rojo has quit IRC19:25
BjoernTYes I saw the structure around ContainerBroker but didnt see methods that help me here at fist glance but put_object is probably it19:26
claygtimburke: do you remember what we decided on x-symlink-target-etag and quotes?  current patch seems to still do the strip before the check... if I remove the strip one test that was expecting a 409 gets a 400 when it tries to verify sending a the quoted slo etag doesn't work - but that seems fine?19:26
claygBjoernT: any idea how the object names got corrupted?19:28
*** tesseract has quit IRC19:32
timburkei'm guessing the same way that the timestamps got corrupted in https://bugs.launchpad.net/swift/+bug/1823785 -- some bad bit-flip, potentially causing the name to not even be utf8 any more :-(19:33
openstackLaunchpad bug 1823785 in OpenStack Object Storage (swift) "Container replicator can propagate corrupt timestamps" [Undecided,New]19:33
BjoernTsadly no this is becoming some headache with a growing list of files19:35
BjoernTnot sure if the ingesting app causes problem or swift19:36
BjoernThttps://bugs.launchpad.net/swift/+bug/1823785 would be worst case yes, I hope not19:36
openstackLaunchpad bug 1823785 in OpenStack Object Storage (swift) "Container replicator can propagate corrupt timestamps" [Undecided,New]19:36
*** psachin has quit IRC19:37
donnydDHE: I am quite familiar with how ZFS works, and for this case I set the swift dataset to sync=always, which forces writes to the much faster nvme drives. Also for commonly accessed objects, they would be pulling into the arc (think the same happens in any linux FS though). Mainly I was trying to improve write speeds. My disks are pretty slow in comparison with the rest of the equipment I have. I am thinking you19:37
donnydare right though, in object storage the name of the game is reliability. Speed doesn't really matter19:37
donnydIn a larger scale system, I really wouldn't even notice.. Its just real slow at my microscopic scale for object storage19:40
*** zaitcev has quit IRC19:43
timburkeclayg, on the quotes thing -- i think as i started to play with it for patchset 20 i saw that test that would break and fixed up my patch to not break the test. up to you to do the strip() or not19:44
clayg🤔19:45
claygtimburke: there's a couple of req modification going on in symlinks _check method currently19:48
DHEdonnyd: incorrect19:49
claygI can pull out the pop for the etag since that doesn't seem to break anything - but making that method not modify state would require a bit of moving things around and probably wouldn't last... should I rename it?19:49
donnydsure19:49
donnydwhich part are you thinking is incorrect19:50
claygtimburke: well, maybe I could try to get rid of more19:50
DHEthe NVMe write cache only takes small writes (default 32k or less) and does not speed anything up. if speed matters you use sync=disabled19:50
DHEsync=always provides the ultimate in crash protection even if the app didn't ask for it. nothing more.19:51
donnydDHE: So you are telling me that zfs doesn't take all writes that are synchronous and send them to zil -> then to slog if you have one?19:53
DHEI'm saying that 1) data doesn't stay on the NVMe disk and 2) ZFS doesn't read back from the ZIL/SLOG in order to write data to the main disks at a later time19:54
DHEthe ZIL/SLOG is write-only, and only read during crash recovery when mounted19:54
DHEso having an nvme disk is better for performance/latency than storing on the main spinning disks, but FORCING data to the nvme disk does not help anything ever19:55
*** zaitcev has joined #openstack-swift19:57
*** ChanServ sets mode: +v zaitcev19:57
donnyd??? LOL, sure20:03
claygtimburke: the _check, _validate, and user->sys is a real mess and the content-type mangling is new, I'm really having a hard time seeing the obvious way to organize it20:05
* clayg on the docstring for the path string type:20:05
clayg-    :returns: a tuple, the full versioned path to the object and the value of20:05
clayg-              the X-Symlink-Target-Etag header which may be None20:05
clayg+    :returns: a tuple, the full versioned WSGI quoted path to the object and20:05
clayg+              the value of the X-Symlink-Target-Etag header which may be None20:05
clayg^ ???20:05
*** ccamacho has quit IRC20:05
timburkei'd strike "quoted" -- it isn't, is it?20:10
*** zaitcev has quit IRC20:17
*** e0ne has joined #openstack-swift20:26
*** zaitcev has joined #openstack-swift20:30
*** ChanServ sets mode: +v zaitcev20:31
*** tdasilva has quit IRC20:37
*** zaitcev_ has joined #openstack-swift20:39
*** ChanServ sets mode: +v zaitcev_20:39
donnydDHE: So its possible my testing is completely flawed, but I want to share some data and so I can try to understand20:41
*** zaitcev has quit IRC20:43
donnydWRITE: bw=801MiB/s (840MB/s), 200MiB/s-322MiB/s (210MB/s-337MB/s), io=16.0GiB (17.2GB)      sync=always20:45
donnydWRITE: bw=321MiB/s (336MB/s), 80.2MiB/s-135MiB/s (84.1MB/s-142MB/s), io=16.0GiB (17.2GB)   sync=standard20:45
*** pcaruana has quit IRC20:56
DHEdonnyd: that's strange.  only possibility that makes sense to me is if you're buffering each TCP packet rather than doing a huge fsync() when it's done...21:21
DHEwhich means that sync=disabled would be even faster, though without the crash safety21:22
donnydOk, that makes sense. In reality I think I am going to follow the advice I got earlier and just make something as stable as possible and not worry about it21:22
donnydIt already blew up with zfs once.21:23
donnydSo should I be using any sort of sw raid or just individual disks21:24
DHEthe theory behind swift is that the redundancy is already handled with swift itself, so you're better off getting the better IOPS by allowing each disk to operate independently21:24
donnydSo should I put the 11 drives I have in raid(x) or should I just leave them in jbod and use swift21:25
DHEassuming large-ish objects, striping of RAID disks tends to make all disks seek in unison21:25
*** tdasilva has joined #openstack-swift21:25
*** ChanServ sets mode: +v tdasilva21:25
*** tdasilva has quit IRC21:25
DHEwhereas with swift the request is served by 1 spindle, which is good for multi-object performance but bad for throughput on a single object21:25
DHEZFS is especially bad because RAID-Z has a stripe chunk size of 1 disk sector (4k tops typically)21:26
donnydI will have log files, glance images, and a random assortment of desktopy files21:26
donnydso maybe a few raid0 groups?21:28
DHEI still think you're best off just having individual disks, unless you really need the throughput that comes with raid-0 or another striped raid21:30
donnydThat makes sense21:39
donnydDo you think it would be worthwhile to maybe do something like external journals for ext4 or xfs?21:40
DHEit could be worth it. anything that keeps seeking down on writes I suppose...21:48
DHEpersonally I want to give lvmcache a spin, but dont' have a high endurance SSD to enable writeback mode21:48
*** e0ne has quit IRC21:49
*** diablo_rojo has joined #openstack-swift22:03
*** henriqueof has quit IRC22:04
*** BjoernT has quit IRC22:07
claygtimburke: so POST to hardlink will still 307 despite the etag not validating22:21
timburkesounds right22:22
timburkeor at any rate, expected22:22
timburkei mean -- we *could* do the POST, then validate *after* and decide whether to 307 or 409... but i don't think we *must* apply the metadata -- eventual consistency's gonna get weird otherwise22:25
claygdon't think we must? (probably typo, cause yeah ... we have to)22:27
claygso, but I'm not even sure if we know enough to return the 409 ... we could go and *check* 😬22:27
timburkeyeah, typo -- i confused myself rewriting what was a double-negative22:28
claygtimburke: can you think of any prior art on new features annoucing themselves in /info22:28
claygI think it's a great idea - i'm just not sure to call it "allowed" - would love to look a diff that exposed /info on a non-configurable feature before?22:29
timburkeand yeah, my thought was that we could go check -- but that it'd have to be after we sent the POST and had an indication that we'd just POSTed to a hardlink22:29
timburkeon the /info thing, i don't think we've really got precedent. but it kinda sucks that clients have to know that data-segments were added to SLO in 2.17.022:31
claygyes, totally agree!  it'd be a great habit to get into.22:31
*** diablo_rojo has quit IRC22:32
claygbut allowed/enabled sounds too much like it invites being turned off to me 🤔22:32
claygavailable?22:33
timburkemaybe "supports_static_links"? available's OK by me, too, though22:34
claygmaybe w/o prior art I'll ask if we can defer it to a follow-up change and maybe also do an audit of other features that deserve similar treatment?22:34
claygwould it be ok to defer it?  I could put up a placeholder patch and we could talk about it at the meeting?22:35
timburke👍22:36
claygMaybe we can say static link 307 on POST makes sense because that verb doesn't supported x-if-match symantic and probably couldn't 🤔22:37
timburkemainly just a thought -- i feel like there's some window of diminishing returns -- in fact, at a year and a half, the data segments thing is maybe approaching the end of that window22:37
claygthat's probably true, the fresher the feature the more clients need to assume their favorite cluster's don't have it...22:39
timburkei think the 307's pretty fair -- i was just noticing that we tell the client to go try again elsewhere without providing any of the context about it being a hardlink22:39
*** hoonetorg has quit IRC22:48
*** hoonetorg has joined #openstack-swift22:50
claygso it looks like I could throw an `x-symlink-target-etag` in the 307 response?  That might be a little useful?22:50
*** hoonetorg has quit IRC22:57
*** hoonetorg has joined #openstack-swift23:01
openstackgerritClay Gerrard proposed openstack/swift master: Allow "static symlinks"  https://review.opendev.org/63309423:03
*** rcernin has joined #openstack-swift23:11

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!