Thursday, 2020-09-10

*** gyee has quit IRC00:50
kota_timburke: good info, thx for the summit schedule02:26
*** rcernin has quit IRC02:50
*** rcernin has joined #openstack-swift02:59
openstackgerritMerged openstack/swift master: Remove some useless swob.Request attr setting  https://review.opendev.org/75001303:11
*** josephillips has quit IRC03:19
*** evrardjp has quit IRC04:33
*** evrardjp has joined #openstack-swift04:33
*** m75abrams has joined #openstack-swift05:06
*** rcernin has quit IRC08:35
*** manuvakery has joined #openstack-swift09:06
*** rdejoux has joined #openstack-swift09:36
*** StevenK has quit IRC10:21
*** rcernin has joined #openstack-swift10:39
*** StevenK_ has joined #openstack-swift11:19
*** StevenK_ is now known as StevenK11:23
*** yuxin_ has quit IRC12:35
*** yuxin_ has joined #openstack-swift12:36
*** m75abrams has quit IRC12:55
*** sorrison has quit IRC13:14
*** sorrison has joined #openstack-swift13:14
*** rcernin has quit IRC15:50
*** yuxin_ has quit IRC16:03
*** yuxin_ has joined #openstack-swift16:06
cwrightHi everyone. I've been working with ormandj on our swift cluster, and as he's described here, we have run into some performance issues.16:35
cwrightWe've started looking at implementing servers_per_port=2. We are using an independent replication network, and have seen that this is an issue for servers_per_port (re: https://bugs.launchpad.net/swift/+bug/1669579 )16:35
openstackLaunchpad bug 1669579 in OpenStack Object Storage (swift) "servers_per_port will not bind to replication_port" [Medium,In progress] - Assigned to Romain LE DISEZ (rledisez)16:35
cwrightI've read the available docs, but I've been having trouble finding any real examples online of how the ring should look, given the above bug and that we are using a separate replication network.16:36
cwrightHere's a gist that shows the object ring for a small test cluster of 4 servers, each with 3 object disks, both how it is today (before implementing servers_per_port) and how I *think* it should look after servers_per_port:16:36
cwrighthttps://gist.github.com/corywright/d89a93bf21de3773ee9ade39d8c324fc16:36
cwrightCan anyone confirm that what I'm planning to do looks sane?16:36
timburkecwright, so in light of the bug, i'd recommend keeping all of the replication ports as 6300 -- you then run two instances of the object-server with two different configs: the one serving data back to proxies should be using servers per port, while the one serving replication traffic will continue using workers16:44
timburkeat least, that's how clayg tells me we've been running my clusters; it might be interesting to see if that matches what rledisez does, as another data point16:45
timburkenote that you likely *don't* want to set `replication_server = true` on that second instance: while we finally have a fix for https://bugs.launchpad.net/swift/+bug/1446873, it's not been in a tagged release yet16:46
openstackLaunchpad bug 1446873 in OpenStack Object Storage (swift) "ssync doesn't work with replication_server = true" [Medium,Fix released]16:46
claygI think rledisez carries that patch so he can do "something" with his replication servers... I don't know exactly why that's desirable; but maybe we should try to land https://review.opendev.org/#/c/337861/16:48
patchbotpatch 337861 - swift - Permit to bind object-server on replication_port - 7 patch sets16:48
ormandjwe're seeing fun stuff like a single drive in a single server in a cluster with hundreds of drives across multiple servers getting busy due to a patrol read, for example, causing the entire cluster to slow down to pretty abysmal rates16:49
ormandjto include replication16:49
claygtimburke: yes, we just have all our replication server workers on the same port - having disk isolation in worker processes would be helpful - but I don't know what we want to dedicate that much memory to running a bunch of replication workers16:49
ormandjand when you're onlining new capacity, that's pretty painful16:50
claygwhat's a "patrol" read?16:50
ormandjdrive controller kicks off a process that effectively does a sector scan on drives in the background/transparently to the OS to look for failing disks/sectors, which makes disk access 'slow'16:51
ormandjno different than if you hit it with 150IOPS or something like that from the OS, just slows down the disk16:51
ormandjHPs/Dells/Ciscos/etc do it re: raid controllers16:51
claygIME a single drive getting busy or slow does not make the whole cluster performance "abysmal" - but adding capacity should be able to make effective use of ALL cluster resources (edging out clients) if the consistency engine is cranked all the way up16:52
cwrightthanks timburke.  we currently already run two object-servers, with the second one only having `replication_server = true`16:52
cwrightsince that is the one that binds to the ip on the replication network16:53
claygcwright: yeah timburke is definitely saying if you use EC you shouldn't have replication_server = true because of https://bugs.launchpad.net/swift/+bug/1446873 unless you're running master16:53
openstackLaunchpad bug 1446873 in OpenStack Object Storage (swift) "ssync doesn't work with replication_server = true" [Medium,Fix released]16:53
ormandjclayg: if we kick off a patrol read on a drive, it absolutely does negatively impact performance for the cluster, significantly, which is why we're looking to implement servers-per-port. we'll see successful request rates drop through the floor, and we start serving 499s/500s more frequently - it's very odd16:54
claygoic, well - yes - one of the main benifits of servers-per-port was in-fact in reality isolating disks to particular wsgi servers such that a tarpit disk only effects requests to that disk (and not requests to other disks on the same server)16:55
ormandjno EC here btw, just replication of 316:56
ormandjalso we aren't using ssync, i wasn't aware that was usable/reliable yet, is it?16:56
claygssync is the only option for EC, for replicated rsync is still good - but some deployments are having success with ssync on replicated too16:57
claygsince you're just 3R - it makes a little less sense why a single slow disk/server would effect "most" requests16:57
claygisolating slow disks to a single object server worker helps A LOT with other requests to other disks on the same server - but shouldn't effect anything on OTHER servers really 🤔16:58
ormandjyeah, it's baffling us too, because it causes a pretty significant impact16:58
ormandjwell, we think on our small clusters of 3 servers, it may cause a problem because of 3x replication16:59
ormandjwe've also had to set fairly high node timeouts due to the impact of a single drive16:59
claygah, sure so with 3 servers a 3R PUT will need to talk to every server 😁16:59
ormandjand i think that 'slows down' everything16:59
claygI think you're on the right track with servers per port to start17:00
cwrightso since we aren't using EC/ssync, should it be safe to set `replication_server = true` in the second object-server config?17:01
claygif a disk becomes nearly un-responsive during a patrol read you might want to look at unmounting it first (507s don't take a whole node timouet and the proxy will stage writes on handoffs, then object-replicator can automatically repair when it's back online)17:02
claygthe only problem with unmounting is that the replicator will also try to rebuild parts while it's unmounted - you might need some new code or custom middleware to "temporarily suspend writes" to a disk (I could see lots of uses for something like that)17:03
*** gyee has joined #openstack-swift17:03
ormandjyeah, we wanted to see how that went, but when we went to implement, ran into trouble because of the replication network bug. so just wanted to clarify before we make changes and break the planet ;)17:04
claygcwright: yes, if you're not using EC/ssync replication_server = true is fine as far as I know, but leaving it commented out is also ok (the config's bind_ip tells it where to listen; and must match the ring regardless)17:04
cwrightclayg: ok great, thanks17:05
ormandjclayg: we can't really do the unmounting thing. that's not really how it works, it kicks off on all the drives at varying intervals within the parameters you set.17:05
ormandji think we'll probably be a lot better off with this change, and we'll see how it goes with that in place17:05
ormandjone question we did have, mutating the ring by updating the port for the non-replication entry, are we talking full reshuffle of data everywhere?17:06
claygormandj: neat!  we have a "node agent" that runs in the background and does periodic disk checks - if we had something in the object server that would respect a "drive temporarily offline file" - that's where I'd add code to notice a drive is being scanned and mark a drive in maintenance mode17:07
clayg... but a cron would work too17:07
claygno, just changing a device's replication_ip/port wouldn't require a rebalance of parts17:07
ormandjclayg: yeah, this is hidden from the OS, you can 'see' it occuring through the controller cli tools, so it'd have to be some magic there, which is suboptimal17:07
ormandjclayg: awesome, hadn't looked at the hashing algo to figure out if ports/etc were used, that's great news17:07
claygormandj: it sounds like a really neat check that's doing a lot to try and stay transparent - it's unfortunate that it's adding so much latency in io wait queues 😢17:09
clayghopefully the drive isolation of server_per_port will do the trick! 🤞17:09
ormandjyeah, for most workloads it's relatively transparent, but random iops? not so much17:09
ormandjmost modern servers with raid controllers will default with it on17:10
*** tonyb has quit IRC17:15
*** tonyb has joined #openstack-swift17:57
*** manuvakery has quit IRC18:05
*** gmann is now known as gmann_afk18:11
openstackgerritMerged openstack/swift master: gate: Make rolling upgrade job work with either 60xx or 62xx ports  https://review.opendev.org/75067918:52
*** mikecmpbll has quit IRC18:59
*** rdejoux has quit IRC19:31
openstackgerritClay Gerrard proposed openstack/swift master: add swift-manage-shard-ranges shrink command  https://review.opendev.org/74172120:23
*** openstackgerrit has quit IRC20:36
cwrightFollow up question, is it recommended to deploy servers_per_port changes for account and container servers at the same time that we roll it out for object servers?21:04
cwrightactually, just saw a comment in the code that servers_per_port is only for object-server at the moment, so that answers that21:38
*** gmann_afk is now known as gmann22:49
*** rcernin has joined #openstack-swift22:58
*** rcernin has quit IRC22:59
*** rcernin has joined #openstack-swift22:59
*** openstackgerrit has joined #openstack-swift23:10
openstackgerritTim Burke proposed openstack/swift master: Authors/ChangeLog for 2.26.0  https://review.opendev.org/75053723:10
timburkerendered release notes preview: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_59e/750537/2/check/build-openstack-releasenotes/59e2503/docs/current.html23:49

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!