Monday, 2019-04-22

*** d34dh0r53 has quit IRC01:09
*** baojg has joined #openstack-swift01:28
*** d34dh0r53 has joined #openstack-swift02:47
*** gkadam has quit IRC03:00
*** thurloat8 has quit IRC03:03
*** gkadam has joined #openstack-swift03:06
*** openstackgerrit has joined #openstack-swift03:06
openstackgerritzhongshengping proposed openstack/swift master: Replace git.openstack.org URLs with opendev.org URLs  https://review.opendev.org/65427803:06
*** psachin has joined #openstack-swift03:22
*** tonyb has joined #openstack-swift04:46
*** tonyb has quit IRC06:13
*** tonyb has joined #openstack-swift06:14
*** ccamacho has joined #openstack-swift06:58
*** baojg has quit IRC07:07
*** pcaruana has joined #openstack-swift07:19
*** tkajinam has quit IRC08:24
*** pcaruana has quit IRC08:45
*** rcernin has joined #openstack-swift09:03
*** e0ne has joined #openstack-swift09:42
*** hoonetorg has quit IRC10:00
*** hoonetorg has joined #openstack-swift10:13
*** baojg has joined #openstack-swift10:53
*** pcaruana has joined #openstack-swift10:58
*** gkadam has quit IRC12:49
*** gkadam has joined #openstack-swift12:52
*** psachin has quit IRC13:39
*** openstackgerrit has quit IRC14:28
claygcool, yeah everything seemed to "just work" on my end...15:00
claygwe have p 654278 which looks fully legit15:00
patchbothttps://review.openstack.org/#/c/654278/ - swift - Replace git.openstack.org URLs with opendev.org URLs - 1 patch set15:00
*** gyee has joined #openstack-swift15:11
notmynamegood morning15:41
*** e0ne has quit IRC15:47
*** e0ne has joined #openstack-swift16:01
*** gkadam has quit IRC16:30
*** e0ne has quit IRC16:36
*** ndk_ has quit IRC17:48
*** ybunker has joined #openstack-swift17:49
*** sleterrier has quit IRC17:50
*** sleterrier has joined #openstack-swift17:50
ybunkeris there any swift command that i can use to find where the partition (for example 1111) is located? i want to find the main partition and the replicas17:51
notmyname`swift-get-nodes -p PARTITION`17:52
notmyname`swift-get-nodes [-a] <ring.gz> -p partition` (from the usage string, so more complete/correct)17:53
*** e0ne has joined #openstack-swift17:58
ybunkernotmyname: thanks :-)18:02
*** e0ne has quit IRC18:08
ybunkergot into a drive fully 100% condition, and after trying many things.. for example (set handsoff_delete and handsoff_first)18:12
ybunkerand already added (3) new nodes to the cluster18:12
notmynamefor full clusters, the goal is to first add new drives, then get the handoffs moved as quickly as possible. the handoffs_first=true and handoffs_delete set to something like 1 or 2 are the first things to check18:13
notmynamethen check rsync settings. make sure you've got a lot of available connections on the new drives, make sure rsync is not accepting inbound connections on the servers with full drives18:13
ybunkerit seems that since the fully drives are at 100% used space (10G free), it does not 'remove' the replicated partitions from those drives.. and still at 100%18:13
*** e0ne has joined #openstack-swift18:14
notmynameclayg: what are the other things we set in the emergency replication mode?18:14
ybunkernotmyname: handsoff_ already check and is not freeing a byte :(18:15
claygif you're 100% full you need handoffs_delete = 118:15
ybunkeralready did that on the fully nodes18:16
notmynamemeeting time for me. gotta close irc18:16
claygok, then you should be good to go18:16
ybunkerthanks a lot notmyname18:16
claygyou can increase object replicator works & concurrency if you have iops available on the nodes that want to drain18:16
ybunkerclayg: already did that but disks keeps at 100%18:17
claygyou might want to back of the napkin how many streams of io you can pull/push per node18:17
claygcheck logs for errors - something could be preventing successful trasnfer (e.g. rsync connection limiting)18:17
claygthis is replicated or EC fragments?18:18
ybunkerclayg: and the problem is that we have a 'maintenance' window for obj-repl that runs 4hs a day..because when it runs the latency of the cluster goes to the roof.. from 44ms... to 4s....and its impossible to operate at those numbers18:18
ybunkerreplicated18:18
claygeverybody has an iops budget - you could play with the ionice settings18:19
claygbut - i figure you'd want to get things moving first and then worry about making easier to manage - you should be able to free up some bytes duirng your four hour window if you go full tilt18:20
claygno?18:20
ybunkerclayg: yeah i want to get rid of the 100% full condition.. the problem is that on those 4h replica.. the fully drives are still at 100%, i got 8 workers running on that window18:21
claygare you doing rsync connections/modules per disk?18:21
ybunkerper ACO18:22
claygso with 8 workers your outbound streams from the node is 8x concurrency - what's your object replicator concurrency?  how many (object) disks per node?18:23
ybunkerfound the following on the log error:    object-replicator: [worker 1/8 pid=20511] @ERROR: max connections (8) reached -- try again later18:23
ybunker9 disks per node18:23
claygthere you go!18:23
clayggoing no where fast18:24
ybunkeri just change max connections on rsyncd to 1618:24
claygthat limit is set in the rsync.conf18:24
claygis that enough?  what concurrency are you running?18:24
clayg16 per node isn't even 2 connections per disk - I'd think you'd want 4-8 per disk in an emergency18:25
claygof course w/o rsync modules per disk you can't guarantee even distribution...18:25
ybunkerclayg:  http://paste.openstack.org/show/749609/18:25
claygyeah!  try and move your deployment in this direction at some point -> https://github.com/openstack/swift/blob/master/etc/rsyncd.conf-sample#L2518:28
claygok, so concurrency is 8 and workers is 8 - so you have about 64 streams coming out of each full node - so you'll want the aggregate capacity of all new incoming hardware ready to receive all that18:29
clayghow many standing nodes do you have - and how many are you adding?18:29
ybunkeri have 9 standing nodes (2 fully 100%) and adding (3) new ones18:30
claygok, so you should be able to really hammer those new guys - I'd recommend increasing workers to 9 so you have one worker per disk - set concurrency to 4 so you have 324 outgoing streams, then set max_connections at something like 75 or so on those nodes you want to fill up and LET HER RIP!18:33
claygwhen is your window?  it'll be a replication party!!!18:34
ybunkerfor rsyncd something like this (names are from 4..12) => http://paste.openstack.org/show/749610/18:34
claygoh, you ARE doing rsync module per disk then18:34
ybunkernono, i mean the changes that you told me about18:35
clayghow can that be tho?  you're not setting it in the object-server.conf18:35
claygoh oh oh - sure that'd be nice to have for the future - makes replication a lot more closely tied to rare commodity of spinning platters18:36
claygup to you if you want to change that now or after the fire-drill18:36
claygwe rebalanced lots of clusters before we had rsync modules per disk ;)18:36
ybunkerafter :)18:36
claygyeah so tweak your workers X concurrency and increase the rysnc max_connections on the 3 new nodes like... A LOT18:37
claygyou should see them start to get hammered with write io, and shortly after that the nodes pushing data will be able to do some DELETEs18:37
ybunkerso on all the nodes im going to set:   workers = 9, concurrency = 4, replicator_workers = 9 and max_connections on rsyncd to 7618:37
claygthat's fine - if it's easy for you to make a subtle change I'd recommend a heterogeneous deployment18:38
clayglots of workers X concurrency on the PUSHING nodes w/ very little room for incoming rsync - then the OPPOSITE on the receiving nodes18:38
claygdoes that make sense?  I'm not sure it's obvious what these options all really "do" exactly?18:39
ybunkerso on fully nodes i need to decrease the max_connections? or do i increase it on all the nodes?18:39
claygyou man increase it on all nodes - it would be a very small improvement to leave it set to something small on the nodes that should be pushing data (just to avoid busy work)18:40
ybunkergot it18:40
claygnormally "busy" work is fine, it's just... the code doesn't really understand "this is an emergency!!!  FREAK OUT!" so if you give it a bunch of breathing room it might think "oh you want me to GO FAST, can do boss!"18:41
ybunkergot sense :)18:42
ybunkerhopefully with these changes it started to decrease a little bit :-), get back tomorrow hopefully with some good news :-), thanks clayg18:43
clayggreat!18:43
*** e0ne has quit IRC19:23
*** dasp has joined #openstack-swift19:35
*** ybunker has quit IRC19:53
*** e0ne has joined #openstack-swift20:05
*** pcaruana has quit IRC20:46
*** fyx has quit IRC20:53
*** jungleboyj has quit IRC20:54
*** clayg has quit IRC20:56
*** jungleboyj has joined #openstack-swift20:57
*** e0ne has quit IRC20:58
*** gmann has quit IRC20:59
*** beisner has quit IRC21:00
*** e0ne has joined #openstack-swift21:02
*** jungleboyj has quit IRC21:04
*** e0ne has quit IRC21:06
*** jungleboyj has joined #openstack-swift21:07
*** fyx has joined #openstack-swift21:07
*** gmann has joined #openstack-swift21:07
*** beisner has joined #openstack-swift21:08
*** clayg has joined #openstack-swift21:14
*** ChanServ sets mode: +v clayg21:14
*** irclogbot_0 has quit IRC21:58
*** tonyb has quit IRC21:58
*** irclogbot_1 has joined #openstack-swift22:03
*** jungleboyj has quit IRC22:07
*** gmann has quit IRC22:07
*** jungleboyj has joined #openstack-swift22:07
*** gmann has joined #openstack-swift22:07
*** sleterrier_ has joined #openstack-swift22:09
*** gmann has quit IRC22:11
*** irclogbot_1 has quit IRC22:11
*** gmann has joined #openstack-swift22:11
*** kinrui has joined #openstack-swift22:14
*** fungi has quit IRC22:16
*** sleterrier has quit IRC22:16
*** mathiasb has quit IRC22:16
*** kinrui is now known as fungi22:19
*** irclogbot_0 has joined #openstack-swift22:21
*** rcernin has quit IRC22:45
*** rcernin has joined #openstack-swift22:45
notmynameworking on my swift project update talk for next week, and I'm looking for some really old info. which means I'm looking through project update talks from back in 2011 (and earlier). wow they were ugly ;-)22:50
notmynameeg http://d.not.mn/swift_overview_oscon2011.pdf22:51
*** tkajinam has joined #openstack-swift22:57
*** threestrands has joined #openstack-swift23:04
*** rcernin has quit IRC23:15
*** rcernin has joined #openstack-swift23:16
*** csmart has joined #openstack-swift23:26
mattoliveraumornign23:48
notmynamehello mattoliverau23:48
mattoliverauwow, that looks a lot different then your current update style. They look about as ugly any of my slides.23:52
notmynameit's funny seeing the original all-text and bullets version to the stylized text to the just-big-pics to my current style of a blend of text and pics23:54
notmynameI also need to figure out the right way to say "hello everyone. this is my last project update to give, so you get to listen to what I want to talk about for the next 40 minutes" ;-)23:57
notmyname"I've given close to 20 of these, so now imma tell you what's what"23:57

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!