Wednesday, 2022-08-31

kotalooking at https://wiki.openstack.org/wiki/Meetings/Swift . it's still pointing at 3rd Aug12:02
timburke😬 sorry -- i really need to update that!17:28
kotagood morning20:59
timburkesorry (again) -- my lunch ran long21:09
timburke#startmeeting swift21:09
opendevmeetMeeting started Wed Aug 31 21:09:38 2022 UTC and is due to finish in 60 minutes.  The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.21:09
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.21:09
opendevmeetThe meeting name has been set to 'swift'21:09
timburkewho's here for the swift meeting?21:09
kotao/21:09
mattolivero/21:10
timburkejust want to go through a few patches this week, first up21:11
timburke#topic get info memcache skipping21:11
timburkemattoliver, thanks again for pushing up some fixes to https://review.opendev.org/c/openstack/swift/+/85095421:11
mattoliverNps, I like it :) 21:12
timburkethe idea there is to randomly add a memcache miss to get_account_info and get_container_info calls, similar to what we're already doing for shard ranges21:13
timburkethe updates looked good to me, too -- do we want to go ahead and merge it, or wait until we've run it in prod for a bit first?21:14
mattoliverI guess it doesn't hurt to wait a week. But on the other hand it is disabled by default. 21:15
timburkewe can wait -- i'll make sure there's a ticket for our ops to try it out post-upgrade, and plan on letting y'all know how it goes next week21:17
mattoliverSeeing as we are carrying in prod from this week, maybe we take advantage of that and see if it works :) 21:17
mattoliverKk21:17
timburkenext up21:17
timburke#topic object metadata validation21:17
timburkeone of our recent hires at nvidia took a look at a bug we were seeing where we had a healthcheck that talked directly to object-servers to verify that we can PUT/GET/DELETE on every disk in the cluster21:19
timburkeunfortunately, the healthcheck would write the bare minimum to get a 201, resulting in the reconstructor blowing up if the DELETE didn't go through (or if there was a race)21:20
timburkeend result was a patch i'm liking21:20
timburke#link https://review.opendev.org/c/openstack/swift/+/85332121:20
timburkethough i kind of want to go a little farther and add some sanity checks for replicated policies, too, as well as using the new validation hook in the auditor21:21
timburke#link https://review.opendev.org/c/openstack/swift/+/85529621:22
zaitcevInteresting.21:22
timburke(fwiw, the specific bug we were seeing stemmed from us not including a X-Object-Sysmeta-Ec-Etag header in the PUT part of the healthcheck -- we'd include frag index, but not the client-facing etag)21:23
timburkejust wanted to call attention to them -- i don't think there's much discussion that needs to happen around them (except maybe thinking of further checks we'd like to make)21:25
timburkenext up21:26
timburke#topic ring v221:26
timburke#link https://review.opendev.org/c/openstack/swift/+/83426121:27
timburkei've had a chance to play more with the patch -- found a few more rough edges, but the core of it still seems solid21:27
timburkei also started putting together some benchmarking using some pretty big rings from prod (20k+ devices, part power like 20 or something like that -- 5MB or so in size)21:29
mattoliverK, I haven't looked at it for q few weeks, so I'll revisit it today. 21:29
mattoliverOh good idea21:29
timburkelong and short of it is that v1 and v2 seem to be within a few percentage points of each other, which doesn't seem too surprising given that the formats are largely related21:30
timburkeinterestingly, i haven't seen the performance improvement i was expecting in going from v0 to v1 -- i remember that seemed to be the driving force during my format-history research21:31
timburkebut then, i also discovered that we didn't specify the pickle protocol version when serializing v0 rings in the new patch, so maybe there were some protocol-level improvements21:32
timburkei'll dig into it a little more21:32
mattoliverNice work. Yeah performance testing is kinda what we need on this. How well does it work.21:33
timburkehttps://bugs.launchpad.net/swift/+bug/1031954 for the old performance bug21:33
mattoliverAnd memory performance.. ie we can load only what we want out of the format too. 21:34
timburkefor sure -- though it becomes more noticeable if/when we merge the ring and builder files21:35
mattoliverYeah +121:36
mattoliverBut metadata only won't include devices which will help :) 21:36
timburke#topic sync_batches_per_revert21:39
timburkei also did some quasi-benchmarks for this patch -- basically doing an A/B test in my home cluster while rebalancing a few terabytes21:39
timburke#link https://review.opendev.org/c/openstack/swift/+/83964921:39
mattoliverOh another good patch.. I kinda forgot about 21:39
mattoliverThis also come into its own when wanting to make progress on bad disks right? Or am I thinking of another patch? 21:40
timburkeyes -- but i'm definitely seeing it being a good/useful thing in healthy (but fairly full) clusters21:41
mattoliverOh nice21:42
timburkeso i've got three object nodes -- i drained one completely so i could simulate a large expansion. the other two nodes, i ran with handoffs_first -- and one tried the old behavior of a single big rsync, while the other tried batches of 2021:43
timburkethen i watched the disk usage rate while they both rebalanced to the "new" guy. the single rsync per partition would see long periods where disk usage wouldn't move much, with periodic spikes that would go up to like -200, -400, -600MB/s21:45
timburkethe node that broke it up into batches would fairly consistently be going something like -50MB/s, occasionally jumping up to -100MB/s -- which which matched the rate of rsync transfers *much* better21:47
timburkei don't have log aggregation set up at home (yet), but i also got the sense that there were a lot more rsync errors with the single-rsync transfers21:48
mattoliverOh cool interesting. Less of a list to build and compare at both ends before starting, with things that might change? 21:49
mattoliverPer rsync 21:49
timburkeless that -- i've got a fairly static cluster; now and then some writes, but mostly pretty read heavy. the bigger win was that we didn't need to wait for a whole partition to transfer before we could start deleting21:50
mattoliverOh yeah, that's huge actually. 21:51
timburkeso my proposition is that this would help a cluster-full situation: you might be able to get to 90% full, finally get your expansion in, and not need to wait a day or two for the initially-failed rsyncs to get retried a few times to start bringing down usage on the old disks21:53
mattoliverYeah, +1, that's pretty great 21:54
kota+121:54
timburkebonus: the batched node finished its rebalance several hours earlier (again, likely due to the lowered likelihood of rsync errors)21:55
timburkeall right, sorry -- i took a bunch of time with that. it was kinda fun, though it took a bit :-)21:55
timburke#topic open discussion21:55
timburkewhat else should we talk about this week?21:56
zaitcevWell, it's been almost a full hour.21:56
mattoliverSounds awesome! Let's get it in21:56
timburkeall right, i'll call it21:58
timburkethank you all for coming, and thank you for working on swift!21:58
timburke#endmeeting21:58
opendevmeetMeeting ended Wed Aug 31 21:58:45 2022 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)21:58
opendevmeetMinutes:        https://meetings.opendev.org/meetings/swift/2022/swift.2022-08-31-21.09.html21:58
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/swift/2022/swift.2022-08-31-21.09.txt21:58
opendevmeetLog:            https://meetings.opendev.org/meetings/swift/2022/swift.2022-08-31-21.09.log.html21:58

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!