21:00:03 <timburke> #startmeeting swift
21:00:04 <openstack> Meeting started Wed Oct 30 21:00:03 2019 UTC and is due to finish in 60 minutes.  The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:05 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:07 <openstack> The meeting name has been set to 'swift'
21:00:13 <timburke> who's here for the swift meeting?
21:00:19 <kota_> o/
21:00:37 <mattoliverau> o/
21:00:59 <rledisez> o/
21:01:36 <timburke> agenda's at https://wiki.openstack.org/wiki/Meetings/Swift
21:01:47 <timburke> #topic Shanghai
21:01:59 <timburke> it's almost here!
21:02:14 <kota_> yey
21:02:22 <timburke> in two days, i'll be on the plane! i'm excited (and a little freaking out)
21:02:35 <rledisez> H-36 before my flight :)
21:02:55 <timburke> i've been adding what events i know about to the etherpad
21:02:58 <timburke> #link https://etherpad.openstack.org/p/swift-ptg-shanghai
21:03:02 <kota_> it'll logn flight, please safe.
21:03:23 <kota_> be long
21:03:25 <timburke> in particular, i saw there was a game night, like there have been the last few PTGs
21:03:48 <timburke> they tend to be pretty fun, and a good opportunity to get to know some other the other openstackers better
21:04:08 <timburke> i also saw that cschwede is going to be there!
21:04:08 <kota_> good to know
21:04:16 <kota_> when?
21:04:22 <timburke> (at the ptg that is; i don't know about game night ;-)
21:04:57 <kota_> ah, you told about the past ones.
21:05:03 <kota_> got it.
21:05:18 <timburke> game night's thursday, 8:00 PM, City Center Mariott Lobby
21:05:30 <kota_> ah ok. thx.
21:06:00 * kota_ should go to the etherpad link
21:07:05 <timburke> oh, there was also a flyer the foundation put together, where'd i put that...
21:07:18 <timburke> #link https://object-storage-ca-ymq-1.vexxhost.net/swift/v1/6e4619c416ff4bd19e1c087f27a43eea/www-assets-prod/summits/shanghai/Shanghai-Travel-Tips.pdf
21:07:41 <timburke> with some travel tips
21:07:50 <kota_> travel tips!
21:08:37 <timburke> "Some restrooms do not supply toilet paper.  Suggested to carry some with you." is a little disconcerting...
21:08:46 <timburke> but good to know!
21:08:51 <kota_> oic. "Download ALL apps needed on your phone / desktop (including the Summit Mobile App!)"
21:09:04 <kota_> `including the Summit App!`
21:09:11 <mattoliverau> oh wow so soon
21:09:40 <timburke> yeah, i still need to get my phone situated... laptop's prepped; phone, not yet
21:09:56 <mattoliverau> at least y'all will be close to my timezone soon (assuming you'll have irc access)
21:10:38 <kota_> strictly speaking, my timezone is closer ;-)
21:10:55 <timburke> mattoliverau, i *think* irc will be ok? ptgbot isn't going to be so useful otherwise, anyway ;-)
21:11:04 <mattoliverau> ahh good point :)
21:11:18 <mattoliverau> kota_: lol
21:11:22 <mattoliverau> that is true
21:12:29 <timburke> all right, that's all i've got for summit/ptg -- i can't wait to see kota_ clayg rledisez alecuyer and cschwede there!
21:12:37 <timburke> on to updates!
21:12:46 <timburke> #topic versioning
21:13:46 <timburke> clayg and tdasilva have got the three patches stacked up now, and they've been iterating on it
21:13:50 <mattoliverau> cschwede there too, that's awesome!
21:14:44 <timburke> i haven't been able to follow along quite as closely as before, as i've gotten a bit distracted with summit/ptg/general-travel prep
21:15:58 <mattoliverau> timburke: you do a great job of following as much as you do.
21:16:25 <mattoliverau> I guess they're not here to do an update. maybe we just link the patches and move on then?
21:16:25 <timburke> but i got the impression they've been adding tests and fixing up rough edges, with the idea that clayg will have a pretty solid picture of what's involved so we can talk about it at the ptg
21:16:35 <mattoliverau> cool
21:16:58 <mattoliverau> so what comes first, the null containers?
21:17:09 <mattoliverau> or namespace
21:17:20 <mattoliverau> what ever terminology I'm suppose to use :P
21:17:29 <timburke> yep, null namespace first -- https://review.opendev.org/#/c/682138/
21:17:48 <timburke> then a new versioning api for swift -- https://review.opendev.org/#/c/682382/
21:18:26 <timburke> and finally hooking up s3api to use the new api -- https://review.opendev.org/#/c/682382/
21:18:49 <mattoliverau> sweet
21:18:54 <timburke> i think that ought to cover it
21:19:00 <timburke> #topic lots of small files
21:19:03 <kota_> those look big changes
21:20:28 <timburke> kota_, fortunately, almost 4k of the 5k lines in that middle patch are just new test files :-)
21:20:39 <kota_> :-)
21:20:54 <rledisez> alecuyer is not here, but I think he told me there is nothing new on losf this week
21:21:29 <timburke> rledisez, is there anything we should be trying to do or look at to be more prepared at the ptg?
21:22:18 <rledisez> I think what we have in mind right now is to stop evolving it for w while until it can be merged (fix bugs, tests, nothing new before merge)
21:22:54 <rledisez> I think alecuyer should be answering your question, i can't answer honestly
21:24:10 <timburke> that's ok, no worries. something of a freeze sounds reasonable; and i guess the rest of us ought to be thinking about what needs to happen next for us to feel comfortable merging it to master
21:25:04 <timburke> #topic profiling
21:25:13 <timburke> #link https://etherpad.openstack.org/p/swift-profiling
21:25:33 <timburke> rledisez, take it away :-)
21:25:38 <rledisez> thx :)
21:26:13 <rledisez> so, the full story is in the etherpad, but i short, we are CPU bound on proxy-servers, and it does not seem right that a decent proxy-server cannot handle more than 3 or 4 Gbps of trafic
21:26:42 <rledisez> so I did some profiling, I played with conf option timburke suggested last week and I put result there
21:27:11 <rledisez> first of all, I'm interested if you see any issue in the bench I did (wrong methodology etc…)
21:27:33 <rledisez> after that, I propose some ideas at the bottom to improve the situation i'd like to discuss
21:27:52 <rledisez> basically, object-server is fine. proxy/GET is fine, proxy/PUT is damn slow
21:28:22 <timburke> you said it's got 10Gb NICs -- are there two of them? one client-facing, one cluster-facing?
21:29:05 <rledisez> timburke: on our production yes, but for the benchmark, all was local? on production we are far from 10Gbps on either interface
21:29:18 <rledisez> i mean, it was local, for sure :)
21:30:08 <rledisez> note: I still need to bench with EC policy
21:30:37 <timburke> and are we measuring bandwidth on the client-facing traffic, or cluster-facing?
21:30:47 <timburke> (just to sanity check ;-)
21:30:48 <zaitcev> Is it possible refuse in-kernel MD5 and try some local libraries?
21:30:58 <zaitcev> Maybe the kernel overhead is too great or something.
21:31:18 <rledisez> timburke: client facing (and I understand that cluster facing we are expecting N*bandwidth for a PUT)
21:32:05 <kota_> rledisez: the benchmark ran under py3 or still py2?
21:32:16 <rledisez> zaitcev: are you talking about the splice option?
21:32:41 <rledisez> kota_: I did both for some measure, I didn't see any major difference, but mostly py2
21:32:54 <kota_> ok
21:34:08 <timburke> so with the 1MB chunk size... the client's seeing 5Gbps, so we must be generating 15Gbps on the cluster interface -- which seems about in line with the upper-bounds you were seeing in the object-server...
21:34:09 <zaitcev> No, I am saying that all of our MD5 are calculated by kernel nowadays, right?
21:34:22 <zaitcev> Every time you invoke md5 it's a syscall
21:34:32 <zaitcev> Using the AF_LINK or what's its name
21:34:33 <timburke> zaitcev, nope -- rledisez already pointed out that i had the wrong idea about that ;-)
21:34:59 <zaitcev> ok
21:35:57 <rledisez> timburke: In my bench I had 3 object-server that could reach about 14Gbps, so in the best of best world the proxy should handle 15Gbps (because it's all localhost trafic, writing on /dev/shm)
21:36:19 <timburke> when we're writing we just use the normal python hashlib: https://github.com/openstack/swift/blob/2.23.0/swift/obj/diskfile.py#L1669
21:37:03 <rledisez> and it's quite good given than object-server is "only" 20% slower than simple md5sum
21:37:25 <rledisez> so I'm not expecting major improvement on object-server
21:37:52 <rledisez> (well +17% bw / -17% cpu is still something :))
21:38:43 <rledisez> just to be clear, i'm not suggesting at all to remove md5 calculation :) I just did it to get the best of proxy-server
21:40:11 <timburke> eh, i know notmyname's talked about the idea of using soemthing other than md5 before... that's not *such* a crazy idea...
21:40:46 <timburke> but yeah, i'm not sure how best to further investigate ATM
21:41:09 <mattoliverau> Makes me wonder if some simple perf daily or weekly check in zuul might be useful. Obviously with a grain of salt because of shared tenants. But might catch major degredations.
21:41:42 <rledisez> well, with just the no-timeout/no-queue on proxy we already could get a signifant perf improvement
21:42:46 <mattoliverau> I wonder why timeouts cause such a slow down, is it there implementation, because I wonder why a watchdog thread works so well
21:43:08 <mattoliverau> I would've thought a timeout would be a timed thread or something somewhat similar
21:43:22 * mattoliverau has never looked under the hood though
21:43:53 <mattoliverau> is it something evenlet monkey patches (me is just thinking out loud).
21:43:55 <rledisez> mattoliverau: it is quite good, except we call it for each chink (so each piece of 64KB), so it is called thousands of time for an upload
21:44:05 <mattoliverau> ahh
21:44:08 <mattoliverau> ok
21:44:18 <rledisez> while a watchdog will be initalized once and jsut a variable is updated then
21:44:25 <timburke> yeah, eventlet basically schedules an event for later to raise the Timeout in the appropriate thread
21:45:19 <timburke> and i think it's also part of why the increased chunk size captures a lot of the no-timeout gain
21:45:48 <rledisez> and the queue, well, the same * N, and it needs a synchronisation each time (so lock etc…)
21:46:06 <rledisez> timburke: right, bigger chunk == less call to Timeout/queue
21:46:23 <timburke> rledisez, did you happen to measure RAM consumption differences between the different chunk sizes?
21:47:26 <timburke> i wonder if we should just up the default chunk size...
21:47:29 <rledisez> nope, I didn't. but it's quite easy to calculate the woerst case scenario I think. the max queue size is 10 IIRW, there is N replica, so chink_size * N * 10 ?
21:47:45 <rledisez> *chunk
21:47:58 <rledisez> per PUT
21:48:33 <timburke> and it was always a single PUT at a time, right?
21:49:39 <rledisez> yeah, Im planning to do more on concurrency later to see if there is something to optimize there
21:49:47 <rledisez> right now it's focus on one connection performance
21:51:51 <mattoliverau> rledisez: great job
21:51:52 <timburke> out of curiosity, what kinds of speeds can you get with netcat? testing locally just now, i can get ~24GB/s with dd piping straight to /dev/null, but only 1.3-1.4GB/s (so ~11Gbps) if i send it through a socket that's piping to /dev/null...
21:52:25 <rledisez> timburke: do you have the exact command so I can copy/paste?
21:53:10 <timburke> in one terminal, `nc -l 8081 > /dev/null` -- in another, `dd if=/dev/zero bs=1M count=10000 > /dev/null`
21:53:43 <rledisez> 15.5 GB/s
21:53:52 <timburke> i tried twiddling bs/count to do even larger chunk sizes, but it didn't have much difference
21:54:08 <timburke> why's my laptop so slow!? boo!
21:54:28 <timburke> good to know though, to keep in mind as an upper bound :-)
21:54:56 <rledisez> I can provide you a server to work, but you're going to have trouble at china customs ;)
21:55:16 <timburke> all right, well... i guess we'll keep thinking about it. willing to bet we'll talk about this more next week
21:55:24 <kota_> lol
21:55:33 <timburke> got just a few more minutes
21:55:38 <timburke> #topic open discussion
21:55:47 <timburke> anything else anyone would like to bring up?
21:56:29 <mattoliverau> I have a mate who might be convincing his work to do some upstream time. They're interesting in tiering. So if it goes ahead I might point him at those stalled patches.
21:56:59 <timburke> \o/ i love new contributors!
21:57:05 <mattoliverau> if so, a discussion that should be had (maybe at ptg) is is it still the right design?
21:57:27 <kota_> good
21:57:31 <mattoliverau> or maybe should it use the new null namespace and hide tiering containers?
21:58:11 <timburke> excellent question
21:58:15 <mattoliverau> I had a chat wtih him about some of it already while giving him a Swift intro the otherday online.
21:59:01 <timburke> out of curiosity, who's his employer, if you can say?
21:59:13 <mattoliverau> if you can add that to the list of discussions it would be good. It'll depend on if he can swing it. But maybe as a friday thing after other null namespace dicsussions
21:59:25 <mattoliverau> Can't say yet
21:59:37 <timburke> 👍
21:59:54 <mattoliverau> timburke: but you might now them because they may or may not use swiftstack ;)
22:00:05 <timburke> all right, we're about at time
22:00:18 <timburke> thank you all for coming, and thank you for working on swift!
22:00:22 <timburke> #endmeeting