Wednesday, 2022-11-30

opendevreviewASHWIN A NAIR proposed openstack/swift master: s3api errors for unsupported headers x-delete-at, x-delete-after  https://review.opendev.org/c/openstack/swift/+/86478801:25
opendevreviewMatthew Oliver proposed openstack/swift master: WIP sharder: update own_sr stats explicitly  https://review.opendev.org/c/openstack/swift/+/85228301:42
opendevreviewMatthew Oliver proposed openstack/swift master: Sharding: No stat updates before CLEAVED state  https://review.opendev.org/c/openstack/swift/+/83781101:42
opendevreviewMatthew Oliver proposed openstack/swift master: WIP: sharding: Block moving to CLEAVED if cleaved rows < expected  https://review.opendev.org/c/openstack/swift/+/84397301:42
opendevreviewASHWIN A NAIR proposed openstack/swift master: proxy-server exception logging shows replication_ip/port  https://review.opendev.org/c/openstack/swift/+/86086602:33
opendevreviewAlistair Coles proposed openstack/swift master: swift_proxy: add memcache skip success/error stats for shard range.  https://review.opendev.org/c/openstack/swift/+/85894210:49
opendevreviewAlistair Coles proposed openstack/swift master: trivial: fix flakey account/test_auditor.py assertion  https://review.opendev.org/c/openstack/swift/+/86613811:02
zigotimburke: Thanks a lot for the investigations !11:24
zigotimburke: Should I still apply https://review.opendev.org/c/openstack/swift/+/866051 to my debian/zed branch?11:25
opendevreviewMerged openstack/swift master: trivial: fix flakey account/test_auditor.py assertion  https://review.opendev.org/c/openstack/swift/+/86613814:48
cylopezhello, openstack-swift did "illegal transfer encoding" error in swiftclient ring your bell about a glance image upload from horizone ?16:00
cylopezI got it, glance doesn't provide the size during to swiftclient, so swift backend is replying 501 not implemented16:32
opendevreviewAlistair Coles proposed openstack/swift master: sharder: update own_sr stats explicitly  https://review.opendev.org/c/openstack/swift/+/85228317:05
timburkezigo, yeah, i think that patch should square you on 3.11.017:33
opendevreviewTim Burke proposed openstack/swift master: Tolerate EOFError on input()  https://review.opendev.org/c/openstack/swift/+/86620919:59
timburke#startmeeting swift21:00
opendevmeetMeeting started Wed Nov 30 21:00:15 2022 UTC and is due to finish in 60 minutes.  The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.21:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.21:00
opendevmeetThe meeting name has been set to 'swift'21:00
timburkewho's here for the swift meeting?21:00
mattolivero/21:00
kotao/21:01
acoleso/21:01
timburkeas usual, the agenda's on the wiki21:02
timburke#link https://wiki.openstack.org/wiki/Meetings/Swift21:02
timburkefirst up21:02
timburke#topic func test failures21:02
timburkeso we've still got those failing func tests on centos921:02
timburkeand gmann was good enough to pin our dsvm job to focal so it wouldn't start failing, too21:03
timburkewe've got a fix in hand, and it's even got a +2 (thanks, acoles!)21:03
timburke#link https://review.opendev.org/c/openstack/swift/+/86344121:04
timburkeso: are there any objections to merging it?21:04
mattoliverOk cool. I said I'd look at that, sorry, got distracted. Promise to take a look first thing today.21:04
timburkeno worries; thanks mattoliver21:05
mattoliverFrom when I looked at it, it made sense. 21:05
timburkethen separately: do we think it's worth removing our dependence on stdlib's http parsing? i have a gut feeling that it would be a fairly large endeavor21:06
timburkeand it's not clear whether we'd want to have that in-tree or push it into eventlet. i got the impression from https://github.com/eventlet/eventlet/pull/574 that temoto might be receptive to that...21:08
timburkeit doesn't necessarily mean that we'd have to own the parser, either -- something like https://github.com/python-hyper/h11/ looks somewhat appealing21:09
mattoliverThat does sound like a large undertaking. And keeping it maintained. Although we do pull more and more. I dunno, but if eventlet want to maintain it. Maybe it's worth it.21:09
mattoliverAlthough would that mean I tighter coupling with eventlet21:10
timburkeand we'd almost certainly need to revisit https://review.opendev.org/c/openstack/swift/+/427911 | Replace MIME with PUT+POST for EC and Encryption21:11
timburkeanyway, i think we can debate that more later. my biggest worry is that we already have a bunch of large undertakings in progress21:13
mattoliveryeah21:13
mattolivernot a bad idea, maybe add it to ideas or something so we can discuss it more, or have a place to write things21:13
timburkeas long as we can continue to paper over the current trouble, i think we can defer that decision21:13
timburkenext up21:14
mattoliver+121:14
timburke#topic testing under py31021:14
timburkejust a quick update that we're still blocked on requirements21:14
timburke#link https://review.opendev.org/c/openstack/requirements/+/863581/21:14
timburkei gave a nudge just before the meeting, we'll see if we can get this moving21:15
timburke#topic testing under py31121:15
timburkeregardless of what *we're* testing with, our users will continue marching forward! ;-)21:16
timburkeand by doing that, we helped discover a cpythong bug21:16
timburkezigo noticed it first21:16
timburke#link https://meetings.opendev.org/irclogs/%23openstack-swift/%23openstack-swift.2022-11-24.log.html21:16
mattoliveryup, and you found a rather large issue in 3.11 too right?21:17
timburkei eventually ran it down and wrote it up21:17
timburke#link https://github.com/python/cpython/issues/9988621:17
timburkeand it looks like they're working on a fix21:17
timburke#link https://github.com/python/cpython/pull/9990221:17
zigoFYI, I have warned Doko aka Matthias Klose (maintainer of the interpreter in Debian and Ubuntu) about it on #debian-python.21:18
zigoHopefully, we'll have a fix soon for it.21:18
timburkeone question i've still got on my mind: i found a workaround to avoid the segfault -- is it worth us applying that? or should we just wait it out for a 3.11.1?21:18
zigoHowever, it's not clear to me: do I still need a patch on the Swift side for it?21:19
timburke#link https://review.opendev.org/c/openstack/swift/+/86605121:19
zigoOh, thanks.21:19
zigoIMO, not worth merging upstream, but I can take it in downstream Debian until the interpreter is fixed.21:19
timburkezigo, if cpython fixes their issue, i don't think the swift patch is needed (though i also haven't tested with the WIP fix yet)21:19
zigoI'll build and let you know if that's an ok-ish temp fix for me.21:20
timburke👍21:20
zigoThanks again tim !21:20
acolesgreat detective work timburke 21:20
timburkethere was something surprising in my work-around patch that seemed like we might still want...21:21
timburke#link https://review.opendev.org/c/openstack/swift/+/866051/2/swift/common/db.py#a58521:22
timburkethe blanket exception handling in that context manager on master doesn't sit right with me21:22
zigoI just wrote about the above at the Debian bug entry: https://bugs.debian.org/102478421:22
timburkethanks21:23
timburkeall right, we'll plan on holding off on that patch, though i might pull that bit i highlighted out as its own patch21:24
timburke#topic ring v221:24
timburkeso based on some feedback from clayg i started playing around with using zipfiles as rings21:25
timburke#link https://review.opendev.org/c/openstack/swift/+/86506621:25
timburkeit's definitely still rough (not near as polished as the patch we talked about at the PTG), but the core ideas are all there21:26
timburkebut i think my biggest concern is over the name change21:26
timburke1. there are a bunch of places throughout the code base where we assume rings will be named *.ring.gz21:27
timburke2. dealing with automatic ring reloading seems like it'd get a little hairy21:28
mattoliverI plan to as a follow up to the zip patch as it stands, so see what the fallout of a name change would look like. 21:28
mattolivermight play with that today if there isn't a fire or distraction at work21:28
timburkethat said, there were definitely some nice things about being able to use zip/unzip on the thing -- in particular, it seems like a much better way to shove in whatever other extension data you might want to put in a ring21:29
mattoliverI assume ring reloaded, we'd just be looking at the latest mtime. 21:29
mattolivermaybe it wont be so bad21:29
mattoliverbut time and code will tell :P 21:30
timburkemattoliver, the thing that really got in my way was that we hang on to the filename we loaded and keep checking that -- being able to flop between two names got messy21:31
mattoliverhmm, yeah21:31
timburkeanyway, based on that, i pushed up a new swift-ring-packer utility21:31
timburke#link https://review.opendev.org/c/openstack/swift/+/86608221:31
timburkelets you pack/unpack/list v2 rings made with the patch that we talked about in the PTG21:32
timburkethat's all i've got21:33
timburke#topic open discussion21:33
timburkewhat else should we talk about this week?21:33
mattoliverI've made some progress on cleaning up the tracing stuff21:33
mattolivereven moved it into a better namespace inside swift21:34
mattoliverAt the end of the chain I have a docs patch with an overview_trace 21:34
zigotimburke: I can confirm your workaround fixes the Swift Debian package unit test runs (just did a build run...).21:35
timburke\o/21:35
timburkeon both fronts ;-)21:35
mattoliverThe doc is very rough, but plan to put design decisions there, because I did have to make some interesting decisions thanks to eventlet21:35
mattoliverAnd it's kinda important to know when reviewing tracing in swift21:35
mattoliver#link https://review.opendev.org/c/openstack/swift/+/86561221:36
mattoliver^ thats the end of the chain with the doc21:36
mattoliverI'm also trying to revive the shrinking chain21:37
mattoliver#link https://review.opendev.org/c/openstack/swift/+/84397321:37
mattoliver^currenlty ending there21:38
mattoliverI see acoles has worked worked on the front end of the chain again. So will check it out today21:38
mattoliveryup, he double worked :P 21:38
acolesmattoliver: I think it is in good shape, thanks for your help.21:39
mattolivernps21:39
acolesI checked test coverage and added some21:39
acolesI'll try to keep working up the chain (which I just realised I did not rebase), maybe we can get the first merged21:40
mattoliveryeah! 21:40
mattolivergood plan21:40
timburkei'll take another look, too21:41
acolesI've been working on reducing the memcache blob size for cached shard ranges https://review.opendev.org/c/openstack/swift/+/86356221:42
mattoliveroh cool21:42
acolesthere's two ideas in there: 1) inspired by timburke's patch, compact the serialised shard range data by only storing the lower & name21:43
acoles2) use a shallow tree to break up 1000's of shard ranges into leaves of 100's 21:43
timburkenice!21:44
timburkei bet i can help with that first TODO ;-)21:44
acolestimburke: oh, please do 🙏21:44
mattoliverit looks awesome, and thanks for responding to my questions inline. I'll loop back to it soon21:45
acolesone wrinkle I uncovered (well, probe tests uncovered!): the backend may return overlaps in updating shard range lists (ok overlaps, when a parent is sharding and children are created)21:45
acolesand the overlap does not play well with compacting the shard ranges to only (lower, name) :O 21:46
acolesso I ended up needing to filter out the overlaps before serialising to cache21:46
acoleseverything is fine on master because of the way the ranges sort by (upper, state, lower) but once we throw away upper and state the overlaps mess things up21:47
mattoliverahh yeah21:47
acolesso...should we change the backend to filter what it returns?21:47
timburkei remember seeing that on my patch, too -- pushed up https://review.opendev.org/c/openstack/swift/+/852221 fwiw21:47
acolesthat means we have an upgrade dance - backends must be upgraded before proxies21:48
mattoliverI guess having a cached valid path is good anyway. So long as any mistakes can eventually be fixed. 21:48
timburkei suppose the better approach might be to have proxies do the filtering themselves...21:48
acolesmattoliver: yes, all we want in cache is a single contiguous path of namespaces21:49
acolesI need to ponder more what the proxy should do in the face of gaps ... we could have the proxy stretch a neighbor to cover a gap for updates 🤷21:49
mattoliveror put the root in there21:50
timburkeyeah, that was my thought, too21:50
mattoliveror not cache and force it to go back to the container for truth, but that would cause hotspotting (but alert us to the problem quickly) :P 21:50
acolesyes. again, I think we (maybe timburke ) have dabbled with the backend filling ALL gaps with the root, but again that gets us into an upgrade wrangle21:51
timburkethough if we're only tracking (lower, name), we necessarily start stretching shards21:51
acolestimburke: we can put ANY name (including root) in a tuple with a lower to fill a gap21:51
timburketrue21:51
acolesI mean, we could even pick a random other range to spread the pain :)21:52
mattoliveroh thats true21:52
mattoliverand take load off the root21:52
acoles...and muddle the debugging.. :(21:52
mattoliverwell maybe just leave the gap, as tim says it'll get caught by the preceeding range21:53
mattoliverthen the question remains what happens if the gap is at the start21:53
acolesok, I'll work up something to handle gaps21:54
acolesat this point I'd appreciate any feedback on whether the whole endeavour seems worthwhile 21:54
timburkedefinitely seems worthwhile to me -- we've been struggling with those large cache values, for sure21:54
mattoliver+121:55
mattoliverand there only going to go up, unless we do sometihng21:55
acolesand whether the class hierarchy makes sense to others - for a while I've wanted a superclass of ShardRange that just knows about namespace bounds21:55
acolesI recently checked and we have one shard range list at > 4MB!21:56
mattoliverthat is a little nuts.21:56
timburkeone question: now that there are two lookups, what happens if the first succeeds, but the second fails? i'm guessing we just treat it as a miss and go back to the root for a fresh set of everything?21:56
mattoliverhow big is the list once you run it through your code now?21:56
acolesmattoliver: less ;-)21:57
mattoliverlol, great, ship it then :P21:57
acolestimburke: that would be the intent21:57
timburkemattoliver, on my other patch (that just reduced what we track without breaking it into a tree), i saw a 3.1MB cache value shrink to 0.9MB21:58
acolesmattoliver: seriously, each blob is reduced to some configurable max-items, but for our 4MB maybe a 10x reduction?21:58
acolesAnd then we can estimate at least a 2x reduction for simply dropping the uppers21:59
mattolivernice!21:59
acolesok, so what timburke measured is for the compaction alone, then the shallow tree gets us even smaller if needed22:00
timburkeacoles, we're always limited to just the two layers, right? seems like we'd want to target len(shard_ranges)**0.5 shards per blob22:00
timburkeall right, we're at time22:01
timburkethank you all for coming, and thank you for working on swift!22:01
timburke#endmeeting22:01
opendevmeetMeeting ended Wed Nov 30 22:01:23 2022 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)22:01
opendevmeetMinutes:        https://meetings.opendev.org/meetings/swift/2022/swift.2022-11-30-21.00.html22:01
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/swift/2022/swift.2022-11-30-21.00.txt22:01
opendevmeetLog:            https://meetings.opendev.org/meetings/swift/2022/swift.2022-11-30-21.00.log.html22:01
acolestimburke: yeah, so mattoliver asked about deeper trees and i commented on patch that I stopped myself making the tree deeper (by recursion) because i think one level gives us lots of headroom, but I also tried to make it extensible in future22:01
timburkeah, got it22:01
acolestimburke: TBH the compaction alone might be enough (?) in many case22:02
timburkei definitely like the idea of keeping growth sub-linear22:03
acolesI actually started with a recursive arbitrary depth tree, then chanted YAGNI 100 times and deleted that code :P22:03
timburkelol22:03
acolesit hurt22:03
acolesok, good night o/22:04
timburkeo/22:04
opendevreviewTim Burke proposed openstack/swift master: WIP: restructure cached updating shard ranges  https://review.opendev.org/c/openstack/swift/+/86356222:34
opendevreviewTim Burke proposed openstack/swift master: DB locks shouldn't squelch errors  https://review.opendev.org/c/openstack/swift/+/86623923:48

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!