21:00:13 <timburke> #startmeeting swift
21:00:14 <openstack> Meeting started Wed Apr 29 21:00:13 2020 UTC and is due to finish in 60 minutes.  The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:15 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:17 <openstack> The meeting name has been set to 'swift'
21:00:21 <timburke> who's here for the swift meeting?
21:00:24 <seongsoocho> o/
21:00:28 <kota_> hi
21:00:32 <rledisez> hi o/
21:00:49 <tdasilva> o/
21:00:57 <clayg> o/
21:01:24 <timburke> agenda's at https://wiki.openstack.org/wiki/Meetings/Swift
21:01:36 <timburke> first up
21:01:42 <timburke> #topic swift-get-nodes
21:02:39 <clayg> so i haven't really looked at this - i don't see how it ever worked with paths with spaces?  Do you just like wrap it all in "quotes or something"
21:02:39 <timburke> so i realized earlier this week -- we never made it so you could find %00versions containers with swift-get-nodes!
21:02:54 <alecuyer> o/ sorry im late
21:03:18 <mattoliverau> o/
21:03:24 <timburke> clayg, yeah, with spaces you'd do something like `swift-get-nodes ring.gz 'acct/cont/obj with spaces'`
21:03:53 <clayg> oic, but that strategy won't work with the null character
21:04:02 <timburke> but you can't really do that with null bytes -- pretty sure things break down because of how args get sent to C programs
21:04:15 <timburke> which is a *bit much* to try to hack around ;-)
21:05:01 <timburke> #link https://launchpad.net/bugs/1875734
21:05:01 <openstack> Launchpad bug 1875734 in OpenStack Object Storage (swift) "swift-get-nodes cannot be used with %00versions containers" [High,In progress]
21:05:04 <clayg> yeah, ok so un-url-encoded sort of kind of worked fro some names on accident
21:05:24 <timburke> yup
21:06:05 <timburke> so i wrote a patch to have it accept percent-encoded paths, like acct/cont/obj%20with%20spaces
21:06:08 <timburke> #link https://review.opendev.org/#/c/724141/
21:06:08 <patchbot> patch 724141 - swift - swift-get-nodes: Allow users to specify either quo... - 1 patch set
21:06:15 <clayg> I think it should try and take quoted names by default - with a `--for-some-reason-i-did-not-quote-this` flag for when you want an object named `%beef` and can't be bothered to type `%25beef`
21:07:00 <clayg> I feel like for most of the cases where it was working it will still work - if you "unquote" "object with spaces" it turns into "object with spaces" doesn't it?  what's the big deal?
21:07:08 <timburke> out of an abundance of caution, i kept the old behavior and have you opt-in to quoted paths with --quoted (or just -Q)
21:07:36 <clayg> then we don't need to "deprecate" anything - we just get better (and leave an escape hatch if somehow someone was scripting this and figured out a way to make it work)
21:08:00 <clayg> timburke: I'd advocate for the opposite instead, that's why I wanted to get other opinions
21:08:02 <timburke> clayg, yeah, i was mainly worried about the %beef sort of case
21:08:33 <timburke> i'd be perfectly happy to just work the one way :D
21:08:48 <timburke> what's everybody else think?
21:09:18 <clayg> perhaps I'm being overly optimistic about most of the time unquote of unquoted names is the identity function - obviously in SOME cases that's not true
21:09:53 <rledisez> It seems reasonable to move fast as it's not an API and I doubt everybody did some scripting around that tool. Just displaying a proper warning after the default changed for sometime should be enough
21:10:02 <clayg> 100%beef is the canonical example of the case where the output would unexpectedly change
21:10:11 <timburke> fwiw, i only emit the warning when there is a difference
21:10:26 <rledisez> but the warning shoud be obvious. I can already tell I wouldn't see it because I'm never reading the first lines of the output...
21:11:09 <clayg> hahah
21:11:21 <timburke> heh, fair enough -- i probably ought to move it to the bottom we continue down this route
21:11:44 <rledisez> yeah, I think it gets more chance to be see by operators
21:12:57 <mattoliverau> we need like a cli equiv of <blink> tag :P
21:13:06 <clayg> @mattoliverau hahaha
21:13:43 <timburke> k -- i'll respin with no --quoted option (because it'll be the default behavior) and the warning moved to the bottom
21:14:15 <timburke> and just for mattoliverau, i'll probably spend more time than i should digging through ANSI escape sequence docs ;-)
21:14:22 <clayg> @timburke but what will the warning say exactly?  If they're doing the right thing and quoting paths that need it .... won't they get a WARNING: I did exactly what you expected
21:14:43 <mattoliverau> lol
21:14:43 <clayg> oh no mattoliverau
21:14:45 <timburke> hrm. good point :-/
21:15:03 <clayg> i think just put it in the release notes
21:15:59 <timburke> oh, it's for-sure going in the release notes. and i know rledisez would see it that way; do we think that's true of most operators?
21:16:03 <mattoliverau> snowman says "warning .." :P
21:16:38 <rledisez> I hope so… :)
21:17:15 <timburke> also, i can tell that i'm going to hate poking at old swift clusters after this ;-)
21:17:28 <rledisez> after that, it's just about reminding the change. so many years of passing crazy args to the tool, can't be erased in a blink ;)
21:18:13 <clayg> i feel like I mostly get these paths from loglines - there they're probably "double quoted" 🙄
21:19:12 <timburke> the only consistency is inconsistency!
21:19:23 <clayg> eventaully
21:19:26 <kota_> lol
21:20:18 <mattoliverau> lol
21:20:32 <timburke> well... i'll try something. at least the change is publicized! i *do* want something like this available in the near term, though, 'cause it's not great that there are containers and objects getting created that we can't easily find
21:20:48 <timburke> #topic S3 MPU deletes
21:21:57 <timburke> earlier this week, we had an availability issue because of request amplifications coming from deleting S3 MPUs
21:22:29 <mattoliverau> oh, opps
21:22:49 <timburke> there were a bunch of confounding factors (old-style versioning being enabled on the segments container, and everything being in an EC policy)
21:23:39 <timburke> but even setting those aside, you can have a single client DELETE kick off like 1000 subrequests to clean up the segments :-(
21:23:54 <kota_> :(
21:24:33 <timburke> if the client has a short-ish timeout (after all, AWS responds to deletes real quick), they'll likely retry and make the problem worse
21:25:25 <timburke> i don't really have any action items for this -- mostly just highlighting the problem so people know about it
21:25:33 <rledisez> i guess it can happen too with bulk delete?
21:26:53 <timburke> yeah, though bulk delete has the advantage of being able to dribble out bytes to keep the connection alive
21:28:02 <rledisez> right, so it cannot happen by accident (but still it can be intentional). ratelimit middleware could be a protection for that maybe. I don't remember exactly how it ratelimit (sleeping or returning a 4xx response?)
21:28:10 <tdasilva> async_delete ?
21:28:32 <timburke> rledisez, yeah, i was just thinking about ratelimiting...
21:28:34 <clayg> yeah, it's gunna have to be something like that
21:28:56 <timburke> tdasilva, the async delete stuff is definitely on my mind :-) i think long-term, we want a
21:29:13 <timburke> new large object type that behaves much more like MPUs
21:29:37 <timburke> and async-deletes will likely be part of that solution
21:30:15 <timburke> #topic PTG
21:30:27 <timburke> one more reminder about the call for topics
21:30:29 <timburke> #link https://etherpad.openstack.org/p/swift-ptg-victoria
21:31:33 <timburke> i think i'm also supposed to sign up for some timeslots or something -- i really need to make sure i figure out the plan for this "virtual PTG" thing this coming week
21:31:38 <timburke> we're only like a month away!
21:31:47 <clayg> wow that's nuts
21:32:09 <timburke> ok, on to ongoing-work!
21:32:18 <timburke> #topic lots of small files
21:32:46 <timburke> alecuyer, i saw a patch for an updated key format
21:33:02 <alecuyer> Yes, I pushed a WIP patch for that here : https://review.opendev.org/#/c/723609/
21:33:03 <patchbot> patch 723609 - swift (feature/losf) - New key format for objects in the index-server - 1 patch set
21:33:30 <alecuyer> So, still need to work on it, but I'm happy that I'm removing more code than adding new code. It makes listdir() functions simpler
21:33:37 <timburke> \o/
21:33:43 <alecuyer> and I guess that's about it unless you have questions :)
21:33:59 <timburke> i'll try to take a look at it soon :)
21:34:35 <timburke> i should also revisit my attempt to fix the losf gate job... never enough time lately
21:35:09 <timburke> #topic swiftclient socket leak
21:35:54 <timburke> so a week or two ago i saw a message on the ML
21:35:56 <timburke> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014221.html
21:36:13 <timburke> about swiftclient 3.9.0 on py2 leaking sockets
21:36:30 <timburke> #link https://bugs.launchpad.net/python-swiftclient/+bug/1873435
21:36:30 <openstack> Launchpad bug 1873435 in python-swiftclient "Established connection is never closed on Python 2.7" [High,In progress] - Assigned to Tim Burke (1-tim-z)
21:36:36 <clayg> i was looking at that bug today - the patch was pretty invasive
21:37:59 <timburke> well, part of it was that much of p 674320 probably should have been done like that from the get-go
21:38:00 <patchbot> https://review.opendev.org/#/c/674320/ - python-swiftclient - Cleanup session on delete (MERGED) - 2 patch sets
21:38:44 <timburke> clayg, would you prefer that p 721051 was more limited, just adding the `if not six.PY2:`?
21:38:44 <patchbot> https://review.opendev.org/#/c/721051/ - python-swiftclient - Only add __del__ to HTTPConnection shim on py3 - 2 patch sets
21:39:59 <clayg> no, i mean not if that won't fix the bug
21:40:44 <timburke> i think that'd probably be enough? not entirely sure, honestly. but it'd get us back to the prior behavior on py2, at least
21:41:17 <timburke> i could have the explcit closing as a follow-up that we separately debate -- that'd be fine
21:41:53 <zaitcev> Did anyone manage to reproduce the leak outside of that weird test environment?
21:42:14 <zaitcev> I mean, the original leak that Schulz patched with __del__.
21:43:07 <timburke> zaitcev, i remember being able to get that ResourceWarning just fine when i was reviewing it
21:43:12 <zaitcev> I see.
21:44:21 <timburke> might only get emitted on some versions of py3... i forget when they added that
21:44:48 <zaitcev> You know
21:45:17 <zaitcev> I was reading keystoneclient a bit in the past week. And it has client.sess=sess and sess.client=client
21:45:33 <zaitcev> Not always, but sometimes, depending what options are supplied.
21:46:24 <zaitcev> Using __del__ in that situation is pure madness. I only reviewed our own stuff, so I saw no loops and no excuse to -1 on that patch.
21:47:52 <timburke> hmm... i thought the gc was pretty good about cycle detection these days, though? i'd just assumed there was some sort of situation that caused it to not get run *shrug*
21:49:22 <timburke> anyway, something to be aware of -- i think we ought to decide on a fix and backport it to ussuri
21:49:57 <timburke> #topic s3api + versioning
21:50:44 <timburke> i've seen client trying to request specific versions of object on buckets that have never had versioning enabled
21:51:10 <kota_> interecting
21:51:16 <kota_> interesting
21:51:17 <timburke> i *think* it came down to an AWS version id tagging along during a migration? not entirely sure
21:51:48 <timburke> but s3api didn't handle it well -- object_versioning responds with a 400, then s3api responds 500
21:51:57 <timburke> but https://review.opendev.org/#/c/722552/ should fix it!
21:51:58 <patchbot> patch 722552 - swift - s3api: Check whether versioning is enabled more - 3 patch sets
21:52:13 <timburke> #topic versioning + container acls
21:52:55 <timburke> i've also got a customer that turned on versioning for a container, then was sad that container acls basically stopped working
21:53:40 <timburke> existing data was fine, but once they wrote new data (new name or an overwrite), reads that previously would have been successful all 403
21:54:15 <timburke> clayg mentioned it a bit in -swift before the meeting, but i think the approach i settled on in https://review.opendev.org/#/c/724393/ is reasonable?
21:54:15 <patchbot> patch 724393 - swift - versioning: Have versioning symlinks make pre-auth... - 1 patch set
21:55:09 <timburke> basically, since the user needed to be authorized to read the symlink object, we should grant the same authorization to following the link to the versions container
21:55:15 <clayg> timburke: is there a significant difference between the wsgi utils preauth request helper and installing a noop lambda auth callback?
21:55:15 <zaitcev> I can't find it now, but I seem to recall that there's some semantic thing that prevents interpreter to invoke __del__ on looped object. Or, rather, it will even try to free anything if one of the objects in the loop has __del__. But I don't remember and I can't find it.
21:55:56 <timburke> zaitcev, oh, interesting. so the __del__ may not be a good idea on py3, either...
21:55:57 <clayg> zaitcev: that sounds familiar to me!  I always treated __del__ as dark magic!
21:56:31 <zaitcev> I think I read about it in Bezeley's book.
21:56:39 <timburke> clayg, i forget -- i'll look into it. i think there might be some other header-scrubbing that goes on, something like that
21:57:01 <zaitcev> Anyway, sorry. We're on ACLs now.
21:57:35 <timburke> sorry, running low on time :-)
21:59:15 <timburke> does that seem reasonable? it's only in the get path, and this still lets us define another way of saying "hey, allow these users to list and read old versions"
21:59:55 <clayg> @timburke what were you saying about "popping" off that stack?
22:00:18 <clayg> @timburke i'm sure it's reasonable but I need to look at it more - it's on my list for tomorrow
22:01:26 <timburke> well, i kinda push some state (the allow-reserve-names header and the always-authed callback), then immediately after the app call i restore the previous state (so if the version was a symlink to some *other* container, the new request still needs to be authorized)
22:02:07 <timburke> i started out just inserting the no-op callback where we used to add the header, but realized that'd be open to abuse
22:02:23 <timburke> anyway, thanks for thinking about it!
22:02:41 <timburke> thank you all for coming, and thank you for working on swift!
22:02:54 <timburke> #endmeeting