21:00:25 <timburke> #startmeeting swift
21:00:26 <openstack> Meeting started Wed Sep 23 21:00:25 2020 UTC and is due to finish in 60 minutes.  The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:29 <openstack> The meeting name has been set to 'swift'
21:00:36 <timburke> who's here for the swift meeting?
21:00:51 <kota_> hello
21:02:16 <timburke> maybe it's just you and me
21:02:26 <kota_> oh yeah
21:02:28 <timburke> agenda's at https://wiki.openstack.org/wiki/Meetings/Swift
21:03:10 <timburke> main thing i wanted to mention was some follow up on the state of our gate
21:03:24 <kota_> okay
21:03:26 <clayg> o/
21:03:40 <timburke> #topic busted gates
21:04:10 <kota_> oh, it looks bunch of branches were broken...
21:04:18 <timburke> i *think* swift's ussuri gate is fixed now -- at any rate, i stopped seeing emails about docs failure
21:05:06 <timburke> assuming it's moving, i'll try to get some stable releases out for ussuri and train this week
21:06:10 <timburke> swift client's gate is better now! the fix landed after the deadline to branch for victoria, though, so i might need to reach out to the stable team to sort out how best to fix that one
21:06:39 <timburke> (the fix involved some requirements changes, so i worry a little that a simple backport may not be great)
21:06:45 <kota_> i see
21:07:48 <timburke> i discovered pyeclib's gate was broken after seeing p 744623
21:07:48 <patchbot> https://review.opendev.org/#/c/744623/ - pyeclib - [goal] Migrate testing to ubuntu focal (ABANDONED) - 4 patch sets
21:08:25 <timburke> p 753472 fixed it, but disabled the two jobs we had to test against tip-of-master libec
21:08:26 <patchbot> https://review.opendev.org/#/c/753472/ - pyeclib - Fix gate (MERGED) - 1 patch set
21:08:41 <clayg> focal is gunna be so great - i'm sure I'll try upgrading to it at some point
21:09:22 <kota_> clayg!
21:09:44 <clayg> kota_: i snuck in 😁
21:10:10 <timburke> at some point we should dig into how those fail, but they're both such low-volume repos that i'm fairly certain they still work well together
21:10:55 <timburke> while i was looking at pyeclib, i also pushed in p 753421 to test against py38 on focal and py36 on centos8
21:10:56 <patchbot> https://review.opendev.org/#/c/753421/ - pyeclib - Update gate jobs (MERGED) - 4 patch sets
21:11:12 <kota_> libec-pyeclib-unit said `/bin/bash: line 17: tox: command not found` :(
21:11:28 <kota_> at p 753472
21:11:28 <patchbot> https://review.opendev.org/#/c/753472/ - pyeclib - Fix gate (MERGED) - 1 patch set
21:11:49 <kota_> no p 744623
21:11:49 <patchbot> https://review.opendev.org/#/c/744623/ - pyeclib - [goal] Migrate testing to ubuntu focal (ABANDONED) - 4 patch sets
21:12:40 <timburke> i love how snappy pyeclib's jobs are -- at 2-4 mins per job, i feel like we can add more target platforms all day long!
21:12:56 <kota_> sounds good
21:14:07 <timburke> but all of this reminded me that i should check on the state of libec's gate; will report back next week
21:14:49 <timburke> that's all i've got for the gate stuff; any questions or comments?
21:15:45 <kota_> nothing so far. thanks for your effort to keep the gate to work.
21:16:31 <clayg> timburke: 👏
21:16:37 <timburke> all right, i've just got one other topic on my mind lately
21:16:46 <timburke> #topic hung proxy servers
21:17:32 <timburke> there have been two distinct issues that came up recently are somewhat related
21:18:23 <timburke> one is https://bugs.launchpad.net/swift/+bug/1895739
21:18:24 <openstack> Launchpad bug 1895739 in OpenStack Object Storage (swift) "Proxy server sometimes deadlocks while logging client disconnect" [Undecided,In progress]
21:20:44 <timburke> the nitty-gritty is in the bug, but the summary is that while we're down in logging, garbage collection may cause us to try to grab the same (non-reentrant) lock twice in the same (green)thread
21:21:09 <timburke> the other is https://github.com/eventlet/eventlet/pull/498
21:21:59 <timburke> where eventlet sees that there's a fd read to read, but then doesn't wake anyone up to read it
21:23:24 <timburke> good news is that the second one is already merged (and tagged!) following https://github.com/eventlet/eventlet/pull/645 -- thanks for cleaning it up clayg!
21:23:40 <clayg> tight poll loop keeps asking for the same fd, and it says it's ready - but it just keeps polling
21:24:16 <timburke> the first one has a patch at p 752593
21:24:16 <patchbot> https://review.opendev.org/#/c/752593/ - swift - Replace threading._active_limbo_lock with a re-ent... - 3 patch sets
21:24:52 <timburke> i think both of these issues can affect other services, it's just acutely bad on proxies
21:25:56 <timburke> as much as anything, i just wanted to raise awareness in case anyone else sees similar issues, and maybe see if i could get someone to look at the swift patch ;-)
21:28:04 <clayg> does lp bug #1895739 only effect py3?
21:28:05 <openstack> Launchpad bug 1895739 in OpenStack Object Storage (swift) "Proxy server sometimes deadlocks while logging client disconnect" [Undecided,In progress] https://launchpad.net/bugs/1895739
21:28:27 <timburke> i've only *observed it* on py3 -- and i'm not sure why :-(
21:28:55 <timburke> looking at py2's code, it seems like it *could* happen there, too... but again, i've not actually seen it
21:29:10 <timburke> maybe there was some change in GC algo?
21:29:33 <clayg> what kind of lock *is* _active_limbo_lock in cpython?  does eventlet patch it by default?
21:29:41 <timburke> i still haven't found a good way to reliably reproduce the problem, either :-(
21:29:51 <clayg> 😢
21:31:05 <timburke> clayg, so in cpython it's a pretty low-level lock -- uses https://docs.python.org/2/library/thread.html#thread.allocate_lock as i recall
21:32:26 <timburke> eventlet *does* patch it; it gets replaces with a Semaphore
21:32:42 <clayg> neato!
21:33:46 <timburke> which seems like a reasonable replacement given the semantics
21:35:08 <timburke> i tried to go over some of the weirdness that leads to this in the bug -- it's not really clear to me whether we're to blame, eventlet's to blame, or cpython's to blame :-/
21:35:55 <timburke> swapping out for our own reentrant lock seems like the most-reasonable approach, though, especially since it's already getting patched
21:36:51 <timburke> clayg, since you've already put some effort into thining about eventlet and our PipeMutex, mind takinga look this week?
21:37:33 <clayg> i'm sure it's fine - but without a repro it's hard to say exactly
21:38:22 <timburke> all right, that's all i've got planned
21:38:26 <timburke> #topic open discussion
21:38:34 <timburke> what else should we talk about this week?
21:38:41 <clayg> are we still stalled out on pyeclib?
21:39:43 <timburke> pyeclib's good now, afaik -- maybe you're thinking of p 738959 though?
21:39:44 <patchbot> https://review.opendev.org/#/c/738959/ - liberasurecode - Be willing to write fragments with legacy crc - 2 patch sets
21:41:09 <timburke> i still haven't circled back on it -- i'm coming around to wanting to at least treat set-to-the-empty-string the same as unset, but beyond that i'm not sure
21:44:18 <timburke> i think my main question is: which falsey values should we look for?
21:46:25 <timburke> kota_, clayg any thoughts there? keeping in mind that the check'll have to be written in C
21:47:11 <clayg> i like 0 and 1 for true and false in C
21:47:55 <kota_> clayg: agree. plus empty value seems False.
21:48:32 <clayg> anyone have any idea why making a request that uses acl's results in the env getting copied?  p 752770
21:48:32 <patchbot> https://review.opendev.org/#/c/752770/ - swift - Log error processing manifest as ServerError - 1 patch set
21:48:38 <timburke> ok, i'll code that up this week
21:48:58 <clayg> we end up loosing the storage policy index from the req.environ as well
21:57:03 <timburke> i have no idea. sorry. went looking
21:57:23 <timburke> i'll see about digging into it more on the patch, though
21:58:06 <timburke> all right, i think that'll do it
21:58:18 <timburke> thank you all for coming, and thank you for working on swift!
21:58:27 <timburke> #endmeeting