21:00:11 <timburke> #startmeeting swift
21:00:11 <opendevmeet> Meeting started Wed Apr 26 21:00:11 2023 UTC and is due to finish in 60 minutes.  The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:11 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:11 <opendevmeet> The meeting name has been set to 'swift'
21:00:26 <timburke> who's here for the swift team meeting?
21:01:13 <acoles> o/
21:01:19 <kota> o/
21:02:46 <mattoliver> o/
21:03:16 <timburke> sorry that it's been so long since i actually held a meeting
21:03:37 <timburke> there were a couple items from whenever that last time was that i wanted to follow up on
21:03:47 <timburke> #topic keepalive timeouts
21:03:59 <timburke> #link https://review.opendev.org/c/openstack/swift/+/873744
21:04:19 <timburke> i finally got around to reworking that to be a plumbing-only patch!
21:04:33 <timburke> thanks zaitcev for reviewing it
21:04:54 <mattoliver> Oh cool! I'll take a look at it too then
21:06:12 <timburke> i forget whether we (nvidia) have just recently started running with that and latest eventlet, or if that's going out next week, but it's looking good so far
21:06:30 <timburke> #topic ssync and data with offsets
21:06:41 <timburke> both patches have merged!
21:06:51 <acoles> \o/
21:07:33 <timburke> but... they introduced a flakey probe test. keep an eye out for it in gate results
21:07:50 <timburke> speaking of...
21:07:57 <timburke> #topic gate issues
21:08:30 <timburke> we recently had a busted lower-constraints job, after virtualenv dropped support for creating py2 envs
21:09:13 <timburke> that's been fixed (and i've got a follow-up to fix it better) -- but it still impacts stable branches
21:09:50 <timburke> once the follow-up lands, i'll propose some backports with the two patches squashed together
21:10:38 <timburke> i've also seen some flakey tests lately
21:10:48 <timburke> #link https://bugs.launchpad.net/swift/+bug/2017024
21:10:57 <mattoliver> Kk, thanks for all that work timburke
21:11:31 <timburke> was easy to reproduce locally, and not too bad to fix once i looked at some of the other tests in the file
21:11:57 <timburke> see https://review.opendev.org/c/openstack/swift/+/881142
21:12:16 <timburke> but the probe test i mentioned...
21:12:29 <timburke> #link https://bugs.launchpad.net/swift/+bug/2017021
21:13:03 <timburke> i still haven't reproduced locally, and i haven't found any smoking guns in the gate job logs
21:14:12 <timburke> if anyone has some cycles to spare on it, i'd appreciate any insights you can figure out -- this seems to be the leading cause of rechecks the past week or two
21:14:42 <acoles> :/
21:15:06 <timburke> alternatively, we could consider removing the flakey test, but that doesn't seem great
21:15:12 <acoles> subjectively, it does seem to be causing a lot of rechecks
21:16:37 <timburke> lastly, i wanted to draw attention to a recent proxy error we've been dogpiling on
21:16:43 <timburke> #topic ec frag iter errors
21:16:47 <mattoliver> I'll keep an eye out for it, and if it happens to me I'll dig in, in the meantime will also add it to the bottom of my todo list and hope to get to it at some point.
21:17:49 <timburke> indianwhocodes was investigating some differences in py2/py3 proxy behaviors, and started pulling at this "generator already executing" error
21:18:02 <timburke> #link https://review.opendev.org/c/openstack/swift/+/880356
21:19:37 <timburke> (note that the error would happen under both py2 and py3, but the tracebacks got much noisier in py3, as it started adding more context about what other errors were in the process of being handled)
21:21:08 <timburke> the more we thought about it, the weirder it seemed; eventually, we pieced together that one greenthread was trying to close out a generator that was currently executing (and blocked on IO) in another greenthread
21:22:12 <timburke> this has led to a few different refactorings from clayg and acoles -- i'm really optimistic about where the EC GET code will wind up
21:22:25 <timburke> two questions i've got about it, though:
21:22:37 <timburke> 1. do we have an upstream bug about the error already?
21:24:10 <timburke> and 2. do we have a fix yet? i think i heard that we've got a good idea of what needs to happen, but idk whether we've got a patch that could include a "closes-bug"
21:24:48 <mattoliver> 1. not that I've seen. though I haven't looked. maybe indianwhocodes could write one? it'll be good educational experience.
21:26:10 <timburke> maybe these will be better questions when indianwhocodes and clayg are around ;-)
21:26:28 <kota> +1
21:26:30 <timburke> i can also bring it up out-of-band
21:27:03 <mattoliver> yeah, sorry I haven't been following the work.
21:27:57 <timburke> just wanted to (1) bring it to people's attention in case they wanted to help out or better understand it, and (2) call out the good work in digging deep on a complicated part of the proxy
21:28:11 <mattoliver> +100
21:28:38 <timburke> that's all i've got
21:28:41 <timburke> #topic open discussion
21:28:50 <timburke> anything else we should bring up this week?
21:29:14 <mattoliver> I've spend a bunch of time, and there will be a bunch more, adding unit tests to tracing. You might have seen some activitity
21:30:15 <mattoliver> I'm basically adding some tracing asserts, ie what spans should be created to tests of middlewares that've i've actually gone in and instrumented a little (that have extra spans added)
21:30:50 <timburke> whoo! i still need to give tracing a spin
21:31:20 <mattoliver> It could be a never ending scope of how many and what type of unit tests, but just want to do something to get the code into a more upstream mergible state.
21:31:34 <mattoliver> you should give it a whirl! (when you have time)
21:32:52 <mattoliver> just yesterday I was adding tests to tempurl and saw clearing we run _get_hmac 4 times on HEADs. Then looking in the code, yup it does (and it's suppose to) but was obvious in the spans that were created, it was pretty cool to see it
21:33:06 <mattoliver> *clearly
21:33:17 <timburke> i remember that it requires py3 -- but can we merge it with the caveat that you should only configure it under py3 (but everything will still run fine under py2 as long as it's *not* configured)?
21:33:44 <mattoliver> yeah, when we do, there will be an impact.
21:34:05 <mattoliver> I'll double check, the concrete tracer implementations were defintely py3.
21:35:04 <opendevreview> Shreeya Deshpande proposed openstack/swift master: Error logs changed for ChunkWriteTimeout  https://review.opendev.org/c/openstack/swift/+/881648
21:36:08 <mattoliver> anyway, that's a good test, I can create a python2 venv and give it a whirl
21:37:03 <mattoliver> thats all I have
21:37:12 <timburke> oh, duh! i should just look at the zuul results!
21:37:37 <mattoliver> oh yeah! lol
21:37:44 <timburke> all right, i think i'll call it early then
21:37:54 <timburke> thank you all for coming, and thank you for working on swift!
21:37:59 <timburke> #endmeeting