21:00:02 <timburke> #startmeeting swift
21:00:02 <opendevmeet> Meeting started Wed May  1 21:00:02 2024 UTC and is due to finish in 60 minutes.  The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:02 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:02 <opendevmeet> The meeting name has been set to 'swift'
21:00:11 <timburke> who's here for the swift team meeting?
21:00:50 <mattoliver> o/
21:01:13 <timburke> huzzah! i was worried i'd be left talking to myself ;-)
21:01:25 <mattoliver> not this time :)
21:01:41 <timburke> i tried to do a better job of prepping this week
21:01:57 <timburke> so the agenda's pretty full at
21:02:00 <timburke> #link https://wiki.openstack.org/wiki/Meetings/Swift
21:02:06 <timburke> first up
21:02:14 <timburke> #topic utils refactor
21:02:21 <timburke> #link https://review.opendev.org/c/openstack/swift/+/914029
21:02:22 <patch-bot> patch 914029 - swift - Refactor utils - 20 patch sets
21:02:52 <timburke> clayg, acoles, and i all like where this has landed
21:03:36 <timburke> unfortunately it looks like there was a probe test failure in the gate (test_reconciler_move_object_twice), so it'll need a recheck
21:03:38 <mattoliver> yeah, I love the idea of further refactor, utils is getting big.. but not looking forward to the rebase fallout, esp in tracing :P
21:03:52 <timburke> but it'll be coming in the next day or so
21:04:18 <timburke> and yeah, expect a decent number of merge conflicts to fall out of it (sorry in advance)
21:04:47 <mattoliver> kk
21:05:13 <timburke> i'll try to get a merge down to feature/mpu up asap once its landed so acoles can have a ready-to-go-branch in his morning
21:05:33 <timburke> #topic probe test timeouts
21:05:35 <mattoliver> oh yeah great idea
21:06:02 <timburke> while i was reviewing that patch, i noticed that we get a fair bit of probe test timeouts
21:06:11 <timburke> not a *ton*, but more than i'd like
21:07:10 <timburke> some of them more or less make sense -- a patchset breaks every probe test, then the retry-failed-tests logic kicks in and retries them *all*...
21:07:19 <timburke> yeah, that's reasonably likely to cause a timeout
21:07:45 <timburke> others seem to just hang, though, and that's more worrying
21:08:15 <timburke> #link https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_aeb/913949/3/check/swift-probetests-centos-9-stream/aebbd31/job-output.txt
21:08:55 <timburke> for example, gets 8% of the way through the tests, then hangs until the timeout pops 1h51m later
21:09:21 <mattoliver> wow
21:09:23 <timburke> the test that hangs isn't consistent, fwiw
21:09:26 <timburke> #link https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_67c/909800/7/check/swift-probetests-centos-9-stream/67cfe7b/job-output.txt
21:09:39 <timburke> #link https://9b6014e80e764b848f3d-c29773bdeee4530a738751d9e026e2a7.ssl.cf1.rackcdn.com/874806/23/check/swift-probetests-centos-9-stream/ddc315e/job-output.txt
21:09:53 <mattoliver> been able to reproduce when running probe tests locally?
21:10:08 <timburke> nope -- so honestly i'm not quite sure how to debug it
21:10:32 <timburke> but i figured i'd bring it up in case anyone else had ideas
21:11:01 <timburke> i should probably write up a bug about it, and try to track job failures more closely
21:11:42 <timburke> if anyone else wants to take a look, i found this helpful
21:11:45 <timburke> #link https://zuul.opendev.org/t/openstack/builds?job_name=swift-probetests-centos-9-stream&job_name=swift-probetests-centos-8-stream&project=openstack%2Fswift&result=TIMED_OUT&skip=0&limit=100
21:11:45 <mattoliver> yeah bug might be a good start. I'll run some probe tests locally in the meantime and see what happens
21:12:04 <mattoliver> on nice
21:12:55 <timburke> it does seem like things go worse around March -- prior to that, it was mostly ~1/month
21:13:28 <timburke> but of course, the older runs don't still have logs attached to verify the hang
21:13:54 <timburke> next up
21:14:03 <timburke> #topic liberasurecode release
21:14:12 <timburke> it's been like a couple years!
21:14:24 <timburke> so i put together authors/changelog
21:14:31 <timburke> #link https://review.opendev.org/c/openstack/liberasurecode/+/917784
21:14:32 <patch-bot> patch 917784 - liberasurecode - Release 1.6.4 - 1 patch set
21:15:06 <mattoliver> yeah probably due for a release :P
21:15:30 <timburke> there's nothing too major -- there's a bounds-check that callers might appreciate, but otherwise it's mostly code cleanup and build fixes
21:15:47 <mattoliver> kk, will review it today
21:15:49 <timburke> probably half the reason is just to make sure i remember how to do one of these ;-)
21:15:51 <timburke> thanks
21:16:16 <timburke> speaking of ec...
21:16:30 <timburke> #topic manylinux wheels for pyeclib
21:17:03 <timburke> so i've been playing with this for a bit, and created a Dockerfile to help build these a while back
21:17:09 <timburke> #link https://review.opendev.org/c/openstack/pyeclib/+/817498
21:17:09 <patch-bot> patch 817498 - pyeclib - Add Dockerfile to build manylinux wheels - 11 patch sets
21:17:31 <timburke> but i finally got around to trying to get them building in CI!
21:17:37 <timburke> #link https://review.opendev.org/c/openstack/pyeclib/+/917857
21:17:37 <patch-bot> patch 917857 - pyeclib - Add job to build wheels - 5 patch sets
21:17:45 <mattoliver> oh yeah, I remember you playing with this
21:17:57 <mattoliver> nice
21:18:24 <timburke> it even has them showing up as artifacts on the zuul build page: https://zuul.opendev.org/t/openstack/build/a8e195bfe57b4d2c928d1a52a0523e4e/artifacts
21:19:36 <timburke> next up i want to beg some help from someone who knows zuul and the release process better than me to figure out how to actually build & upload that when we tag a release
21:20:14 <mattoliver> you might have to visit infra for that
21:20:35 <timburke> i also realize it might be nice to provide a little more context on manylinux wheels and why i want this
21:20:54 <mattoliver> true
21:22:28 <timburke> so any of us can build a binary wheel already -- setup.py bdist_wheel and away you go
21:23:27 <timburke> but that would create a wheel tied to your specific version of system libraries (including not just glibc but also liberasurecode)
21:24:22 <timburke> meaning that you couldn't just publish it and expect other people to be able to use it. pypi will actually reject such a wheel if you even try
21:26:36 <timburke> manylinux wheels are designed so you *can* distribute them, because they target a really old version of glibc and glibc won't break backwards compat
21:27:15 <mattoliver> oh ok, making alot more sense now
21:28:31 <timburke> that actually only solves half the problem, though -- great, glibc's OK, and we can probably expect other people to have *some* version of that installed
21:28:45 <timburke> but what about liberasurecode? or isa-l?
21:30:01 <timburke> there's a way to have those baked into the wheel, too! and since *those* will only depend on some widely-installed libraries, now you've got a wheel that can actually be used in a lot of places
21:31:12 <timburke> *and* you don't need a C build chain to install pyeclib
21:31:37 <mattoliver> oh wow, ok. I never considered putting more into a wheel. I guess why not. The point is to save compiling etc.
21:31:52 <timburke> my end goal is to be able to run `pip install swift` on a pretty bare-bones system and have it Just Work
21:32:50 <timburke> at least now you can say `pip isntall https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_a8e/917857/5/check/pyeclib-build-wheels/a8e195b/artifacts/pyeclib-1.6.1-cp35-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl swift` and i *think* that'll work ;-)
21:33:02 <timburke> (until the build results expire)
21:33:31 <timburke> (and assuming you fix my isntall typo :P)
21:33:40 <mattoliver> that would be cool. I actaully did just that yesterday (pip install swift) and then needed to get python.py and a compiler installed. So maybe good timing for this discussion :)
21:33:52 <mattoliver> *python.h
21:35:03 <timburke> there's more stuff that could be done (aarch64 wheels, musl wheels) but this seemed like a pretty good starting point
21:35:16 <timburke> next up
21:35:24 <timburke> #topic expirer work
21:35:28 <mattoliver> +1
21:35:42 <timburke> there are a few patches we've been looking at lately
21:36:24 <timburke> one adds some more info to the expirer queue entries -- specifically, the content-length of items that are marked to expire
21:36:28 <timburke> #link https://review.opendev.org/c/openstack/swift/+/912496
21:36:28 <patch-bot> patch 912496 - swift - add bytes of expiring objects to queue entry - 13 patch sets
21:38:12 <timburke> the other body of work is trying to deal with the large number of expirers and large number of queue entries we've got in prod -- every object node is participating, and that can result in a lot of account/container db load when they all restart
21:39:10 <timburke> the fact that we've got a bunch of deferred work in the queue that should be skipped for now just adds to the frustration
21:39:27 <timburke> so clayg has a couple patches
21:39:29 <timburke> #link https://review.opendev.org/c/openstack/swift/+/914713
21:39:30 <patch-bot> patch 914713 - swift - expirer: new options to control task iteration - 14 patch sets
21:39:35 <timburke> #link https://review.opendev.org/c/openstack/swift/+/916026
21:39:35 <patch-bot> patch 916026 - swift - distributed parallel task container iteration - 6 patch sets
21:40:14 <timburke> they were stacked previously, but that second one hasn't been updated in a little bit
21:40:40 <timburke> fwiw, though, i wonder how much we'd need the first one if we had the second one already
21:41:35 <mattoliver> would finally moving to the new task queue (that divides up the queues amongst all the partitions (or whatever)) making it more distributed, be an option?
21:42:21 <mattoliver> I haven't really looked into these patches yet. I'll try and get too that to get a better understanding
21:43:36 <timburke> potentially? p 517389 hasn't seen real activity since 2019, though, and we'll still need to deal with the 1B+ queue entries in the old layout
21:43:36 <patch-bot> https://review.opendev.org/c/openstack/swift/+/517389 - swift - Add object-expirer new mode to execute tasks from ... - 47 patch sets
21:44:50 <timburke> next up...
21:45:02 <mattoliver> oh yeah, just thinking out loud
21:45:07 <timburke> #topic py2/py3 behavior difference in brokers
21:45:30 <timburke> acoles and i noticed a funny thing while reviewing a patch on feature/mpu
21:46:13 <timburke> when we bulk-load all the rows from the pending file into a db, py2 shuffles the rows!
21:46:14 <mattoliver> yeah, I've noticed this. And skipped on py2 tests because the row insert order isn't known bewteen the 2
21:46:22 <timburke> this was a bit of a surprise to both of us
21:47:01 <timburke> oh! which test, do you remember? i want to fix it so py2 behaves like py3
21:47:35 <mattoliver> didn't py2's dict not strickly ordered. maybe it's used as a datatype down in the sqlite module or something
21:48:00 <timburke> it comes down to dict iteration order -- i think we just need to use an OrderedDict around https://github.com/openstack/swift/blob/2.33.0/swift/container/backend.py#L1365
21:48:30 <mattoliver> I'll have to find it.. it was a while ago
21:48:42 <timburke> and maybe https://github.com/openstack/swift/blob/2.33.0/swift/container/backend.py#L341
21:49:21 <mattoliver> where was working on brokers. maybe in the shard-ragne sync point patch, or maybe somethnig that's landed. I'll have to go digging. I'll ping you when I find it.
21:49:27 <timburke> that'd be great if you can. i might be able to find it on my own, too, now that i know it's somewhere out there
21:49:38 <timburke> last up
21:49:48 <timburke> #topic unreleased swiftclient bug
21:50:29 <timburke> there are a couple bugs caused by a recent-ish swiftclient patch, but Yan's got a fix up for them!
21:50:32 <timburke> #link https://review.opendev.org/c/openstack/python-swiftclient/+/916135
21:50:32 <patch-bot> patch 916135 - python-swiftclient - Fix swiftclient output regression - 5 patch sets
21:50:41 <mattoliver> oh nice
21:51:02 <timburke> we probably want to get that reviewed & merged fairly soon
21:51:19 <mattoliver> kk, I'll put it on my list
21:51:31 <timburke> all right, that's all i've got
21:51:35 <timburke> #topic open discussion
21:51:42 <timburke> anything else we want to bring up?
21:52:22 <mattoliver> We do have some students from a university in Qatar who want to work on swift as a project at Uni, their teacher/lecturer as reached out.
21:52:46 <mattoliver> I was trying to think of some swift related project for them to work on.
21:53:18 <timburke> oh yeah, i think i saw you forwarded something to me... sorry, i'm bad at keeping up with outreach
21:53:19 <mattoliver> So any thoughts would be greatly appreciated. Not sure on the size or complexity though.
21:53:47 <mattoliver> Looking at our old ideas page maybe one of these?
21:54:00 <mattoliver> account quotas for number of files
21:54:12 <mattoliver> #link https://wiki.openstack.org/wiki/Swift/ideas/account-quota-files
21:54:49 <mattoliver> task queue (though maybe to complex)
21:55:04 <mattoliver> probably same with teiring.
21:55:20 <mattoliver> we could try and give them pipeline automation
21:56:09 <mattoliver> I think the reconciler and sharder daemons need better scaling (ie added concurrency with workers etc).
21:56:23 <timburke> oh yeah, i should revisit p 635040 ...
21:56:23 <patch-bot> https://review.opendev.org/c/openstack/swift/+/635040 - swift - Include some pipeline validation during proxy-serv... - 5 patch sets
21:56:40 <mattoliver> Or maybe just something something intersting audit-watcher or custom middleware.
21:57:40 <timburke> i'll have a think on it
21:58:08 <mattoliver> Thanks, me too. And jianjian too now that he's joined the room :P
21:58:37 <mattoliver> I think we're basically out of time.. so that'll do from me :)
21:59:53 <timburke> all right. thank you for coming, and thank you for working on swift!
21:59:57 <timburke> #endmeeting