21:00:03 <timburke_> #startmeeting swift
21:00:04 <openstack> Meeting started Wed Feb  3 21:00:03 2021 UTC and is due to finish in 60 minutes.  The chair is timburke_. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:05 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:07 <openstack> The meeting name has been set to 'swift'
21:00:14 <timburke_> who's here for the swift meeting?
21:00:19 <mattoliverau> o/
21:00:23 <seongsoocho> o/
21:00:37 <kota_> o/
21:00:41 <rledisez> o/
21:01:04 <clayg> o/
21:01:26 <acoles> o/
21:01:37 <timburke_> as usual, the agenda's at
21:01:39 <timburke_> #link https://wiki.openstack.org/wiki/Meetings/Swift
21:01:51 <timburke_> first up
21:01:59 <timburke_> #topic vagrant swift all in one
21:02:10 <timburke_> #link https://github.com/swiftstack/vagrant-swift-all-in-one
21:02:54 <timburke_> i know clayg and i have been pretty heavy users of this tooling to set up our dev environments
21:03:34 <clayg> ๐Ÿด IT WHILE IT'S ๐Ÿ”ฅ
21:03:45 <timburke_> but in the near future, it's likely to go away (as part of some long-standing cleanup necessary to get rid of the old swiftstack github account)
21:03:47 <rledisez> me and alecuyer (and some other colleagues) used it a lot too. very useful
21:04:43 <clayg> timburke_: update as of today - there's a non-zero chance it will just end up living at github.com/nvidia/vagrant-swift-all-in-one ๐Ÿคทโ€โ™‚๏ธ
21:04:54 <clayg> everyone here that uses it should still fork it now just in case
21:05:39 <timburke_> i do wonder if it'd be good for us to have something like it moved in-tree...
21:06:03 <clayg> virtualbox yes, vagrant maybe, chef NO (like HELL NO)
21:06:03 <timburke_> could probably re-use a lot of the ansible playbooks we've already got for setting up probe tests
21:06:11 <clayg> timburke_: SOLD
21:06:44 <rledisez> it's so handy that i'm wondering if there is swift devs that does not use it :D
21:08:10 <seongsoocho> I really love this tool :-)
21:08:20 <clayg> rledisez: apparently mattoliverau has "better things" that he keeps to himself
21:08:21 <timburke_> it's the sort of thing i'm always a bit torn about -- it's way easier to stand up an environment, but there's definite value in having a range of setups that we each use. vsaio's definitely got some opinions baked in
21:08:41 <mattoliverau> lol, I don't know about better :P
21:09:31 <mattoliverau> I just have a dodgy bash script I wrote back when I needed to deploy a bunch on rackspace cloud that sets SAIO the way I want it (and am now used too). But it doesn't do all the things vsaio does. (like s3api).
21:09:58 <timburke_> fwiw, i remember notmyname linking https://gist.github.com/notmyname/40b8131963346676dd18817aeb5ef799 a while back if anyone wanted to go virtualbox with no vagrant ;-)
21:10:13 <zaitcev> I always follow our in-tree manual to set up SAIO.
21:10:42 <clayg> zaitcev: you are a hero ๐Ÿ˜
21:11:17 <zaitcev> https://knowyourmeme.com/memes/no-take-only-throw
21:11:18 <timburke_> ๐Ÿค” i wonder if we could generate some bits of the manual based on in-tree playbooks....
21:11:47 <zaitcev> "IT dog, how do you automate?"  "No automate" "Only type"
21:12:01 <clayg> timburke_: the super weird thing about vsaio is the configuration - like it's opinionated, but supports *some* options (which in a few cases are like... flip this ONE config option to false ๐Ÿคจ)
21:12:29 <timburke_> py3? yes/no
21:13:06 <clayg> right, then there's other options that are super useful - maybe when we port to in-tree ansible we can square that up to something more sensible
21:13:19 <timburke_> (i *still* wish i had good tooling for working with mixed py2/py3 development)
21:13:29 <clayg> it'd be nice to have a change that requires some crazy config (servers-per-port) and you can include the saio stuff for reviewers to check it out!
21:15:00 <mattoliverau> my dodgy one is: https://github.com/matthewoliver/simple_saio but ignore the ubuntu in the readme. I'm not sure what it really supports. I mainly using centos SAIOs atm and Opensuse probably works, well it did a few months ago.
21:15:00 <acoles> IMHO a common dev and CI ansible setup would be great
21:15:15 <rledisez> one thing that always bothered me is that it seems there is in the code stuff that are specific so that ir works in SAIO. I'm wondering if the one server/vm is still relevant in a world where docker is everywhere. I can imagine creating a real cluster just with some docker-compose file
21:15:19 <mattoliverau> but yeah, vsaio + ansible would be nicer.
21:15:31 <mattoliverau> I have a some chef experience, and I didn't enjoy it :P
21:15:49 <timburke_> anyway, i guess we covered what needed to be said. namely, vsaio is a repo that (might) go away, so if you want to keep using it, it's probably a good idea to fork it sooner rather than later
21:16:16 <clayg> rledisez: i think that's quite reasonable - the trick is porting probetests ๐Ÿค”
21:16:42 <timburke_> i think it was also (vaguely) what notmyname was trying to do with runway
21:17:09 <clayg> s/might/probably/ go away - s/sooner rather than later/like right now... during this meeting/
21:17:32 <clayg> oh right runway!!!  that one might already be gone ๐Ÿค”
21:17:48 <timburke_> yeah, that one's already gone
21:18:50 <clayg> https://gitlab.com/nvidia/proxyfs-ci/runway
21:19:05 <timburke_> we can keep thinking about how best to do dev envs and how similar they need to be to CI vs prod vs some other crazy thing, but i think we should probably keep moving
21:19:38 <timburke_> #topic sharding in train
21:19:59 <timburke_> zaitcev, i haven't seen patches yet, how's it going?
21:20:04 <zaitcev> I put together a stack of 18 patches, they pass unit tests.
21:20:16 <zaitcev> The 19th was much too hard, so I gave up.
21:20:16 <timburke_> \o/
21:20:37 <clayg> didn't sinatra do a song about "sharding in the rain"?
21:20:38 <zaitcev> Unfortunately, I must focus on RBAC this week.
21:21:01 <timburke_> no worries, and thank you for taking on the RBAC work!
21:21:14 <timburke_> i just wanted to check in, make sure you weren't blocked
21:21:17 <zaitcev> So, I'm tempted to throw them into Gerrit in one big stack, just so they're not locked in my laptop.
21:21:52 <zaitcev> After we talked about it last week, I wanted to feed them in batches of 4 or 5, to let review them easier.
21:21:58 <timburke_> that's fine by me. i'll try to get through them quickly once they're up
21:22:07 <zaitcev> I replaced Change-ID at least
21:23:01 <timburke_> if possible, try to include the cherry-picked sha in the commit message; makes it a little easier for me to compare master vs stable
21:23:24 <zaitcev> Yes, I changed old Change-Id with Cherry-Picked-From.
21:23:52 <zaitcev> That's all
21:24:07 <timburke_> #topic eventlet and ssl
21:24:22 <timburke_> #link http://lists.openstack.org/pipermail/openstack-discuss/2021-January/020100.html
21:25:12 <timburke_> i was catching up on mailing list recently and saw zigo has been having trouble with eventlet and ssl in swift-proxy
21:25:39 <zigo> timburke_: I found out that the issue is dnspython 2.0.
21:25:47 <timburke_> i wanted to check if anyone has a ssl-enabled-keystone handy to try to repro
21:26:12 <timburke_> oh, curious
21:26:13 <zigo> timburke_: The issue is swift-proxy connecting to Keystone to check credentials...
21:26:14 <zaitcev> Ironically I don't
21:26:33 <zaitcev> I created a different region for Keystone and set Swift to talk using that region.
21:26:39 <zigo> So the problem is not a swift-proxy binding over SSL.
21:26:47 <rledisez> timburke_: we do have that, swift talking to a Keystone over SSL
21:26:51 <timburke_> ...i guess maybe dnspython imports ssl before eventlet's monkey-patched it?
21:27:12 <zigo> timburke_: I can't tell, but it's definitively a monkey paching issue between dnspython and eventlet.
21:27:34 <timburke_> i know i've seen similar recursion errors before, and it's been a matter of not monkey-patching early enough
21:27:42 <zigo> The same issue happens when Neutron tries to tell Nova (over the Nova API) that a VM port is up.
21:28:29 <zigo> Well, I'd prefer if there was a strong movement to get out of this madness.
21:28:44 <zigo> Monkey patching is a terrible idea.
21:29:21 <zigo> It has numerous times, and still bites hard...
21:30:08 <zigo> https://github.com/eventlet/eventlet/issues/619 <--- The issue has been opened since 25th of June ...
21:30:30 <timburke_> https://github.com/rthalley/dnspython/blob/v2.0.0/dns/query.py#L48 i guess? i don't see an ssl import on 1.16.0 (at a quick glance, anyway)
21:31:29 <timburke_> yeah, monkey-patching's... not great. one more reason we ought to look at that PUT+POST(+POST) patch again... it bugs me that we're so tied to eventlet
21:32:58 <zigo> My guess, is that we do eventlet monkey patching early, but then dnspython does monkey patching *after*, and then accessing stuff on the SSLContext object breaks hard.
21:33:05 <zigo> (I'm not sure, just double-guessing)
21:33:56 <timburke_> oh -- it does its own monkey patching or something, is that it? ick
21:34:34 <zigo> Isn't what you've just linked does?
21:34:55 <zigo> (ie: creating an SSLSocket object...)
21:35:17 <timburke_> does anyone have bandwidth to try to repro/fix the issue? having a pin on a two-year-old version of dnspython doesn't seem sustainable
21:35:47 <zigo> It's also currently completely broken in both Fedora and Debian (both have dnspython 2.0.x).
21:36:28 <zigo> I'm trying to push to revert to 1.16.0, but I'm not sure I'll be successful.
21:36:31 <clayg> timburke_: does anyone remember why we depend on dnspython?
21:36:46 <zigo> clayg: eventlet does depend on it ...
21:37:18 <clayg> cname_lookup
21:37:35 <zigo> python-eventlet (master)$ cat setup.py | grep dns
21:37:35 <zigo> 'dnspython >= 1.15.0, < 2.0.0',
21:38:17 <timburke_> so we *do* use it for cname_lookup, but the bigger issue seems to be that if you have it installed for the sake of something else, it'll break ssl in eventlet-ified processes
21:38:54 <zigo> Indirectly, yes.
21:39:07 <zigo> We use keystoneauth, which calls requests, which calls urllib3.
21:39:35 <zigo> Urllib3 access the SSL socket SSLContext.options object, and when it does ... big crash !
21:40:59 <zigo> I believe the issue is because this:
21:40:59 <zigo> https://github.com/eventlet/eventlet/blob/master/eventlet/green/ssl.py#L449
21:40:59 <zigo> isn't in use because of dnspython overriding the eventlet monkey patching.
21:41:10 <zigo> I may be wrong, but so far, that's where I am...
21:41:56 <timburke_> all right, i'll look into it. the dnspython tip was useful, looks like i might have a repro now!
21:42:20 <zigo> What's the intention behind this:
21:42:20 <zigo> https://github.com/rthalley/dnspython/blob/v2.0.0/dns/query.py#L58
21:42:20 <zigo> ?
21:42:20 <timburke_> #topic orphaned shard ranges
21:42:39 <zigo> Ok, thanks.
21:42:50 <zigo> timburke_: Feel free to ping me anytime and we can discuss this later.
21:43:21 <timburke_> zigo, i think it's just trying to stub out enough to prevent NameErrors and the like in case ssl isn't available
21:43:52 * zigo will try to patch this out, just to see if it continues to work ...
21:45:03 <timburke_> i know we picked up https://review.opendev.org/c/openstack/swift/+/771086 recently to prevent us from running into this orphaned-shard situation...
21:46:13 <timburke_> is https://review.opendev.org/c/openstack/swift/+/770529 still viable for cleaning up any orphans that may already be on disk? or should we abandon that?
21:46:23 <mattoliverau> yup, and that stops them being created.
21:46:42 <mattoliverau> There is work on getting the new shrink code working.
21:47:15 <mattoliverau> acoles: has a chain starting: https://review.opendev.org/c/openstack/swift/+/771885
21:47:19 <timburke_> yeah, maybe i should change the agenda item to cover shrinking generally ;-)
21:47:23 <mattoliverau> yeah
21:48:16 <mattoliverau> the start of the chain will allow the root to provide the final shard acceptor as it's self when collapsing. (to keep it root driven)
21:49:09 <mattoliverau> acoles recently did an awesome job of simlifying that with a auditing state
21:49:16 <acoles> I prefer the root driven shrink & delete approach rather than the shard self-determination in https://review.opendev.org/c/openstack/swift/+/770529
21:49:33 <mattoliverau> +1
21:49:59 <mattoliverau> later on that chain is the new compact shard-manage-shard-ranges command
21:50:10 <mattoliverau> which reworks how shrinking works in the sharder.
21:50:30 <acoles> in general the shrinking and overlap repair is coming along nicely, but we uncovered a couple of bugs along the way
21:50:49 <mattoliverau> Once all these peices are done, orphan shards will be dealt with. initially manually, but its moving in the right direction for automatically too.
21:50:58 <acoles> mattoliverau: has fixed one https://review.opendev.org/c/openstack/swift/+/773832
21:51:57 <acoles> I'm about to push another fix, these are both bug fixes in addition to the new feature in swift-manage-shard-ranges commands
21:52:11 <acoles> so its been an interesting journey :)
21:52:31 <mattoliverau> :)
21:52:33 <timburke_> cool! sounds like things are moving right along. blocked on anything?
21:54:03 <acoles> timburke_: not enough hours in a day ! :)
21:54:04 <mattoliverau> not right now, just testing, reviews, coding.. and confidence we've fixed all the bugs :P
21:54:38 <acoles> there's the other bug fix https://review.opendev.org/c/openstack/swift/+/774002
21:54:52 <timburke_> all right then
21:54:59 <timburke_> #topic relinker
21:55:06 <acoles> note: these bugs would not impact *sharding*, just shrinking of shards
21:55:07 <timburke_> i'm still working on some relinker enhancements, and wanted to call out a couple things
21:55:17 <timburke_> first, my patches are (mostly) in a single chain now, so it's easier to try out all the improvements at once by checking out the end of the chain
21:56:12 <timburke_> second, the first couple patches in the chain make it so you can point the relinker at a config file to read most (but not quite all) of the cli flags
21:57:01 <timburke_> that was mostly because i realized we've got a bunch of options that should already be in you [DEFAULT] section of your object-server.conf (--swift-dir, --devices, --skip-mount-check)
21:57:36 <timburke_> with https://review.opendev.org/c/openstack/swift/+/772419, --user
21:58:40 <timburke_> i was mainly wondering if the config file seemed like as reasonable idea to everyone; i can't think of any other one-off tools we have in the swift repo that would go read a conf file...
21:59:15 <mattoliverau> the config file is still optional tho right? you can still use cli ops if need be.
21:59:57 <mattoliverau> So I think it makes sense to allow a config, esp when most of the options may already be set. But also it optional so wont stop how others may already be using the tool.
21:59:59 <timburke_> yup -- existing cli tooling should all still work
22:00:21 <mattoliverau> great job
22:00:42 <timburke_> with that in mind, should i make sure that all config options get CLI args?
22:00:44 <acoles> timburke_:+1  I think it is very reasonable
22:01:41 <acoles> +1 was for having a conf file. not sure if you *must* expose all the options if they are beyond current cli
22:02:04 <acoles> if the defaults are sensible
22:02:04 <mattoliverau> I think we're at time :(
22:02:20 <timburke_> yeah, i was noticing that, too ;-)
22:02:27 <timburke_> thank you all for coming, and thank you for working on swift!
22:02:30 <timburke_> #endmeeting