21:00:04 #startmeeting swift 21:00:05 Meeting started Wed Sep 11 21:00:04 2019 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:06 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:08 The meeting name has been set to 'swift' 21:00:11 who's here for the swift meeting? 21:00:23 o/ 21:01:16 o/ 21:01:27 o/ 21:02:23 agenda's at https://wiki.openstack.org/wiki/Meetings/Swift 21:02:50 #topic Shanghai 21:03:19 just a quick reminder that we ought to put things we want to talk about on the etherpad so we won't forget 21:03:22 #link https://etherpad.openstack.org/p/swift-ptg-shanghai 21:03:41 not that i'm concerned about us finding something to talk about ;-) 21:03:59 just want to make sure we remember to cover what we wanted to 21:04:20 on to updates! 21:04:32 #topic py3 21:05:55 i'm inclined to approve the probe tests (https://review.opendev.org/#/c/671333/) based on zaitcev's +2 and mattoliverau's prior generally-approving review, barring objections 21:06:11 after that i need to start looking at making a py3 probe tests job 21:06:28 did you remove the print? 21:06:45 yup :-) 21:06:51 ok, give me a sec ;) 21:06:59 hehe 21:07:28 done 21:07:40 fwiw, I haven't been able to run probetests on my fedora-saio, i still need to find time to debug it 21:08:47 i'm assuming zaitcev is able thou, so i'm not super concerned, it's most likely my env. 21:09:03 the main other thing i'd like people to think about is which func test jobs we can drop. at the moment we've got 5 in-process func test jobs running for each of py2 and py3 21:09:06 i've got this feeling like a lot of that's repeating testing of things where we already have good coverage 21:10:12 it gets particularly annoying when we have spurious failures because of a resource-constrained gate 21:10:20 that is alot. though we do test different configurations which is good. 21:10:54 Dropping some from say py2 and py3 might be good. _OR_ do we want to move some as periodic jobs 21:11:06 but, that situation also seems like it might be improving? idk. i'm not in a *big* hurry... just wanted to put it out there 21:11:24 so they get tested say weekly, to make sure py2 encryption doesn't die, for example 21:11:47 yeah, just brain storming :) 21:12:24 I always forget why we have a separate s3api? why can't that be enabled by default such that tests are run in the regular func tests 21:12:31 Is there a way to trigger them all, for example after a patch has been reviewed, updated etc.. before merging ? 21:12:33 mattoliverau, yeah, i was thinking about something like run ec on py3 and encryption on py2 or similar, relying on unit test coverage to catch ec-py2 and encryption-py3 bugs 21:12:53 maybe move some to experimental? 21:13:12 that sounds a lot like alecuyer's suggestion :-) 21:14:14 or trigger some func tests based on patch contents. Ie only run py3 encryption if something happens in the encryption middleware 21:14:31 otherwise run py2 encryption each time 21:14:42 a dunno, we may need a matrix :) 21:15:09 ^^^ not a bad idea... we tend to compartmentalize functionality pretty well... 21:15:16 anyway, maybe something to talk about more in the coming weeks (or even shanghai) 21:15:47 #topic versioning 21:16:05 but we also know that swift tests do not put a big load on the gate, so i'm not sure we would see better results, would we? 21:16:58 tdasilva, true... mainly i don't want to type "recheck" so much :P 21:17:34 +1 21:18:00 assuming a more-or-less constant rate of spurious failures, fewer jobs mean fewer rechecks 21:18:39 (our removing jobs would almost certainly be rounding-error in terms of gate-resource-constraints) 21:19:21 looks like clayg's not around ATM; i can give an update on versioning (unless tdasilva would like to) 21:20:01 timburke: go ahead 21:20:17 so clayg, tdasilva, and i have been talking about what design would get us to S3-style versioning a lot lately 21:21:24 i think what we're landing on is two separate filters so we only have to have a single "mode" loaded in our heads at a time while coding and reviewing 21:21:55 but with a single entrypoint, similar to what we did with encryption wrapping up encrypter and decrypter 21:23:13 2 filters? for different modes? 21:23:47 Do we have an ethernet page or something where the rough idea/plan lives? 21:24:12 can we make one, so I can live vicariously though you guys.. or ask silly questions :P 21:24:46 mattoliverau, yep. keep versioned_writes more or less as it is to deal with x-versions/history-location, add another one that wraps it (or is wrapped by it) for the new data layout 21:25:07 no etherpad yet -- can try to get that going 21:26:04 cool, I know clayg asked if I was free to talk about some of this. Which is cool, but we should try and keep the discussion open. Having said that, thanks guys for all the hard work :) 21:27:00 this still gives us something of an opportunity for a clean break, which means we can have the new mode only use symlinks when writing new versions or restoring versions, rather than having this extra bifurcation 21:27:15 I like that 21:27:59 mattoliverau: it's definitely not set in stone, it's just the latest idea we had and liked it, but please do share your ideas too 21:27:59 if we can get symlinks + new layout + backwards compat. That'll be amazing 21:28:09 tdasilva or i will work on getting https://review.opendev.org/#/c/678962/ and https://review.opendev.org/#/c/681054/ consolidated 21:28:21 mattoliverau, it's that last one that's always tricky ;-) 21:28:42 mattoliverau: what do you mean by backwards compat? 21:28:53 yeah, I was OK with loosing the last, so long as we're careful 21:28:55 so... what's your opinion on what "backwards compat" means? yeah, tdasilva get's it ;-) 21:29:56 does it mean "this old mode continues working, but you can do this new thing for new stuff", or do we need a migration path to get people from the old to the new? 21:30:07 sorry, I mean that we can use the same API and things auto migrate over. Not introducing a new versioning API, or middleware what makes old versions incompatable 21:30:30 by auto migrate, I mean from when you start adding new items 21:30:39 not something in the back ground 21:31:15 so old continues working, but new stuff uses the new thing.. but it is all accessible via the same API 21:31:45 with some enhancements to the API if we want to add some s3api-isms 21:33:38 The biggest thing I want to see is that any awesome new feature is accessible via the swift API not just s3api. I just want our swift api to always be a first class citizen in swift (obviously). even if we have to talk about a new seperate versioning API. But if you can make them work together.. I'm all for that. 21:33:39 so, if i had a container using x-history-location (since that was the best we had available) and it's accumulated some versions of objects... what are your expectations if/when you try to switch it to the new mode? 21:34:32 yes, definitely want this to be a swift api first, with s3api shimming into it. the smaller we can make that shim, the better 21:35:38 I dunno, either there is a new API, so you have to keep the old versioned writes middleware if you still want to access them. And start using the new API for 2.0.. or ideally, any new versioning uses symlinks. but if you delete it'll reach back into the old naming and bring them back 21:36:19 though, I obviously haven't really thought it through too deeply 21:36:22 yeah, and that last part is where it gets messy fast :-( 21:37:32 I bet. Which is why I was thinking it might be out of scope.. but if we can.. sweet :) 21:38:05 i'll make sure we get an etherpad up so we can continue to think & talk this through. maybe we could even find a time to get a video chat going ahead of shanghai ;-) 21:38:44 I'll try and look into the patches so I can better grok the probs :) 21:39:01 i'd appreciate it 👍 21:39:35 keeping moving... 21:39:41 #topic lots of small files 21:39:59 so we haven't had review bandwidth to spare for poor alecuyer :-( 21:40:20 no worries I know everyone is busy :) 21:40:36 but meanwhile he's looking to push to prod -- i kinda feel like that's sufficiently good for merging to a feature branch ;-) 21:40:55 anyone have objections there? 21:41:57 it seems like this represents the current state-of-the-art, and i definitely don't want to slow down alecuyer on any of this, or otherwise make life more difficult 21:42:27 sorry, I'm slow, you mean patches are sitting open and we should get them landed to the feature branch? 21:42:35 quicker. 21:43:03 if so, feature branches should be quick to merge. 21:43:40 https://review.opendev.org/#/c/679022/ is the patch on feature/losf to drop grpc for http 21:43:46 One thing I can say about it is that it only changes the way the index-server communicates with the python. But nothing on disk changes. So if changes are required it shouldn't be a problem (no backward compatibility issue) 21:44:10 push a patch. leave it open for x time and land it even if people haven't reviewed it too hard.. the idea of a feature branch is to move fast.. leave it open long enough to give people a chance to know its there and see it. 21:44:28 but don't feel bad about landing code. It will have to be reviewed again anyway. 21:44:46 Well that's how Al and I worked on the sharding stuff. 21:45:10 mattoliverau, that's basically the conclusion i reached this morning when i saw kota_ apologizing for not reviewing it ;-) 21:45:54 merging. just wanted to confirm that we're on the same page about it 21:45:56 if the patch breaks something we can fix it. or revert it, so no stress :) 21:46:50 that's why feature branches are awesome. So if it's holding you up, merge it ;) 21:47:06 alecuyer, what are the next steps we need for losf? 21:47:26 first thing after that patch, I think, would be to rebase https://review.opendev.org/#/c/666378/ 21:47:35 (and adress your latest comments!) 21:48:27 then I need to push a few patches for issues we found (sorry I haven't pushed these earlier) 21:49:12 and one last thing to mention, wrt to the CPU usage (that I think you also noticed at swiftstack while testing), 21:49:33 sounds great! i'll try harder to make time to review them, and get them merged even if i can't spare time for a thorough review ;-) 21:49:50 alecuyer: how dare you find issues and fix them.. and then want to share them.. shame on you :P 21:50:15 mattoliverau: yeah I know ;) still I should share them quicker :) 21:50:41 forgot half of my last sentence, sorry - CPU usage : we are considering using the hashes.pkl 21:51:26 Just discussing this, happy to discuss it here or later (i think we're short on time, leave time for other topics) 21:51:51 oh... does it skip that entirely at the moment, and recompute every time? interesting... 21:51:58 it does 21:52:36 i think we also tend to tune our replicators/reconstructors aggressively, which would compound the issue 21:53:19 er, maybe "to be aggressive" would be more clear 21:53:20 yes, so that was probably a bad call. the good news is is should be easy to reuse (just change the path so we don't have to create on dir per hashes.pkl file, prefix the filename with the partition, for example) 21:53:37 (the bad call -> doing away the hashes.pkl I mean - not being aggressive while reconstrucing) 21:53:45 love it 21:53:51 thanks for keeping us updated 21:53:57 +1 21:53:58 #topic sharding 21:54:25 sorry mattoliverau -- still haven't gotten to reviewing :-( 21:54:31 nps 21:54:51 As I mentioned last week 21:55:10 I've reworked the empty broker patch 21:55:24 also, we got a new bug! and mattoliverau already has the beginnings of a patch :-) 21:55:37 yeah 21:55:54 was about to mention that. I went and worked on a patch to clean up cleave contexts 21:56:22 it's currently attached to the bug. But will add some tests and push it up as an actual patch 21:56:32 I'll try and get that done in the next few days 21:56:44 https://bugs.launchpad.net/swift/+bug/1843313 21:56:45 Launchpad bug 1843313 in OpenStack Object Storage (swift) "Sharding handoffs creates a *ton* of container-server headers" [Undecided,New] 21:56:57 thanks timburke I was looking for that :) 21:57:21 and thanks timburkefor the bug report with a good solution idea. Makes life easier. 21:57:24 thanks mattoliverau! 21:58:07 that's all I've got 21:58:29 #topic open discussion 21:58:38 anything else to bring up real quick? 21:58:39 I've been looking into the PDF train goal 21:58:45 oh, right! 21:58:49 thanks for that 21:59:07 making progress, I've pulled Nick, a suse guy who did the designate PDF stuff 21:59:28 to help point me in the right direction re latex and sphinx debugging 21:59:49 anything you need the rest of us to help with? proofreading, i suppose 22:00:05 I think tables get messed up when we use ``tag`` s in the columns 22:00:18 So I might need to remove them to get it to build properly 22:00:33 or hard code table column lengths.. which I'd rather not do. 22:00:40 anyway. I'll kjeep playing :) 22:00:52 ...probably fine to give them normal formatting. *shrug* 22:01:12 all right, time to wrap up 22:01:19 yeah, especailly if it means they'll work and in a table I assume they'll still stand out. 22:01:22 kk 22:01:25 thank you all for coming, and thank you for working on swift! 22:01:30 #endmeeting