21:00:02 #startmeeting swift 21:00:03 Meeting started Wed Oct 9 21:00:02 2019 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:04 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:07 The meeting name has been set to 'swift' 21:00:20 who's here for the swift meeting? 21:00:29 o/ 21:00:31 o/ 21:00:42 o/ 21:01:11 o/ 21:01:32 i know clayg's under the weather today, so i'm assuming tdasilva or i will be talking about versioning later 21:01:58 #topic Shanghai 21:02:05 a few different things to bring up 21:02:10 first, as always, there's the etherpad for topics 21:02:15 #link https://etherpad.openstack.org/p/swift-ptg-shanghai 21:02:22 but we've also been getting more details about logistics 21:02:28 #link http://lists.openstack.org/pipermail/openstack-discuss/2019-October/010047.html 21:02:33 notably, looks like 21:02:39 * we've got three days' worth of time dedicated to swift 21:02:44 * no coffee in the meeting rooms 21:02:51 * we *can* move furniture this time ;-) 21:03:00 * and sounds like there's going to be a pretty hard cut-off at 4:30, so we may need to investigate other places to continue conversations before dinner 21:03:24 No coffee! 21:03:29 :/ 21:03:40 i know, right? and they expect work to get done... ;-) 21:03:55 perhaps, tea exists, instead? 21:04:09 mattoliverau: but the discussions can move to a pub from 4:30, you got to choose between coffee or beer ;) 21:04:12 doesn't look like it from the link :/ 21:04:29 lol, that's a good point 21:04:38 nope :-( 21:04:40 "Unfortunately, the venue does not allow ANY food or drink in any of the rooms." 21:04:41 *not allow ANY food or drink in any of the rooms.* 21:04:44 :/ 21:04:54 So there may be coffee outside 21:04:58 surely 21:04:58 so more hallway track time 21:05:25 and we can maybe organize some breakfast get-togethers or something 21:05:42 and finally, i remember being sad that i didn't think to do this in denver, so i proposed an ops feedback session; looks like we haven't had one of those since Boston 21:05:49 and it was accepted! 21:05:54 #link https://www.openstack.org/summit/shanghai-2019/summit-schedule/events/24413/swift-ops-feedback 21:05:57 oh cool, nice one 21:06:26 nice! 21:06:33 so we'll see who we can meet there :-) 21:07:31 any other comments/questions on Shanghai? any further information i should try to run down for people? 21:08:04 i ought to book the team photo; do we have any preferences on timing? morning, afternoon, ... 21:08:40 I don't mind either time 21:08:48 given the timezone, I think I would prefer during the night, so whatever… 21:09:07 maybe i'll aim for right before or right after lunch 21:09:24 next up 21:09:25 #topic train release 21:09:27 right before so you don't have to rush back. 21:09:31 just a quick follow-up from last week 21:09:31 i don't have strict limitation. I'll leave Saturday morning so I can make me available until Friday 21:09:36 we have a 2.23.0! \o/ 21:09:46 good 21:09:48 \o/ 21:09:54 and a stable/train branch 21:10:06 thanks again for all your hard work -- best release yet :-) 21:10:41 on to updates! 21:10:47 Make sure not to wear any signage on your shirt, most especially not in Chinese ideograms that you cannot read. Unbeknownst to you, they may refer to Taiwan, Tibet, HK, or Uigurs. 21:11:08 good call 21:11:13 right 21:11:30 #topic versioning 21:11:43 i haven't seen a tdasilva yet... 21:12:17 but we've been investigating this null-namespace idea... and bumping into some complications with C code (perhaps unsurprisingly) 21:13:09 I'm surprised, actually. You can have an empty string "" in C easily, so... 21:13:59 but a string like "foo\0bar"... 21:15:15 anyway, still trying to figure out how best to move forward with it, but i don't think we've been able to rule it out entirely yet. if NUL doesn't work out, we may try to claw back "\x01", similar to how we claimed the leading "." namespace in accounts 21:16:45 still need to wire up versioning to use the reserved namespace, and layer s3api on top of that. we should have something of a solid proof-of-concept in time for discussions in Shanghai, though 21:16:57 just saying as I heard it from a colleague today, why not using / as it is already forbidden in container names. the reason might be the enormous impact on codebase 21:17:09 I don't know if it was considered 21:18:04 interesting thought. i think we'd have a hard time distinguishing container requests from object requests, though... hm... 21:18:19 yeah, that's what I though too 21:19:03 add an x-backend-no-really-this-is-my-container-name header? ;-) 21:19:51 timburke: nah, let's fork sqlite instead ;) 21:20:11 anyway, i don't think we're stuck on anything at the moment, just a matter of needing to try a few more things and get a better feel for how we ought to move next 21:20:22 #topic lots of small files 21:21:12 alecuyer, hopefully my changes to https://review.opendev.org/#/c/666378/ were agreeable 21:21:25 Yes, thanks a lot for your help with this one! 21:21:50 happy to! and there's also a fresh merge from master 21:21:52 tests! 21:22:31 next, I'll need to rebase this one https://review.opendev.org/#/c/659254/ , which is much smaller 21:23:22 oh nice, yeah 21:23:28 and then I'd like to think about how to handle leveldb failure (again I'd like to stress that we're never seeing leveldb self destructing, but occasionally fsck will remove a .ldb file and the db goes bad) 21:23:28 and i think i spotted a problem in the quarantine logic: https://review.opendev.org/#/c/686846/ 21:23:47 interesting 21:24:00 oh I missed it! will check it, thanks timburke 21:24:39 currently a leveldb failure is handled "out of band" . I wonder if that should happen within swift (rebuild the db if it's corrupt) 21:24:58 We don't expect swift to repair a filesystem, but in this case I'm not sure.. 21:25:24 do we have a feeling for roughly how long it takes to rebuild the db? 21:25:56 I think for a 6TB drive it's about 12 hours if i remember well 21:26:16 For a disk full of small files yes that's about right (but that's what we're expecting for LOSF right) 21:26:17 * timburke whistles 21:26:53 if there is fewer bigger files it would be way faster of course 21:27:35 there should be multiple dbs per drive, right? would that be for rebuilding *all* dbs on a drive? 21:27:51 there is one db per drive/policy 21:28:05 oh, i was just mis-remembering 21:28:20 in the example I gave there is only one policy per drive, so one db 21:28:30 makes sense 21:28:59 jsut to put things in regards, it's way faster than xfsrepair in the same condition 21:29:24 (about 70 million "files") 21:29:31 I guess an auditor could fire off an async concurrent job to start a rebuild if detected.. but 12 hours means you'd want to be sure. 21:30:06 yeah, my initial thought was that maybe the auditor (or maybe even replicator/reconstructor) could handle this... but that might make for some really lumpy cycle times... 21:30:44 currently we watch it through the index server python parent process 21:30:47 yeah, it might be considered a different type of job 21:31:37 So it could enqueue something maybe. that'll either alert operators or maybe even fire something off. 21:32:07 yes one could have the choice of manual trigger vs fully automated. So something to think about 21:32:11 its a thought. Self healing would be cool :) 21:32:42 Add to the ideas list :) 21:33:11 👍 is there anything else we should be looking at/thinking about? 21:33:12 just thinking, if the repair does not start, the data are mostly unreachable, right? so any good reasons to not fire the repair? 21:33:34 just a note of our deployment: we tried to extend our deployment of pure golang leveldb but we had to rollback, some strange errors alecuyer is investigating 21:34:05 timburke: no I think that's it from me 21:35:28 cool. i'll think more about the db file stuff... 21:35:41 #topic sharding 21:35:54 mattoliverau took a look at the stat-latching patch! 21:35:57 thanks :-D 21:36:04 I haven't really done as much on this as I'd like.. kinda been distracted. 21:36:08 oh yeah I did look at that 21:36:26 I like it. Will test it some more locally in an SAIO 21:36:39 Try and do that today while I have no idea what I'm doing workwise 21:37:01 something something, changing strategic direction something something? 21:37:45 yeah. Well I was lucky and wasn't layed off, but given a new role outside of openstack 21:37:55 in SES of all places. 21:38:16 But no idea what that really means until I meet with my new manager today sometime. 21:38:54 true. thanks for still showing up here this week! i hope you can keep working on swift, but i know you'll keep doing good work whatever you're working on 21:39:12 I hope to still be given time to work on an upstream project... but it might mean no more travel to openstack events supported by my company. 21:39:32 ta 21:39:39 :( 21:39:58 that's all i had 21:40:04 #topic open discussion 21:40:20 anything else we ought to discuss? 21:43:02 all right, let's let mattoliverau and kota_ get breakfast, and alecuyer get to bed ;-) 21:43:11 :) 21:43:13 thx 21:43:15 thank you all for coming, and thank you for working on swift! 21:43:21 thhanks all 21:43:24 #endmeeting