21:00:13 #startmeeting swift 21:00:13 Meeting started Wed Dec 20 21:00:13 2017 UTC and is due to finish in 60 minutes. The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:15 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:17 The meeting name has been set to 'swift' 21:00:20 who's here for the swift team meeting? 21:00:36 o/ 21:00:50 meeting meeting meeting 21:00:54 hi 21:00:57 tdasilva: !! 21:01:06 clayg! 21:01:20 hello 21:01:20 o/ 21:01:41 hello 21:02:24 hi o/ 21:02:45 hi 21:02:49 hello :) 21:03:04 rledisez: acoles: torgomatic: joeljwright: !!! everyone is here !!! 21:03:08 * clayg steps out for a bit 21:03:08 welcome, everyone 21:03:11 lol 21:03:19 agenda at 21:03:22 #link https://wiki.openstack.org/wiki/Meetings/Swift 21:03:41 a few things to review and a new topic from m_kazuhiro 21:03:58 first up, some FYI stuff... 21:04:01 #topic general stuff 21:04:23 patch 528582 is making its way through the gate 21:04:24 https://review.openstack.org/#/c/528582/ - swift - Native Zuul v3 tox jobs 21:04:29 and in fact it changes the gate 21:04:42 it's the in-repo definition of check and gate jobs for swift 21:04:50 using the new zuul v3 stuff 21:05:35 it's a great improvement because it allows us to define test jobs in our own repo (AIUI, even with ansible scripts to set up an environment, if needed) 21:06:02 this first patch is the last step int he old centralized job definition to the new zuul v3 way 21:06:11 tdasilva: ^^^ pretty cool huh!? 21:06:24 all hail zuul v3 21:06:25 very! was thinking we could add libec, probe tests 21:06:30 yup 21:06:41 ~ a whole new world ~ 21:06:43 +1 21:06:43 :) 21:07:17 speaking of probe tests, we all probably noticed by now that the swiftstack community qa cluster has been throwing errors. I've finally burned through enough on my todo list to get to that item 21:07:45 burn it down! 21:07:47 (1) I hope to fix it, of find someone who can (2) some of it may not be needed any more with the zull v3 definitions 21:08:08 any other general stuff to bring up? 21:08:11 oh... i mean fix it... but you COULD burn it down - that would *also* be fun... 21:08:20 I thought I had something else, but of course I didn't write it down... 21:08:22 i was actually thinking of that wrt to the func tests... 21:08:35 i think, libec and pyeclib are also in progress for zuul v3 21:08:35 timburke: yeah burn 'em! 21:08:42 I heard something about some conference talks being accepted, acoles cschwede, tdasilva 21:08:42 we should totally get some cross-policy testing happening in the gate though! 21:08:47 notmyname: IIRC the community cluster functional test was setup to cover an EC policy which we now have an in process test for 21:08:55 acoles: ya 21:09:06 cschwede: !!! 21:09:06 e.g. https://review.openstack.org/#/c/528546 21:09:07 patch 528546 - pyeclib - Convert tox job to native v3 Zuul 21:09:53 notmyname: https://fosdem.org/2018/schedule/track/software_defined_storage/ 21:10:18 there are two swift talks missing from there right now, but i expect they will be there soon 21:10:26 nice 21:10:44 tdasilva: and cschwede assure me they will be there too, otherwise I'll have to drink belgian beer on my own :/ 21:10:55 and give 3 talks 21:11:28 kota_: oh man! that makes me so happy that i made a liberasurecode-git tox env for pyeclib... 21:11:43 tdasilva: t's swift, same talk 3 times over :) 21:11:56 acoles: tdasilva: cschwede: that's great :-) 21:12:24 ok, let's move on to the agenda topics 21:12:28 #topic symlink status 21:12:35 patch 232162 21:12:36 https://review.openstack.org/#/c/232162/ - swift - Symlink implementation. (MERGED) 21:12:37 whoooo!!! 21:12:39 it's merged! 21:12:42 yey 21:12:52 (what follow-ons are still open?) 21:13:06 awesome! 21:13:14 great! 21:13:33 https://review.openstack.org/#/c/527583/ 21:13:33 patch 527583 - swift - functest for symlink + versioned writes 21:14:02 it looks like that it got a merge conflict 21:14:03 notmyname: https://review.openstack.org/#/c/527583 - clayg or i need to rebase it 21:14:04 patch 527583 - swift - functest for symlink + versioned writes 21:14:29 the conflict is all my fault, think I fixed a teardown in two patches 21:14:34 other than the rebase, how's it look? 21:15:11 booo `liberasurecode-git tox env` 21:15:30 er... booo `The change could not be rebased due to a conflict during merge` even 21:15:35 copy/paste error 21:15:46 merge all the things! 21:16:04 anything else needed for symlinks follow-on? 21:16:13 clayg: if you dont get there before me i will fix the conflict tomorrow 21:16:15 clayg: didn't you have a pile of them? did they all get squashed or landed? 21:16:42 what symlink follow-on? idk... i'd have to work on a list 21:16:43 there ware 5 follow up patches but 4 of them landed already AFAIK 21:16:44 a bunch got landed 21:16:56 notmyname: see kota_ is on it - you don't have to worry - i'm not worried 21:16:57 ok 21:17:06 😎 21:17:22 sounds good! 21:17:28 #topic SLO data segments 21:17:35 patch 365371 21:17:35 https://review.openstack.org/#/c/365371/ - swift - Add support for data segments to SLO and Segmented... 21:17:38 oh man, so many. it was great. p 527595 p 528073 p 528072 p 527477 p 527475 21:17:39 https://review.openstack.org/#/c/527595/ - swift - Add more assertions for Symlink + Copy unit tests (MERGED) 21:17:40 https://review.openstack.org/#/c/528073/ - swift - add symlink to probetest for reconciler (MERGED) 21:17:42 https://review.openstack.org/#/c/528072/ - swift - add symlink to container sync default and sample c... (MERGED) 21:17:43 timburke: looks like you just pushed a patch set 21:17:44 https://review.openstack.org/#/c/527477/ - swift - Assert X-Newest and X-Backend headers are propagat... (MERGED) 21:17:46 https://review.openstack.org/#/c/527475/ - swift - Symlink doc clean up (MERGED) 21:18:04 timburke: don't abuse the patchbot 21:18:13 :D 21:18:23 oh, 6 total and 5 got mreged. 21:18:28 *merged 21:18:33 nice 21:18:34 notmyname: yup. had to fix it up after the name change in 528081 21:18:43 p 528081 21:18:43 https://review.openstack.org/#/c/528081/ - swift - rename utils function less like stdlib (MERGED) 21:18:54 timburke: thanks 21:19:14 joeljwright: wanna give it a quick look to see if it still resembles what you remember? :-) 21:19:27 I've been keeping an eye on it 21:19:27 gate looks happy -- just waiting on the extra long test 21:19:42 but I'll give it another look for patchset 40 21:19:52 timburke: joeljwright: did either of you test the nested SLOs with a recent patch set? when I looked, it looked good, but that was an outstanding question 21:19:59 it's evolved into something really cool thanks to timburke 21:20:14 yeah, it's a pretty cool feature 21:20:24 I suspect it will become a commonly-used feature 21:20:53 notmyname: yeah, there's a func test for that now. look for 'nested-data-manifest' 21:20:59 nice! 21:21:27 sounds good 21:21:28 #topic s3api work status 21:21:29 I haven't tested since patch 38 or 39, happy to take another look tomorrow 21:21:29 https://review.openstack.org/#/c/38/ - openstack-infra/system-config - Added entry for building manuals. (MERGED) 21:21:35 kota_: tdasilva: what's going on here? 21:21:36 ttfb on SLOs got you down? inline the first meg! ;-) 21:21:36 sorry 21:21:53 timburke: first 7k ;-) 21:22:10 TBH i haven't had a chance to take a look at s3api for a while, definetely need to back to it :( 21:22:12 I havne't looked at the s3api feature branch in a bit 21:22:25 same for you timburke? 21:22:28 last i was working on merging tests 21:22:48 kota_: have you spent any time on it lately? 21:22:51 oh, sorry I was left for a bit 21:22:54 yup 21:22:59 merge as in merging trees, not landing new tests 21:23:10 got it 21:23:12 I was starting to work on it yesterday 21:23:16 nice! 21:23:20 wait a bit 21:23:27 paste the trillo link... 21:23:44 https://trello.com/b/ZloaZ23t/s3api 21:24:07 i took 2 of tasks from backlog 21:24:31 https://review.openstack.org/#/c/529238/ 21:24:31 patch 529238 - swift (feature/s3api) - Add s3api section into docs 21:24:32 docs 21:24:46 https://review.openstack.org/#/c/529268/ 21:24:46 patch 529268 - swift (feature/s3api) - Avoid global LOGGER instance 21:24:47 logger 21:25:05 and some minor things 21:25:06 nice! 21:25:17 that's great. thanks for keeping work going on it 21:25:33 kota_: is on a roll1 21:25:38 roll! 21:25:52 hopefully, i expect timburke helps to land. in particular LOGGER one starts to touch a bunch of places 21:26:37 tdasilva: if you have a patch for functests even if its intermediate, i may help you to cleanup. 21:26:59 kota_: that sounds great, i'll try to propose as a WIP 21:27:09 yeah, i'll be sure to help review when i can 21:27:11 yay team swift :-) 21:27:29 that's all from me for s3api 21:27:29 #topic general task queue 21:27:39 m_kazuhiro: this is what you've been working on 21:27:51 #link https://wiki.openstack.org/wiki/Swift/ideas/task-execution 21:27:58 #link https://docs.google.com/document/d/11sBbB6pBvLYNeM9wjTdvvsJIu8Dl8i13UH2NfRVNOqg/edit#heading=h.u5kq7utbivxa 21:27:59 general task queue!!! 21:28:04 https://review.openstack.org/#/c/517389 21:28:04 patch 517389 - swift - WIP: Update object expirer to use general task que... 21:28:24 whoa 21:28:33 This is updating work for expirer's task queue and this is next work for auto-tiering. 21:28:35 m_kazuhiro: and you've been working with rledisez and mattoliverau(?) on it? 21:28:42 notmyname: yes. 21:28:46 great 21:29:00 how's work going? 21:29:31 rledisez: is all about it 21:29:57 totally :) 21:29:59 I implemented main code for that. But I want to discuss some topics about it. 21:30:09 ok 21:30:15 m_kazuhiro: do you have your tickets booked for PTG? 21:30:45 clayg: I'm going to get tickets now. 21:30:47 * clayg is just asking - he hasn't booked his travel either - but I think other people have started doing so 21:31:01 whoot! 21:31:28 m_kazuhiro: great good doc 21:31:39 google 21:31:59 the great doc is made by mattoliverau !! 21:32:16 * tdasilva high fives mattoliverau 21:32:55 +1 very good summary of discussion we had in sidney 21:33:08 is there a particular question that we need to discuss in this meeting? 21:33:32 looks like it will be a very good topic in dublin 21:34:06 Romain Expirer Algorithm 21:34:16 yay, swift finally has named an algorithm 21:34:17 neat idea to use partitions to break up work... 21:34:34 heh 21:35:33 Today's main discussion topic is in end of the google doc. 21:35:55 "Known problems" 21:36:46 m_kazuhiro: can you restate that problem - i don't understand it in that context... I read something about a migration? maybe taking the old containers and backfilling into the new per-partition containers? 21:37:14 basically, if every objects server fetch a different account for object-expiring tasks, what about the legacy tasks. all objects servers will fetch the same account. if you have a lot of object server, it's a DDoS of the account server 21:37:35 migration could be difficult path if you have hundreds of millions of tasks 21:38:13 rledisez: i agree - i'd rather now have to translate the rows 21:38:21 shard the migration work with the new queue system 21:38:27 rledisez: Thanks 21:38:29 (half serious :) ) 21:39:11 rledisez: how many object-expirier's do you currently run? You're worried about a bunch of account GETs ... i... I'm not sure I'm scared... once it's in the page cache... I'm not sure 21:39:25 like... what... 10K requests? meh.. 21:39:43 could we have some config option that says "continue pulling work from the old location, but write new work down in two places" then later flip it to "pull work from new place (and only write down new work in one place)"? 21:39:44 10K requests, every minutes ? 21:39:56 that's a lot, I already killed a cluster that way 21:40:47 timburke: i was about to propose something like object-expirer2 21:40:57 i was thinking of something based on the device id and current day. a process would fetch task if day % device_id == 0 not that great, but it would limit the concurrency on legacy account 21:42:02 well, it's actually a very bad ideai, forget it :) 21:43:08 I got an idea about that from kota_... and I think it is great. The idea is that expirer has "check_legacy_queue" boolean configure. We can controll expirer count which checks legacy queue with the value. 21:44:16 m_kazuhiro: does that answer help you move forward for now? 21:44:38 i'd rather have something the operator does not need to think about. if the operator forget that new param, the feature is broken. doesn't feel nice. but as we require that the operator run object-expirer on object-server, it might be OK I guess 21:46:21 notmyname: Yes. I can move forward with the answer. But I want to get opinions about the idea. 21:46:36 rledisez: so you HAVE deployed enough object-expirers that the container server's holding the .expiring_objects account were not happy with the current amount of listings? I don't think i've ever had that problem. 21:47:09 er... account servers or account/container servers 21:47:33 clayg: it was not object-expirer, but an other internal process working pretty much the same way AFAIR 21:47:38 I've seen specific expiring_object/ *containers* get busy (but that's because they're eating row updates) 21:48:03 and that part at least has been better since the RAX guys did the thing to make 100 containers for every timeslice or however dfg wrote it 21:48:09 maybe i should do an other test on our biggest cluster to see how it behave now 21:48:37 i feel like i must be missing something... probably need to read about it more. but the migration path for inherently ephemeral data seems like it shouldn't be a stumbling block. is it that we're changing the cardinality of workers? 21:49:46 does the new design have a bunch of accounts? I thought it was account-per-task so I don't quite follow how the load at the account layer is reduced ... and I'm not sure how many GETs on account would need to happen before I'd be concerned... 21:50:13 clayg: account per $task-$ring-$partition_number 21:50:22 I mean... i think we can write these processes to be pretty nice to swift - and we'll only have as many clients as nodes.. and there's probably more external clients than internal nodes... 21:50:35 ORLY!? ok then... 21:51:47 ok, yeah I minunderstood the diagram - the indentation is account -> container -> object - gotcha 21:51:54 ok cool! 21:51:57 as long as we don't wind up with some Kafka-esque rules for how our queue system works, things should be fine 21:52:43 m_kazuhiro: thanks for bringing this topic up. it's a good question 21:53:17 to everyone, where's the best place to keep discussing it? in -swift? or something else like the google doc? 21:53:24 m_kazuhiro: what woudl you prefer? 21:54:56 I think we should ignore it 21:54:58 I think remainning topic can be discussed on IRC (not this meeting). So we can finish acout task queue for now. 21:55:00 this problem at least 21:55:13 ok 21:55:26 We could come up with all kinds of arbitrary ways to limit the number of deployed nodes which handle the legacy data 21:55:32 that's brought us pretty close to full time 21:55:37 #topic open discussion 21:55:39 "also_do_legcy_work = false" 21:55:47 I remembered the other thing I wanted to bring up at the beginning 21:55:49 next meeting time 21:55:55 yup 21:56:01 "legacy_processes = 4" "legacy_process = 1" 21:56:14 I suggest we skip the next 2 weeks 21:56:21 and next meeting is january 10 21:56:37 notmyname: can I still get into openstack-swift and tell everyone what santa brings me? 21:56:41 sounds good 21:56:42 on the 27th 21:56:46 any objections? (objecting is the same as volunteering to run a meeting) 21:56:53 no objections 21:56:56 :-) 21:56:56 notmyname: no objections 21:57:03 * joeljwright stays silent 21:57:04 ugh, I lost 21:57:09 ok for me, i'll be off starting friday 21:57:10 ok, next meeting is january 10 21:57:17 rledisez: have a good holiday! 21:57:28 thx clayg :) 21:57:35 rledisez: enjoy! 21:57:36 if you are taking a holiday, I hope you have a good time with friends and family 21:57:40 rledisez: bon vacances 21:58:01 thx all, i'm sure i'll enjoy it! 21:58:15 thanks, everyone, for all your work on swift :-) 21:58:24 thanks for a good swift-in-2017 21:58:30 #endmeeting