21:00:04 #startmeeting swift 21:00:04 Meeting started Wed Jan 31 21:00:04 2018 UTC and is due to finish in 60 minutes. The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:05 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:07 The meeting name has been set to 'swift' 21:00:19 who's here for the swift team meeting? 21:00:22 o/ 21:00:25 o/ 21:00:46 hello 21:01:05 hi 21:01:23 could be a nice small one (in terms of people). 21:01:30 seems like it 21:01:34 but the important people are here ;) 21:02:20 there is only important people around swift ;) 21:02:29 ok, not a whole lot on the agenda this week 21:02:36 #link https://wiki.openstack.org/wiki/Meetings/Swift 21:02:44 #topic releases 21:02:48 first up, releases 21:03:02 we've done a swiftclient release (it took a long time in the gate, but it's done now) 21:03:16 and swift itself is getting ready for a 2.17.0 release 21:03:21 \o/ 21:03:45 I know I may have given the impression of doing the 2.17.0 release sooner, but I was waiting on a patch and I was traveling :-) 21:04:02 next swift client release will need to support different tempurl hashes, cause that just landed in swift 21:04:05 and we arent under time pressure (yet) for it. we were on a schedule for the client release, so i did that one first 21:04:29 mattoliverau: good point 21:04:39 yeah, so a few last-minute things landing ins wift for 2.17.0 21:04:47 the multiple hashes for tempurls is cool 21:04:52 yeah, in no hurry.. we might be able to land some more things so notmyname has to update the changelog again :P 21:04:54 and the data segements in SLOs is really nice :-) 21:05:17 yeah, I'll get the changelog updated today or tomorrow and do the tag when it lands 21:05:30 thanks again mattoliverau and cschwede, for looking at the tempurl stuff! i suppose i ought to start on a similar patch for formpost... 21:05:40 looking ahead to the official queens release... 21:06:00 I do not know yet if we'll do another 2.18. or 2.17.1 release before the queens deadline 21:06:02 timburke: multpe hashes in the client first, so people can easily use it 21:06:08 to some extent, it depends on what lands 21:06:21 if we do anything, I suspect it would only be a 2.17.1 21:06:34 IIRC the deadline for that release is only a couple of weeks away anyway 21:06:53 I added a basic part diff tool.. though maybe it isn't useful to people. It was fun to write tho :P 21:06:56 so maybe we'll just call this one an early queens release :-) 21:07:23 mattoliverau: what's that? 21:07:37 i pushed it last night, let me find it 21:07:51 i saw something about that... need to find time to take a look... 21:08:15 #link https://review.openstack.org/#/c/539466/ 21:08:16 patch 539466 - swift - Add a basic partition diffing tool 21:08:31 oh, cool 21:08:53 Just from a discussion on channel the other week. so you can compare builders and rings to see whats different in the replica2parts2dev tables 21:09:06 yeah, that could be quite useful 21:09:21 tho I don't know if the verbose option is useful.. but is fun :P 21:09:36 when `swift-recon --md5` gives an error, this tool could tell you how bad it is 21:10:01 might need to add a device diff too. but you can just use ring builder for at least that list 21:10:12 *to the tool 21:10:17 any question on releases, current or upcoming? 21:10:38 then you realize that you had a part power of 24. then you cry... 21:10:45 ;-) 21:11:03 yup.. but it look pretty 21:11:21 could strip -v out 21:11:35 between now and the queens cutoff date, if you see something that's a big bug, that should be priority above new features 21:11:51 let's move on to the PTG 21:11:51 +100 21:11:57 #topic PTG planning 21:12:08 I made a thing 21:12:10 #link https://etherpad.openstack.org/p/Dublin_PTG_Swift 21:12:18 notmyname: is the checkpoint question still alive? 21:12:21 an etherpad for gathering topics 21:12:51 acoles: IMO yes it is, but I'd like to have torgomatic around. well, more people actually. and I think it's something we should talk about at the PTG 21:13:10 rephrased, I do NOT think it's something we should agree to or decide quickly 21:13:11 notmyname: ok, makes sense 21:13:30 add it to the etherpad! :-) 21:13:50 done 21:14:12 acoles: I'll looking forward to being able to talk sharding at PTG. Sorry I've been a bit in and out regarding reviewing and working on it upstream over the last few weeks. 21:14:48 mattoliverau: NP, I'm also looking forward to a face to face session on it! 21:16:13 the PTG is in Dublin Ireland starting on monday february 26 21:16:33 we'll have a room for wed-fri, but I'm sure we'll find a way to talk on monday and tuesday, too 21:16:52 and it's likely we may also try to find some time for some off-site social activities too 21:17:19 And I found pretty direct flights, so it'll only 26 hours of transit (instead of 30+) so I'm happy :) 21:17:32 if you find someone who's going to the PTG, please encourage them to update the etherpad 21:18:11 here's all the links to etherpads for the whole event 21:18:12 #link https://wiki.openstack.org/wiki/PTG/Queens/Etherpads 21:18:44 we're using a different naming pattern. 21:19:07 that's because i created the etherpad before I added a link to it on the wiki page 21:19:26 oh it's fine, just my OCD kicking in 21:19:44 https://wiki.openstack.org/wiki/PTG/Rocky/Etherpads you mean, right? 21:19:47 hang on. no we arent 21:20:03 oh yeah 21:20:09 r comes after q 21:20:16 #undo 21:20:17 Removing item from minutes: #link https://wiki.openstack.org/wiki/PTG/Rocky/Etherpads 21:20:20 #link https://wiki.openstack.org/wiki/PTG/Rocky/Etherpads 21:20:23 yeah, https://wiki.openstack.org/wiki/PTG/Rocky/Etherpads (i've had that open so didn't follow the irc link) 21:21:07 timburke: thanks 21:21:18 any other questions about the PTG? everyone ok for now? 21:21:26 * tdasilva snicks in... 21:22:02 WELCOME tdasilva!!!!!!! 21:22:09 any questions tdasilva? now that your here :P 21:22:11 (you can't "snick" in ;-) ) 21:23:03 #topic task queue: upgrade impact 21:23:12 m_kazuhiro: this is a topic you added to the agenda 21:23:14 #link https://etherpad.openstack.org/p/swift_general_task_queue 21:23:22 m_kazuhiro: the floor is yours 21:23:30 notmyname: thank you 21:23:49 At first, I explain background. 21:25:00 I'm implementing patch to update object-expirer now. https://review.openstack.org/#/c/517389/ 21:25:01 patch 517389 - swift - Update object-expirer to use general task queue sy... 21:25:29 This patch makes expier to use general task queue. 21:25:46 With general task queue feature, many object-expirers run on a swift cluster. Object-expirer runs on every object-server. 21:26:12 The expiratoin tasks are assigned to object-expirers according to partition numbers which their object-server have. 21:26:30 To get partition numbers of local object-server, object-expirer requires ip and port information of object-server. Therefore object-expirer's config is moved to object-server.conf's section from sepcial conf file 'object-expirer.conf' in the patch. 21:26:57 Hidden task account / container names are changed from original task account / container (hereinafter, we call it "legacy" style). In legacy style, task account / container is only one for a swift cluster. With general task queue feature, there will be many task accounts / containers. 21:27:19 To make expirers compatible with legacy style tasks, we can set "execute_legacy_task" flag in object-expirer section in object-server.conf. If the value is True, the object-expirer will execute legacy style tasks. So we can choice which object-expirer runs for legacy style task on object-server. 21:27:49 After the patch, no legacy style expiration tasks are created. Expiration tasks are created only into general tasks queue. 21:28:38 About this patch. The discussion points for now are... 21:29:33 1: If swift operators run object-expirer (for legacy style tasks) NOT on object-servers before the patch. They need to redesign where object-expirers for legacy style tasks should run on. This is impact for operator. 21:29:49 2: If swift operators forget to update object-server's [object-expirer] section, there will be error message 'Unable to find object-expirer config section in object-server.conf'. Then, no object-expirers will run. Can we accept the behavior? 21:31:07 I want to discuss above points. 21:31:16 thanks m_kazuhiro 21:31:31 yes, thank you 21:31:38 So in the ehterpad you talk about 2 choices. 21:31:54 so it's about what happens in a cluster that is using expiring objects after they upgrade to a release that has a task queue 21:32:22 make people migrate to the new expirer configuration or run 2 sets (legacy) and task queue expirers. The former looking at the old config. 21:32:39 morning 21:32:47 sorry slept too much 21:32:53 kota_: no worries 21:33:20 Having 2 different expirers seems more confusing then just fixing it up on upgrade to me. 21:33:54 but on the other hand "run this legacy one until the queue is gone" does sound simple 21:34:00 I mean what do we call them, and if people are using automated swift-init or there own systemd init scripts then they may all need to be renamed 21:34:06 yeah 21:34:11 I suppose the queue may never be gone, if someone has a 7 year expiry or something, though 21:34:28 So it really depends on where your running expirers 21:35:08 it's pretty smooth if it's already on object servers. But obviously more effort on people running them elsewhere (line rledisez I believe) 21:35:37 I put this in the etherpad, while brain storming the steps involved (I might have missed some?) 21:35:57 - Move or create '[object-expirer]' section to object-server configuration (on all object server nodes) 21:35:57 - drop a '/etc/swift/internal-client.conf' if it doesn't exist, or define an internal client location with 'internal_client_conf_path'. 21:35:57 - If you need legacy expirers (not green field and want to clean up old legacy locations): 21:35:57 - if old expirers where on some of the object servers: 21:35:57 - add 'execute_legacy_task = true' to only them so they will still work and take advantage of old existing 'processes' and 'process' settings. 21:35:58 - else: 21:36:00 - pick the same number of expirers to do legacy work. Then you can use the same 'processes' and 'process' settings. Or just pick 1 to do legacy work and don't define 'processes' and 'process'. 21:36:02 - Optionally, if you wish to choose a different task queue account prefix of expirer queue name prefix do so now: 21:36:04 task_account_prefix = ?  (default 'task') 21:36:06 expirer_task_container_prefix = ? (default 'expirer'). 21:36:29 is it ok for more than one new style expirer to have execute_legacy_task=true 21:36:37 ok that didn't paste the best.. go look in the etherpad :P 21:37:03 yup, but if you do, you need to use the old legacy processes and process 21:37:04 I mean, per server 21:37:13 acoles: i should be, I guess it's then needed to set the processes and process 21:37:20 *it 21:37:24 otherwise we could dos outselves (all hitting the legacy namespace at once) 21:38:27 tho the good news is, even if you decided to only have 1 in legacy mode, we aren't adding to the legacy queue so it'll eventually get through it all 21:38:54 "eventually" 21:39:04 lol 21:39:06 yup 21:39:38 which is why processes and process hasn't disapeared.. just only legacy options now. 21:40:02 could we decide to have a fixed numbers of legacy process (eg: 32). if there is not 32 object servers, then the values for process/processes would adapt automaticaly 21:40:35 it would avoid to overload the legacy account/containers, while having some parallelization 21:41:06 maybe legacy processes could process *all* queue entries... if we haven't hit the expiration yet, add an entry to the *new* queue and pop from the old... 21:41:21 anyone wha ran an expirer before is going to want a legacy mode, correct? and potentially forever? 21:41:34 yes 21:41:41 i hate that "potentially forever" part... 21:42:00 timburke: right, I wondered if legacy tasks could be migrated somehow 21:42:37 but it seems like having legacy mode automatically enabled would be nice 21:43:31 but controlling some ellection of nodes to handle legacy starts getting complicated. Because we don't want too many to hit the same legacy namespace. 21:44:01 we could say replica 0 of each.. but at 24 part power that's still alot of servers hitting 21:44:14 so how should we move forward with this? keep discussing in here? discuss in irc in -swift? keep it in the etherpad? 21:44:37 well firstly.. 21:45:13 if legacy might be around "potentually" forever. then in the case of this discussion, I think we want option 2. 21:45:24 only 1 set of expirers (the new ones). 21:45:30 that need to handle legacy 21:45:58 otherwise we have 2 different expirer daemons and 2 sets of configs 21:46:17 too bad :'( 21:46:30 I'd rather 1 and either configure them in the why people used to for legecy or some automatic way. 21:46:37 *way 21:47:48 It sounds like that where the discussion was going.. so we might be able to progress the question at hand 21:48:15 but happy to take it offline, into etherpad, and definitely discussions at PTG :) 21:48:32 option 1 is my choice, and we will always have code related to legacy (either legacy expirer or conversion code, because we will never know if everything i converted in all clusters over the world) 21:48:55 enter the checkpoint release conversation... :-) 21:49:00 :) 21:49:11 lol 21:49:34 lol 21:49:36 ok, so mattoliverau and rledisez both say option 1 for now, based in part on the concerns from timburke and acoles. so that sounds like a good plan for now 21:49:37 well, offline/etherpad it is ;) 21:50:01 and we can readdress it at the PTL? (and of course in the etherpad or IRC before then) 21:50:06 *PTG 21:51:01 mattoliverau: Thank you for your leading discussion. 21:51:03 I need to digest this some more, and we should definitely give it time at PTG if not before 21:51:12 I agree 21:51:28 acoles: +1 21:51:37 #topic open discussion 21:51:49 is there more to bring up this week in the meeting? anyone else have something? 21:51:56 * acoles gets nervous about upgrades that break something that worked before 21:52:38 acoles: but it might work much better 21:53:09 mattoliverau: yes, its the getting to 'better' that worries me :) 21:53:18 :) 21:53:28 ok, notbody's jumping in with anything, so I think the meeting is done :-) 21:53:34 thanks for coming, everyone 21:53:36 only my random lunchtime hack part diff tool. but we briefly talked about it before 21:53:43 thank you for your contributions to swift 21:53:49 #endmeeting