19:00:48 #startmeeting swift 19:00:49 Meeting started Wed Jul 24 19:00:48 2013 UTC. The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:50 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:53 The meeting name has been set to 'swift' 19:00:58 hi 19:01:03 welcome 19:01:06 glad you can make it 19:01:18 (and I hope there are more here than the 4 that responded) 19:01:31 notmyname, hi 19:01:33 topics to discuss this week: https://wiki.openstack.org/wiki/Meetings/Swift 19:01:47 (not in order) 19:01:56 so first up, a swift hackathon 19:02:15 * swifterdarrell is here 19:02:32 since so many of the swift contributors will not be able to make it to hong kong, we're going to have a swift hackathon in Austin in Octobor 19:02:47 how's the humidity in Austin in Oct? 19:02:48 the purpose is for coding on swift, not for presentations 19:02:57 peluse: worse than phoenix :-) 19:03:09 well, I guess we'll make it anyway :) 19:03:17 here 19:03:28 it will be pretty small (~30 people total) 19:03:38 October is usually pretty nice in Texas 19:04:00 * portante scrounges around to find a big texas hat 19:04:05 I'll have a public link and invite for the next meeting. I wanted to let people know it's coming though :-) 19:04:15 * creiht doesn't have a big texas hat 19:04:36 I'm also hoping that since it's in austin, several (all?) of the rax contributors will be able to make it 19:04:45 * portante found my wife's 40th birthday pink texas hat, puts it back right away ... 19:04:48 since they don't like going to openstack summits ;-) 19:04:54 portante: yikes 19:05:41 swiftstack will be sponsoring it. we'll provide a place, some tables, whiteboards, wifi, and some food. 19:05:43 * creiht can make no promises ;) 19:05:43 will you have specific coding topics? 19:05:48 to be fair, the annoyance of travel is super-linear in the duration of the journey 19:06:00 portante: "make swift better" :-) 19:06:32 ok, moving on 19:06:41 notmyname: portante: presumably tasks which require/benefit from collaboration would be ideal 19:06:41 I've not been to one of these things, what are the basic logistics (duration, agenda, etc) or you can just cover that in the invite if you're ready to move on to the next topic 19:07:01 peluse: it'll be covered in an invite 19:07:03 #topic enable versioned writes 19:07:08 Enable versioned writes by default? https://review.openstack.org/#/c/36979/ 19:07:11 no 19:07:12 :) 19:07:26 this patch was submitted to allow versioned writes to be turned on by default 19:07:46 so my question was why not configure devstack to enable it, if they want to test it? 19:07:48 I don't care about the devstack question in the patch comments 19:07:50 brief recap of pros vs cons? 19:08:00 +1 on not enabling it by default; deployers can turn it on if users want it, right? 19:08:11 the larger question was raised about when if ever should a feature be changed to be enabled or disabled 19:08:18 swifterdarrell: agreed 19:08:19 I like the idea; I'm not a fan of software that has a bunch of options you have to set in order to get the goodies 19:08:20 pros: people tend to use defautls 19:08:33 +1 not on by default 19:08:41 cons: it's not currently on, so by doing nothing deployers who upgrade will get a new feature 19:08:51 cons: people tend to not read release notes 19:08:55 the defaults of the codebase don't (shouldn't) exist to enforce some kind of policy on deployers; (I guess you could say deployers NOT wanting it could opt out... but I think it's better for existing deployments to be opt-in for changes) 19:09:19 swifterdarrell: which is exactly why it was disabled to start with 19:09:26 yes, and for this feature, if it ends up miss-used it could use up a lot of storage, right? 19:09:35 portante: ya 19:09:40 "could" 19:09:46 could, agreed 19:10:03 I'm not versed on the feature but what siwftdarrell says sure makes sense 19:10:08 Presumably owners of large clusters are more careful than that, but having to adjust configs for updates is unpleasnt. 19:10:23 zaitcev: one would think so ;-) 19:10:32 zaitcev: maybe creiht doesn't read release notes ;-) 19:10:52 but "well they should have read the notes" isn't a good reason 19:11:00 * creiht is just observing human nature 19:11:12 If defaults aren't meant to be used, then don't have them--require every setting to have a specified value. If defaults ARE meant to be used, don't change them on deployers 19:11:14 * notmyname is just teasing creiht 19:11:15 How hard is that? 19:11:16 I think it would make sense to keep this one disabled by default - whoever enables it should know what he is doing 19:11:18 reallly 19:11:28 but then if we decide to do that, we would have to go through and first define what a "default" cluster should be first 19:11:40 * portante still fascinated by a pink texas hat ... 19:11:43 :) 19:11:53 davidhadas: exactly! 19:11:54 Default cluster is RAX 19:11:56 creiht: don't you mean your "evil chuck" face? 19:12:03 Or even "Historic CF" 19:12:03 zaitcev: I hope not 19:12:44 dfg: redbo: any opinion on enabling versioned writes by default? 19:13:59 No real opinion. We run with it on already. 19:14:16 ok 19:14:17 thanks 19:14:21 um- i think we run it. why would that feature be on and non of the others? except dlobjects i guess 19:14:32 i think both of those would be better as middleware though 19:14:42 but no strong opinions 19:15:01 is there a reason for it? 19:15:33 to be on by default? 19:15:35 dfg: that people use defaults, and turning it on means that more people will use it. the question was raised in a patch 19:15:50 does anyone know if it's on in the sample configs? 19:16:14 torgomatic: the samples reflect the defaults 19:16:15 if they want to start using it wouldn't they jsut turn it on? 19:16:23 the sample configs should all have the defaults as examples 19:16:28 #link https://review.openstack.org/#/c/36979/1 19:17:22 I think it is a lot safer for a deployer to find out they don't have a feature deployed, but they can just enable it 19:17:29 can we vote as to whether obj versioning and dynamic large object should be yanked out of the proxy server and made middleware? the proxy server code is really complicated and that would help clen it up some 19:17:42 s/some/a lot 19:17:44 rather than for a deployer that knows they don't have it enabled, suddenly have it enabled 19:17:54 dfg: I think that vote comes as the review on the patch to do it :-) 19:17:58 creiht: ++ 19:17:58 dfg: but I'd support that 19:18:09 notmyname: more pepole will be facing capacity problems by this without undertsanding why putting same objects again and agin eats up capacity. 19:18:13 creiht: ya, which is why I originally marked it as -1. 19:18:27 not really- it would inform this decision. making obj versioning on by default makes splitting it out a bigger change 19:19:01 dfg: middleware isn't a feature flag, though (although it can kinda be used for that) 19:19:32 dfg: but it seems that the question goes away since most of us are leaning to not enabling it 19:19:44 ok 19:20:01 ok, thanks for that 19:20:15 changing config defaults shouldn't happen, except for a very good reason 19:20:17 #topic json 19:20:18 Get rid of simplejson and use stdlib json? Apparently the APIs are slightly different; see https://gist.github.com/smerritt/6066977 and https://review.openstack.org/38014 for details. 19:20:21 torgomatic: ^ 19:20:23 no 19:20:24 :) 19:20:43 short story is: the built in json never got the c speedups that are in simplejson 19:20:52 simplejson doesn't work with PyPy 19:20:55 creiht: we want pypy 19:21:22 Can't we support both? I thought we had been all this time? 19:21:23 so I don't think the answer is to make cpython use the slower library 19:21:30 is the json stuff on the common path? 19:21:34 do we really need them? 19:21:35 gholt: kind of? 19:21:46 all the tests run against simplejson all the time 19:21:53 seems like there should be a better way to fix it 19:22:03 ya- i like keeping simplejson too 19:22:05 but what about the majority of requests that swift handle? 19:22:20 also, I don't like simplejson's api 19:22:44 torgomatic: wait, what? json *is* and older version of simplejson 19:22:44 seems like a lot of cost for a small gain 19:22:45 sometimes you get str, sometimes you get unicode... hope your code is okay with both types 19:22:55 oh that part 19:22:55 creiht: they diverged, unfortunately 19:23:11 if we need simplejson for performance, worth the discussion, but if we keep pypy out because of simplejson, we are dropping potential performance improvements on the floor 19:23:13 at least stdlib json gives you the same type all the time 19:23:33 otherwise you get bugs where something 500s when a param is non-ASCII UTF-8 19:23:37 or we could fix simplejson if it is an issue? 19:23:56 creiht: By recompiling the C code? ;) 19:24:06 I think we should use ujson 19:24:10 lol 19:24:39 * portante bites, googles ujson 19:25:16 would an API adapter in swift which presents a common API w/either stdlib or simplejson under the hood be too crazy? 19:25:26 does ujson work with pypy? 19:25:48 lets do what we always do- write our own json parser !! 19:25:49 no idea, I just wanted to be different 19:25:50 swifterdarrell: it might work, but only if that adapter doesn't make things slower than just using stdlib json 19:25:55 :) 19:26:03 swifterdarrell: that seems like the way to go 19:26:45 torgomatic: you would avoid, for instance, looking at all the data and making sure it's Unicode for consistency 19:26:46 swifterdarrell: +1 i'd think it wouldn't be too hard to do right? 19:27:05 torgomatic: that's API ugliness, but I'd make that trade-off for speed (or, rather, keep making it) 19:27:08 seems like extra work for what kind of performance gain for cpython? 19:27:31 swifterdarrell: isn't there something like that in oslo? :) 19:27:37 lololol 19:28:10 or how about we help patch simplejson to fix api issues 19:28:18 If the performance sucks that bad for us, I guess we can maintain a fork. But I totally do NOT want to do that. 19:28:26 then we can run with either simplejson or json as is 19:28:54 creiht: you think the simplejson guys will take API-changing patches? seems like that could hose their users pretty effectively 19:29:05 but then, I don't know if they're sticklers for that sort of thing 19:29:10 portante: Good question... but who's going to spend the time collecting solid numbers? the folks who want to keep what we have or the folks who want to make it some unknown amount slower? 19:29:14 torgomatic: if it makes it the same as json? not sure how that would be bad 19:29:18 it doesn't hurt to try 19:29:27 and seems like the least sucky if you can 19:29:34 creiht: ask the guy getting errors when running w/stdlib json under pypy :) 19:29:41 Look at Alex' patch, Chuck. It's not something you can patch. The problem is that they have gone to the approved Python 3 model of handing strings. Once there nobody's going back and they'll continue returning unicode strings. 19:29:44 has anybody shown that pypy would really even be that much faster? 19:29:51 torgomatic: not my problem, everything works for me :) 19:30:20 seems like a bad attitude to have 19:30:22 I think running pypy with swift is neat, it doesn't warrent bending over backwards (at the cost of the normal implementation) to fix 19:30:42 creiht: +1 19:30:49 zaitcev: ahh that's a completly different issue then 19:31:02 like- don't we have bigger fish to fry? 19:31:03 also, we've got code in utils that imports simplejson, then falls back to json 19:31:15 if we're not going to work with stdlib json, let's rip that bit out 19:31:31 are we concerned with json parsing or generating speed? 19:31:43 "uses A or B except that it crashes with B" is crappy 19:32:03 torgomatic: +1 that's bad 19:32:08 torgomatic: that was done a long time ago (likely when the json api and simplejson api probably didn't diverge) 19:32:25 creiht: so the question is do we rip out json? 19:32:27 creiht: I' 19:32:39 m sure it was fine when it was done, but it rotted 19:32:53 I'm just providing context 19:33:12 sure 19:33:13 portante: from what I see the issue is the it breaks with json right now 19:33:36 but we can fix that 19:33:52 so, if we go with stdlib json, it makes things easier in N years' time when Swift ends up supporting python 3, as I don't think simplejson works w/py3 19:33:54 if someone has that itch, let them scratch it :) 19:34:02 for now, I can't see how we can ditch simplejson 19:34:33 torgomatic: there are going to me *so* many more difficult issues when it comes to supporting python3 19:34:46 creiht: absolutely true 19:34:57 but I'd rather not pile on any others if we can help it 19:35:30 so the options are either patching simplejson or writing an adaptor layer, right? did I miss another one? 19:35:32 that's fine with me as long as it doesn't come at a cost for the current implementation 19:35:42 we'll need to eventually make every string in swift into unicode instead of utf-8 19:36:07 well, except where it's dealing with actual file data 19:37:47 That's the Python 3 model 19:37:51 alright, I'll hack up an adapter layer thing and do some synthetic benchmarks 19:38:00 ok 19:38:02 It breaks down in WSGI so bad, you would've belive 19:38:12 torgomatic: thanks 19:38:25 moving on.. 19:38:37 torgomatic: can you also engage Alex_Gaynor to check against pypy speed? 19:38:45 #topic erasure codes questions 19:38:56 it's not a bad python 2 model either. the wsgi part isn't that bad, the problem is just how much code we'll have to verify works okay when unicodes start appearing where theere used to be strs. 19:39:05 portante: for json stuff? sure, once I have the benchmarks written 19:39:12 great 19:39:15 thx 19:39:44 since our last meeting, we've talked more about erasure codes. are there questions that need to be discussed here (rather than just "normal" IRC questions)? 19:39:53 notmyname, can we merge this patch(Forklift the DiskFile interface into it's own module) into the EC branch? 19:40:21 yuanz: ya, I can do that 19:40:35 Why not. As I understand EC does not place unusual demands on DiskFile. 19:40:44 yuanz: I'll do it as soon as the meeting is done 19:41:01 zaitcev: the hope is that master will get merged into the ec branch frequently so they don't diverge 19:41:10 ah, ok 19:41:17 notmyname: I was going to about "frequency", weekly? 19:42:02 peluse: it's a matter of me pressing a button. I haven't set up a schedule for it. IMo it should be at lest weekly, if not more often 19:42:20 notmyname: Can I ask about our plans for reduce cross region replication traffic? 19:42:35 vvechkanov: ya, just a minute 19:42:42 anything else on ec? 19:42:45 EC is fascinating, but as I understand Joe Arnold has already arrayed significant forces against it alread, so I don't feel compelled to help along. It's bound to happen in due time... 19:43:03 zaitcev: against? 19:43:11 who is joe arnold and why should we care? 19:43:11 zaitcev: we're pushing it and writing it :-) 19:43:16 Yea, I mean like attacking the problem 19:43:48 #topic open discussion 19:43:50 vvechkanov: go ahead 19:44:10 Hello all. I have a question about swift replication. I think it is good idea to reduce traffic between regions. For that we can modify replication to prefer replicate in local region, than in foreighn one. 19:44:27 vvechkanov: like the read and write affinity? 19:44:36 Yes. 19:44:40 * portante sheepishly crawls back under his pink texas hat 19:45:18 seems like it makes sense to me 19:45:31 vvechkanov: it would depend on the particular patch, I'd think 19:45:47 vvechkanov: but the point of the replication is to provide the high durability 19:46:00 hmm...I withdraw my previous statement abotu it making sense 19:46:47 if you are looking at it from a "replicate to the copies in the same region instead of using the WAN link" that makes sense to me 19:46:53 seems to me that if you're going to replicate data, you should replicate it to where the proxy will look for it, not to some other place 19:46:56 keeping all replicas in a region doesn't 19:47:30 torgomatic: ya, a 2 region, 3 replica system maybe should prefer to keep the 2 copies on the one region in sync with each other? 19:47:43 for the data move (not checks) 19:47:47 shouldn't this be a subject for a different time? 19:47:56 like when i'm not around :) 19:48:03 We planning to replicate every step in one regions and one in some, for example 10 steps replicate to other regions. 19:48:32 ./kick dfg problem solved ;) 19:48:32 heh, there's already enough confusion about what happens to PUTs to one region with write affinity enabled... I can't imagine how bad it'd be with cross-region replication further interfered with.. 19:48:34 * swifterdarrell shrugs 19:48:55 vvechkanov: I don't understand what that means 19:49:31 vvechkanov: since replication is pushed base and doesn't share state between the nodes, I'm not sure how it would work 19:49:37 oh- i missed the open-discussion tag. sorry :) 19:50:03 I mean replication in region will be more often then cross-region replication. 19:50:39 What is Havana schedule? 19:50:57 zaitcev: october-ish? 19:51:04 notmyname: filter out nodes from foreign regions in replicator for every run except every 10th 19:51:10 I really need to know how aggressive I need to be with DB broker, because if LFS misses Havana, then I'm fired 19:51:14 ogelbukh: ah ok 19:51:30 basically that's a simpliest(?) approach we could imagine 19:51:54 Peter seems good with DiskFile, he's gonna make it by October I'm sure 19:52:14 or may be just easiest 19:52:28 as in easy vs simple 19:52:58 ogelbukh: sounds like it would work. I'm not sure if it's a good idea or not. I don't think I'd be too comfortable running that way. others may be 19:53:03 Simple, but seems asking for tweaking the ratio, which is an extra knob. 19:53:50 ogelbukh: vvechkanov: couldn't you achieve similar results with network-level QoS/shaping on a separate cross-region replication network, possibly with a higher object-replicator concurrency level? 19:54:17 ogelbukh: vvechkanov: I thought the point of the sep. replication network was to allow control of that traffic flow outside of Swift 19:55:28 swifterdarrell: it was and it still is 19:56:03 sweet! no patch needed 19:56:22 ) 19:56:33 anything else in the last 4 minutes? 19:56:37 I'll +2 almost any zero-line patch 19:57:49 if you're looking for reviews, the acount-acls one could use some eyes 19:57:55 there's a ton of others too 19:57:59 That one is complex. 19:57:59 thanks for your time 19:58:22 ya 19:58:25 #endmeeting