21:00:35 #startmeeting swift 21:00:36 Meeting started Wed Aug 19 21:00:35 2015 UTC and is due to finish in 60 minutes. The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:37 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:40 The meeting name has been set to 'swift' 21:00:54 who's here for the swift meeting? 21:00:56 hello 21:00:57 o/ 21:00:57 o/ 21:00:59 o/ 21:01:00 hi 21:01:00 hi 21:01:01 o/ 21:01:01 Here by phone 21:01:11 ^ 21:01:13 So this is what it feels like to have a meeting at a normal hour :P 21:01:14 hi 21:01:33 <-- 21:01:37 mattoliverau: heh 21:01:39 o/o/ 21:01:43 hi 21:01:44 mahatic: are you awake at some crazy hour? 21:01:45 mattoliverau: so this is what it feels like at a very odd hour :P 21:01:47 here 21:01:58 mahatic: lol 21:02:01 mahatic: 2:30am for you? 21:02:17 hello 21:02:22 notmyname: yes! weekly meeting cancelled for 6 am tomm! so I thought I'll pull off a couple of more hours :P 21:02:30 wow 21:02:37 ok, then let's get started 21:02:41 #link https://wiki.openstack.org/wiki/Meetings/Swift 21:02:54 #topic hackathon summary 21:03:06 last week we had a great week (IMO) in austin at the hackathon 21:03:20 I'd like to again thank jrichli for hosting and coordinating everything on the ground 21:03:20 +100 21:03:26 +100 21:03:29 also for the cookies ;-) 21:03:33 +100 21:03:37 yup jrichli cookies were great 21:03:48 great job jrichli 21:03:50 You all are welcome! it was fun 21:04:07 Thanks for the cookies. 21:04:15 nom nom nom 21:04:22 I'm currently writing up a summary of the week in a blog post. should be published later this week (that's the plan) 21:04:22 :-) 21:04:29 but for a summary of the summary... 21:04:42 heh 21:04:44 :D 21:04:51 landing patches faster: need better tools and communication 21:05:04 for tools, there's a couple of things to look in to 21:05:04 it was great, jrichli! 21:05:18 glad you liked it redbo! 21:05:19 I've been told that gerrit doesn't have sticky comments 21:05:41 and I'm still working on a "group prioritization" thing to find who's starred what patches 21:05:56 but the basicn thing I took away there was to communicate more on patches 21:06:15 if there's already a +2, don't push a patch set over it unless necessary 21:06:23 if you rebase, make a note saying that's all you did 21:06:41 if you push a patch set up, give a quick sentence explaining the difference 21:06:59 that will help us all use the tool better (until such time as we have better tools) 21:07:18 also, lot's of stuff on EC issues and getting it to "done" 21:07:32 several things identified and being worked through now 21:07:50 it means there are some good patches up to review now, and take a look at the bugs tagged "ec" 21:08:07 I think that we'll be able to clear up these known issues and have something reasonable by tokyo 21:08:18 I have to admit I started skipping EC recently. 21:08:34 that's like the opposite of the right thing! ;-) 21:08:39 lol 21:08:43 I know we have Clay, Kota, Sam et al on the job, so it's easy to get lazy. 21:08:46 zaitcev: we missed you last week 21:08:55 zaitcev: oh, I can sympathize with that :-) 21:09:33 we also talked about encryption and what's going on there 21:10:00 jrichli has organized https://trello.com/b/63l5zQhq/swift-encryption so that it's clear for the big stuff outstanding 21:10:11 in the "to do for reduced scope" column 21:10:42 I think we'll be able to make some great progress on this before tokyo, but I'd be surprised if we get it full done by then 21:11:02 (unless someone tells me "oh yeah it's simple and we're totally already done") 21:11:15 :-) 21:11:18 we'll get the PUT path done :) 21:11:50 also at the hackathon we talked about global cluster improvements. not a ton of stuff there except maybe some handoff affinity. most of the rest is deployment management 21:12:22 * cschwede is curious about handoff affinity 21:12:28 notmyname: is there a bug on handoff affinity? I still don't think I understand exactly how the sorting is going wrong for us? 21:12:53 are we talking write affinity or handoff affinity 21:13:01 cschwede: right?! like I think we're supposed to already do this in the proxy, I think something is wrong - but I don't really understand what's breaking down 21:13:03 handoff, I thought 21:13:40 choose a handoff in the same region as the failed primary 21:13:42 mattoliverau: it must be the affinity of a write when write affinity is turned on and some amount of primaries is down 21:13:47 maybe I misremembered 21:14:31 clayg: yup ok, I've also got a patch up there to add handoff affinity to the replicators 21:14:35 yeah, we need to actually have a bug written down 21:14:41 well, even if primaries are not down but use write affinity - you might need handoffs_first? but maybe needs to be discussed outside the meeting. i’m really curious! 21:14:44 so we're talking from the proxy side, got it 21:15:21 sometimes proxy is not enough? ie prevent new data written to a node, but also move data asap somewhere elese? 21:15:27 handoff affinity in the replicators? like when a primary gets a 507 for a partner in the same region? I don't think I have that starred 21:15:56 we need to open a launchpad tag for global-cluster and makes some issues 21:16:00 clayg: yeah, it was an request from an OP during the ops session in Vancouver 21:16:01 ok, so it seems like there's something to with handoffs. and they might (or might not) have affinity. 21:16:04 clayg: good idea 21:16:16 lol 21:16:21 righ moving on :P 21:16:25 heh, ok 21:16:26 the full clusters discussion was good, I thought. got a list of stuff to write down in some "what to do when swift asplodes" doc. also some ideas as to new tools to write to help in cluster full scenarios 21:16:33 mattoliverau: everyone keeps saying someone in vancouver said something about affinity - but I'm not sure that anyone knows who - or wrote down what they acctually wanted 21:16:54 yeah, its on the priority reviews wiki 21:17:03 mattoliverau: not in any detail, though 21:17:10 that's true 21:17:25 mattoliverau led a good discussion on container sharding 21:17:29 mattoliverau: "handoff affinity (handoffs in the same region)" <= i don't know what that means or what's broken 21:17:35 Well I have something up, and hopefully the op in question will review :P 21:17:39 one thing I liked about it is that there are a lot more people looking/thinking about it 21:17:56 yay container sharding! redbo will know what to do! 21:17:59 yeah, I'll write up a new version of the SPEC to include what was discussed (maybe on the plane ride home) 21:18:04 and it looks like there might be a reasonable reduced-scope plan of what to do 21:18:05 ^ sharding 21:19:15 redbo and dfg and nadeem and allen told us about hummingbird. I like it a lot and they did a great job. I have a much better idea of what the goal actually is and the plans to solve the problem 21:19:25 +1 21:19:39 * clayg wonders if he'll ever get back to his hummingbird patch 21:19:44 I don't have a hard requirement for c sharding, but I'm very enthusiastic for Matt actually doing it. Swift needed it for years, but we got by with application-controlled split-up for containers. 21:19:45 for those not there, the problem being solved is summed up as "object replication is slow in big clusters" (somehwat simplified) 21:20:12 so the golang object server + object replication processes help with speeding up replication 21:20:18 and that right there is a really good thing 21:20:51 so the initial numbers they presented were pretty encouraging, and they shared some tools on how we can all duplicate the tests in our own environment 21:21:28 can you give a raw estimation? like 10% faster, or more like 10x? 21:21:37 * cschwede needs faster object replication 21:21:49 notmyname: oh yeah! I want to try that with keep_cache_private = true 21:21:51 but also, there was the common understanding that the hummingbird branch is intentionally simplified (eg no policy support) for now, and it's a long-term thing 21:22:49 cschwede: I think it was more like "can rebalance as fast as we're adding capacity" vs. "can not rebalance as fast as we're adding capacity" 21:22:54 #10PB+problems 21:23:03 cschwede: the numbers for now are more about asyncs and .... yeah, what clayg said 21:23:32 right. how many partitions can get rebalanced when adding significant capacity 21:23:33 cschwede: put 1.5X, get 4.6X, delete 1.4X in my memo. 21:23:43 ho: thanks! 21:23:51 redbo: is that right? ^ 21:24:03 cschwede: oh, I thought you were talking about replication not the object server 21:24:14 cschwede : same as ho in my notes, ho looked it up faster! 21:24:25 I thought reads form the page cache were like 6x faster 21:24:44 acoles: hehe 21:24:54 and I thought the os threads for blocking syscalls got them like near 2x on writes & delete's 21:24:58 clayg: ho: notmyname: acoles thx! sounds great. yeah, i had replication in mind primarily, but put/get as well 21:25:19 * cschwede needs to have a closer look to hummingbird 21:25:38 Sorry, wasn't following too close. Replication speed is one of the big goals. 21:26:16 clayg: yeah i noted 2x on write/delete for 1k objs, numbers above were 1M 21:26:22 I really do wanna do server per port with ~4 workers per port and keep_cache_private = true 21:26:38 Policies is one simplification. I don't think it'll be too difficult to add replicated policies, we just don't use them. 21:27:07 redbo: right. wasn't saying that as a ding. just that the goals right now are more around replication times in your clusters than "full replacement" 21:27:17 which is totally reasonable IMO 21:28:08 so aside for the specific things going on, the larger thing is just that there is a *ton* of stuff going on in swift right now 21:28:22 we had 27 people at the hackathon. and over 40 things listed as topics to talk about 21:28:26 redbo: and POST! but feature parity may be a distraction; enumeration of differences is a) more useful and b) required to quantify parity 21:28:41 so if you feel like it's hard to keep track of all the things and a lot is happening, you're right! 21:28:44 notmyname: did we get to everything when it was all said and done? 21:28:58 clayg: we did not do anythign about metadata searching 21:29:12 but yeah, we got almost everything 21:29:15 sigh, or part power adjustment - sorry cschwede 21:29:23 well - that's pretty good i 'spose 21:29:34 notmyname: I feel like it's hard to keep track of all the things and a lot is happening 21:29:43 clayg: you're right! ;-) 21:29:58 good that we meet again in ~ 9 weeks :) 21:30:17 omg - i can't believe it 21:30:26 so with all the stuff going on, it's pretty important that we prioritize and realize what is reasonable to do before tokyo, what we can make progress on in the short-term, and what we defer to "conversations over beer" 21:31:22 so unless there's other things to say about the hackathon, that's a good segue to talkign about releases 21:31:41 yay release! 21:32:14 probably good to release code sometimes :) 21:32:15 #topic releases 21:32:18 yeah 21:32:30 so I want to do a release next week 21:32:35 okay 21:32:39 although at this point I want to get in peluse's etag_buckets fix for EC overwrite and probably other stuff too 21:32:39 I'll be working on release notes, etc 21:32:46 it will be 2.4 21:33:08 and that should give us time to have one more release before tokyo (albeit with a fast turnaround) 21:34:01 so here's a "rule" I'd like to try to follow: if there is any open bug that is marked as critical, it blocks a release 21:34:17 (rule in quotes because I don't know a better word of the top of my head) 21:34:45 guideline ;) 21:34:59 so if you're looking at a swift bug in launchpad.....rather, "when" you're looking at swift bugs in launchpad, if there's something that must be in the next release, it should be marked critical 21:35:03 mahatic: thanks :-) 21:35:16 does that sound ok with everyone? 21:35:19 https://bugs.launchpad.net/swift shows 5 critical ones. 21:35:23 yup, great plan 21:35:48 zaitcev: nice! so... are they inprogress? 21:35:54 ya 21:36:17 the 2 open ones are the 2 we aren't supposed to talk about publicly yet ;-) 21:36:17 3 are fix commited 21:36:25 shhhh 21:36:38 ninja bugs 21:36:50 phthththth - dorks 21:37:03 anything else about releases? 21:37:29 notmyname: sounds ok 21:37:41 #topic keystone related patches 21:37:46 #link http://paste.openstack.org/show/421542/ 21:37:50 ho: this is your topic 21:37:56 ho: drop some knowledge 21:37:59 yeah, I would like to introduce keystone related patches in swift to get atention from you :-) 21:38:07 one of these years, I should figure out how to spin up a Keystone instance 21:38:19 I summarized the current status in the paste url. 21:38:37 If you have a time, please read it and feel free to ask me. 21:38:51 that's i want to say here. :) 21:38:56 ho: thanks for the paste, that's useful :) 21:38:59 torgomatic: you can't - your business hours overlap with North American timezone - for some reason it's impossible to know anything about swift+keystone if you work in North America - no one knows why 21:39:25 clayg: i'm moving to north america then :P 21:39:27 * clayg goes to read the paste 21:39:28 lol 21:39:42 acoles: great idea! lol@ 21:39:56 ho: thanks for working on this 21:40:18 ho: oh I think i misunderstood - you don't want more attention to swift+keystone *in general* - you just have some keystone patches that need more immediate attention 21:40:18 I just had someone today mention something about swift+keystone v3 and wanting something better 21:40:31 torgomatic: it's very easy. just edit and run this - http://www.zaitcev.us/things/keystone-setup.sh 21:40:51 zaitcev: that's great! 21:41:01 so ho has done a lot of work on a comprehensive test suite for checking all role based access scenarios. they were taking a looong time to execute but we got that done some in austin 21:41:02 zaitcev: nice! saved 21:41:24 s/done/down/ 21:41:26 clayg: yeah 21:42:08 zaitcev: I dont' think `systemctl start openstack-keystone.service 21:42:14 ^ will work on my ssytem 21:42:45 i also try to make vagrant-swift-all-in-one with keystone 21:43:00 zaitcev: that's like "given a pre-existing keystone deployment" here's 100 lines of bash that can "set it up" 21:43:45 I will push it to the repository but i have to learn more about chef :-) 21:43:48 i think mattoliverau ansible scripts sets up keystone, right? 21:43:50 ho: thanks for bringing these up 21:44:01 clayg: My point is, you don't need to bring up whole OpenStack, just do what the script does and it makes Keystone with stable IDs and SQLite backend. Fairly painless when compared with the OOO-based installer... 21:44:27 tdasilva: yeah, I have something, I'll get that up one of these days 21:44:47 hugokuo: has keystone setup scripts too! 21:45:12 sounds like there's a few things. ok, everyone go race to see who can write docs and scripts the fastest! go! 21:45:33 #topic open discussion 21:45:40 what else is on your mind? 21:46:12 we everyone has scripts! I don't think that really solves "a group is just a role that assigns a user to project under the domain, what's the problem!" 21:46:35 eh, the scripts still help (me) 21:46:54 last time I tried was by spinning up a full devstack, and it took two hours and didn't work 21:46:55 yeah, i suppose you're right 21:47:10 sounds familar 21:47:18 I'm at the openstack ops meetup right now. there's love for swift here. "it just works." "it's very stable" "I don't have to worry about it". etc 21:47:22 torgomatic: heh, i seem to be able to get my devstack turned on most of the time 21:47:25 notmyname: I have that testr patch currently marked as a WIP. If anybody cares to look and give feedback on it. 21:47:26 be proud of what you've done with swif t:-) 21:47:26 I think if we had more time at the hackathon, we might have gotten to a discussion regarding small file optimizations for EC. 21:47:40 hurricanerix: right! that's a good one 21:47:48 hurricanerix: I'm planning on looking at that asap 21:47:56 torgomatic: but using keystone client is a real PITA - there's some descrepency between what the api says to do and what you can do with the keystone client? I think the openstack client is better supported but less documented or something? 21:47:58 (which might be a day or so) 21:48:22 Specifically, whether we'd want to go forward with a dual-policy for EC policy for small vs. large objects. 21:48:32 notmyname: cool, thanks. 21:48:38 (replication and EC). 21:48:44 On the other hand, I met an IT guy from New York who said he attended some of our sessions in Vancouver, and found "Swift people" "disorganized". 21:48:53 minwoob_: yeah, that's a big deal. definitely needs to happen (small file optimization). post tokyo, though 21:49:05 in the few times I opened my laptop, I saw a ton of pep8 patches (and zaitcev saying stuff on twitter). what's going on? 21:49:18 torgomatic: clayg keystoneclient CLI doesn't support v3 so openstack cli the way 21:49:18 zaitcev: we *are* disorganized - there's too much going on :'( 21:49:42 acoles: I think zaitcev's script gets by with 'keystone' 21:49:51 zaitcev: and we didn't have monitors in the meeting rooms so that made showing things harder 21:49:53 notmyname: The gentleman with strange nickname was trying to reduce the Hxxx/Fxxx exclusion list. One patch per exclusion. 21:50:09 the exclusion list is not a TODO list 21:50:17 oh 21:50:26 notmyname: heh 21:50:26 notmyname: Right. Do you think it's something good for a spec? 21:50:28 notmyname: yes, and we allowed the ones that made sense and rejected the oens that didn't 21:50:36 torgomatic: ok, great 21:51:10 minwoob_: it would help to have more words - what you're thinking and how you're thinking about doing it - a spec is a plce you can write words - so it might work 21:51:18 I spoke with someone here, and it seems like it would be possible to adjust some dependencies so that we can get back to a whitelist of style changes. it's something I'll look in to at some point 21:51:32 clayg: minwoob_: we had some good dieas back at the boston hackathon 21:51:43 clayg: yeah but i don't see it adding all the test accounts we now have - there's one in a non-default domain 21:51:44 not sure what (if anything) was written, though 21:52:03 I'll try to coordinate with someone on that, then. 21:52:19 the *boston* hackathon? about directing to ec/replicated "policy" based on size? 21:52:51 It probably goes deeper than just "if an object in the EC policy is < certain size, store it via replication". 21:52:56 clayg: something about just storing the whole data for smal objects, but in every fragment (ie 14x replication). 21:53:04 minwoob_: yeah, probably 21:53:15 But a good start. 21:53:19 minwoob_: redbo commented that basically the way the ec policies were exposed makes them very difficult to deploy for service providers because... well because users make unexpected choices :\ 21:53:33 minwoob_: so, yes, it would be good to work on and it's definitely needed. one reason is because of what clayg jsut said 21:53:55 minwoob_: so I'd love to see something there, but don't expect it to get much attention for a few months at least 21:54:08 Yeah, definitely something post-Liberty. 21:54:25 depending on what we get done in the next few weeks, it might be a decent session in tokyo 21:54:27 if we can do a small file optimization for EC policies, could we do a large file EC optimization for replicated policies? :) 21:54:32 ho: anyway - if you *could* keep working on keystone support for vagrant-swift-all-in-one that'd be great 21:54:38 redbo: one policy for everything! 21:55:05 anything else to bring up from anyone this week? 21:55:09 clayg: thanks! i will brush up it more 21:55:39 Just one more step and it'll turn out that policies were unnecessary to begin with. Except for e.g. geographic policies. 21:55:55 zaitcev: just one policy: do-the-right-thing 21:56:06 lol 21:56:09 ok, I'm calling it 21:56:17 I'm really not sure if it's quite as simple as a single storage policy that can do both kinds of data :\ It feels more like per policy constraints or automagic re-routing of requests + symlinks 21:56:17 thanks for coming today. and thanks for working on swift 21:56:20 what are you calling it? 21:56:27 Bob 21:56:31 Good choice 21:56:32 torgomatic: done (the meeting) 21:56:32 Steve 21:56:34 lol 21:56:37 the other thing was if a customer sets EC policy, we could reject uploads < some size 21:56:47 it's always "steve" for me 21:56:53 lol 21:56:56 redbo: ya. 409 conflict 21:57:03 redbo: or 412. whatever 21:57:15 redbo: and no chunked encoding 21:57:29 or copies. or *LOs 21:57:34 simplifies a lot 21:58:01 :-) 21:58:03 #endmeeting