Tuesday, 2015-03-31

acolesho: there must be something more important than 169035 ;)00:01
acolesnotmyname: ^^ ?00:01
*** vinsh has quit IRC00:01
hoacoles: yeah, 169035 is not a big one.00:02
notmynameacoles: ho: I defer to peluse and clayg about https://review.openstack.org/#/c/131872/00:02
*** annegentle has quit IRC00:02
notmynameho: this one is easy, and I want it to be in kilo (for non-technical reasons--for DefCore reasons) https://review.openstack.org/#/c/167828/00:02
notmynameho: and torgomatic just pushed up a new version of multi-range for EC. that would be great to get in (but since [my] this morning I've been assuming it won't)00:04
honotmyname: acoles: thanks! I will review 167828 first then try 13187200:04
notmynameho: thanks!00:04
*** zhill has quit IRC00:06
claygwhat'd i do?00:07
notmynameclayg: I was deferring to you (and peluse) on what to do next with the reconstructor patch00:08
claygnotmyname: oh, nothing really much to review - i'm still working on the reconstructor tests00:09
yuantorgomatic, for 168254, there are some other more complicated cases, like mismatched versions on the first K nodes, if we could read from (k+m) nodes then we have more chance to decode the object00:10
torgomaticyuan: true, although the code we have on feature/ec today just gives up when it finds mismatched fragment archives00:10
yuantorgomatic, yeah I made some updates on that part of code, it tries to get all the responses from k+m nodes and check if there's enough data to decode for one version00:12
torgomaticyuan: ok, I'll take another look00:12
yuanthanks, the general idea was to count the etags founded and check if there's sum(one etag) >= ecpolicy.ec_ndata00:14
openstackgerritAlistair Coles proposed openstack/swift: Erasure Code Reconstructor  https://review.openstack.org/13187200:17
openstackgerritAlistair Coles proposed openstack/swift: Fix ssync sender cleanup of reverted fragment files  https://review.openstack.org/16905200:17
*** rdaly2 has joined #openstack-swift00:17
acolesho: ^^ patch 169052 is ready for review too :)00:19
patchbotacoles: https://review.openstack.org/#/c/169052/00:19
acolespeluse: clayg: ^^ i think thats heading the right way but i'm heading to bed!00:19
*** tsg has quit IRC00:20
*** acoles is now known as acoles_away00:21
*** rdaly2 has quit IRC00:21
*** Tahmina has quit IRC00:27
*** dmorita has joined #openstack-swift00:31
*** lcurtis has quit IRC00:34
*** kota_ has joined #openstack-swift00:36
kota_morning, everyone :)00:37
hokota_: morning!00:40
openstackgerritSamuel Merritt proposed openstack/swift: Add some debug output to the ring builder  https://review.openstack.org/14694500:47
openstackgerritJohn Dickinson proposed openstack/swift: Check if REST API version is valid  https://review.openstack.org/16850900:48
notmynameok, I'll be back later tonight00:52
*** Nadeem_ has quit IRC00:54
*** annegentle has joined #openstack-swift00:58
mattoliveraukota_: morning00:58
mattoliveraunotmyname: o/00:59
kota_ho, mattoliverau: o/00:59
openstackgerritpaul luse proposed openstack/swift: Erasure Code Reconstructor  https://review.openstack.org/13187201:07
*** annegentle has quit IRC01:12
*** annegentle has joined #openstack-swift01:13
*** haigang has joined #openstack-swift01:13
*** mitz has quit IRC01:26
*** vinsh has joined #openstack-swift01:29
*** mitz has joined #openstack-swift01:36
*** annegentle has quit IRC01:42
*** tsg has joined #openstack-swift01:49
*** panbalag has joined #openstack-swift01:59
*** jrichli has quit IRC02:05
*** jrichli has joined #openstack-swift02:05
openstackgerritJanie Richling proposed openstack/swift: WIP - Provides a simple skeleton of middleware for encryption feature.  https://review.openstack.org/15790702:06
*** panbalag has joined #openstack-swift02:12
*** annegentle has joined #openstack-swift02:15
*** vinsh has quit IRC02:16
*** haomaiwang has joined #openstack-swift02:17
*** lcurtis has joined #openstack-swift02:22
*** Gues_____ has quit IRC02:25
openstackgerritHisashi Osanai proposed openstack/swift: Remove sudo from resetswift command  https://review.openstack.org/16914202:25
*** jrichli has quit IRC02:32
*** Gues_____ has joined #openstack-swift02:35
*** lcurtis has quit IRC02:37
*** bkopilov has quit IRC02:38
*** haigang has quit IRC02:57
*** erlon has quit IRC03:01
*** tsg has quit IRC03:19
*** Gues_____ has quit IRC03:20
*** haigang has joined #openstack-swift03:22
*** kei_yama has joined #openstack-swift03:25
*** vinsh has joined #openstack-swift03:26
*** haigang has quit IRC03:27
*** vinsh has quit IRC03:28
*** vinsh has joined #openstack-swift03:29
*** vinsh has quit IRC03:31
*** panbalag has quit IRC03:43
*** haigang has joined #openstack-swift03:55
*** dmorita has quit IRC04:00
*** bkopilov has joined #openstack-swift04:08
*** jasondotstar has quit IRC04:12
*** jasondotstar has joined #openstack-swift04:14
*** bkopilov has quit IRC04:20
*** zaitcev has quit IRC04:21
*** annegentle has quit IRC04:40
*** ppai has joined #openstack-swift04:41
*** goodes has quit IRC04:41
*** goodes has joined #openstack-swift04:43
*** annegentle has joined #openstack-swift04:46
*** annegentle has quit IRC04:49
claygno more sudo!04:54
*** SkyRocknRoll has joined #openstack-swift04:55
hoclayg: sorry. it's my mis-understanding...04:56
claygho: no i'm sure it's fine - iwas just catching up in channel04:56
claygi've been ignoring it writing reconstructor unittests04:56
hoclayg: can i ask the patch num for your unittest?04:57
claygi haven't submited yet - i'm going to drop them right ontop of patch 13187204:58
patchbotclayg: https://review.openstack.org/#/c/131872/04:58
claygi guess I could kick up what I have04:58
claygmeh, i need to write some more04:58
*** silor has joined #openstack-swift05:00
hoclayg: thanks for the info. this patch is my next target. :)05:05
hos/target/target of review05:06
*** kota_ has quit IRC05:07
*** kota_ has joined #openstack-swift05:11
*** dmorita has joined #openstack-swift05:12
claygho: a'igth - i guess i'll push up what I've got05:13
*** dmorita has quit IRC05:21
claygho: whoa - bit of a mess with the rebasin'05:24
hoclayg: haha. conflict with your previous patch? (i forgot the num)05:26
openstackgerritClay Gerrard proposed openstack/swift: wip: ec reconstructor probe test  https://review.openstack.org/16429105:26
openstackgerritClay Gerrard proposed openstack/swift: Erasure Code Reconstructor  https://review.openstack.org/13187205:26
claygi hope i didn't loose anyones fixes there05:28
*** dmorita has joined #openstack-swift05:31
openstackgerritClay Gerrard proposed openstack/swift: Fix ssync sender cleanup of reverted fragment files  https://review.openstack.org/16905205:31
mattoliverauclayg: maybe you need to come start a swiftstack office in Oz, cause you seem to be living the timezone :P05:38
claygmattoliverau: maybe if I did it'd help me get back on normal time?05:44
*** reed has quit IRC05:47
mattoliveraulol, maybe05:49
*** nshaikh has joined #openstack-swift05:52
*** annegentle has joined #openstack-swift05:58
*** annegentle has quit IRC06:02
cschwedeGood Morning!06:03
hocschwede: good morning!06:04
mattoliveraucschwede: morning! What are you doing up so early!06:05
cschwedemattoliverau: it’s 8am over here, normal time for me06:06
mattoliverauoh, cool, as we have a little git of cross over (until I need to change my clocks in like a week or something)06:07
mattoliveraus/git/bit/06:07
cschwedewe just changed clocks the weekend, normally i would start one hour later06:07
mattoliveraucschwede: ahh, so that's why.. otherwise I'd just assume you used to ignore me first thing in the morning :P06:14
cschwedemattoliverau: no worries, i’m not ignoring you! sometimes i’m just a silent observer06:16
mattoliveraulol06:16
*** haigang has quit IRC06:27
*** haigang has joined #openstack-swift06:27
*** haigang has quit IRC06:32
notmynamecschwede: I found the error in https://review.openstack.org/#/c/16850906:32
notmynameI unintentionally found it06:32
cschwedenotmyname: oh, great - i’m just debugging it. what’s the reason?06:32
notmynameyou didn't clean up constraints after setting it. I'm about to push a new verison06:33
cschwedeah, i think it’s a re-used swift.conf?06:33
cschwede:) just recognized that too. thx for debugging!06:33
openstackgerritJohn Dickinson proposed openstack/swift: Check if REST API version is valid  https://review.openstack.org/16850906:33
cschwedebtw, the new APIVersionError is a good idea06:34
notmynamecschwede: also check my revert to test_chunked_put_bad_version and test_chunked_put_bad_path. they pass now (with the version on master)06:35
notmynamecschwede: I'm guessing you had changed those to make some tests pass?06:35
notmynamecschwede: are you on the openstack operators list?06:37
notmynamecschwede: didn't you say you had a customer in instanbul? http://lists.openstack.org/pipermail/openstack-operators/2015-March/006675.html06:37
cschwedenotmyname: yes, i modified test_chunked_put_bad_version to make the tests pass. but looking at it now my change is not necessary06:38
notmynamecschwede: ah. feel free to push over it :-)06:38
claygnotmyname: you're up late!06:38
claygnotmyname: did you ever hear back from infra on the review-ec branch?06:38
cschwedei read the operators list, yes; but that is not my customer06:38
notmynameclayg: not yet. I expect ttx to be up in about an hour (he's one our later than cschwede if my geography is correct)06:40
*** silor has quit IRC06:40
cschwedenotmyname: France? Same time, 8:40 am now06:40
notmynameah ok06:41
claygi'm seeing this random eventlet.switch bug on some of the feature/ec changes -> http://logs.openstack.org/72/131872/48/check/gate-swift-python27/e998cc0/nose_results.html06:41
cschwedethey also switched to DST on the weekend06:41
cschwedenotmyname: clayg: do you have a new stronger coffee brand in the office? looks like you guys are no longer sleeping06:41
notmynameif I haven't heard from ttx by the time I get up tomorrow, I'll find other people to DOITNOW!06:42
claygit's like power naps06:42
*** jamielennox is now known as jamielennox|away06:49
claygit's like eventlet.sleep acctually06:49
openstackgerritClay Gerrard proposed openstack/swift: wip: ec reconstructor probe test  https://review.openstack.org/16429106:56
openstackgerritClay Gerrard proposed openstack/swift: Erasure Code Reconstructor  https://review.openstack.org/13187206:56
claygtest ERROR: ERROR __call__ error with PUT /sda1/p/a/c/o : Timeout (1s) <- that's the same thing - in switch06:57
claygsomethings borked :\06:57
claygportante: was always really good at tracking this crap down06:57
claygi wonder if it's just on feature/ec06:57
*** annegentle has joined #openstack-swift06:58
*** annegentle has quit IRC07:04
notmynameI just saw an email from ttx to a mailing list, so I know he's up07:04
notmynameclayg: basically, you'll know when the new feature/ec_review branch is there when you `git fetch gerrit` and see it download07:05
notmynameso I'm hoping he'll get to it soon07:05
notmynameok, I'm going to bed. talk to you in a few hours07:06
*** haigang has joined #openstack-swift07:12
*** silor has joined #openstack-swift07:17
*** someonespace has joined #openstack-swift07:22
*** bkopilov has joined #openstack-swift07:28
*** geaaru has joined #openstack-swift07:42
*** jistr has joined #openstack-swift07:43
*** annegentle has joined #openstack-swift07:59
*** annegentle has quit IRC08:05
cschwedehmm, looks like a dependency is missing on the gate, saw that error on at least two different jobs08:28
cschwedehttps://bugs.launchpad.net/openstack-ci/+bug/133455008:28
openstackLaunchpad bug 1334550 in OpenStack-Gate "Could not find any downloads that satisfy the requirement X" [Low,Fix released]08:28
*** haigang has quit IRC08:38
*** jordanP has joined #openstack-swift08:46
*** acoles_away is now known as acoles08:47
acolesmorning/evening08:49
-openstackstatus- NOTICE: CI Check/Gate pipelines currently stuck due to a bad dependency creeping in the system. No need to recheck your patches at the moment.08:55
*** ChanServ changes topic to "CI Check/Gate pipelines currently stuck due to a bad dependency creeping in the system. No need to recheck your patches at the moment."08:55
*** annegentle has joined #openstack-swift09:00
*** annegentle has quit IRC09:06
hoacoles: morning!09:06
openstackgerritChristian Schwede proposed openstack/swift: Check if device name is valid when adding to the ring  https://review.openstack.org/16923109:20
*** ho has quit IRC09:34
*** jamielennox|away is now known as jamielennox09:41
*** silor has quit IRC09:43
*** jamielennox is now known as jamielennox|away09:47
*** annegentle has joined #openstack-swift10:01
*** annegentle has quit IRC10:06
openstackgerritAlistair Coles proposed openstack/swift: Fix ssync sender cleanup of reverted fragment files  https://review.openstack.org/16905210:09
*** dmorita has quit IRC10:34
*** haomaiwang has quit IRC10:36
*** haomaiwang has joined #openstack-swift10:54
*** silor has joined #openstack-swift11:02
*** annegentle has joined #openstack-swift11:02
*** annegentle has quit IRC11:07
*** nshaikh has quit IRC11:20
*** nshaikh has joined #openstack-swift11:31
*** jistr is now known as jistr|english11:32
*** jistr|english is now known as jistr|class11:33
*** kei_yama has quit IRC11:48
*** ChanServ changes topic to "Review Dashboard: http://goo.gl/uRzLBX | Overview Dashboard: http://goo.gl/2By1qv | EC status: https://gist.github.com/notmyname/fd006c061ccb28e8ecfc | Logs: http://eavesdrop.openstack.org/irclogs/%23openstack-swift/"11:50
-openstackstatus- NOTICE: Check/Gate unstuck, feel free to recheck your abusively-failed changes.11:50
straycatOkay, I created two identical ring builder files11:53
straycata.builder and b.builder11:53
*** kota_ has quit IRC11:53
straycatdiff a.builder and b.builder shows they're the same11:53
*** ujjain has quit IRC11:54
straycatbut after a rebalance with the same seed diff says they differ?11:54
straycatthat wasn't exactly what i was expecting11:54
cschwedestraycat: iirc, there is a timestamp in the ring file, thus the difference12:01
straycatI was hoping it'd be something like that12:02
*** km has quit IRC12:02
*** annegentle has joined #openstack-swift12:03
*** annegentle has quit IRC12:08
*** erlon has joined #openstack-swift12:09
*** panbalag has joined #openstack-swift12:10
*** ppai has quit IRC12:34
*** mahatic has joined #openstack-swift12:38
*** bkopilov has quit IRC12:41
cschwedestraycat: use the following to print your rings, and compare your output using diff: https://gist.github.com/cschwede/4c60f89a86bd238d309a12:42
cschwedeif i use a seed the ringdata is identical except for the _last_part_moves_epoch12:42
*** tongli has joined #openstack-swift12:49
*** jistr|class is now known as jistr12:58
*** annegentle has joined #openstack-swift13:02
*** petertr7 has joined #openstack-swift13:10
peluseclayg, that 1s timeout thing on feature/ec was introduced between set 44 and 45....13:15
*** logan2 has quit IRC13:16
*** ujjain has joined #openstack-swift13:17
peluseclayg, and someone mentioned the test ZBF thing, just a simple fix.  Looking into the job cardinality comment you mentioned, intent was as you mentioned and I think I see where things strayed13:17
peluseman this is the carappiest code I've ever written in my life... I've got to find someone to blame for this!13:19
*** logan2 has joined #openstack-swift13:19
*** nshaikh has quit IRC13:21
*** trex has joined #openstack-swift13:40
*** tsg_ has joined #openstack-swift13:46
*** logan2 has quit IRC13:53
*** rdaly2 has joined #openstack-swift13:56
*** logan2 has joined #openstack-swift13:56
straycatcschwede, Okay, given that's the offset all the last_part_moves are based on that feels slightly unsafe, though I guess on a timescale of hours it's not going to be a problem13:57
*** rdaly2 has quit IRC14:01
*** jrichli has joined #openstack-swift14:17
*** bkopilov has joined #openstack-swift14:26
*** tsg_ has quit IRC14:27
*** blankspace has joined #openstack-swift14:37
*** emptyspace has joined #openstack-swift14:39
*** someonespace has quit IRC14:40
*** blankspace has quit IRC14:43
*** blankspace has joined #openstack-swift14:52
*** emptyspace has quit IRC14:56
*** blankspace has quit IRC15:01
*** SkyRocknRoll has quit IRC15:02
*** annegentle has quit IRC15:08
*** annegentle has joined #openstack-swift15:13
straycatcschwede, Ahh sorry I might be being a stupid here, that value is only going to be used by the rebalance process, and that's only ever performed by running swift-ring-builder foo.builder rebalance ?15:13
*** reed has joined #openstack-swift15:13
straycatIf that's the case then I don't need to worry about the differing value for _last_part_moves_epoch15:14
cschwedestraycat: yes, exactly. even if you do a rebalance the assignment of the partitions should be identical, given that you apply the same changes everywhere, use the same seed and a min_part_hour of less than the time difference to your last rebalance15:20
*** zigo__ is now known as zigo15:23
*** mahatic has quit IRC15:24
straycatcschwede, "and a min_part_hour of less than the time difference to your last rebalance" thereby forcing all partitions to be rebalanced regardless of _last_move_parts ?15:27
*** zaitcev has joined #openstack-swift15:28
*** ChanServ sets mode: +v zaitcev15:28
cschwedestraycat: min_part_hours is the amount of hours that needs to pass before a partition is allowed to move again. thus if you have different values, some partitions on builderfile A might move, but not on B (if you rebalance). But if it is 0 for example, the same rules apply everywhere, no matter the time that has passed15:30
claygintroduced between set 44 and set 45 ?15:32
straycatcschwede, makes sense thanks15:33
cschwedestraycat: you’re welcome :)15:33
claygcschwede: i've been wondering if balancing ec rings with like 10-16 "replicas" is going have strange behaviors because of min_part_hours15:34
cschwedeclayg: depends on the total amount of disks?15:35
claygi just mean if only one replica of a part is going to move every min_part_hours and you have 10-16 o them - that's like two weeks to move everything if you're... idk migrating zones or something15:36
claygacoles: cschwede: peluse: so you guys say we know something about the random Timeout (1s) failures?15:38
claygis it happening on master reviews too?15:40
cschwedeclayg: hmm, so you mean we need to think about the rebalancing for EC? Allowing to move more than one replica of a partition at a time?15:43
cschwedeclayg: sorry, now, i don’t know the reason for the timeout :(15:43
claygcschwede: well we shouldn't do anything yet - i just haven't really spent anytime with real world rebalancing of 12 replica rings :P15:46
claygcschwede: I see 'em on my laptop sometimes honestly - I think something is going on15:46
cschwedeclayg: agreed - thinking alone about juggling 12 instead of 1 replicas at a time makes me dizzy15:47
acolesclayg: i've seen some of those timeout test failures but not investigated15:53
*** bobby2_ is now known as bobby215:54
claygacoles: :'(15:59
claygi'm worried its going to take someone smart looking at it15:59
acolesrules me out then15:59
claygsee how quick you were with that16:00
claygtdasilva: how'd you do that etherpad thing - can I get everyone to start helping capture the tests and line numbers that breaking?16:01
tdasilvaclayg: one sec, let me create one16:02
acolesclayg: after my experiment last night i have concluded that my body can't take coding til early hours of morning ;)16:02
claygacoles: you never know until you try16:02
*** tsg_ has joined #openstack-swift16:03
tdasilvaclayg: https://etherpad.openstack.org/p/swift_timeout_test_failure16:03
claygacoles: I've found injecting coffee into the bloodstream helps recovery in the morning16:03
acolesacoles: i'm just gettin too darn old16:05
glangeacoles: your are talking to yourself16:05
glangeyou16:05
acoless/acoles/clayg/ see what i mean!16:05
glange:)16:05
*** silor has quit IRC16:05
acolesglange: its the only way to get agreement ;)16:06
notmynamegood morning16:07
*** jistr has quit IRC16:09
clayglol!16:12
peluseclayg, i thought the 1s timeout was only on the ecrecon branch, no?16:12
claygoh is that what's going on?  peluse, i have no idea16:14
claygi'm just going to keep running tests in a loop until i get a clue16:14
peluseOK, I'll look at curent feature/ec16:14
peluseon ECrecon it started at patch set 45 though16:15
claygoh that was easy16:15
notmynameclayg: feature/ec_review is available16:15
notmynamejust heard from ttx that summit room assignments should be finished by the end of next week. today's cross-project meeting has that as an agenda item16:15
claygnotmyname: ok - do you want to start reviewing with or without the bugs?16:16
notmynameI just got in and haven't caught up from what happened during the night16:16
notmynamewhere are we right now?16:17
claygnotmyname: peluse and I are going to make the reconstructor awesome by the power of unittests!16:17
notmynameyay16:18
claygnotmyname: and no one is getting any work done because of this stupid reoccuring failure with Timeout (1s) on feature/ec (or anything based on the reconstructor)16:18
notmynameis that what's blocking https://review.openstack.org/#/c/169035/16:18
peluseclayg, just so we know we're taking about the same thing, exactly which test(s) are you seeing hit by this?16:19
claygpeluse: i'm trying to take notes as I go https://etherpad.openstack.org/p/swift_timeout_test_failure16:20
pelusenotmyname, that one looks like it needs a recheck.  I'm not aware of it being associated with the timeout thing but could be wrong16:20
peluseclay, OK16:20
notmynamepeluse: ok, just trigged the recheck16:20
peluseclayg, seems like the 1s TO thing is only on the ECrecon branch and only when you include the ECrecon tests (and again started with patch set 45)  if you want to continue adding more unit tests I'll go hunt down exactly what is causing it16:26
claygi think that was just the gate thing the status messages were talking about "Could not install requirement XStatic-Angular-Irdragndrop"16:27
claygpeluse: *DEAL*16:27
claygacoles: what are you up to?  just slackin?16:28
acoles!16:28
acolesfeet up16:28
acoleskickin back16:28
claygi swear like three times in the last two weeks i was just like "I ... can't ... write ... another ... line ... of ... code ..."16:28
acolesclayg: i am nearly done writing the mother of all ssync tests16:29
claygnice!  is it an integration test - do you spin up servers!?16:29
acolesclayg: its gonna test end to end, on disk files to on disk16:30
acolesclayg: i was going to add it to this patch https://review.openstack.org/#/c/169052/16:31
notmynameneat16:31
claygacoles: that's flipping *great*16:32
claygi can't wait to see it16:32
claygpush it now16:32
acolesclayg: peluse : i think we need the fixes in 16905216:32
claygi don't care if it fails16:32
claygthe 1s TO fixes?16:32
acolesclayg: now i have oversold it i'm worried16:32
claygpeluse: says anything that dpends on the reconstructor breaks16:33
claygacoles: SHUT UP AND TAKE MY MONEY16:33
acolesclayg no 169052 is ssync/clean up reverted FI's fixes16:33
acolesclayg: gotta check my code carefully for farts ;)16:34
clayglol!16:34
* acoles will never enjoy beethoven again16:35
claygwell - it's just when you hear it you'll get gassy16:35
acolesyou're making it worse!16:35
peluseclayg, narrowed down:  its in the hokey setup of the old set of unit tests.  maybe we should clean those out sooner than later but first let me see if I can get a bit closer to root cause to make sure it really is something we dont give a shit about16:38
claygpeluse: weird - thanks16:38
*** annegentle has quit IRC16:40
claygman so like if you're doing a 10+2 scheme and you need to rebuild a fragment, you can only have *one* other node down or else you can rebuild16:45
claygI guess that's what the 2 is for :\16:45
*** jordanP has quit IRC16:45
*** vinsh has joined #openstack-swift16:48
zaitcevGuys, remember  that we have eventlet>=0.16.1,!=0.17.0? Apparently 0.17.1 satisfies it.16:58
zaitcevYay16:58
notmynamezaitcev: that makes sense, right?16:58
notmyname"anything bigger than 16.1 except for 17.016:59
zaitcevRight, as long as EC works with it16:59
notmynameI'm frankly amazed that we haven't (AFAIK) seen two openstack projects with incompatible dependency versions (eg proj A requires dep v1, but proj B requires dep v2. of course proj A breaks with dep v2)17:00
*** welldannit has quit IRC17:00
notmynamezaitcev: the gate is running whatever the highest version available that meets the version requirements, so all the tests have used 0.17.1 since it's been available. so I'm not too worried about EC breaking with 0.17.117:01
zaitcevThank that implicit keyword arguments in Python. They allow for easy compatibility in most cases without burdening developers with things like namespaces.17:02
notmyname(which is also one reason I try to keep my saio running with the earliest supported versions of dependencies)17:03
claygnotmyname: you're doing gods work man17:19
claygi implemented the ts method on like *one* test case now where there's not a timestamp just waiting for me I'm all put out17:21
peluseclayg, just updated the etherpad wrt the 1s TO thing.  Have to run to a mandatory meeting for 90 min or so17:23
claygacoles: https://etherpad.openstack.org/p/swift_timeout_test_failure <- peluse says you durable files are too durable17:26
acolesclayg: looking...17:35
*** silor has joined #openstack-swift17:35
*** zhill has joined #openstack-swift17:40
*** annegentle has joined #openstack-swift17:41
*** geaaru has quit IRC17:44
openstackgerritAlistair Coles proposed openstack/swift: Fix ssync sender cleanup of reverted fragment files  https://review.openstack.org/16905217:44
acolesclayg: there's the ssync tests ^^, still got a failure so wip17:44
*** annegentle has quit IRC17:46
*** bkopilov has quit IRC17:56
*** bkopilov has joined #openstack-swift17:56
claygacoles: that's cool tho18:01
openstackgerritChristian Schwede proposed openstack/swift: Check if device name is valid when adding to the ring  https://review.openstack.org/16923118:04
*** jamielennox|away is now known as jamielennox18:04
*** dencaval has joined #openstack-swift18:15
openstackgerritJanie Richling proposed openstack/swift: WIP - Provides a simple skeleton of middleware for encryption feature.  https://review.openstack.org/15790718:30
*** zhill_ has joined #openstack-swift18:32
werI have one (of 8) servers that isn't really clearing it's async work.  async_pending": 98258.  They are not clearing on this one host.18:37
*** silor has quit IRC18:39
openstackgerritOpenStack Proposal Bot proposed openstack/swift: Updated from global requirements  https://review.openstack.org/8873618:39
claygwer: you got a container server offline atm?18:51
*** annegentle has joined #openstack-swift18:52
claygweird that it's just one server - network routes?  anything change on the machine recently?  old ring?18:52
*** fbo has quit IRC18:52
werno clayg.  Everyone is online.  All these machines are identical and on the same 10 gig switch with no known connectivity issues.18:52
werrings have not changed in months.18:53
werI run a pretty steady state.... but the other day I noticed one object server on another machine that was cpu'ing more than the others.....  And then this one server that isn't moving it's async stuff.18:54
werI reloaded the object server... but was tempted to actually restart it and see if things move.  I'm also suspicious that I might have a container hotspotting or something but I have not identified anything yet.18:56
weror maybe a fragmented sqlite.  It's strange behavior.  It's kinda stuck :)18:57
werI actually had everything running really well for 6 months.  But I  needed to shard a container.  Cause I had like 3million objects and did a lot of deletes each night.  So I sharded that container.  I do feel like I see more timeouts then on the old cluster with a single larger container.... but the timeout's are still very low.19:00
clayghmm... that is interesting19:00
claygok, well when you don't know what the problem is TO THE LOGS!19:01
claygobject-updater is responsible for that - you could stop him, and the run swift-init object-updater once -nv from the command line to get the logs on the console19:01
werlol  the only thing I've identified are some timeouts.  And I hate them.  Maybe I don't have enough object servers listening or something.19:02
werobject-updater does the async stuff?19:02
*** fbo has joined #openstack-swift19:04
werhe's not running ?! :P19:05
werwtf19:05
openstackgerritClay Gerrard proposed openstack/swift: wip: ec reconstructor probe test  https://review.openstack.org/16429119:10
openstackgerritClay Gerrard proposed openstack/swift: Erasure Code Reconstructor  https://review.openstack.org/13187219:10
*** silor has joined #openstack-swift19:11
*** zhill_ has quit IRC19:14
werIt's working on it clayg.  Looks like the object updater is getting some timeouts on one of my mounts.  hrm.  it swept 4 disks fine and is timing out now on three more.  I've got something to dig at I guess.19:15
claygwer: nice - keep us posted - good luck!19:16
werk thanks!  I see a few failures.... but I can only assume this is because it's probably been dead for so long19:16
werd133 completed: 240.01s, 4096 successes, 3 failures19:16
werI've got plenty of ram. workers = 24 max_clients=512 on the object server.  The really messed up thing is that I get about 10million requests a day for objects that are going to be 404.  And I think that uses up my connections.  But I didn't have this issue previously.  I might bump max clients if I can identify it as being hit.  I really don't know.19:20
*** reed has quit IRC19:36
*** reed has joined #openstack-swift19:36
werok, damn.  I was missing two container-replicator and an account-replicator process... I'll put some checks in place to trend if these die, or if I had some one time occurrence that I missed.  serves me right for turning my back on things I guess.19:37
acolesclayg: so about these timeouts, am i missing something re the commit(), i see random tests fail due to time rounding without running the recon tests19:38
acoless/recon/??/19:38
claygrecoder19:38
acolesclayg: i wrote stuff on the etherpad19:39
claygbecause it RE-EC-EnCodes them!19:39
acolesoh no i lit that fuse again :P19:39
claygacoles: so but i mostly only see this specific Timeout 1s errors when I run the reconstructor tests19:39
claygmaybe there's a test here or there that has some timing isues rounding stuff - but the Timeout (1s) failures are pretty consistent for me19:40
acolesclayg: maybe the ts iter pattern is flaky? are you using that a lot in the recodifier tests19:40
claygacoles: *maybe* but I started that back in the reconciler tests19:41
clayganyway - i'm not seeing like assertion failures or some timestamp doesn't equal another - i'm seeing the hub.swift raising timeouts - peluse says it's because stuff is slow19:41
acolesclayg: ok gotcha19:42
acolesclayg: i'll go dog some more19:42
acolesargh dig!!19:42
claygheh19:42
claygi'm going to eat something and head into the office for awhile - catch up with notmyname - unless - is he out today?19:42
acolestaxes? ;)19:43
*** annegentle has quit IRC19:51
peluseacoles, just got back from meeting-ville and read your thing on the etherpa20:02
acolespeluse: maybe i was barking up wrong tree?20:03
peluseacoles, not sure I "get it" just yet though :)20:03
acolespeluse: well i wrote it quick so bit of a brian dump20:03
acoleswhoever brian is20:04
peluseacoles, if I don't use the commit() method the things work fine, it seems to be when its used as part of the class setup() somehow??  How does that relate to what you saw wrt timestamps (or maybe it doesnt)20:04
acolespeluse: not sure it does, can you point me to example of where its used as part of a class setup()?20:04
pelusebut honestly I think we're scrapping that whole set of tests including the setup and nowhere else (that I can find) do we create a bunch of files in setup()20:05
pelusesure, look in... (copying)20:05
pelusesetup() of TestGlobalSetupObjectReconstructor()20:05
acolespeluse: k, will do20:05
pelusecrap, another mtg, will be semi-online here for a few20:06
acolespeluse: unrelated, i have a fix for test_removes_zbf in the test_recosntructor, what shall i do with it? paste it for you or clayg, or push over the recon patch (its like 5 lines)20:07
peluseI have one ready too :)20:07
peluseone liner20:07
pelusebut paste and lets compare20:07
peluse        list(self.reconstructor.collect_parts())20:08
peluse        self.assertFalse(os.path.exists(pol_1_part_1_path))20:08
acolespeluse: http://paste.openstack.org/show/197736/20:08
acolespeluse: yeah that works too :D20:09
peluseOK, have to talk on this other call.. back in a few min20:09
*** dencaval has quit IRC20:09
acolesk i'm heading home will touch base later20:09
peluseOK20:09
*** annegentle has joined #openstack-swift20:10
*** acoles is now known as acoles_away20:11
*** silor has quit IRC20:12
pelusenotmyname, anyone - is there any plan (timeframe wise) for py 3.0?20:19
pelusefor swift of course... we have a team doing python optmizations and it doesn't look like there will be a 2.8 to contribute them to...20:20
*** jogriffin has joined #openstack-swift20:24
*** annegentle has quit IRC20:30
openstackgerritMerged openstack/swift: Even more cleanup to EC on-disk file cleanup  https://review.openstack.org/16903520:30
*** annegentle has joined #openstack-swift20:34
*** david-lyle has quit IRC20:40
notmynamepeluse: not as far as I know. nobody is really working on it20:40
peluseyeah, I thought it had some steam last year and seemed to die off20:43
notmynamepeluse: only client side20:43
peluseahh20:44
notmynamepeluse: or something called ec got everyone busy20:44
peluseyay :)20:44
claygnotmyname: i think everyone is still waiting on eventlet to support py320:50
claygnotmyname: but I think temoto is making progress20:50
notmynameya.  I've heard good things20:51
claygi lost the ether pad - does someone have the link?20:51
claygpeluse: acoles_away: unless you have the fix already somehow?20:51
peluseon the phone, almost done20:52
claygpeluse: i don't know if this works in your office - but around here there's like two magical letters that you get you out of just about anything20:53
peluseFO?20:54
claygheh20:54
claygok, i see why that collect jobs test was broken - but what does that have to do with the Timeout (1s) thing?20:54
pelusenada20:56
claygsrly, i lost the etherpad link20:56
claygnm, lastlog had it20:56
claygpeluse: the other failures that acoles pointed out make sense enough I guess - but again nothing to do with Timeout (1s)20:59
pelusecoming20:59
peluseagree20:59
pelusehttps://etherpad.openstack.org/p/swift_timeout_test_failure20:59
clayghave you tried just commenting out the GlobalTestReconstructor20:59
pelusethere it is :)20:59
*** thumpba has joined #openstack-swift20:59
peluseclayg, yes, that works like a champ20:59
claygpeluse: thanks20:59
claygpeluse: oh20:59
claygok, well at least we have one option20:59
pelusethat's why I think mabe this isn't worth chasing down but wanted to dig just a little deeper to make sure it wasn't something real that jsut exposed by my goofy test setup21:00
claygpeluse: ... right21:00
clayg:\21:00
claygpeluse: well at least we know we have a way to cut and run21:00
claygpeluse: can you keep on it and I'll get acoles fixes and look into those other tests?21:00
pelusebut I've been on the damned phone since acoles went to lunch - off now so will poke a bit more and I think maybe just remove all the global tests and port the ones that make sense into the class you added (the piss ant ones for coverage of things like check_rings)21:01
peluseyup21:01
*** annegentle has quit IRC21:01
*** annegentle has joined #openstack-swift21:01
claygpeluse: yeah i'm on board with that plan too21:05
claygpeluse: but I looked and could not see what on earth that setup could be doing that leaked into any other tests!?21:06
claygacoles_away: I can't get test_object_delete_at_aysnc_update to fail in a tight loop - what you said makes sense - but I can't get it to do it21:06
*** annegentle has quit IRC21:07
claygacoles_away: from your description the problem would be easily fixed by ts = (utils.Timestamp(t) for t in itertools.count(int(time() + 1)))21:07
*** Nadeem has joined #openstack-swift21:11
NadeemHello folks, I was wondering how could I propose a skip on Tempest tests for https://review.openstack.org/#/c/150149/21:13
NadeemAs per https://github.com/openstack/tempest/blob/master/HACKING.rst#2-bug-fix-on-core-project-needing-tempest-changes I need to propose a skip on Tempest tests21:14
claygNadeem: how about we *not* change the API?21:16
Nadeem@clayg Well currently we are not following the RFC. As per RFC 2616 section 10.3.5 & section 4.3, 304 Not Modified should not include entity headers like Content-Length & Content-Type.21:19
NadeemThis change allow us to be compliant with the RFC.21:20
claygwe're compliant21:20
claygit said should not21:20
claygacoles_away: yay it failed!21:23
*** annegentle has joined #openstack-swift21:23
werso my async problem is gone.  I'm getting occasional timeouts and it looks like the object-server is giving an 499 on occasion.  The client talking with the proxy-server is returned a 408 under these conditions.  Any pointers where to poke at these 499's?21:27
peluseclayg, interesting finding on the 1s timeout thing - if you change the order of tests fed into nosetests it goes away (see etherpad).  have a 30 min phone call now....21:30
claygwer: all of those say that the client talking to the proxy stopped sending data - probably on a PUT - maybe they have a timeout and got bored?21:32
weryeah I wanted to blame the client.... but I wasn't sure21:32
weris that what that says to you? :)21:32
claygpeluse: yeah I think i observed that too21:32
peluseand if you add --processes=4 (or something) you can run them with the reconstructor first and it works21:33
claygwer: well it may not be their *fault* if the service is being slow21:33
clayg--processes=4 !?  where do you come up with this stuff21:33
peluseshotgun troubleshooting baby :)21:33
claygwer: but I'm not sure how much more you can get from the 408's - the code paths that hit that mean that the proxy thinks that there was a timeout reading from the client21:34
peluseI'm thinking we can kill the global tests at this point, this seems like a nosetest thing that doesn't like how much time we're spending in class setup()21:34
*** bkopilov has quit IRC21:34
claygwer: client_timeout setting in the proxy server config21:34
claygwer: you might go digging into the specific transaction id's - and see if you can engage the client making the request - it's possible they may be seeing something different on their end21:35
claygwer: another thing that can cause a timeout to pop is something starving the reactor - like a pice of middleware doing a blocking operation21:35
clayger... "hub" in eventlet parlance21:36
claygpeluse: what?  too much time in setUp?  blaming nosetests?  not sure I buy that21:36
claygacoles_away: well the test delete-at bugs only seem to pop on the feature/ec branch21:37
claygpeluse: maybe those tests are leaking their background processes somehow and causing a bunch of background noise in the eventlet hub?21:37
peluseand only when you run the tests with reconstructor first (from my testing)21:37
claygwer: ^ speaking of starving the hub!  :D21:37
werclayg: I totally have some middleware doing the sharding....21:38
peluseclayg, OK, I'll look some more21:38
claygwer: is it doing *blocking* requests - or all green (like calling into the app)21:38
*** bkopilov has joined #openstack-swift21:39
*** acoles_away is now known as acoles21:39
werug.  clayg I could barely speak wsgi... I wrote it.  I don't think it should be blocking and just sits in the pipeline :/21:40
*** jrichli has quit IRC21:40
mattoliverauMorning, phew reading scroll back took a while.. So fun test errors huh21:40
*** thumpba_ has joined #openstack-swift21:40
claygwer: well maybe it's fine!21:40
*** thumpba has quit IRC21:40
peluseclayg, so with this option it still fails...21:41
claygwer: is it publicly available - you could probably trick someone into looking it over - cschwede and mattoliverau are into that crazy stuff21:41
peluse--process-restartworker21:41
peluseIf set, will restart each worker process once their tests are done, this helps control memory leaks from killing the system. [NOSE_PROCESS_RESTARTWORKER]21:41
claygpeluse: well if there's only one worker?21:41
*** thumpba has joined #openstack-swift21:41
claygpeluse: you might be able to limit it to a specific global reconstructor test21:42
peluseya21:42
*** thumpba_ has quit IRC21:45
claygpeluse: maybe the rings are leaking - and the object server tests are acctually trying to connect to loal host or something?21:47
acolesclayg: peluse: i'm not here for long but wondering what do you want me to focus on tomorrow? dig into the timeout issue more, keep going on ssync tests?21:50
pelusemy vote would be ssync tests21:51
*** erlon has quit IRC21:51
claygpeluse: I have something that fixed the issue for me locally21:51
pelusereally? do share21:52
claygpeluse: https://gist.github.com/clayg/24e882c4ee9f786e531221:52
acolespeluse: ok21:53
claygi'm going to let that run for awhile and then push up the fix21:53
claygacoles: yeah it'd be great if you could get those ssync tests working!21:53
peluseOK, so that bit of hackery you replaced was solving a problem that I was getting when running tox or noestests at the obj directory level - I mentioned it to you at the hackathon I think21:54
acolesi was just looking at that global testdir setup21:54
acolesclayg: i think the test failures were just the ones in the underlying recon patch21:54
acolesclayg: but there's more scenario coverage i can add21:55
claygtox or nosetests at the obj directory level?21:56
claygpeluse: i must have missed that21:56
claygacoles: oh ok - well that sounds promising21:57
acolesclayg: peluse : do you want me to keep patch 169052 dependent or squash into the reconstructor patch??21:57
patchbotacoles: https://review.openstack.org/#/c/169052/21:57
peluseprobably standalone21:58
claygacoles: oh i see what you mean - ummm...21:58
pelusewhat am I missing?21:59
pelusebrb21:59
claygwell he's cleaning up code that we're adding in patch 131872 - and adding tests22:00
patchbotclayg: https://review.openstack.org/#/c/131872/22:00
claygin the long run it probably doesn't matter22:00
claygall of that ssync reconstructor stuff will probably be in a single change when reviewing on master22:00
acolesclayg: so according to jenkins the ssync tests in 169052 are good its just the reconstructor test failures showing up22:01
claygtorgomatic: I think I ran off poor Nadeem - why are you encouging patch 15014922:01
patchbotclayg: https://review.openstack.org/#/c/150149/22:01
torgomaticclayg: just wondering what was going on with it; either it should be abandoned or it should make progress22:01
acolesclayg: so its more manageable separate just as long as folks reviewing 131872 realise there's some fixes up the chain22:02
torgomaticof course, the fact that his response was basically "the directions say do X; what does X mean?"22:02
torgomatic...doesn't fill me with confidence that it will make progress22:02
claygtorgomatic: seems obviously a v1.1 api thing - "backwards incompatible with some clients (e.g. Tempest)" seems like one of those "valid reasons in particular circumstances" that rfc 2119 was talking about when it described *should* not22:02
claygacoles: idk, i'll rebase it when I push up some of these other reconstructor fixes (the timeout 1s thing for sure) - i guess I'll keep it seperate for now22:04
acolesclayg: thats fine by me.22:05
notmynameclayg: /cc torgomatic you've convinced me. after looking at it again and thinking some, I'll -2 that patch22:05
claygnotmyname: i'm *always* down on changing the public api22:05
acolesclayg: i gotta try and sneak in the ssync protocol change somehow too22:06
torgomaticsounds good; whatever gets it out of a state that's waiting on feedback22:06
claygnotmyname: If we don't do a good job keeping these warts around we'll never have a good reason to work on the recrapifier middleware!22:06
notmyname:-)22:06
torgomaticretroencrapulator22:06
Nadeem@clayg I didn't ran off :) I got distracted in another chat...well the RFC 2616 in Sec 4.3 says that 304 shouldn't have any Message body. Hence it should not have any Content-Length/Content-Type.22:06
claygacoles: whoa did I say that outload - i was totally thinking you sneak that in somehow!22:06
notmynameclayg: ya, my initial look was more agreeing with the concept. I hadn't considered the implications. now, -222:07
acolesclayg: yup i'm just trying to think how to do it without churning all of the existing tests :/22:07
claygacoles: stupid tests - you wont even notice it it when it comes in with all the other changes22:08
notmynameok, whew. out of meetings, I hope, for the day22:08
claygpeluse: OH NO!  it failed again!  it's less likely apparently - but still happens :'(22:14
acolesok i'm calling it a day22:14
pelusebah22:15
pelusesame failure mode?22:15
peluseeither way what you posted is simpler than what was there, wanna push it and I'll work on it from that point?22:15
claygpeluse: it's the same failure mode22:16
peluseacoles, have a good one man22:17
claygpeluse: idk, i'm trying to see if i can isolate to a specific set of tests - i don't really think it's the setUp now?22:17
peluseOK, its still isolated to the global tests though right?  I can go through them one at a time22:18
openstackgerritMerged openstack/python-swiftclient: Include unsupported url scheme with ClientException  https://review.openstack.org/15824822:18
*** acoles is now known as acoles_away22:19
*** thumpba has quit IRC22:20
pelusewell, that's what I'll do since wherever it is in there I created it - ugh22:20
claygpeluse: maybe it's the heartbeater stuff?22:22
peluseI'll get there... just adding them in one at a time til it breaks :)22:22
pelusethere aren't that many22:22
claygpeluse: well I think when you get to porting the ones that are causing the issues they'll - hehe - yeah that22:22
claygpeluse: ok I think i have it isolated to those 4 test process_job_all_* tests22:26
claygtest_process_job_all_timeout i think - obviously it's *all* timeout!22:27
peluseI'm almost to that point, will let you know if its the same for me.22:27
*** Nadeem has left #openstack-swift22:28
*** tacotuesday has joined #openstack-swift22:30
peluseheh, yeah.  just that one which is of course the last one I added back in...22:31
*** trex has quit IRC22:31
*** jogriffin has quit IRC22:33
peluseso increasing the mocked timeout seems to work - could it simply be that the large amount of crap in setup is right on the 1s mark?22:35
*** tacotuesday has quit IRC22:35
pelusebecause remember it works if I don't use the .commit() method as well (which is a shitpile less code)22:35
claygpeluse: idk, i think there was just some junk timouts getting stuck in the hub or something22:37
claygok i'm going to submit with the skiptest in there22:37
*** annegentle has quit IRC22:38
pelusesounds good.  wierd though, 5 sec stil fails after a few runs (just that single test in the class) but 10 sec runs over and over22:38
openstackgerritClay Gerrard proposed openstack/swift: wip: ec reconstructor probe test  https://review.openstack.org/16429122:39
openstackgerritClay Gerrard proposed openstack/swift: Erasure Code Reconstructor  https://review.openstack.org/13187222:39
peluserunning now...22:40
*** annegentle has joined #openstack-swift22:41
peluseI'll also rebase acoles' dependent patch22:42
claygpeluse: oh shit - i was already doing that22:44
claygpeluse: now I don't want to type yes because you may beat me to it and the world could end?!22:45
peluseit likely would end :)22:45
pelusemy tox job is almsot done22:45
openstackgerritClay Gerrard proposed openstack/swift: Fix ssync sender cleanup of reverted fragment files  https://review.openstack.org/16905222:45
clayghahah!22:45
claygpeluse: the trick is to not run tox ;)22:46
peluseyou bastard!22:46
peluseso I had a failure in tox though....22:47
claygnotmyname: torgomatic: why can't we have https://review.openstack.org/#/c/143791/ on feature/ec - it's a good and useful change!22:47
claygpeluse: unrelated22:47
claygpeluse: maybe?22:47
claygpeluse: :P22:47
peluselikely... test_version_manifest_utf8_container_utf_object22:47
claygso you may have the last push after all :D22:47
notmynameclayg: we can. same story as the multi-range get22:48
claygnotmyname: mattoliverau: yeah why can't we have multi-range get on feature/ec - it's a good and useful change!22:49
clayg$ git diff master | wc -l 2118122:50
claygpsshshhsththt - this is going to be *easy*22:50
*** devlaps has joined #openstack-swift22:50
notmynameclayg: smaller than storage policies ;-)22:50
*** devlaps has quit IRC22:54
peluseclayg, that failure above is on the latest ecrecon patch and can be hit with just running ./.unittests23:06
peluse(and not on latest feature/ec)23:07
claygpeluse: what failure now?23:10
claygpeluse: can you fix it?23:10
claygpeluse: it doesn't look like jenkins has chimed in on patch 131872 yet23:10
patchbotclayg: https://review.openstack.org/#/c/131872/23:10
peluse    test_version_manifest_utf8_container                         ERROR  1.3623:11
peluse    test_version_manifest_utf8_container_utf_object              ERROR  0.0123:11
claygso with all of the ECObjectController refactoring - do we still need all of the proxy specific ec methods on the storage policy?23:11
claygtorgomatic: ^23:11
peluseyeah, I'll look.  would have thunk it was me but passes on feature/ec23:12
claygpeluse: ok well it's probably something I added I guess?  I'll look into if it jenkins kicks it back before you find a fix23:12
peluseyou'll likely have dinner, a few shots of scotch and breakfast before I find a fix :)23:13
* peluse is not having a good week23:13
*** annegentle has quit IRC23:13
clayglol23:14
claygpeluse: ok well then maybe ignore it and try to fill in some of the reconstructor tests - decide what if anything we're to do with the all_timeout test - and think about what you want to do build_jobs?23:14
claygpeluse: or tell me what to do - because i'm just piddlin' around with getting some changes ready on review/ec23:15
claygerr... feature/ec_review23:15
claygwhich *does* kind of sound like a feature!23:15
peluseclayg, looks like it came in with patch set 45 (same as the crazy timeout)23:17
peluselet me see if it runs w/o the global tests23:17
*** ChanServ changes topic to "EC Merge plan: https://etherpad.openstack.org/p/ec_merge_plan | Review Dashboard: http://goo.gl/uRzLBX | Overview Dashboard: http://goo.gl/2By1qv | Logs: http://eavesdrop.openstack.org/irclogs/%23openstack-swift/"23:18
notmynamehttps://etherpad.openstack.org/p/ec_merge_plan23:18
notmynamethere's the plan to merge ec to master23:18
notmynameto be discussed more at tomorrow's meeting23:19
claygpeluse: if stupid jenkins would get off it's can and give me a traceback I might be able to look at it!23:20
*** petertr7 has quit IRC23:21
peluseclayg, OK wait I lied about the patch set 45 thing - different failures (expected).... will keep searching for where it went off the rails23:22
peluseclayg, jenkins says "it still stucks" -- https://jenkins01.openstack.org/job/gate-swift-python27/4349/console (same error I see locally)23:27
*** kei_yama has joined #openstack-swift23:27
claygpeluse: why isn't that posted on the patch yet?23:28
peluseI went to zuul and clicked in on it - I don't think its totally done yet23:29
peluseso its passed for me twice with 49 and failed twice with 5023:30
*** tongli has quit IRC23:30
claygbah this is so tedius23:31
peluseand makes so little sense23:32
mattoliverauSorry been in a meeting23:35
mattoliverauclayg: multi-range gets would be awesome in EC.. so long as they work.. in my testing they were cutting off the end of the final boundry (miscalculate content-length maybe?) making incomplete multi-part. torgomatic has uploaded a new patchset so I should go test that I guess.23:38
*** km has joined #openstack-swift23:38
* peluse feels like a blind squirrel trying to find a nut23:41
claygpeluse: you said you can reproduce it locally right - have you find a series of tests that cause the issue - beyond.... all of them?23:42
claygpeluse: you said adding raise SkipTest() in the GlobalReconstrctor tests didn't make it stop for you?23:42
pelusenot yet, I ruled out the crappy global tests and am now just looking at unsuspecting changes in patch set 5023:42
pelusecorrect23:43
claygtest coupling is the worst thing ever invented23:43
*** ho has joined #openstack-swift23:52
claygpeluse: while true; do nosetests swift/test/unit/obj/test_reconstructor.py swift/test/unit/proxy/; if [ $? -ne 0 ]; then break ; fi; done seems to fail pretty reliably for me23:53
pelusethis is bannanas - looks like its one of the tests in the other class - the new ones23:54
*** zhill has quit IRC23:55
claygmaybe it's worth understanding why that *_all_timeout test was so bad?23:55
torgomaticone thing that's gotten me before is a proxy server object (proxy.server.Application) being shared between tests and deciding to error-limit my fake nodes because I had fake errors23:56
torgomaticand then that leaks to the next test case and the proxy gets all snobby about who it'll talk to23:57
mattoliverauSo on the latest EC recontructor patch, I get the timeout error on test_reconstructor_skips_bogus_partition_dirs, but only when I run all teats on TestGlobalSetupObjectReconstructor (and probably when running the whole suite). Running just the test seems to be fine.. so must be resource cleaning somewhere.23:57
mattoliverauI know you've dicussed it to death, but hey I was sleeping :P I'm going to take a quick debugging look incase fresh eyes help23:58
pelusemattoliverau, are you running all unittests?23:58
pelusebecause what you mention is a new one that at least I haven't seen23:58
mattoliveraupeluse: I wanted to recreate the issue without waiting as long, and seems to trigger if I just run all the tests in TestGlobalSetupObjectReconstructor class.23:59
claygpeluse: https://github.com/simplegeo/eventlet/blob/master/eventlet/timeout.py#L7623:59
pelusemattoliverau, so whatever is behind this series of strange issues it seems to depend on how the test is run23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!