Monday, 2015-04-13

*** jamielennox|away is now known as jamielennox00:06
*** pberis has joined #openstack-swift00:18
*** dmorita has joined #openstack-swift00:34
*** kota_ has joined #openstack-swift00:39
*** km has joined #openstack-swift00:41
*** geaaru has quit IRC01:05
*** fanyaohong has quit IRC01:09
*** thumpba has quit IRC01:11
*** pberis has quit IRC01:18
*** kota_ has quit IRC02:03
openstackgerritMerged openstack/swift: Set connection timeout in container sync  https://review.openstack.org/15694303:16
*** thumpba has joined #openstack-swift03:25
*** geaaru has joined #openstack-swift03:31
*** thumpba has quit IRC03:44
*** thumpba has joined #openstack-swift03:53
*** km_ has joined #openstack-swift04:01
*** km has quit IRC04:03
*** thumpba has quit IRC04:19
*** kota_ has joined #openstack-swift04:27
*** ppai has joined #openstack-swift04:50
notmynameI'd love to see the ec_review patches get finished up in the next 24 hours.04:54
notmynamethat will give a chance to get the other pending-on-master patches to land and use to make an RC on tuesday04:55
*** haomaiwang has joined #openstack-swift05:09
kota_notmyname: ok, I'm going to hurry me to review the patches on the ec_review branch.05:11
*** geaaru has quit IRC05:17
openstackgerritPratik Mallya proposed openstack/python-swiftclient: Accept token and tenant_id auth  https://review.openstack.org/17279105:45
*** km_ has quit IRC05:48
*** kota_ has quit IRC05:59
*** km has joined #openstack-swift06:01
cschwedeGood Morning!06:40
mattoliveraucschwede: Guten Morgen, have a good weened?06:52
mattoliverau*weekend06:52
cschwedemattoliverau: Good Morning Matthew! Yes, thanks - finally spring is arriving over here thus enjoying the sun :D How about you?06:52
mattoliverauIts Autumn, so cooling down.. but like you means its wonderful weather and enjoying actually being out in the sun :)06:53
mattoliverau(without burning)06:54
*** nshaikh has joined #openstack-swift07:01
mattoliverauBut yeah, had a good weekend :)07:03
*** jamielennox is now known as jamielennox|away07:11
cschwedei’m wondering about the test errors in the reconstructor (https://review.openstack.org/#/c/170339/) and asking myself if this is something we need to worry about. the tests pass locally on my VM though07:22
*** jistr has joined #openstack-swift07:24
*** chlong has quit IRC07:25
*** mmcardle has joined #openstack-swift07:34
*** geaaru has joined #openstack-swift07:46
*** krykowski has joined #openstack-swift07:49
*** jordanP has joined #openstack-swift08:00
*** ujjain has joined #openstack-swift08:28
*** acoles_away is now known as acoles08:28
*** ujjain has quit IRC08:28
acolesmorning08:28
openstackgerritLorcan Browne proposed openstack/swift: Add lowest option to swift-recon disk usage output  https://review.openstack.org/16723608:29
*** joeljwright has joined #openstack-swift08:32
*** tanee has quit IRC08:40
*** tanee has joined #openstack-swift08:41
acolescschwede: i just took a look at the test_reconstructor errors, they all appear to be due to comparing values derived from lists that may not always have same order e.g. dict keys08:46
*** haigang has joined #openstack-swift08:46
acolescschwede: so i think the problem is that the tests should sort before comparing, and not a fundamental problem with the unit under test08:46
cschwedeacoles: Morning! Well, if it is from a dict we should these errors with a 50/50 chance, or not? because the dicts have only two entries08:47
acolescschwede: yes. it will need to be fixed.08:51
acolescschwede: or do you never see it locally?08:52
cschwedeacoles: no, not on my three tests, thus i’m wondering08:52
acolescschwede: i am just looping the tests to try to reproduce08:54
*** km has quit IRC08:56
*** jamielennox|away is now known as jamielennox09:02
acolescschwede: hmmm, i can't reproduce either. but the failures are due to misordering when comparing suffix lists having 'abc' and '123'09:07
cschwedeacoles: yes, i think sometimes there is a set() missing - while some tests already use it09:13
*** haigang has quit IRC09:13
*** tanee has quit IRC09:14
*** tanee has joined #openstack-swift09:14
*** haigang has joined #openstack-swift09:15
cschwedeoh wait, that are lists, not dicts.09:15
acolescschwede: the failed unit test report for the assertion at line 1500 test_reconstructor.py.test_build_jobs_handoff shows the expected value stub_hashes.keys() to be ['abc', '123'], but...09:21
cschwedeacoles: do we still rely on 2.6 for testing? i don’t think so, right? we could use https://docs.python.org/2/library/unittest.html#unittest.TestCase.assertItemsEqual then09:21
cschwedewhich is basically assertEqual(sorted(expected), sorted(actual))09:22
acolescschwede: locally if i construct same stub_hashes dict the key order is reversed, so the test passes09:22
acolescschwede: re py26 i'm not sure we have officially abandoned it - the swiftstack CI runs tox -e py2609:22
cschwedeacoles: ah yeah, you’re right. thus simply throwing in a few sorted() should do it then09:23
*** aix has joined #openstack-swift09:23
cschwedewell, a few more...09:24
cschwedeacoles: i create a diff09:24
acolescschwede: yes, its weird though because the value being tested must also be a dict with same keyy so on same machine similar dicts are sorting differently?? one list is based on dictA.items(), the other on dictB.keys(), both have same keys but we get different ordering. not sure if that should surprise me or not, i know dict ordering is arbitrary.09:25
acolescschwede: thanks fr doing the diff09:26
*** kei_yama has joined #openstack-swift09:26
*** haigang has quit IRC09:27
*** haigang has joined #openstack-swift09:28
*** ppai_ has joined #openstack-swift09:29
*** ppai has quit IRC09:33
acolesclayg: ^^ fyi scrollback09:35
*** Kirgahn has joined #openstack-swift09:38
KirgahnHello everyone!09:39
*** theanalyst has quit IRC09:42
KirgahnThere's a problem I'm having with a single node swift deployment I made within a v.m. -  I can create containers and upload/download objects just fine but, when I try to delete something, it refuses with an http 400 Object DELETE failed: Invalid path: /device0/3674/AUTH_8f63721ec8734d29adb22ce73cc09:43
Kirgahngot any advice?09:43
*** theanalyst has joined #openstack-swift09:44
ppai_Kirgahn, could you share the command you used to issue a delete request09:44
cschwedeacoles: clayg: diff that wraps some dicts in sorted(): http://paste.openstack.org/show/203488/09:45
Kirgahnvia bash "swift delete SwiftContainer", via horizon i just select the object and delete it - same result09:46
Kirgahnthnx09:46
acolescschwede: thanks! did you post that link on the gerrit review?09:46
cschwedeacoles: yes09:47
acolesgreat09:47
ppai_Kirgahn, I hope the container is empty09:48
Kirgahnthe container is not empty - as I stated, when I try to delete the single object it contains I get the very same error09:49
KirgahnI'm aware of the fact that you can't delete containers that are not empty09:49
ppai_hmmm.interesting, did u take a look at the logs ?09:50
*** yuan has quit IRC09:52
*** yuan has joined #openstack-swift09:53
Kirgahnyes, this is what I get: "[root@swift ~]# swift delete test 880568-sophie-howard--AhaWallpaper.com.jpg Object DELETE failed: http://192.168.124.133:8080/v1/AUTH_8f63721ec8734d29adb22ce73ccd0ac5/test/880568-sophie-howard--AhaWallpaper.com.jpg 400 Bad Request  [first 60 chars of response] Invalid path: /device0/3154/AUTH_8f63721ec8734d29adb22ce73cc"09:53
KirgahnI can easily download that object09:53
*** haigang has quit IRC09:54
*** haigang has joined #openstack-swift09:55
Kirgahn"[root@swift ~]# swift download test 880568-sophie-howard--AhaWallpaper.com.jpg 880568-sophie-howard--AhaWallpaper.com.jpg [auth 0.408s, headers 0.648s, total 0.658s, 1.556 MB/s]"09:56
ppai_If it's a saio vm, you'll be having access to swift logs09:57
Kirgahnit's not, i actually manually deployed and connected it to an existing openstack deployment09:58
KirgahnI've already tried to increase verbosity with "log_name = swift log_facility = LOG_LOCAL0 log_level = DEBUG log_headers = false log_address = /dev/log" in each server conf file09:59
ppai_looking from the code, that message "Invalid path: ****" is thrown by split_path() which means there's something wrong with the path09:59
*** haigang has quit IRC10:00
Kirgahncan i do anything else to increase verbosity? the wierd thing is that I can download the object easily but I can't delete it, as if the path would change10:05
acolesKirgahn: someone asked here with a similar problem recently and IIRC there was a misconfigured account or container or object server port in the configs which was causing maybe a container server to be listening on port that should be an object server10:18
Kirgahnmmm, thanks for the pointer I'll triple check the config10:18
acolesKirgahn: double check your config file and ring file port numbers10:19
acolesKirgahn: its just that error (invalid path) is symptomatic of a request being sent to the wrong server type10:20
acolesKirgahn: but tbh i'm not sure how you would have uploaded the object in that case10:21
*** ppai_ has quit IRC10:26
KirgahnThanks acoles! I had a weird double entry in the object.builder10:31
Kirgahnswift-ring-builder  /etc/swift/object.builder  /etc/swift//object.builder, build version 3 4096 partitions, 1.000000 replicas, 1 regions, 1 zones, 2 devices, 0.00 balance The minimum number of hours before a partition can be reassigned is 1 Devices:    id  region  zone      ip address  port  replication ip  replication port      name weight partitions balance meta              0       0     0       127.0.0.1  6002       127.10:32
Kirgahni removed the wrong entry, rebalanced and voilà!10:32
Kirgahnstill wierd though10:32
Kirgahnone entry was serving upload and download requests, the other one was handling deletes10:33
acolesKirgahn: ok, glad you found it.10:36
*** ppai_ has joined #openstack-swift10:38
*** aix has quit IRC10:45
tab___Which database is best to use/prefered one to use with Swift? Should I go with SQLite or MySQL/MariaDB?10:56
*** ppai_ has quit IRC11:19
portantetab___: what do you mean by "best to use/preferred"?  Are you asking if there is an option to tell which DB to use?  Or are you asking what to use in some larger system working with Swift?11:26
*** tab___ has quit IRC11:27
*** ppai_ has joined #openstack-swift11:33
*** ujjain has joined #openstack-swift11:34
*** kei_yama has quit IRC11:43
*** jamielennox is now known as jamielennox|away12:12
*** EmilienM|afk is now known as EmilienM12:19
*** ppai_ has quit IRC12:33
*** PurpleJack has joined #openstack-swift12:38
*** jroll has quit IRC12:50
*** jroll has joined #openstack-swift12:50
*** Kirgahn has quit IRC12:55
*** openstackgerrit has quit IRC13:00
*** openstackgerrit has joined #openstack-swift13:03
*** krtaylor has quit IRC13:03
*** erlon has joined #openstack-swift13:15
*** ozialien has joined #openstack-swift13:16
*** proteusguy has quit IRC13:19
*** petertr7 has joined #openstack-swift13:20
*** aix has joined #openstack-swift13:24
*** dmorita has quit IRC13:28
*** proteusguy has joined #openstack-swift13:31
*** annegentle has joined #openstack-swift13:44
*** Trixboxer has joined #openstack-swift14:04
*** lpabon has joined #openstack-swift14:07
*** nshaikh has quit IRC14:15
*** tellesnobrega has quit IRC14:17
*** vinsh has quit IRC14:18
*** tellesnobrega has joined #openstack-swift14:19
*** jistr is now known as jistr|mtg14:29
*** vinsh has joined #openstack-swift14:33
notmynamegood morning14:49
acolesnotmyname: good morning14:51
notmynamebig day today (I hope) :-)14:51
acolesnotmyname: what's your plan B? :P14:52
notmynamedo it the next day? ;-)14:53
*** proteusguy has quit IRC14:54
notmynameacoles: cschwede: I'm glad you were looking at the reconstructor error14:54
notmynamelooks like 2 ec_review patches need a 2nd +2. and 4 need 2 +2s14:56
notmynamehow up to date with issues is https://etherpad.openstack.org/p/swift_ec_triage14:57
*** welldannit has joined #openstack-swift14:57
acolesnotmyname: oh i thought we had more double +2's14:58
acolesnotmyname: 9 reviews in total right?14:58
notmynamesome might have been lost with a recent push14:58
notmynameya, I'm looking at https://review.openstack.org/#/q/status:open+project:openstack/swift+branch:feature/ec_review+topic:bp/swift-ec,n,z14:59
acolesnotmyname: the -2 on patch 169985 is obscuring all the +2's there15:01
patchbotacoles: https://review.openstack.org/#/c/169985/15:01
notmynameoh, yeah. but that one is a special case. yes. that one too is ready to go, it seems15:02
notmynameok, I added a +A there to help with tracking15:03
notmynameok, so 4 have +A and 5 don't15:04
acolesnotmyname: i'm close to +2 on patch 169989 but need some reassurance on a query there15:04
patchbotacoles: https://review.openstack.org/#/c/169989/15:04
acolesor a slap round the head for being stupid15:04
acolesand i am reviewing the reconstructor15:05
acolesnotmyname: it feels like it has been less painful than SP was in terms of stuff changing 'underneath' patches up the chain15:06
acoles...so far15:06
notmyname:-)15:06
acolesnice video (blog) btw, reminded me how nice the view is from your offices15:07
notmynameone of the biggest differences, IMO, between SP and EC is that EC has been more of a whole-community effort from the beginning. whereas SP was more of a few people doing it and then the merge was everyone else coming up to speed15:07
*** jistr|mtg is now known as jistr15:07
acolesyup that ^^ was certainly the case for me15:07
notmynameand, yes, this seems less painful than SP was15:07
notmynameso THANK YOU! :-)15:07
notmynamewell, that's an interesting email post this morning: "eventlet 0.17.3 is now fully Python 3 compatible"15:08
notmynametdasilva: I know you aren't around, but congrats on the new baby!15:11
acolesnotmyname: oh wow, did he/she arrive early?15:12
*** annegentle has quit IRC15:12
notmynameyes he did. tdasilva sent me an email yesterday.15:12
notmyname"...our baby Lucas arrived last night. It was a little earlier than expected but baby and mom are doing well."15:12
*** annegentle has joined #openstack-swift15:13
acolesexcellent. so tdasilva may make vancouver after all :P :P15:14
notmynameheh15:15
*** GlennS has left #openstack-swift15:20
*** annegentle has quit IRC15:23
*** zaitcev has joined #openstack-swift15:27
*** ChanServ sets mode: +v zaitcev15:27
notmynameok, the starred patches on the dashboard are: (1) ec_review patches, (2) stuff for master that already has 2 +2s (so I can remember to land them as soon as ec_Review lands), and (3) a couple of small nice-to-haves15:35
notmynameeg, it's the 11th hour for https://review.openstack.org/#/c/166576/15:36
*** gyee has joined #openstack-swift15:41
*** jistr has quit IRC15:45
*** baffle has joined #openstack-swift15:49
*** vinsh has quit IRC15:57
*** annegentle has joined #openstack-swift16:00
notmynamecschwede: "big thinks..."16:01
cschwedenotmyname: thanks a lot, the review process works well ;)16:04
* cschwede thinks more about other things now16:04
notmyname;-)16:04
*** Fin1te has joined #openstack-swift16:19
*** ozialien has quit IRC16:22
openstackgerritJohn Dickinson proposed openstack/swift: 2.3.0 authors and changelog updates  https://review.openstack.org/17257316:22
notmynameI included some EC notes in this new version ^^16:23
notmynameany comments and improvements are welcome16:23
*** jordanP has quit IRC16:24
*** aerwin has joined #openstack-swift16:27
notmynameFYI, I'll be going down to santa clara in a couple of hours and be in various states of "online" this afternoon. I'll be fully back online once I'm home, and I expect to be up late finishing EC and the rest of it for tomorrow16:30
notmynameI'm definitely available for anything that comes up (and nearly all of you have my cell if it's really important)16:31
acolesnotmyname: remind me, we need all ec reviews +2 by end of today, correct?16:32
notmynameyes, that's the goal. obviously, good quality trumps "today", but I'd like to have everything submitted to the gate (to land on feature/ec_review) by the time I go to bed tonight16:33
notmynamethen tomorrow morning I'll propose and land the ec_review->master merge commit16:33
notmynamethen the other stuff that's pending on master with 2 +2s16:33
acolesright so ideally clay pulls the corking -2 later today16:33
notmynameright :-)16:33
*** sandywalsh has quit IRC16:34
notmynamethen once all that's done, we've got a SHA for the RC and I'll send that on to ttx.16:34
notmynameso that's my schedule for the next ~30-36 hours16:34
notmynameif I'm lucky, I'll get a good night's sleep too! ;-)16:35
*** sandywalsh has joined #openstack-swift16:35
*** krykowski has quit IRC16:36
*** ujjain has quit IRC16:51
*** vinsh has joined #openstack-swift16:54
*** haomaiw__ has joined #openstack-swift17:02
*** haomaiwang has quit IRC17:03
*** annegentle has quit IRC17:10
*** ozialien has joined #openstack-swift17:12
*** geaaru has quit IRC17:16
*** Fin1te has quit IRC17:18
*** mmcardle has quit IRC17:19
claygmorning17:27
claygsounds like there's a few diffs already to apply - I'm inclined to take them now?17:29
acolesclayg: morning. i think i have posted all diffs i have for the moment - i *think* all i have left to review is docs17:30
*** rdaly2 has joined #openstack-swift17:30
*** ozialien has quit IRC17:31
claygacoles: well some of the diffs are like really good right - fixing mis-spelled variable names and the ilk?17:31
acolesclayg: there's one on per-policy-diskfile review for cleaning up old non-durable data that you should probably *review* before applying17:32
*** aix has quit IRC17:32
claygnon-druable data - waits reclaim age right?17:32
claygthe only reason I didn't grab it on friday night was laziness on some tests that were coupled with the replicated behavior17:32
acolesclayg: yeah. at least thats the intent :P17:32
claygyeah that one would be nice to verify functionally - i'm not entirely sure how much I tested .druable repair via reconstrutor17:33
claygthere was some notion that it was slow/wasteful - like it would rebuild the whole damn FA just to get the .druable to the remote - but as long as it works it's probably fine17:34
*** zhill has joined #openstack-swift17:34
acolesclayg: well this one i didn't write it til sunday :) https://gist.github.com/alistairncoles/a118e563495d9bd0903e17:34
claygso really it's more of a sanity check of the missing check to make sure the reciever really does ask for the hash/timestamp of the non-druable data17:34
acolesclayg: so which diff are you referring to?17:35
claygacoles: as far as the race in purge - *I* think pushing the cleanup of the non-fi-indexed-durable out until the next pass should essentially make it a non-race - in that once things have settled down for a whole replication pass it's even less likely that suddenly some writes are going to start showing up again into a much smaller window - but maybe i'm being overly optomistic17:37
*** aix has joined #openstack-swift17:37
claygI thought cschwede had a diff for me too - i was just reading email - i need to go look at the reviews on the patch sets17:38
notmynamehi clayg17:38
clayg... but I'm inclined to apply whatever fixes we have written17:38
claygnotmyname: good morning!17:38
notmynameclayg: party day!17:38
acolesyeah cschwede fixed up some tests on ec-recon, he's left a diff linked to the review17:38
notmyname(and ec_review party)17:38
notmyname*an17:38
claygacoles: yeah those!  let's get 'em17:39
notmynamemaybe that's why I don't get asked to throw many parties17:39
claygnotmyname: you and peluse and torgomatic and I can figuratively reapply acoles  and cschwede'd +2's later this morning right?17:39
notmynamewhat do you mean?17:40
acoleshe means can you proxy vote for me :)17:40
claygnotmyname: looks like mattoliverau might have a few nits as well17:40
claygnotmyname:  well for all the +2's and "this is good enough" - I think there's still a couple of oppertunities to improve some17:40
claygI'm more inclined to make the fixes and approve them than to not make the fixes just to avoid any changes17:41
claygI *do* want to get stuff merged today tho because I really want to shift my focus to scale testing in the lab17:41
notmynameoh, yeah. the way this whole thing is going, it seems like we're all in it together. so if acoles and cschwede go to bed and you and me and torgomatic and mattoliverau end up +2/+A stuff, then it's fine17:42
notmynameI mean, I'm fine with acoles doing stuff when I'm asleep. I'm assuming that's transitive17:42
claygnotmyname: cool - that's what I was thinking17:42
*** zhill has quit IRC17:44
claygacoles: so that gist with the .data cleanup - is .data in the reclaim rules or not?17:44
claygacoles: yeah i'm totally confused why test_hash_cleanup_listdir_keep_single_old_data didn't fail with that diff17:46
claygacoles: and what about the qurom size stuff - torgomatic have you seen weekend comments on https://review.openstack.org/#/c/169989/17:48
acolesclayg: ok gimme a few mins to catch up - yeah, the quorum size is the one issue i'm not sure on hence no +2 yet on that patch17:48
*** aix has quit IRC17:49
claygcschwede: some of the sorted(set( changes in http://paste.openstack.org/show/203488/ don't make sense to me - the set equality was ment to address the ordering - maybe onside forgot to get wrapped in a set?17:51
claygcschwede: the dict.keys == [] obviously needed to be sorted - thanks for those17:51
*** ozialien has joined #openstack-swift17:52
*** rdaly2 has quit IRC17:53
*** jkugel has joined #openstack-swift17:53
torgomaticclayg: yeah, I was looking at that quorummy stuff on the train17:54
torgomaticand it confuses me17:54
*** krtaylor has joined #openstack-swift17:54
claygtorgomatic: well all the multi-phase PUT stuff confuses me17:56
claygtorgomatic: so I'm sure I'm more confused than you are17:57
claygtorgomatic: probably only acoles knows what to do - and he's claiming ignorence17:57
acolesi could trump you both on confusion :)17:57
torgomaticI think it's mostly a question of how much is a quorum17:58
notmynameI'm driving to santa clara now. I'll be online as much as possible this afternoon17:58
claygnotmyname: STOP GOING PLACES!?17:58
notmynamelol17:58
claygnotmyname: you can't have two number one priorities17:58
notmynameclayg: talk to dana/manzoor ;-)17:58
claygI'll fucking break some heads - me getting fired won't help you17:58
notmynamelol17:59
notmynamenow I know who to call17:59
claygtorgomatic: so was DELETE POST only need "most" of the replicas like on purpose?17:59
*** annegentle has joined #openstack-swift17:59
claygit makes sense to me - you only need one tombstone for eventualy consistency to win out - the delete'd ness of an object is not erasure-coded it's replicated to high hell18:00
torgomaticclayg: I guess so; really as long as you can write out a pair of tombstones, you're okay18:00
torgomaticthis is more on the PUT part though; like, how many still-working PUTs do you need at each step of the way?18:00
torgomaticthe code is not particularly clear on the matter18:00
claygbah :'(18:01
claygit's probably all my fault18:01
torgomaticyou are in the Co-Authored-By line ;)18:01
cschwedeclayg: most of the sorted() wraps was due to the errors reported here: http://logs.openstack.org/39/170339/6/check/gate-swift-python27/78c17d4/18:02
cschwedeand i added a few more because i thought they might break as well18:02
cschwedeclayg: i can limit the diff to the reported errors if you think that’s more safe and helps18:03
claygcschwede: no I think adding all the sorted(dict.keys()) == sorted([]) is good - anywhere we're doing sorted(set()) == sorted(set()) is not needed as sets are already un-ordered18:05
claygi do the audit when I apply the diff18:05
claygcschwede: thanks!18:06
cschwedeclayg: ah, ok, got it. yes, makes no sense to sort a set18:06
cschwedeclayg: you’re welcome, glad i could help a bit18:06
clayg^ huge understatement!18:06
claygthanks a ton!18:06
claygok i'm finsihed looking things over - i'm going to start applying diffs18:07
acolesclayg: let me stew on the hash_cleanup_listdir diff a little longer. you're right that .data should be in reclaim_rules but i think i see a better way18:08
clayghopefully torgomatic will find his way out of his commute and show me how to beat all of the stupid out of the my craptacular paraphrasing of his proxy work18:09
claygacoles: well.... OHHHHHHhhhhkay18:09
acolesclayg: just need to stoke up in caffeeine...18:09
*** Fin1te has joined #openstack-swift18:18
*** rdaly2 has joined #openstack-swift18:22
*** rdaly2 has quit IRC18:24
*** annegentle has quit IRC18:33
*** ozialien has quit IRC18:33
*** Fin1te has quit IRC18:35
*** joeljwright has quit IRC19:03
*** joeljwright has joined #openstack-swift19:03
claygacoles: ok, i'm skipping over the diskfile patch you posted anticipating even more awesomeness coming shortly19:08
acolesclayg: k. so test_hash_cleanup_listdir_keep_single_old_data didn't fail with my diff because i'd only find fragments_without_durable if there was more than one (because of the wacky len(files) condition)19:10
acolesbut if i uncomment your reclaim_rule for .data then a bunch of stuff fails :( and i'm still working through those19:11
claygacoles: interesting19:12
claygacoles: fwiw, i'm pretty sure the only reason the len(files) == 1 check is there is because like with EC, in replication, the tombstone reclaimation was added at the very end and the very special "only one file" condition was an easy way to ensure other code paths relating to suffix hashing wouldn't be effected19:13
acolesclayg: some of it is just where tests create files at time 42 and the .data gets instantly reclaimed after its written :) because HCL is called after the .data put before the .durable is written (thats due to not touching legacy code and is on our trello todo to fix)19:14
claygbut we already know that's all quite janky because of the object-server not plumbing in reclaim age through get_hashes19:14
acolesclayg: yeah i'm sure its so19:14
claygacoles: oh yes, I see that would be a problem19:15
claygsimple enough to make those tests use self.ts() instead of 42?19:15
acolesclayg: yeah, for a minute i was wtf where did my diskfile go???19:15
claygacoles: not a great experience even in tests i'm sure :\19:16
acolesclayg: yep done that just got some reconstructor tests failing :/19:16
acolesclayg: oh yeah, so i have self.ts() !19:17
*** thumpba has joined #openstack-swift19:17
claygare the reconstructor tests the dict keys order thing cschwede found?19:17
acolesclayg: no i have some left over status from the fake conn19:19
* acoles goes digging...19:19
notmyname.19:20
acoles..19:20
clayg../..19:20
notmynamepermission denied19:21
claygcschwede: the literal path to the sample internal client config I think it sorta problematic19:21
claygI'm thinking I'll have to build it up from test.__file__19:24
*** silor has joined #openstack-swift19:27
pelusewow, that's some serious scrollback19:27
*** silor has quit IRC19:36
*** rdaly2 has joined #openstack-swift19:38
peluseacoles, you still there?19:42
acolespeluse: i am19:43
pelusewas just looking at your quorum size comment in get_put_responses() sure seems like a bug to me19:43
pelusedid you do anything further with it or should I post a gist fix?19:43
acolesyeah it felt like a _quorum_size() override method had gone AWOL19:43
claygpeluse: acoles: wasn't that call for the .durable?19:44
acolespeluse: no i have not take it any further, i wasn't confident enough that i understood19:44
pelusethe .durable uses the minimum_responses thing19:44
claygpeluse: acoles: feels like the standard/crazy "about half your nodes" rules apply19:44
*** Fin1te has joined #openstack-swift19:44
peluseso we need regular quorum before we issue the .durable and then minimum .durable responses to be done (and that min is 2 right now)19:45
claygpeluse: so the PUT was only requring half instead of the ec policies rule - and tests were passing because why?  Does our fake test ec policy just happen to have a quorum that is equal to the replication calculation?19:45
pelusedon't know that (yet)19:46
claygtorgomatic: are you also looking at this?19:46
peluseI'll post something in the patch shortly19:47
acolesclayg: peluse see my comment on patchset 6, need_quorum is opposite to final_phase so self.have_quorum is called for intermediate (.data) phase19:48
peluseOK19:49
acolesclayg: ok recosntructor tests passing. hardcoded time to create files :/19:54
acolesclayg: so shall we go the whole way and remove the len(files)=1 condition ???19:55
claygacoles: I thought the only reason we had not is because of some overlap of tests in the base mixin?19:55
clayglike we'd have to get rid of the requirement and fix and replicated case in order to have common test pass on both diskfiles?19:55
acolesclayg: yes tests and "remaining consistent with legacy"19:56
acolesclayg: actually lets not19:56
acolesclayg: its make work19:56
acolescan fix it later and do other stuff now19:57
acolesclayg: the important thing is that old .datas are now cleaned up19:57
claygyeah that's nice to have I suppose19:57
acolesclayg: lets put perfectionism on hold19:57
* acoles is 12 hours into the day's shift19:57
* acoles so feeling lazy19:58
*** annegentle has joined #openstack-swift19:59
*** annegentle has quit IRC20:10
*** PurpleJack has quit IRC20:11
claygso i like X-Backend-Ec-Archive-Index20:13
peluseit likes you too20:13
claygacoles: I think there a notion maybe that if we pick something more like "Node-Index" that we could reuse it on other diskfiles - just that we're maybe not gaining much by making it ec specific - but i'm not really sure how long the conept will direclty map to the ring ordering of nodes20:14
acolesclayg: ok well maybe its node-index, at least thats 'generic', but backend-frag-index is both ec specific and inconsistent20:16
clayglol20:17
*** Guest_ has joined #openstack-swift20:20
*** rdaly2 has quit IRC20:22
claygAnyway, i'm not sure how much sense it would make to try and change x-object-sysmeta-ec-archive-index - so we're really just talking about what header the reciever should use to get the value that it hands over to get_diskfile_from_hash20:23
*** rdaly2_ has joined #openstack-swift20:24
claygI think x-backend-fragment-index may be close to ideal, i'm not sure exactly what the schema would be for a backend diskfile that could potentially return the wrong data for a given hash unless it is (like ec) chopping the data into fragments somehow - at least fragment-index would match the diskfile kwarg20:26
claygoh except there we used frag_index :'(20:27
claygpeluse: hates varible names longer then four characters, if it were up to him and yuan it'd be all fi's and ic's everywhere20:27
acolesclayg: heh. is that how we got a variable named 't' in test_reconstructor20:28
clayglol, so as not to confuse it with a ts20:28
claygi'm sure i've named timestamps t1 and t220:28
pelusewhat wrong with "t"?20:28
acolesclayg: yup. so self.ts() *does* return me a tombstone file right? :P20:29
claygi think the only thing wrong with that t was how the methods that close over that scope were defined before the variable was initialized - i ment to fix that :\20:29
acolespeluse: try searching for where its declared :)20:29
acolespeluse: sorry no offence meant i'm sure i use i and x all over the place20:31
claygobviously ->  /\<t\> =20:32
claygacoles: ok, so you have a diff for me?20:32
acolesalmost there running tests20:33
claygi'm still torn on x-backend-node-index20:33
acolesi can hardly type 'diff' !20:33
*** Guest_ has quit IRC20:33
*** HenryG has quit IRC20:33
*** aerwin has quit IRC20:33
claygacoles: what was bad about x-backend-fragment-index?  or data-fragment-index or object-fragment-index or diskfile-fragment-index or something like this?20:33
acolesclayg: delegate up to notmyname for a PTL decision :P20:33
clayghe's AWOL20:34
*** rdaly2_ has quit IRC20:34
*** joeljwright has quit IRC20:34
notmynameI'm here20:34
acoles.20:35
claygsweet!20:35
claygwe have one of those hard computer science problems20:35
notmynamenaming things?20:35
*** welldannit has quit IRC20:36
claygnotmyname: ssync needs to send the reciever a hint as to what value it should send into the diskfile manager's frag_index kwarg20:36
peluseand nobody likes the name I used...20:36
* peluse thinks same story different day20:36
claygit's a common code path for repliation and ec - but obviously the replicated diskfile ignores the kwarg and ssync reciever defaults to sending in None20:36
claygbut in the EC case the current named header is "X-Backed-Ssync-Frag-Index"20:37
clayg... which I sorta think is not so terrible looking at it again20:37
acolesso i'm being all OCD but fragment != archive (the sysmeta header) so just thought it should be consistent with the sysmeta thats all20:37
*** HenryG has joined #openstack-swift20:37
acolesits no big deal lets leave and move on...20:37
clayganyway on the PUT path the proxy sends this value to the object node as X-Object-Sysmeta-EC-Archive-Index20:38
peluseclayg acoles, so I posted a change to patch 169989 to address the quorum size but it probably sucks.20:38
patchbotpeluse: https://review.openstack.org/#/c/169989/20:38
claygtorgomatic: ^20:38
acolespeluse: ok i'll take a look in a mo just getting this diff wrapped for clayg20:38
claygpeluse: well but howcome tests weren't failing?20:39
notmynameclayg: ya, seems odd that PUT and ssync send different headers for the same thing20:40
claygnotmyname: but the X-Object-Sysmeta prefix seems to make sense for the Ec specific diskfile, the object server doesn't really look at it - it just passes sysmeta down to the diskfile and it does it's thing20:40
notmynameya, sysmeta is for storing. x-backend- is for inter-cluster communication (timestamps, etc)20:41
claygnotmyname: things are complicated by the fact that there's a coupling of fragment index and the node's offset in the ring (fragment index mostly == primary_node['index'] from the ring)20:41
claygand then there's the fact that diskfile has decided to call the kwarg frag_index - which maybe mostly doesn't matter - it's only the ec specific diskfile that cares about it and like I said the object server doesn't *currently* parse X-Object-Sysmeta-EC-Archive-Index; but it probably will at somepoint, then *all* diskfiles will recieve *some* value for this kwarg, even if it's just a default "None" or something like that - a20:44
peluseclayg, I'm looking at that next but suspect with 4+2 Ec the quorumforum for repl being used would be 4 so we'd always get enough.  have to go pickup daughter from school in a few, be back later20:44
claygpeluse: ok, well I'd be inclined to fix the janky test/fake policy so that it's acctually a good test for ec20:45
acolesclayg: ok here it is https://gist.github.com/alistairncoles/0b8bddd06474f023ec2120:47
acolesclayg: thats a diff against patch 170339 the ec recon :/ because i had to fix up those tests20:48
patchbotacoles: https://review.openstack.org/#/c/170339/20:48
acolesclayg: so idk maybe you can split across that review and the per-policy-diskfile20:49
acolesclayg: sorry, i just realised i was working on top of chain for probe testng etc so that diff won't apply to patch 16998720:49
patchbotacoles: https://review.openstack.org/#/c/169987/20:49
acolesclayg: i undid your reclaim_age thing i'm afraid - the single .data should not be set in results['.data']20:50
acoleswhen its not being reclaimed20:50
*** rdaly2 has joined #openstack-swift20:56
*** rdaly2 has quit IRC21:02
*** ozialien has joined #openstack-swift21:04
*** annegentle has joined #openstack-swift21:09
claygacoles: I don't get it21:14
acolesclayg: which bit21:14
acoles?21:14
claygacoles: so if I apply that on the end of the chain - which bits do i have to fix up to get all the diskfile changes in one review?21:15
claygoh, you got rid of "reclaim_rules"21:16
claygthat's fine by me I think21:16
acolesoh yeah sorry, i didn;t get rid of reclaim_age ! oops21:16
claygacoles: I'm guessing you have noticed I hate variables named timestamp that are acctually strings?21:17
acolesclayg: i think its just the test_reconstructor changes that need to go at end of chain and the rest would go on the per-policy diskfile21:17
acolesclayg: awww, but i did avoid a dangling elif just for you :P21:18
claygjust the other day I was writing some code and the dangling elif seemed like the most obvious correct way to write it - i was so proud of myself - and then I looked at it and realized it was clearer to break it into if; if else21:19
notmynameacoles: you know, I was writing some code this weekend and had a dangling elif and removed it because I though "what would clayg do"21:19
notmynameclayg: you've trained us all :-)21:19
* clayg is glad to be rubbing off on folks21:19
claygacoles: oh i see the elif case statement code21:20
claygsee - that's where I reach for a map!21:20
acolesclayg: do you want me to split that diff into two? for each review?21:21
*** proteusguy has joined #openstack-swift21:21
*** aix has joined #openstack-swift21:21
claygno i can split it up - i'm still trying to decide if the increasingly long justification for why that code stinks is really the best that we want to strive for21:21
*** haomaiw__ has quit IRC21:23
acolesclayg: i know, it sucks. let me see how much grief i get just going the whole way.21:23
*** gyee has quit IRC21:32
*** G________ has joined #openstack-swift21:34
*** G________ has quit IRC21:43
*** lpabon has quit IRC21:44
*** MVenesio has joined #openstack-swift21:51
*** Fin1te has quit IRC21:51
*** welldannit has joined #openstack-swift21:51
*** jkugel has quit IRC21:52
*** PurpleJack has joined #openstack-swift21:56
claygnotmyname: so you were supposed to tell me what to name the x-backend-ssync-frag-index21:57
notmynameheh21:57
notmynameI thought you and acoles figured it out :-)21:57
notmynameclayg: I like x-backend-* for ssync because it's a control falg for the process21:58
notmyname*flag21:58
*** jkremer has joined #openstack-swift21:58
notmynamerather than sysmeta because that's what's stored21:58
notmynamedoes that make sense or answer the question?21:58
notmynameseemed to be that the question was about x-backend- vs -x-systmeta-21:59
*** ozialien has quit IRC21:59
*** PurpleJack has quit IRC22:00
peluseclayg, build and process_jobs().... wow22:02
acolespeluse just left a comment on the review re your gist22:07
peluseacoles, it is, I just used the policy property because it read better I thought22:10
acolespeluse: k22:11
acolespeluse: and i guess the controller doesn't have a 'policy' attribute? cos of the problems with COPY etc22:11
pelusenope22:13
pelusethe quorum stuff got a little twisted after the introduct of the EC obj controller22:14
peluseclayg, it doesn't look like the ec recon is propogating a .durable if the .data exists (ie node A has both node B only has .data, reun EC recon and node B is unchanged however if node B starts with no data it ends up with all the right stuff)22:15
peluseclayg, but that was a manual test, I can write a probe for it to make sure.  This is key to our use of "minimum of 2 durables written"22:17
*** petertr7 has quit IRC22:21
*** itlinux_ has joined #openstack-swift22:28
itlinux_hi all22:29
*** annegentle has quit IRC22:29
*** cschwede has quit IRC22:30
*** MVenesio has quit IRC22:30
itlinux_quick question I want to ask. The hash on the ring which is cal at the creation of the ring, shows a size, if we are trying to push a new object which maybe bigger than that size does the object fall into two location then..22:30
mattoliverauMorning, wow that was a lot of scroll back to read.. Y'all have been busy22:30
mattoliverauitlinux_: what do you mean by size, part power?22:32
*** cschwede has joined #openstack-swift22:33
acolesmattoliverau: good morning!22:34
mattoliverauhey acoles, your still up! Good evening to you sir.22:35
acolesmattoliverau: leave it 25 mins and you can say good morning ;)22:35
*** gyee has joined #openstack-swift22:36
mattoliverauacoles: Right so getting late, thanks for your dedication! clayg and yourself are machines :p22:38
acolesclayg: ok done it, fart removed! https://gist.github.com/alistairncoles/f54eff6d179af735c414  same as before it applies to patch 170339, hopefully everything but test_recosntructor changes can be applied to per-policy diskfiles patch22:43
patchbotacoles: https://review.openstack.org/#/c/170339/22:44
peluseclayg, here's a propogate durable probe test that fails currently. https://gist.github.com/peluse/a38f8fb516425ef2a90522:44
claygwhoot whot!22:44
acolesclayg: its a bit of a hack - some of the tests could probably do with per-policy mixins but lets not go churning those22:44
acoleswhoa, did i say mixin ? :/22:45
claygpeluse: wait - so why does it fail?  :'(22:45
peluseclayg, I dunno yet, wanted to make sure I wasn't seeing things first.  If you want to take a quick look at the probe test I can dig in but not for another two hours (but then the rest of the night)22:46
*** jkremer has quit IRC22:46
claygpeluse: yeah idk, i wouldn't expect that test to pass if removing the .durable makes the object node 404?22:48
peluseclayg, but running the Ec recon before the next GET should push the .durable over so the next get doesn't 40422:50
peluseright?22:50
claygoh yes of course22:50
acolespeluse: does it fail with just a single missing durable?22:51
claygacoles: i think all of the test scenarios devolve to only a single missing durable in all cases22:51
peluseyeah, that's odd22:51
peluseoh, its a direct get22:52
peluseone sec...22:52
peluseOK, so the direct get was the reason for the 404, need to change the test to look for a .durable on the node where it was deltedd before the run of the EC recon, not to do a get.22:54
claygpeluse: are you sure - you sorta convinced me it was correct22:56
pelusewell, if we do a direct get and there's no .durable then the obj server should never give us back the data however a proxy get should because we're only killing one .durable so it can still decode22:57
claygbut the direct_get is after the reconstructor?22:58
acolesyeah test looks good to me22:58
pelusewell, yeah OK22:58
peluseman, I'm just in a fog these daze22:58
claygpeluse: acoles: so fwiw, torgomatic updated the fake ec policy/ring on the proxy PUT path tests and after cleaning up a bunch of unrelated test churn that was being to opinionated about the size and number of things - it seemed like the proxy was already doing the right thing :\23:00
torgomaticyeah, this is confusing as hell23:00
* torgomatic is running at about 5 WTF/min23:00
claygtorgomatic: well even if it turns out to be correct we probably want to make it less confusing :'(23:00
torgomaticclayg: yeah, I'm trying that now23:00
*** zhill has joined #openstack-swift23:01
claygpeluse: you don't invalidate the hashes :\23:01
peluseahhh23:02
peluseyes, the test I stole from was killing the pkl file.  Note that I did this manually first and I was doing that.  One sec...23:02
torgomatichalf the confusingness is probably because we're half-done refactoring BaseObjectController, so we've got this ReplicatedObjectController that's got like 2% of the code for replicated objects, then BaseObjectController has the other 98% but ECObjectController overrides the methods to hide it23:04
peluseOK, removing the hashes.pkl as well and it still fails.  Will update the test on the gist23:04
peluseOK, updated.  I wonder if the .durable isn't being reflected properly in the hashes23:06
acolestorgomatic: so according to my distilled piece of _get_put_responses http://paste.openstack.org/show/203778/ we *always* call _have_adequate_successes(statuses, min_responses) and i'm wondering if that turns out to be just as strict or stricter than the quorum test23:07
torgomaticacoles: right, so that call to _have_adequate_successes uses min_responses, which has the correct quorum value (policy.quorum)23:09
claygpeluse: ok, so I'm thinking the get_suffix_delta is only comparing the fragment indexes and not the None's23:10
claygi'm guessing if the suffixes made their way into ssync we'd get the right behavior23:10
peluseOK, looks like that's OK.  I think we're missing it in get_suffix_delta because we're not passing in None23:10
torgomaticand then I guess the subsequent call to self.have_quorum will want a majority, but we've already got one because policy.quorum is bigger than a majority23:10
claygyay policy.quorum!23:10
peluseyeah, that :)23:10
peluseBTW, I just confirmed that (clayg)23:10
torgomaticunless you do something asshatted like run a 10+20 scheme, and then we have trouble23:10
claygpeluse: you're one step ahead of as usual!23:11
pelusesheeeeit23:11
peluseusually 3 steps behind is more like it!23:11
claygtorgomatic: what's wrong with a 10+20 schema!  it's like even better than replication!23:11
peluseOK, I have to do kid erands.  I can post a gist fix when I get back if note its already done (if I don't see any mroe chatter I'll do it)23:11
claygpeluse: so you have a diff you want me to look at?23:11
peluseclayg, BTW the EC recon job changes are really fantastic!23:12
torgomaticclayg: well, at a minimum, our code will happily write down only 11 fragment archives, then blow up horribly when it can't write 16 .durable files ;)23:12
claygpeluse: thanks for saying23:12
peluseclayg, not yet, I just hacked in some prints to confirm what was going on.  don't have a solution yet23:12
claygpeluse: perfect, no problem23:12
torgomaticHOWEVER23:12
claygpeluse: i think i'll start with a unittest - thanks for writing up that probetest so quick - ttyl23:13
torgomaticif your schema is not ridiculously over-parity-bitted, then our stuff works23:13
acolestorgomatic: hmm, but _have_adequate_successes = True just breaks out of waiting for responses early, could still be <quorum goood responses23:13
torgomaticacoles: yes, but with sane numbers of things, you need M+1 things written to pass _have_adequate_successes(), and that's more than majority(M+K)23:14
torgomaticit only falls over when K>M23:14
*** chlong has joined #openstack-swift23:14
torgomaticpossibly K > M + 223:14
torgomaticso your 10+4 scheme is just fine; your 4+10 is broken23:15
torgomatic(occasionally and only if massive-but-not-*too*-massive failure occurs)23:15
acolestorgomatic: ok i'm sure your brain is operating better than mine right now23:17
claygpoor acoles :'(23:17
acolestorgomatic: if not always in fact23:17
acolesclayg: so where are we at? i'm wondering if i can go sleep? did you see the fart-removing-diff back up^^?23:18
claygacoles: yes i love it!  you tap out man - it's all going to be beautiful in the morning!23:19
notmynamethanks acoles!23:19
acolesok well good luck guys, i may catch you later i guess depending on how it goes :p23:21
*** fanyaohong has joined #openstack-swift23:21
*** jamielennox|away is now known as jamielennox23:22
torgomaticif the fart-removing diff is delayed, is it flatu-late?23:22
acolestorgomatic: lol23:22
torgomaticI can make fart jokes again now that it's after Easter23:23
* torgomatic had given them up for flatu-Lent23:23
clayg^ acoles this is what happens if you feed it23:24
acolestorgomatic: is that like when you loan someone a bad tire - flat-u-lent23:24
mattoliveraulol23:24
torgomaticacoles: exactly!23:24
acolesdid is pell tire right?23:25
acolesspell23:25
torgomaticacoles: somewhere23:25
claygthe ring has tiers - and it's round23:25
acoleslike awedding cake23:25
*** acoles is now known as acoles_away23:28
claygpeluse: ok i have a failing unittest - we got this one23:29
* notmyname is about to drive home. will be on full time after23:30
*** itlinux_ has quit IRC23:35
*** kota_ has joined #openstack-swift23:42
kota_morning23:42
*** PurpleJack has joined #openstack-swift23:44
*** zhill has quit IRC23:49
*** PurpleJack has quit IRC23:49
claygkota_: good morning!23:50
*** ho has joined #openstack-swift23:52
mattoliveraukota_: morning23:54
hogood morning!23:54
mattoliverauho: morning23:56
homattoliverau: morning!23:57
kota_clayg, mattoliverau: good morning :)23:59
*** km has joined #openstack-swift23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!