Wednesday, 2016-10-19

kota_good morning00:49
mattoliveraukota_: morning00:54
kota_mattoliverau: morning00:54
*** tqtran has quit IRC01:00
charzkota_: mattoliverau morning01:06
kota_charz: o/01:06
zhengyingood morning01:06
mattoliveraucharz, zhengyin: o/01:10
clayghehe - good morning everyone!01:11
openstackgerritClay Gerrard proposed openstack/swift: WIP: Make ECDiskFileReader check fragment metadata  https://review.openstack.org/38765501:12
*** ntata_ has joined #openstack-swift01:13
claygi think overall the test failures are trending down - managed to get an implementation for the object server quarantine that i'm satisfied with01:14
claygbut I'm sort of worrying/wondering if for backports we should *just* have the full read quarantine for the audtior - but we'll see what happens when we start to cherry pick it i 'spose01:14
*** ntata_ has quit IRC01:21
*** blair has joined #openstack-swift01:28
kota_clayg: thanks for updating that, will look at01:28
*** clu_ has quit IRC01:39
openstackgerritKazuhiro MIYAHARA proposed openstack/swift: Remove 'X-Static-Large-Object' from .meta files  https://review.openstack.org/38541202:12
*** chlong has joined #openstack-swift02:16
openstackgerritKota Tsuyuzaki proposed openstack/liberasurecode: Fix liberasurecode skipping a bunch of invalid_args tests  https://review.openstack.org/38787902:23
*** lcurtis has quit IRC02:53
*** klrmn has quit IRC03:10
*** kei_yama has quit IRC03:16
*** rjaiswal has quit IRC03:41
*** klrmn has joined #openstack-swift03:42
*** tqtran has joined #openstack-swift03:45
*** links has joined #openstack-swift03:46
*** cshastri has joined #openstack-swift03:50
*** tqtran has quit IRC03:50
*** Guest29440 has quit IRC03:53
*** klrmn has quit IRC04:01
*** trananhkma has joined #openstack-swift04:19
*** ppai has joined #openstack-swift04:34
openstackgerritTuan Luong-Anh proposed openstack/swift: Add prefix "$" for command examples  https://review.openstack.org/38835504:36
*** cshastri has quit IRC04:51
*** klrmn has joined #openstack-swift04:54
*** sure has joined #openstack-swift04:56
*** sure is now known as Guest8966804:56
Guest89668hii all, I am doing "container syncronization" in same cluster for that i created my "container-sync-realms.conf" file like this http://paste.openstack.org/show/586313/04:58
Guest89668i created two containers and uploaded objects to one but those objects are not copied to other container04:59
Guest89668please some one help04:59
*** klrmn has quit IRC05:08
*** itlinux has quit IRC05:09
*** raginbaj- has joined #openstack-swift05:11
openstackgerritBryan Keller proposed openstack/swift: WIP: Add notification policy and transport middleware  https://review.openstack.org/38839305:12
mattoliverauGuest89668: so your container sync realms file is in /etc/swift/05:16
*** SkyRocknRoll has joined #openstack-swift05:16
Guest89668mattoliverau: yes05:17
mattoliverauGuest89668: also you can remove the clustername2 line, you only need to define each cluster once (and you are only using 1 cluster) but that shouldn't be stopping anything05:17
Guest89668mattoliverau: here is error log http://paste.openstack.org/show/586314/05:18
mattoliverauhmm, so it's timing out and then on the retry its saying method not allowed. And it's a DELETE05:25
mattoliverauGuest89668: you have the same secret key on both containers in the sync?05:27
Guest89668mattoliverau: yes05:27
Guest89668here is my http://paste.openstack.org/show/586315/05:28
Guest89668container stats05:28
mattoliverauand just to make sure, your proxy or loadbalancer (or whatever your ip in your realms config is pointing at) is listening on port 80?05:28
mattoliveraucause thats what your realms config says05:29
Guest89668yes it is listening at port 8005:29
*** qwertyco has joined #openstack-swift05:34
mattoliverauGuest89668: is the endpoint to your cluster (that's listening on port 80) as swift proxy? a load balancer. Just trying to figure out why the request is 405'ed05:43
mattoliverauand where is container_sync on the proxy pipeline?05:43
Guest89668my swift endpoint is " http://192.168.2.187:8080/v1/AUTH_%(tenant_id)s"05:44
mattoliverauoh so theyre listening on port 8080 not port 80 or do you have a load balancer listening on 80?05:45
*** ChubYann has quit IRC05:45
Guest89668mattoliverau: no05:45
mattoliverauGuest89668: if not try changeing your end points in the realm to: http://192.168.2.187:8080/v1/05:45
mattoliverauGuest89668: looks like the container sync daemon is trying to update whatever is listening on port 80, maybe a webserver05:46
Guest89668mattoliverau: just now i changed and tried again05:46
Guest89668but now also same result but in log that ERROR was gone05:47
mattoliveraualso I mentioned before you only need to specify a single cluster if you have a single cluster, so if you remove the second you'll have to update container metadata that points to cluser2 to point to cluster 105:47
mattoliverausame result as in no objects?05:47
mattoliverauhave you waited or reran the container-sync?05:48
Guest89668yes i reran container-sync05:49
mattoliverauif your on the container server in question (a container server that serves ars a primary for the container in question) you can stop container-sync and force it to run manually with: swift-init container-sync once05:49
Guest89668here is my new relam file http://paste.openstack.org/show/586317/05:49
mattoliverauif your using swift-init05:50
mattoliverauGuest89668: that looks right (the :8080)05:51
Guest89668and my proxy-server.conf http://paste.openstack.org/show/586316/05:51
mattoliverauGuest89668: cool, container sync is before auth05:52
mattoliverauGuest89668: is the container-sync logging anything? it should log something, even if its just saying its running or warning about internal client using default05:53
Guest89668here is that log  http://paste.openstack.org/show/586318/05:54
*** klrmn has joined #openstack-swift05:54
*** klrmn has quit IRC05:56
mattoliverauhmm, yeah ok, thats a normal message, but means container sync is running.05:57
mattoliverauGuest89668: now that we have the ports right, how about you put another object in a container.. just in case container sync thinks it's uptodate05:58
mattoliveraucause it isn't erroring05:58
Guest89668mattoliverau: i deleted both the containers and created again but still same result05:59
mattoliverauGuest89668: whats your container sync interval? have you set one in the config? if not by default its 300 seconds05:59
mattoliverauor 5 mins06:00
mattoliverauGuest89668: and your container server can access your proxy servers (via the IP you specified) because thats where container sync is running from06:01
Guest89668mattoliverau: i am using single node swift (proxy+storage in same node)06:03
mattoliverauoh ok06:03
Guest89668and how to set container sync intervel06:03
mattoliverauGuest89668: in your container-server config(s) there should be a section for container-sync. Under that heading you can specify an interval by adding:06:04
mattoliverauinterval = <number>06:05
mattoliverauwhile your in there, you can turn up the logging verbosity for just the container sync daemon, but adding to the same container-sync section:06:05
mattoliveraulog_level = DEBUG06:05
mattoliverauthen restart the container sync daemon06:06
mattoliverauand hopefully it'll log more and it might tell us what's going on06:06
*** trananhkma has quit IRC06:07
*** trananhkma has joined #openstack-swift06:07
*** trananhkma has quit IRC06:08
*** trananhkma has joined #openstack-swift06:08
*** rcernin has joined #openstack-swift06:11
Guest89668mattoliverau: i added both parameters but same result06:12
Guest89668i didnt find any extra log06:13
mattoliverauGuest89668: its seems that container-sync isn't finding objects to sync.06:18
mattoliverauhmm weird, what could we be missing06:22
claygbusy busy06:31
Guest89668mattoliverau: then how to debug this06:33
mattoliverauGuest89668: any logs matching the time the container sync ran in the proxy server logs (the other side of the container sync transaction)>06:35
mattoliverau?06:35
Guest89668mattoliverau: just now i uploaded one object to the container1 here is the log http://paste.openstack.org/show/586321/06:38
*** eranrom has joined #openstack-swift06:40
mattoliverauGuest89668: line 12 says your are getting a container-sync log error and your getting a 404 not found06:40
mattoliverauso double check your container sync paths, and make sure you can access the proxy at the ip you specify in the realms config06:41
Guest89668the file "openstack" what i deleted06:41
*** winggundamth has quit IRC06:42
Guest89668at first time creation of containers i uploaded object called openstck06:42
mattoliverauit doesn't seem the log level change has taken effect because you should see alot more.06:42
Guest89668mattoliverau: after that i deleted two containers and again i created those06:43
Guest89668i have given log_level = DEBUG06:44
Guest89668it is correct06:44
Guest89668?06:44
mattoliverauyeah, and did you restart container-sync? Also I don't see your proxy log as apart of that.06:45
mattoliverauclayg: your still up!06:45
*** winggundamth has joined #openstack-swift06:45
Guest89668mattoliverau: yes i restarted06:50
*** silor has joined #openstack-swift06:51
*** hseipp has joined #openstack-swift06:52
onovyclayg: no. i shutdowned it, spiked up. after power on, spiked down (and some time for sync of missing data). it was off for ~1 hour07:00
onovy"down" = value before shutdown, but still higher than before upgrade07:02
*** tesseract has joined #openstack-swift07:08
*** tesseract is now known as Guest9021107:09
*** qwertyco has quit IRC07:11
clayghandoffs first?07:12
claygi think there's a warning emitted if you have it turned out - but the behavior changed at some point07:13
claygonovy: 01410129dac6903ce7f486997a48e36072fa0401 first appeared in 2.7 tag07:14
*** silor has quit IRC07:18
*** rledisez has joined #openstack-swift07:24
*** joeljwright has joined #openstack-swift07:34
*** ChanServ sets mode: +v joeljwright07:34
*** trananhkma has quit IRC07:37
*** _JZ_ has quit IRC07:40
*** geaaru has joined #openstack-swift07:46
*** tqtran has joined #openstack-swift07:49
*** amoralej|off is now known as amoralej07:52
onovyclayg: # handoffs_first = False07:53
onovyso commented out default07:53
*** tqtran has quit IRC07:53
onovyclayg: btw: s/and some time for sync/after some time for sync/07:56
onovydon't understand why rsync metric jumped up after one node shutdown. no reason to sync anything, because handoff are used only when there is disk failure/umount, not whole server failure, right?07:57
claygonovy: incoming writes/deletes will go to handoff while node is down - and i think handoffs_first would spin while waiting for the node to come back up - so it could have explained the change - oh well08:06
claygonovy: I still don't understand what sort of magic you're applying to make that recon drop show up in a graph like that - that metric - and all of rsync metrics are dropped at the end of a cycle - and overwritten by the next cycle08:07
claygIME the cycle while there is real part movement going on (rsync's) is *much* longer then the cycle of a few suffix rehashes08:08
claygi kept loosing my interesting numbers because I didn't want to spin in a tight loop collection the same number over and over just to find an interesting edge08:09
claygnot to mention that the number only got spit out *after* the fact - so it gave me no insight into what was going on *right* now08:09
onovyclayg: but i don't have enabled handoffs_first, so i don't think it explains it08:10
claygso I only track the statsd stuff from the repliator and the finish_time08:10
onovyat graph. Every 5 minutes i GET all stores and process that json replies08:10
onovyso i don't see "edges" when number is reset, but only numbers "every 5 minutes"08:10
claygand doesn't that give you the same number lots of times?  like even in a stable cluster with enough nodes i report my cycle time ~20 mins for the whole cluster (it could be smaller if a reporter gets bad timing cycle takes 15 he reports at 14 5 mins later says it finished in 20, etc)08:12
claygwhen i have a rebalance going and real weight needs to move the cycle time is ... much longer ;)08:12
onovyyep, it does08:12
onovyit's not perfect metric, i know :)08:12
claygi can't really see that come through in the graphs your sending?  is it just too scaled out?08:12
onovygraph4?08:13
claygof the spike?08:13
onovyjop, that's scaled out08:13
onovymmnt08:13
onovyjop=yes :)08:13
onovyhttps://s15.postimg.org/559djmgxn/graph5.png zoom in08:14
onovy"max" zoom https://s17.postimg.org/5e0m1ji7j/graph6.png08:15
*** rcernin has quit IRC08:16
*** rcernin has joined #openstack-swift08:16
*** rcernin has quit IRC08:18
*** rcernin has joined #openstack-swift08:19
*** rcernin has quit IRC08:19
*** rcernin has joined #openstack-swift08:19
claygok, so maybe it's not an imperfect proxy - i use a statsd metric suffix.syncs which happens around the same time as rsync getting incremented but only for primary partitions in update08:21
claygit'd be great if those rsync's were broken out by primary sync with peer or handoff sync to delete08:22
claygonovy: it doesn't make much sense to me that it would climb like that - even if a bunch of suffixes were invalid *and* also out of sync - why wouldn't one pass *fix* most of them?08:23
claygany chance some of the rsync's are failing?  max connections limit or something?  I think a "failure" number comes our gith next to rsyncs?08:23
onovyyep, many failures08:26
onovyi have connection limits in rsync to prevent overload drivers08:27
onovyhttps://s12.postimg.org/floivqq59/graph_failure.png08:27
onovybtw: we are going to change this to statsd, but i just "joined" our really old monitoring with swift using few lines of python code :)08:28
onovyrsyncd.conf: max connections = 64 for objects08:32
onovyfor 24 disks per server08:32
onovyand "concurrency: 4" for object-replicator08:33
*** x1fhh9zh has joined #openstack-swift08:33
onovys/drivers/of drives/08:33
*** x1fhh9zh has quit IRC08:48
*** x1fhh9zh has joined #openstack-swift08:50
Guest89668mattoliverau: my problem resolved and it is syncing objects without any issue08:51
mattoliverauGreat what did we miss? Sorry was called away09:02
Guest89668mattoliverau: the actual issue with the swift endpoint what i mentioned in the relams file after your intimation i changed and restarted services but i didnt checked the other container whether the objects are copied or not09:04
Guest89668now it is working fine and syncing the objects with the time intervel what i mentioned in   container-server.conf09:05
mattoliverau\o/ nice work!09:06
Guest89668mattoliverau: you helped a lot to debug this issue09:06
Guest89668thanks again...!!09:07
*** links has quit IRC09:09
claygok, well at least that explains how it's able to cycle so fast09:15
claygonovy: i'm really lovin' on the rsync module per disk configuration - my rsync.conf has a few more lines in it - but the per drive rsync connection limits are really nice09:17
claygso do that, and the statsd, and 10 million other things, and ... ;)09:17
claygwtg mattoliverau and Guest89668 !!!!09:18
claygwooooo!!!09:18
onovyclayg: we are using salt for deploy, so i can generate rsyncd.conf automagically09:23
onovybut need to fix this first use and finish upgrade first :]09:24
patchbotError: Spurious "]".  You may want to quote your arguments with double quotes in order to prevent extra brackets from being evaluated as nested commands.09:24
onovy*first issue09:26
onovyclayg: do you think it's safe to downgrade that one swift store node?09:29
claygi'd have to look over the change log 2.5 -> 2.709:35
claygnotmyname tries to highlight stuff that can't be backed out of - and we try to avoid stuff that can't be backed out of09:36
clayg... but it's still not something folks do very much - I don't personally have a lot of experience with it09:37
claygmaybe we ~broke something with suffix hashing between versions09:38
claygi'm not sure that would explain the reboot tho - unless that many suffixes really got invalidated09:39
claygthe rsyncs could be firing and doing nothing - not really sending data (some probably still do) but - maybe - the majority of the delta is rsyncs that are finding the directories already have the same files09:40
claygi sorta remember something with fast post because of ssync and ctype timestamps - we had fix *something* in suffix hashing09:41
claygbut I thought we decided it was backwards compatible09:41
claygonovy: do you use fast-past?  do you have .meta files in your cluster?09:42
onovy# object_post_as_copy = true10:03
onovyand no meta files10:04
onovybtw: do you have 2.7.0 in production already?10:04
*** mvk has quit IRC10:14
openstackgerritStefan Majewsky proposed openstack/swift: swift-recon-cron: do not get confused by files in /srv/node  https://review.openstack.org/38802910:14
*** zhengyin has quit IRC10:39
*** mvk has joined #openstack-swift10:43
*** Guest89668 has quit IRC10:56
onovyclayg: https://github.com/openstack/swift/commit/2d55960a221c9934680053873bf1355c4690bb19 this is that patch about 'ssync' vs. suffix hashing?11:00
onovycite: in most11:00
onovy'normal' situations the result of the hashing is the same11:00
onovyas before this patch. That avoids a storm of hash mismatches11:00
onovywhen this patch is deployed in an existing cluster.11:00
*** hseipp has quit IRC11:02
*** ppai has quit IRC11:04
*** x1fhh9zh has quit IRC11:06
onovy+ https://github.com/openstack/swift/commit/9db7391e55e069d82f780c4372ffa32ef4e79c35 this patch makes downgrades harder11:07
*** cdelatte has joined #openstack-swift11:23
*** x1fhh9zh has joined #openstack-swift11:48
*** tqtran has joined #openstack-swift11:50
*** tqtran has quit IRC11:55
*** klamath has joined #openstack-swift12:02
openstackgerritKota Tsuyuzaki proposed openstack/swift: Items to consider for ECObjectAuditor  https://review.openstack.org/38864812:03
*** links has joined #openstack-swift12:20
*** SkyRocknRoll has quit IRC12:25
openstackgerritShashi proposed openstack/python-swiftclient: Enable code coverage report in console output  https://review.openstack.org/38866912:30
kota_acoles: I updated my thought to patch 387655.12:51
patchbothttps://review.openstack.org/#/c/387655/ - swift - WIP: Make ECDiskFileReader check fragment metadata12:51
kota_and clayg:^^12:51
kota_basically the way we are going with patch 387655 seems ok.12:51
patchbothttps://review.openstack.org/#/c/387655/ - swift - WIP: Make ECDiskFileReader check fragment metadata12:51
*** amoralej is now known as amoralej|lunch12:52
kota_that one works to detect all frag archives given from admin6 as corrupted (this is awesome!)12:52
kota_but i found some cornar cases we cannot detect or ability to quarantine a good frag archives.12:53
acoleskota_: ack12:53
kota_acoles, clayg: hopefully, i was just a worrier but i think it can happen so I'd like to hear your opinions for that.12:53
kota_acoles:!!12:54
acoleskota_: worriers make good reviewers !12:54
kota_sorry, i have to leave my office asap12:54
kota_that is going to be closed.12:54
acoleskota_: just looking at yours and clayg changes12:54
admin6kota_: that sounds good :-)12:54
acoleskota_: ok have a good night leave it with me12:54
kota_acoles: thanks man, and if you make comments (either gerrit, irc, etc...), i will take a look wherenever.12:55
* acoles worries kota may be locked in office all night12:55
onovyclayg: upgraded second node12:56
*** links has quit IRC12:59
*** Jeffrey4l has quit IRC12:59
*** remix_tj has quit IRC13:01
*** remix_tj has joined #openstack-swift13:01
*** jordanP has joined #openstack-swift13:04
*** StevenK has quit IRC13:09
*** StevenK has joined #openstack-swift13:15
*** mvk has quit IRC13:50
*** mvk has joined #openstack-swift13:53
*** amoralej|lunch is now known as amoralej13:57
*** vinsh has quit IRC14:07
*** silor has joined #openstack-swift14:11
onovyclayg: so new info: after second node upgrade, rsync metrics bumped up again. and i done test in our test env. If I have 1/2 nodes of 2.5.0 and 1/2 of 2.7.0, rsync metrics is higher than if i have all nodes on same version14:11
onovyso i think there is hashes compare incompatibility between 2.5.0 and 2.7.014:12
onovyand it's not only about rsync metrics, rsync cmd is really called much more14:12
onovyhttps://s14.postimg.org/tboppflq9/graph_2_nodes.png // rsync metrics graph14:14
*** jordanP has quit IRC14:15
*** x1fhh9zh has quit IRC14:18
*** hseipp has joined #openstack-swift14:24
*** sgundur has joined #openstack-swift14:28
*** jistr is now known as jistr|call14:28
*** silor1 has joined #openstack-swift14:31
*** silor has quit IRC14:32
*** silor1 is now known as silor14:32
tdasilvarledisez, acoles, onovy: what's the best practice for your clouds re object-expirer? do you typically run on storage nodes or proxy nodes. doesn't seem like there's good consensus, so I proposed patch 38818514:37
patchbothttps://review.openstack.org/#/c/388185/ - swift - added expirer service to list14:37
*** sgundur has quit IRC14:39
tdasilvaahale: ^^^14:39
*** sgundur has joined #openstack-swift14:43
*** vinsh has joined #openstack-swift14:43
rlediseztdasilva: for now, we run on proxy node because we don't have real scaling issues with the expirer. The rare situation were we had problem, we just increased concurrency and it was enough for us. but i guess it depends on how much object you have to expire. we expire between 1M and 1.5M objects every day and have no negative feebacks14:51
rledisezwould be nice to have some metrics about how many expired objects are waiting to be effectively expired14:51
rledisezquerying the containers of the special account .expired-objects (or whatever is its name)14:52
rlediseztdasilva: what are you calling storage node on your patch? object or account/container?14:53
rledisezi'm affraid that if it runs on object servers there will be too much requests on the container servers, taking down the entire clusters (it already happend to us with a homemade process that was querying containers from object servers)14:54
rledisezmemcache would be a requirement then14:55
*** vinsh has quit IRC14:55
*** vinsh_ has joined #openstack-swift14:55
*** vinsh has joined #openstack-swift14:56
*** klrmn has joined #openstack-swift14:58
*** vinsh_ has quit IRC15:00
*** sgundur has quit IRC15:00
*** hseipp has quit IRC15:00
hurricanerixtdasilva I am going to try and get this updated over the ocata cycle: https://review.openstack.org/#/c/252085/15:10
patchbotpatch 252085 - swift - Refactoring the expiring objects feature15:10
*** jistr|call is now known as jistr15:12
*** Guest90211 has quit IRC15:13
*** pcaruana has quit IRC15:14
tdasilvarledisez: honestly i was calling storage node anything but proxy. typically we don't separate aco nodes, but i understand if you guys do15:15
tdasilvahurricanerix: cool, are you planning to do that on the golang code?15:16
*** rcernin has quit IRC15:16
hurricanerixtdasilva not sure yet,  since there is already a POC mostly done, i was just going to rebase it to get it up to master and verify that it does not break anything.15:17
tdasilvahurricanerix: got it15:19
hurricanerixtdasilva i think it also needs some more documentation, like a deployment/rollback strategy, since this will likely need to be done in phases.15:19
glangetdasilva: the object expirier stuff as written can cause problems with heavy usage15:19
glangetdasilva: besides getting behind, it can result in a large number of asyncs15:19
tdasilvaglange: yeah, i remember dfg talking about that in tokyo15:19
glangetdasilva: for really heavy usage, we need a rewrite either like the one alan did or something similar15:20
tdasilvaglange: do you guys also currently run on the proxy nodes?15:20
glangetdasilva: we are only keeping up in some of our clusters because we run a hacked up version of the code15:20
glangeeach of our clusters have a few extra systems that are used for various things15:21
glangewe run the expirer there15:21
tdasilvaglange: oh, interesting, neat15:21
glangethese extra boxes do log processing and some other stuff15:21
glangewe have a few customers that heavily use that feature :/15:22
glangeit doesn't scale very well as written :)15:22
glangeand we give the developer who wrote that feature (he sits nearby) crap about it from time to time :)15:23
tdasilvaglange: hehehe15:26
acolesclayg: fyi I am working on fixing the ssync tests in patch 38765515:28
patchbothttps://review.openstack.org/#/c/387655/ - swift - WIP: Make ECDiskFileReader check fragment metadata15:28
acolesclayg: back later15:29
*** hoonetorg has quit IRC15:29
*** acoles is now known as acoles_15:29
*** sgundur has joined #openstack-swift15:36
*** jistr is now known as jistr|biab15:39
onovytdasilva: hi. we run expirer on 1-4 nodes in every region15:39
onovyi mean 1. - 4. storage nodes15:39
onovyand every dones 1/4 of expiring15:40
onovy*does15:40
onovyso: processes=4, process=0 on first storage node, =1 on second, etc.15:41
onovysame in both region. so if one region if off, we still expire objects15:42
onovyin first version we had expirer on all nodes which processes=0, but there was many errors in long. expirer was trying to delete object which was just deleted i few seconds before by another one expirer15:42
onovy*one region is off15:43
*** hseipp has joined #openstack-swift15:43
onovytdasilva: we have aco on same servers => storage nodes. p is separated15:43
onovyand a+c is on SSD, o on rotational disk15:43
notmynamegood morning15:45
onovy+ we have ~ x0-x00 expiring per seconds and x000 of async in queue :)15:45
*** links has joined #openstack-swift15:46
rlediseztdasilva: fyi, we used to do pac / o, we are now mmoving to p / ac / o15:48
onovyrledisez: hi. what's your reason for separating ac from o pls?15:49
*** tqtran has joined #openstack-swift15:51
rledisezperformance. o are slow rotational devices while ac are fast SSD. and also number15:52
rledisezonovy: ^15:52
onovyah. we have 1 SSD per storage node and 23 rotation disk15:53
onovyac are on one SSD, o are on 23 rotation disk15:53
rledisezonovy: makes sense, but it would cost too much for us. we have thousands of object servers, we only need 100 or 200 SSD for ac servers15:54
onovyah, right. we have ~16 stores per region now :)15:54
notmynamerledisez: onovy: I'd definitely appreciate it if you can help update https://etherpad.openstack.org/p/BCN-ops-swift for next week15:56
onovynotmyname: but i'm not op :)15:56
onovyi will forward it to our ops15:56
*** tqtran has quit IRC15:56
rlediseznotmyname: thx for the reminder, i wrote down some topics I had in mind, will try to think more :)16:00
*** jistr|biab is now known as jistr16:05
*** admin6_ has joined #openstack-swift16:07
notmynamethanks :-)16:08
*** klrmn has quit IRC16:08
onovynotmyname: is there any deadline for that etherpad?16:09
*** admin6 has quit IRC16:10
*** admin6_ is now known as admin616:10
notmynameonovy: i put a link to the agenda item in there. that's the deadline. when the session starts16:10
*** ChubYann has joined #openstack-swift16:11
notmynamecschwede: around?16:12
openstackgerritJohn Dickinson proposed openstack/swift: use the new upper constraints infra features  https://review.openstack.org/35429116:19
*** sgundur has quit IRC16:30
*** rledisez has quit IRC16:31
*** links has quit IRC16:31
*** sgundur has joined #openstack-swift16:36
onovynotmyname: ok, thanks, forwarded :]16:37
patchbotError: Spurious "]".  You may want to quote your arguments with double quotes in order to prevent extra brackets from being evaluated as nested commands.16:37
onovyclayg: https://bugs.launchpad.net/swift/+bug/163496716:41
openstackLaunchpad bug 1634967 in OpenStack Object Storage (swift) "2.5.0 -> 2.7.0 upgrade problem with object-replicator" [Undecided,New]16:41
*** pcaruana has joined #openstack-swift16:51
claygonovy: sigh (on fast-post suffix hashing change) - i'm running out of ideas!16:51
onovyi think it must be related to suffix hashing change16:52
onovyi read whole git log 2.5.0..2.7.0 and only this seems related16:52
onovygood news is i can reproduce it in lab16:53
onovyand i think everybody can :)16:53
claygonovy: but - i'm still not sure that the spike isn't just because of 2.5 <=> 2.716:54
onovyi'm sure it's problem with "version hybrid cloud"16:55
claygit's not like all our >= 2.7 clusters got a 10x increase in rsync traffic and no one noticed16:55
onovyif whole cluster have same version (2.7 or 2.5) problem disappear16:55
clayg*maybe* we saw the same bumps *while* upgrading but didn't notice16:55
onovylook to bug :)16:55
onovyif i have 2x 2.5.0 + 2x 2.7.0 in lab, i have big spike16:56
claygok, so ... it probably was something in suffix hashing between 2.5 and 2.716:56
onovywhen i downgrade or upgrade whole cluster to same version, spike disappear16:56
onovy(after few tens of minutes)16:56
onovyyep16:56
onovyi think so16:56
onovymaybe it's just "feature", but we should document it than16:56
onovyand maybe recommend to shutdown replicator during upgrade process16:57
onovybecause it can overload cluster imho16:57
*** Jeffrey4l has joined #openstack-swift16:58
claygI'm looking @ https://review.openstack.org/#/c/267788/ - but i made a note in the review that when I had it all loaded in my head I thought the hashes would always be the same16:59
patchbotpatch 267788 - swift - Fix inconsistent suffix hashes after ssync of tomb... (MERGED)16:59
claygmaybe you could poke at the REPLICATE api with curl or do some debug logging to find out if one of your parts on 2.7 code has a different result in hashes.pkl than a 2.5 node for the same part?17:00
onovycan you try in your lab (with your config) reproduce it?17:01
onovyjust install few 2.5.0 nodes and upgrade few of them to 2.7.017:02
onovywe can confirm it's not "my setup" problem17:02
claygonovy: not this week I can't!  ;)17:02
onovy:]17:02
patchbotError: Spurious "]".  You may want to quote your arguments with double quotes in order to prevent extra brackets from being evaluated as nested commands.17:02
claygtrying to get ready for barca and fix some bugs :)17:02
onovyclayg: do you have 2.7.0 in production already btw?17:02
claygonovy: this is our latest tag -> https://github.com/swiftstack/swift/tree/ss-release-2.9.0.217:03
claygwe have lots of folks that have upgraded to 2.9, some are still on ... much older releases17:03
*** amoralej is now known as amoralej|off17:04
onovyok17:04
onovyclayg: what about: https://review.openstack.org/#/c/387591/ ?17:04
patchbotpatch 387591 - swift - Set owner of drive-audit recon cache to swift user17:04
*** klrmn has joined #openstack-swift17:05
onovyzaitcev: torgomatic: ^ can you look too pls?17:06
*** tqtran has joined #openstack-swift17:09
onovyclayg: thanks17:09
*** joeljwright has quit IRC17:10
claygonovy: do you still have a mixed environment in play - or is everything upgraded to 2.7 now?17:12
onovyclayg: in dev i have anything. in production i have 2 nodes on 2.7.0, and other on 2.5.017:13
claygonovy: well would you confirm/deny my supsicion about mis-mashed suffix hashing?  https://bugs.launchpad.net/swift/+bug/155056317:13
openstackLaunchpad bug 1550563 in OpenStack Object Storage (swift) "need a devops tool for inspecting object server hashes" [Wishlist,New]17:13
zaitcevWhat about "patch add(s)" :-)17:14
claygzaitcev: fix it17:15
onovyclayg: so i should run this? https://gist.github.com/clayg/035dc3b722b7f89cce66520dde285c9a17:15
onovyon 2.7.0 or 2.5.0 node?17:15
claygit uses the ring to talk to primary nodes about parts - so ideally you would find a partition that is on a 2.5 and 2.7 node17:16
clayghopefully you could identify such a part from the logs on the node with the high volume rsync's17:16
openstackgerritPete Zaitcev proposed openstack/swift: Set owner of drive-audit recon cache to swift user  https://review.openstack.org/38759117:16
zaitcevyour wish is my command17:17
onovyclayg: i have 4 nodes, 3 replicas and 2 nodes on 2.5.0 and 2 nodes on 2.7.017:17
onovyevery partition is on 2.5.0 and 2.7.0 node17:17
onovyclayg: really looong output17:18
onovysdn-swift-store1.test 6000 hd7-500G17:18
onovy{'9fd': '282d14b6c9f3ccc447ac1f387d9c9c60', '9fe': 'bcf1431d13d69ba1123d7504216787bb',17:18
onovysomething like this17:18
onovysdn-swift-store3.test 6000 hd3-500G '9fd': '282d14b6c9f3ccc447ac1f387d9c9c60'17:20
onovysdn-swift-store1.test 6000 hd7-500G '9fd': '282d14b6c9f3ccc447ac1f387d9c9c60'17:20
onovyso same hash... :/17:20
*** acoles_ is now known as acoles17:22
acolesclayg: onovy IDK if its relevant or helpful but we do have a direct client method to get hashes from an object server https://github.com/openstack/swift/blob/0d41b2326009c470f41f365c508e473ebdacb11c/swift/common/direct_client.py#L484-L48417:30
*** mvk has quit IRC17:31
onovyi'm trying to edit clay's script to compare hashes across servers, almost done17:31
onovyrunning over all partitions now...17:32
acolesk, i was just scan-reading backlog, ignore me ;)17:32
onovy:)17:33
*** sgundur has quit IRC17:39
*** sgundur has joined #openstack-swift17:40
onovyzaitcev: thanks for review and fix17:41
*** mvk has joined #openstack-swift18:02
*** klrmn1 has joined #openstack-swift18:07
*** klrmn has quit IRC18:07
openstackgerritOndřej Nový proposed openstack/swift: Fixed rysnc -> rsync typo  https://review.openstack.org/38884318:17
*** sgundur has quit IRC18:18
*** geaaru has quit IRC18:19
onovytdasilva: we are from Czech republic, not Canada :P18:35
tdasilvaonovy: i knew that, did i mis-spell something? :(18:36
tdasilvaoops, seznam.ca sorry18:36
onovy:)))18:36
tdasilvai meant cz18:36
onovyclayg: thanks for pointing to rsync_module18:40
onovyonovy@jupiter~/tmp/salt-state (rsync_module) $ git show | wc -l18:40
onovy     12418:40
onovyi love salt => ready to deploy :)18:40
*** vinsh has quit IRC19:00
*** charz has quit IRC19:08
*** mlanner has quit IRC19:09
*** hugokuo has quit IRC19:09
*** sgundur has joined #openstack-swift19:09
*** alpha_ori has quit IRC19:09
*** treyd has quit IRC19:10
*** ctennis has quit IRC19:11
*** zackmdavis has quit IRC19:12
*** charz has joined #openstack-swift19:12
*** acorwin has quit IRC19:12
*** swifterdarrell has quit IRC19:12
*** bobby2_ has quit IRC19:12
*** hugokuo has joined #openstack-swift19:12
*** timburke has quit IRC19:14
*** sgundur has quit IRC19:15
*** balajir has quit IRC19:16
*** charz has quit IRC19:17
*** treyd has joined #openstack-swift19:17
*** mlanner has joined #openstack-swift19:18
*** bobby2 has joined #openstack-swift19:18
*** swifterdarrell has joined #openstack-swift19:19
*** ChanServ sets mode: +v swifterdarrell19:19
acolesnotmyname: are we meeting today?19:19
*** balajir has joined #openstack-swift19:19
*** alpha_ori has joined #openstack-swift19:20
*** acorwin has joined #openstack-swift19:20
*** zackmdavis has joined #openstack-swift19:21
*** ctennis has joined #openstack-swift19:22
*** timburke has joined #openstack-swift19:22
*** ChanServ sets mode: +v timburke19:22
*** charz has joined #openstack-swift19:23
*** sgundur has joined #openstack-swift19:25
notmynameacoles: yes. need to go over backports and big bugs and any questions about the summit. I should have the work sessions scheduled by then19:27
acolesnotmyname: k, thanks19:28
*** joeljwright has joined #openstack-swift19:31
*** ChanServ sets mode: +v joeljwright19:32
openstackgerritAlistair Coles proposed openstack/swift: WIP: Make ECDiskFileReader check fragment metadata  https://review.openstack.org/38765519:35
*** hseipp has quit IRC19:35
acolesclayg: ^^ kota_ fixed failing ssync tests, proxy tests still to do plus kota's suggestion in dependent patch19:35
acolesback for meeting19:36
*** joeljwright has quit IRC19:37
*** acoles is now known as acoles_19:39
claygyay!19:48
*** pcaruana has quit IRC19:50
claygi *think* i understand the ssync test fixes sort of?19:52
*** joeljwright has joined #openstack-swift19:59
*** ChanServ sets mode: +v joeljwright19:59
*** silor has quit IRC20:04
*** nikivi has joined #openstack-swift20:04
*** sn0v has joined #openstack-swift20:06
*** sn0v has left #openstack-swift20:06
*** joeljwright has quit IRC20:11
*** joeljwright has joined #openstack-swift20:11
*** ChanServ sets mode: +v joeljwright20:11
*** hoonetorg has joined #openstack-swift20:19
*** nikivi has quit IRC20:21
*** chsc has joined #openstack-swift20:33
*** chsc has joined #openstack-swift20:33
openstackgerritShashirekha Gundur proposed openstack/swift: Invalidate cached tokens api  https://review.openstack.org/37031920:34
mattoliverauMorning20:36
joeljwrightmorning20:37
joeljwright:)20:37
*** sgundur has quit IRC20:52
kota_good morning20:55
kota_acoles: thanks for working that. I'm getting another thought on a part of  my concerns in the last night, will update my comment.20:59
*** acoles_ is now known as acoles20:59
notmynamemeeting time in #openstack-meeting20:59
*** sgundur has joined #openstack-swift20:59
*** mmotiani_ has joined #openstack-swift20:59
acoleskota_: we definitely need to change the exceptions as you suggested, I just didn't get time to do that21:00
*** vint_bra has joined #openstack-swift21:12
*** vint_bra has left #openstack-swift21:12
*** m_kazuhiro has joined #openstack-swift21:21
*** Jeffrey4l has quit IRC21:34
acolestdasilva: thanks for +2 on the reconstructor patch!21:42
*** m_kazuhiro has quit IRC21:44
*** mmotiani_ has quit IRC21:52
*** nikivi has joined #openstack-swift21:54
*** sgundur has quit IRC21:54
*** acoles is now known as acoles_21:57
*** nikivi has quit IRC22:14
*** klamath has quit IRC22:32
*** _JZ_ has joined #openstack-swift22:37
*** jmunsch has joined #openstack-swift22:40
*** vint_bra has joined #openstack-swift22:47
*** joeljwright has quit IRC22:48
*** vint_bra has left #openstack-swift22:48
jmunschanyone able to verify my previous messages exist?22:50
jmunschhello. How to view the X-Delete-After and X-Delete-At meta data from an object, or where in the code should i look more specifically. i have been looking through the http://git.openstack.org/cgit/openstack/deb-swift/tree/swift/obj/server.py trying to figure out how the .expiring_objects gets set, and looking to see how it gets read for GET responses.   I have looked at these related links:22:55
jmunschhttp://docs.openstack.org/developer/swift/overview_expiring_objects.html http://developer.openstack.org/api-ref/object-storage/ http://www.gossamer-threads.com/lists/openstack/dev/31872 https://blog.rackspace.com/rackspace-cloud-files-how-to-use-expiring-objects-api-functionality http://git.openstack.org/cgit/openstack/deb-swift/tree/api-ref/source/storage-object-services.inc http://git.openstack.org/cgit/openstack/deb-swif22:55
notmynamejmunsch: you're wanting to view the data on an existing object?22:56
jmunschnotmyname: the meta data22:56
jmunschFor example I have done something like this:22:57
*** vint_bra has joined #openstack-swift22:57
jmunschobject_headers.update({'X-Delete-After': '2592000'}) # seconds in 90 days22:57
*** gyee has joined #openstack-swift22:57
jmunschOn a PUT22:58
notmynameok22:59
zaitcevguys guys guys. Where is PyECLib's upstream nowadays, https://github.com/openstack/pyeclib/ ?23:00
mattoliverauzaitcev: yup it's apart of OpenStack namespace now23:00
zaitcevmattoliverau: that explains why I could not find 1.3.123:01
notmynamejmunsch: ok, so what do you want to find now that you've done the PUT?23:01
jmunschnotmyname: a key value response with a GET or `swift stat|list` indicating that the created object has had the expiry set23:09
notmynamejmunsch: ok, so `swift stat <container> <object>` will show that, as will a direct HEAD or GET request to the object23:10
notmynamex-delete-after gets translated into x-delete-at an an absolute time23:11
notmynamejmunsch: eg https://gist.github.com/notmyname/3aa5f7f6d6b6e6c76e4499061df7fcc023:12
mattoliveraujmunsch: the updating of the expiring objects container is done in the proxy on a put or post. As notmyname mentioned this is also where x-delete-after is tralated into x-delete-at to be stored as metadata in the object server23:16
mathiasbnotmyname: sorry I fell asleep and missed the meeting :/23:29
mathiasbany chance of moving the topics from working session 5 from friday to thursday, since neither me nor kota_ will be around on friday?23:29
notmynamemathiasb: yeah, I do need to adjust that. will do it over the next 24 hours23:30
mathiasb..just going over the meeting logs and saw that the issue was raised there already23:30
mathiasbthanks!23:30
*** Jeffrey4l has joined #openstack-swift23:31
mathiasbdo you know anything more about the meeting room facilities, e.g., if they have projectors to show slides?23:32
notmynamemathiasb: I don't, for sure. but I expect them to have something like that. we have had it in the past23:34
jmunschnotmyname mattoliverau : thanks so much for the help23:39

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!