Saturday, 2017-09-09

*** baojg has quit IRC00:03
*** baojg has joined #openstack-swift00:06
*** geaaru has quit IRC00:08
notmynamemattoliverau: \o/00:43
mattoliveraunotmyname: only thing that would be awesome for me at the PTG would be a US power cable so I don't have to use this converter ;) one with a IEC C5 Female (or mickey-mouse) or whatever people call it so I can plug it into my dell power adaptor ;) So if anyone happens to have one laying around an office IIEC C5 Female can steal, that would be nice, but it isn't really required :)00:55
mattoliverau* i can steal, some how a paste got in there.. I'd balme the travel, but haven't left yet :P00:57
*** aagrawal has joined #openstack-swift01:06
*** abhinavtechie has quit IRC01:10
notmynamemattoliverau: the one with the 3 barrels thing? https://ae01.alicdn.com/kf/HTB1Re6EHVXXXXbKXFXXq6xXFXXXU/-font-b-IEC-b-font-320-font-b-C14-b-font-Male-font-b-C5.jpg01:33
notmynameoh like the top one here http://sc02.alicdn.com/kf/HTB1mAmgMVXXXXclXpXXq6xXFXXXq/IEC-FEMALE-C13-TO-C6-CLOVER-MALE.jpg01:33
notmynameI'm not sure what I'm looking at, but I'm pretty sure I don't have it https://www.google.com/search?q=IEC+C5+Female&client=safari&rls=en&source=lnms&tbm=isch&sa=X&ved=0ahUKEwiyjc3V-pbWAhUY8WMKHRrYClwQ_AUICygC&biw=1386&bih=762#imgrc=eeYadaU41T4SlM:01:35
mattoliverauYeah that 3 barrel one, like in the last link01:36
*** itlinux has quit IRC01:37
*** mat128 has joined #openstack-swift01:38
*** itlinux has joined #openstack-swift02:06
*** mat128 has quit IRC02:11
*** mat128 has joined #openstack-swift03:09
openstackgerritPete Zaitcev proposed openstack/swift master: PUT+POST and its development test  https://review.openstack.org/42791103:13
*** mat128 has quit IRC03:42
*** baojg has quit IRC03:48
*** baojg has joined #openstack-swift03:50
*** aagrawal has quit IRC03:50
*** klrmn has quit IRC04:35
*** mat128 has joined #openstack-swift04:41
*** mat128 has quit IRC05:13
*** baojg has quit IRC05:47
*** mat128 has joined #openstack-swift06:11
*** silor has joined #openstack-swift06:40
*** mat128 has quit IRC06:43
*** rcernin has joined #openstack-swift06:48
*** mat128 has joined #openstack-swift07:41
*** baojg has joined #openstack-swift07:51
*** psachin has joined #openstack-swift08:07
*** mat128 has quit IRC08:13
*** baojg has quit IRC08:50
*** baojg has joined #openstack-swift08:54
*** mat128 has joined #openstack-swift09:10
*** baojg has quit IRC09:21
*** geaaru has joined #openstack-swift09:39
*** mat128 has quit IRC09:44
*** mat128 has joined #openstack-swift10:45
*** silor has quit IRC11:11
*** mat128 has quit IRC11:13
*** psachin has quit IRC12:12
*** mat128 has joined #openstack-swift12:12
*** mat128 has quit IRC12:44
*** rcernin has quit IRC12:45
*** baojg has joined #openstack-swift12:58
*** baojg has quit IRC13:01
*** baojg has joined #openstack-swift13:05
*** silor has joined #openstack-swift14:04
*** mat128 has joined #openstack-swift14:07
*** silor has quit IRC14:13
openstackgerritzhangyangyang proposed openstack/swift master: Replace deprecated .assertRaisesRegexp()  https://review.openstack.org/50224314:16
*** mat128 has quit IRC14:37
*** mat128 has joined #openstack-swift14:39
*** mat128 has quit IRC14:39
*** baojg has quit IRC14:44
*** baojg has joined #openstack-swift14:45
*** baojg has quit IRC14:45
*** catintheroof has joined #openstack-swift15:25
*** catintheroof has quit IRC15:29
*** mat128 has joined #openstack-swift15:35
*** catintheroof has joined #openstack-swift15:55
*** catintheroof has quit IRC15:57
*** catintheroof has joined #openstack-swift15:58
*** baojg has joined #openstack-swift16:20
*** baojg has quit IRC16:25
*** itlinux has quit IRC16:59
*** silor has joined #openstack-swift17:16
*** baojg has joined #openstack-swift17:23
*** baojg has quit IRC17:28
*** itlinux has joined #openstack-swift17:49
*** psachin has joined #openstack-swift17:54
*** psachin has quit IRC18:01
*** catintheroof has quit IRC18:12
*** d0ugal has quit IRC18:21
*** d0ugal has joined #openstack-swift18:24
*** silor has quit IRC18:34
*** mat128 has quit IRC18:44
*** klrmn has joined #openstack-swift18:58
*** vint_bra has quit IRC19:14
*** itlinux has quit IRC20:25
*** catintheroof has joined #openstack-swift20:41
*** catintheroof has quit IRC20:45
*** eckesicle has joined #openstack-swift21:40
eckesicleim experiencing some degraded swift performance that I cant figure out the cause of21:41
eckesicleone of our keystone servers went down, and after coming back up, we're seeing a strange issue21:41
eckesiclehttps://www.irccloud.com/pastebin/puGHNeQB/21:42
eckesiclethat error appears over and over again21:42
eckesiclebut!21:42
eckesicleonly sometimes21:42
eckesicleif try to upload/download the file with the same credentials a couple of times21:42
eckesicleit'll work like 3/5 times21:42
eckesicleit's also only occuring for some accounts21:43
eckesicle85% of the accounts are okay.21:43
eckesicleif we restart keystone21:43
eckesiclewe see the same behaviour21:43
eckesicleexcept21:43
eckesiclethe 15% of broken accounts, have now changed to be another 15% (seemingly random accounts)21:44
eckesicleany ideas?21:44
eckesicleah wrong paste21:46
eckesiclethis is the correct error message21:46
eckesiclehttps://www.irccloud.com/pastebin/D4E7m96Q/21:46
notmynameeckesicle: (just guessing here) sounds like it might be a cache issue. do you have multiple proxy servers? are they configured to have the same common memcache pool? (they should)21:53
eckesicleyeah they are21:54
notmynamebut you might have it configured that the proxies are using different memcache pools or even only using localhost. which means some cached tokens are now invalid21:54
eckesiclei could restart memcached?21:54
eckesiclefor debug purposes i have disabled all but one swift-proxy21:54
eckesicleand the error is still occuring21:54
notmynamerestarting memcache will be a cache flush. I'm not sure what the impact for your cluster will be. could be minimal.21:55
eckesicleit'll be fine, usage is minimal at 2300 on a saturday21:55
eckesiclethat didnt solve it :(21:56
notmynamecan we isolate the issue any further? you said you tried to redownload with the same credentials. is that like a client request that's doing the auth dance again? or do you mean you're using curl and already have the token and the token sometimes works and sometimes doesn't?21:57
eckesiclewell i can simulate a client request21:57
eckesiclebut just reloading the /health endpoint (that will connect to swift, upload a heartbeat file, and download the same file)21:57
eckesicleso if i go to a client that is currently down21:57
eckesicleand reload that page over and over21:58
eckesicleit'll work like 20% of the time21:58
eckesiclebut then after a minute or so21:58
eckesicleit'll work 100% of the time21:58
notmynamewhat client are you using?21:58
eckesicle(but another client will go down)21:58
eckesiclefog21:58
eckesicleand this is what it looks like in the logs when it fails:21:59
eckesiclehttps://www.irccloud.com/pastebin/r9LbR6pe/21:59
notmynamewhat's between the swift cluster and the client? load balancer? any cache? CDN? TLS terminator?22:00
eckesiclenothing at all22:00
eckesiclewe use a custom dns to route properly but that's it22:01
notmynameclient box talks directly to the swift proxy server process?22:01
eckesicle[pipeline:main]22:02
eckesiclepipeline = catch_errors gatekeeper healthcheck proxy-logging cache container_sync bulk tempurl ratelimit authtoken keystoneauth container-quotas account-quotas slo dlo proxy-logging proxy-server22:02
eckesiclethat's the pipeline22:02
notmynamenext step I'd do is use the swift CLI to get an auth token. eg `swift auth -v`22:02
notmynamemake a note of the auth token returned and then use curl to do requests to the given storage URL22:03
notmynameor maybe to the specific IP of the proxy server you're testing22:03
notmynamesee if that works or fails22:03
eckesicleok22:04
notmynameeg curl command: `curl -i -H "x-auth-token: yourtoken" http://1.2.3.4/v1/youraccount/`22:05
notmynamewhat are you using for TLS termination?22:06
eckesiclegot it!22:07
eckesiclethis cluster runs completely on an internal network22:08
notmynamewhat was it?22:08
eckesicleso there is no TLS22:08
notmynameok22:08
eckesicleexport OS_STORAGE_URL=http://swift-proxy.service.binet:8080/v1/AUTH_fe48f5d760a14d1a9c398d95275235bf22:08
eckesicleexport OS_AUTH_TOKEN=xxx22:08
notmynameyou got it working now?22:09
eckesiclethat works22:09
notmynameok, good22:09
notmynameso now that you have the token itself, use curl to make requests to the proxy server22:10
notmyname(the reason I like using curl is because it doesn't hide anything or do extra requests for you. makes debugging much easier)22:10
notmyname`curl -i -H "x-auth-token: xxx" http://swift-proxy.service.binet:8080/v1/AUTH_fe48f5d760a14d1a9c398d95275235bf` in your case22:12
notmynamethat does a GET on the account and returns a list of the containers in the account22:12
eckesicleyeah22:13
notmynameif you have a bunch of containers, then `curl -I -XHEAD ...` will just do the HEAD request and return the account metadata in the response headers22:13
eckesicleit does get me a list of the containers22:13
notmynamecool22:13
notmynamedoes it work every time?22:13
eckesiclebut i actually cant see22:13
eckesicleoh right22:13
eckesicleno i cant see the container i expect to see22:15
notmynameTBH I don't care if you can see what you expect or not (yet). I care that you don't get 401s22:15
eckesicleok22:15
eckesiclewell im getting 401s now22:15
eckesicleand now it works again22:16
eckesicle(running that same command)22:16
notmynamebig picture, here's where my thinking is going: I want to see if the issue is between swift and keystone with validation of tokens or if the issue is a client to keystone issue where maybe keystone is giving tokens that dont' get validated22:16
notmynamehmm22:16
notmynameok22:16
notmynameso repeatedly using the same token you are sometimes getting 200 and sometimes getting 40122:16
notmynameand it flops back and forth22:16
notmynameright?22:17
eckesicleyes22:17
notmynamethat would imply the problem is more likely to be between swift and keystone22:18
notmynameI don't know a whole lot about keystone internals. are you using uuid tokens?22:18
notmynameor fernet tokens?22:18
eckesiclei dont know the answer to that22:19
eckesiclethat's in the keystone config?22:19
notmynameyes, I'd guess22:19
notmynamefernet tokens are long IIRC. uuid tokens are 32 bytes22:19
eckesicleuuid then22:20
notmynameok22:20
eckesicle32 chars hex22:20
notmynameyeah22:20
notmynameso with the curl requests, when you get a 401, the log messages are the same as the ones you pasted above?22:20
eckesicleHTTP/1.1 401 Unauthorized22:21
eckesicleContent-Length: 13122:21
eckesicleContent-Type: text/html; charset=UTF-822:21
eckesicleWww-Authenticate: Swift realm="AUTH_fe48f5d760a14d1a9c398d95275235bf"22:21
eckesicleWWW-Authenticate: Keystone uri='http://keystone.service.binet:5000'22:21
eckesicleX-Trans-Id: tx3ee4698a23314989aa548-0059b4684422:21
eckesicleDate: Sat, 09 Sep 2017 22:16:36 GMT22:21
eckesicle<html><h1>Unauthorized</h1><p>This server could not verify that you are authorized to access the document you requested.</p></html>root@sld-stor2:/etc/swift/proxy-server# curl -i -H "x-auth-token:22:21
eckesicleah sorry i should've pastebinned that22:21
notmynameno worries22:21
notmynamewhat about the log lines. the "Auth token not in the request header" is the one from earlier that looks weird to me22:22
notmynameare you still seeing that?22:22
eckesicleyes, but i think i know what that is22:22
notmynameoh, and if you grep all of your logs for the x-trans-id, you'll see everything in the swift cluster for that reuqest (tx3ee4698a23314989aa548-0059b46844 in this example)22:22
eckesicleI think that's the health check, just making a http request22:22
notmynameoh22:22
eckesiclehttps://www.irccloud.com/pastebin/gEmVZsu9/22:23
eckesiclethere's a pastebin of the grep22:23
eckesiclewho validates the token?22:24
eckesicleis it the proxy or the object/container/account server?22:24
notmynameyou implied you had multiple proxy servers. can you turn on another without adding it to the dns/load balancer (ie to be able to change the config without end users using it)22:25
notmynameauth heppens in the proxy server22:25
notmynameso if the auth fails, requests don't even go to the storage nodes (account, container, object)22:25
eckesiclei cant easily turn it on now unfortunately22:26
notmynameno worries22:26
notmynameI was hoping you could set a proxy server to use debug-level logging22:26
eckesicleoh i can just restart this one22:26
eckesiclein debug22:26
notmynamethere's a bunch of logs that they keystone integration spits out that might be handy22:26
notmynameok, put it in debug then, redo the curl request until you get a 401, then grep for the transaction id22:27
eckesicledo you know the command by heart?22:27
eckesiclein the config i mean22:27
notmynamewhich command?22:27
notmynameoh, to set it to debug?22:27
notmynamelog_level=DEBUG IIRC22:27
eckesicleyeah thats it22:28
notmynamehttps://github.com/openstack/swift/blob/master/etc/proxy-server.conf-sample#L4422:28
eckesiclehttps://www.irccloud.com/pastebin/NF6aqZVD/22:29
eckesicleokay22:29
eckesicleso i set it to debug22:29
eckesicleran a few curls until it 40122:30
eckesiclethen grepped with context for that txn22:30
eckesicle'cached token marked as unauthorized' looks interesting22:31
notmynameyeah22:31
notmynamelooking22:31
notmynameeckesicle: this started when you upgraded or otherwise changed keystone, right?22:34
eckesicleserver crashed22:34
eckesicleand we booted it up again22:34
notmynameyikes22:35
notmynamein the proxy server config, under the [filter:authtoken] section, do you have cache set to swift.cache? https://github.com/openstack/swift/blob/master/etc/proxy-server.conf-sample#L38422:37
eckesicleno22:37
notmynameif not, I think you should22:37
eckesicledelay_autth_decision = true22:37
eckesicledelay_auth_decision = true22:37
notmynamegood22:38
eckesicleokay22:39
eckesicleim setting that22:39
eckesiclesame problem22:39
notmynamehmm22:39
notmynamethat was from https://github.com/openstack/keystonemiddleware/blob/master/keystonemiddleware/auth_token/__init__.py#L203-L21122:40
eckesiclei wonder if someone made a breaking change to the config22:41
eckesiclesaved it but didnt reload keystone22:41
eckesicleand then when the reboot happened it loaded up an invalid config22:42
notmynameI don't think you need to reload keystone if you just update the swift proxy config22:42
notmynameok, so far it seems like the proxy server gets a token from the end user, the keystone middleware looks for the token in its cache, gets an invalid response from the cache, and then returns 401 to the user22:42
eckesicleit's cached in memcached right?22:44
notmynamehmm... maybe setting cache to swift.cache could hide/fix the problem, but it makes me wonder why it was working before22:45
notmynameIIRC the swift.cache was the best to use a long time ago, but that was related to som scale concerns in keystone's memcache client (which IIRC have been resolved)22:45
notmynameso, taking a quick glance through unfamiliar code (the keystone middleware project), it seems that keystone is looking up the token in memecache22:46
eckesicleso i did set cache = swift.cache22:46
eckesicleand restarted every swift service22:46
notmynameand so perhaps something with the keystone memcache pool or settings was changed with the keystone upgrade22:46
eckesiclebut it didnt fix it22:46
notmynameyeah. that may be a red herring22:47
notmynameor, alternatively, it might be some other keystone config related to memcache that is a common problem there22:47
eckesicle[filter:authtoken]22:48
eckesiclepaste.filter_factory = keystonemiddleware.auth_token:filter_factory22:48
eckesicleidentity_uri = http://keystone-admin.service.binet:3535722:48
eckesicleauth_uri = http://keystone.service.binet:500022:48
eckesiclewhat does paste.filter_factory do here?22:48
notmynamethere's not any chance you've got pre-upgrade versions of the config file to compare against, do you? a company I used to work for had all the config files checked into a version control system for that22:48
notmynameoh, the filter_factory identifies the code module that will be used22:48
notmynamebasically, it's how the plugin system works22:49
eckesiclei have all configs in ansible22:49
eckesiclei was just browsing that22:49
eckesiclebut i checked the config files, they have not been touched since 2015-12-1222:50
notmynameI'm looking at https://github.com/openstack/keystonemiddleware/blob/master/keystonemiddleware/auth_token/__init__.py#L948-L95722:50
eckesiclei wonder if maybe the auto update has broken something22:50
notmynameagain, it's an unfamiliar codebase, and I'm not sure where, if anywhere, keystone has sample config files22:51
eckesiclesome package that we depend on couldve been updated22:51
notmynamebut those lines make it look like there's some settable options22:51
eckesiclewe did something wrong ... now the number of affected clients have gone from 15% to 60%22:51
notmynameyikes!22:52
notmynameok, unset the cache thing (that's the only thing we changed22:52
notmynameoh, debug logging, but that shouldn't affect things22:53
eckesicleyeah22:53
eckesicleill see if it fixes things22:53
eckesiclei wonder if it's a load thing?22:53
eckesiclelike the cache lookup times out every now and then or something22:53
notmynamemaybe22:53
eckesiclethe load is quite high on the server22:54
eckesicle3 on a 6 core22:54
notmynameon the keystone server or the proxy server?22:54
notmynameoh, are they the same box?22:54
eckesicleyeah22:54
notmynameso the other proxy servers were talkign to this box too?22:55
eckesicleive taken those down22:55
eckesicleright now (for debugging)22:55
notmynameright22:55
eckesiclethere is only one keystone and one swift proxy22:55
notmynameya22:55
eckesiclebut a lot of storage nodes22:56
eckesicleno one uses this service out of office horus22:56
eckesicleno one uses this service out of office hours22:56
eckesicleso i dont have to worry about bringing things down22:56
notmynamebut the load is high right now?22:56
eckesiclewe're back down to 15% now22:56
notmynameok22:56
eckesicleyeah the load is high22:56
notmynamehow many requests per second are you handling right now? roughly22:57
eckesicle100 or so22:57
notmynameok22:57
eckesiclei just paused all the health checks22:57
eckesicleokay, we're down to zero requests per second :)22:58
eckesiclelet's see if the load goes down22:58
eckesicleand if the curl still fails22:58
eckesiclewell the load is down to <122:59
eckesicleand the 401s are still showing22:59
notmynamehmm22:59
notmynameI wonder if there's something in the memcache config? did that get updated?23:00
notmynameare there knobs there that can increase capacity?23:00
notmynamealso, I'm getting to the end of my knowledge at this point23:02
eckesicle2015-10-3023:02
eckesicleyeah23:02
eckesiclethanks so much for your help23:02
notmynamemight be worth it to see if there's anyone in the #openstack-keystone channel23:02
eckesicle(if i knew your name and address id send you a bottle of gin or something)23:02
notmynamedo you have other openstack services using this keystone instance?23:02
notmynamelol23:02
eckesicleor a gift card :)23:03
eckesiclepriv me your amazon username and ill send you a little thanks23:04
eckesicleno, it's only swift23:05
*** alenavolk has joined #openstack-swift23:38

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!