Wednesday, 2018-02-21

usr2033i have a problem about policy.v3cloudsample.json file. Can anyone help?07:54
openstackgerritMerged openstack/keystone master: Remove v2.0 policies
lbragstadusr2033 are you still having some issues with policy.v3cloudsample?14:17
lbragstadcmurphy i know i asked you this already, but you have an idea of what you want to go over for the cross-project application credentials session, yeah?16:19
cmurphylbragstad: um not really actually16:20
cmurphyi guess i imagined a bit of q&a and a bit of "what would you like to see"16:21
cmurphyfine-grained access control is the obvious thing16:21
cmurphyit's not that critical of a session now that the base feature is there16:22
lbragstadcmurphy ok - so we can just stage it for a short q&a type thing and if it evolves into something else that's fine16:23
lbragstadattempting to flesh it out here -
knikollayeah, i think we should spend some time talking about fine grained access. maybe could be tied to the discussion about rbac and default roles.16:26
lbragstadthey are closely related16:28
lbragstadfor some reason, our discussions always seem to come full circle lol16:29
knikollamore like spiral, cause after every revolution we get a little closer.16:29
lbragstadpsh - no wonder i'm so dizzy all the time16:30
mnaseri'm trying to troubleshoot an issue that comes up from time to time in puppet-openstack ci .. "This is not a recognized Fernet token"16:30
mnasertempest tests all run with no problems... but from time to time, a request will be accepted by nova for a new server, then it will try to contact neutron to get list of security groups, but neutron responds with a 40116:31
mnaserand in the neutron logs, keystone says the token ain't good, and keystone logs show the 40116:31
mnaser"TokenNotFound: This is not a recognized Fernet token gAAAAABajZJB2pixOrz1RPc_RATriy4CLp1abIDZMI8i9tYNCHmibVCOQIWjGv9r71lFNI2auP1qhb5pDn9ZrUP8f9BpoayI1l6hVO3avfNTQEWnS4xrpDgRjUQFZRmJtTMppawUzkEdYfapFJHlrtKlTgLHSSsHRwS-ca9Ofg8M5WEPdqBx8m0=" .. any idea what could be causing this?16:31
lbragstadmnaser we raise that exception in one place16:31
lbragstadwhich is handling an InvalidToken exception from the library that actually does the encryption/decryption bits for us16:32
mnaserlbragstad: ok i see it here indeed
lbragstadwhich mean, if that exception is getting thrown, then it's could be a key is missing or the token was tampered with in such a way the cryptography library can't make sense of it16:33
mnaserits interesting you say this16:33
mnaseri saw something in the syslogs16:33
lbragstadyep - that's the stop16:33
mnaser(in around the same time frame-ish)16:33
mnaserah it might be unrelated16:33
mnaser"UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 in position 33: ordinal not in range(128)"16:33
mnaserthrown by nova processes inside oslo_log16:34
lbragstadah - ok16:34
lbragstadis disk utilization fine on the host?16:34
mnaser11% USED16:34
lbragstadyou're not running out of disk space in the middle of a rotation, then16:34
mnaserOH HOLD ON16:35
mnaseroops caps16:35
mnaser5 6 716:35
mnaseri wonder if maybe something is rotating keys...16:35
mnaserthe job takes less than an hour to run16:35
lbragstadmnaser how long is this host up?16:35
mnaserand we have a 40 minute token expiration in the gate16:35
lbragstadhow often is the key rotation happening?16:35
mnaser2018-02-21 15:30:02 +0000 /Stage[main]/Keystone::Cron::Fernet_rotate/Cron[keystone-manage fernet_rotate]/ensure (notice): created16:36
mnaserlet me see16:36
mnaserevery 5 minutes.16:36
lbragstadhow many keystone hosts are there?16:36
mnaserlbragstad: only 116:36
mnaserbut we have memcache in there16:36
mnaserso im gonna guess nova keeps the token cached for 40 minutes16:37
lbragstadso - tokens are valid for 40 minutes?16:37
mnaserbut rotating every 5 minutes means all tokens are invalid at 15 minutes16:37
lbragstadbut encryption keys are being rotated every 5 minutes16:37
mnaserour tempest runs last 15 minutes (barely)16:37
lbragstadyep - exactly16:37
mnaserwhich explains why we hit it sometimes and sometimes we didn't lol16:37
lbragstadyou should bump you max_active_key setting16:37
mnaserlbragstad: max_active_key = 5 + every 10 minutes should be good for a 40 minute token right?16:38
mnaserso if our rotations mess up, it'll be caught in the tempest runs i guess16:39
lbragstadyeah - if rotation happens every 10 minutes, 5 keys should cover you16:39
lbragstadbut ideally, you'll want to factor in your token expiration time16:39
mnaserlbragstad: thank you so much!  this was quite a hassle in the gate with the intermittent timeouts16:40
lbragstadi think it would be the token expiration (in minutes) / the intervals of key rotation (in minutes)16:40
mnaser#thanks lbragstad for helping troubleshoot an intermittent fernet token validation failure in puppet gates16:40
openstackstatusmnaser: Added your thanks to Thanks page (
lbragstadanytime mnaser :)16:41
mnaserlbragstad: agreed but i would +1 on interval too16:41
lbragstadyeah - the extra buffer can't hurt16:41
mnaserbecause cronjobs dont really run at the start of token allocation16:41
mnaserso if you're running at :10 and :20 you might have your token expire if you got it exactly a minute before cronjob or so16:41
lbragstadand keystone-manage fernet_rotate will obviously keep the disk clean once you reach the max_active_key limit16:41
* mnaser wonders why clients dont retry once at least when they get a 40116:42
mnaserif token_not_valid: grab_new_token -> retry else -> fail .. should fix all of those weird caching issues etc16:42
mnaserbut oh well16:42
mnaserlbragstad: its always funny when you see old things you've said still make sense now16:46
mnaserlbragstad: i find this happen a lot when i go over old code .. thinking about how to do it in some better way and finding out that's how i did it in the first place after going over it lol16:46
kmalloc.... man my hands hurt. stupid sudden cold with high humidity.17:14
kmallocmnaser: clients SHOULD retry ;)17:16
kmallocbut many people do the naive implementation.17:16
kmallocand explode on failure.17:16
kmalloceven reasonable/valid failure modes that justify a retry17:17
dikonoorcmurphy: Hi . Would you be able to take a look at when you get a chance ?17:27
openstackLaunchpad bug 1750843 in OpenStack Global Requirements "pysaml2 version in global requirements must be updated to 4.5.0" [Undecided,New]17:27
eEbxHey guys, I would like to ask you if anyone of you has relevant keystone benchmark tests. I would like to know if 200ms to get/validate token is ok.17:28
dikonoorcmurphy: Defects need inputs from someone who knows Keystone Federation17:28
dikonoorand how it uses the pysaml2 apis17:28
dikonoorlbragstad: If you or cmurphy could take a look, that would be great..
openstackLaunchpad bug 1750843 in OpenStack Global Requirements "pysaml2 version in global requirements must be updated to 4.5.0" [Undecided,New]17:29
openstackgerritLance Bragstad proposed openstack/keystone master: Update 3.10 versioning to reflect system scope changes
lbragstad^ another thing we'll probably have to backport18:01
lbragstadi swear that was noted in the implementation but apparently not18:02
lbragstadknikolla you all do k2k right?18:11
lbragstadeEbx the answer depends on how you have keystone configured18:12
lbragstadit can vary depending on how things are setup18:12
knikollalbragstad: yes, but not on production yet.18:13
lbragstadknikolla ok - so do you have keystone setup as an idp somewhere?18:13
lbragstadand you authenticate to it for saml assertions that you give to the service provider keystone?18:14
knikollaThere’s an api call for keystone that gives back signed saml18:14
eEbxlbragstad: two keystone servers with nginx load balancer, db is 5 node gallera cluster18:14
knikollaYou send that to sp keystone’s shibboleth18:14
lbragstadknikolla the saml assertion is only generated from information in keystone, right? there isn't a way for someone to authenticate for a saml assert and provide some extra XML to inject into the assertion is there?18:15
lbragstadeEbx do you have caching configured?18:15
lbragstador memcache servers that are configured to work with keystone?18:15
knikollaNo, it’s a get call with no params18:15
lbragstadknikolla ack - thank you18:15
lbragstadknikolla i'm going to paraphrase you in
openstackLaunchpad bug 1750843 in OpenStack Global Requirements "pysaml2 version in global requirements must be updated to 4.5.0" [Undecided,New]18:16
eEbxyes I have memcache servers configured18:16
knikollaWhat info are you looking to put in there?18:16
lbragstadi'm not, but there appears to be a security issue with pysaml218:16
lbragstadspecifically when a user has the ability to pass data to the thing that generates the assertions18:16
lbragstadwhich doesn't sound like it affects us18:16
kmallocoh, fun18:16
kmalloclet me take a look at that18:16
knikollaWe don’t parse xml, shibboleth/mellon does that for us18:17
knikollaWe merely generate and sign it18:18
kmallocthat shouldn't ever effect us18:18
kmallocfor sake of forward looking safe18:18
kmallocwe should update18:18
kmalloci can't believe people use assert for anything outside of testing/non-critical errors18:19
kmallocexpect assert wont fire before using it.18:19
kmallocwe probably should evaluate assert usages in keystone (we might have some lingering ones that are similar)18:19
lbragstadok - updated with a comment18:21
lbragstadeEbx do you know if you're caching tokens?18:22
kmalloclbragstad: haha i just commented too on it :P18:22
kmalloc200ms seems a little slow (eEbx) but i haven't done recent testing.18:22
lbragstadeEbx 200 ms is on par if you're generating the token (without caching) on every request18:23
lbragstadwithout knowing what hardware you're running on, i would expect utilization of memcache to drastically improve that18:23
*** tesseract has quit IRC18:25
kmallocit also depends on the load of the DB (is it used for other applciations? if so, what is the general io latency on it for lookups)18:25
kmallocalso, what is the concurrency of token issuance / validation18:26
cmurphyglad to know i wasn't wildly off base :)18:36
lbragstadfantastic response time18:40
* lbragstad steps away for lunch18:40
lbragstadcc gagehugo ^ '20:12
gagehugolbragstad I like the experimental note20:13
lbragstadfigured we should add the limit stuff in there, too20:13
lbragstadi'll propose a backport20:13
lbragstaddone -
lbragstadin case folks haven't seen it yet -
lbragstadit looks like the PTG feedback session is going to be at the same time we were planning on having our retrospective20:41
lbragstadgame night is also on thursday20:41
lbragstadfyi - i was thinking about bringing some games - would anyone be interested?20:41
cmurphyi would game with y'all21:06
lbragstadthere's gonna be so much to do on thursday21:11
lbragstadbut i can bring the resistance, dutch blitz, and exploding kittens21:13
* lbragstad double checks the game cabinet21:20
lbragstadyeah - those are the travel friendly games i have21:21
gagehugoI think I have the oregon trail card game as well21:25
lbragstadoh - that one is fun21:26
lbragstadnostalgia in a deck of cards21:26
*** belmoreira has quit IRC21:37
mnaserlbragstad: anyone say nostalgia? :P21:40
mnaserbut also i've been digging in my emails/launchpad to find the bug regarding admin-ness with v3 domains.. anyone know where that one is tracked or has a link around?21:41
lbragstadmnaser hah - like the v3.samplepolicy bug?21:45
mnaserlbragstad: yes, i think it ended up being marked as a dup of another one21:45
lbragstadi think i know which one you're talking about21:46
*** rmascena__ has quit IRC21:53
lbragstadkmalloc do you want to kick this through ?22:05
lbragstadmnaser isn't not this one is it?22:06
mnaserlbragstad: are you talking about the changes above ^ ?22:08
lbragstadmnaser sorry - forgot to paste22:08
openstackLaunchpad bug 1630434 in OpenStack Identity (keystone) "policy.v3cloudsample.json doesn't allow domain admin list role assignments on project" [Medium,Triaged]22:08
mnaserlbragstad: oh yeah something similar, the one i had reported had a whole bunch of discussion if i remember22:09
mnaseri cant find it.. i have no idea why22:09
openstackLaunchpad bug 968696 in OpenStack Identity (keystone) "duplicate for #1684320 "admin"-ness not properly scoped" [High,In progress] - Assigned to Adam Young (ayoung)22:10
mnaserahh yes combined with
openstackLaunchpad bug 968696 in OpenStack Identity (keystone) ""admin"-ness not properly scoped" [High,In progress] - Assigned to Adam Young (ayoung)22:10
lbragstadmnaser this is the one you reported -
openstackLaunchpad bug 968696 in OpenStack Identity (keystone) "duplicate for #1684320 "admin"-ness not properly scoped" [High,In progress] - Assigned to Adam Young (ayoung)22:12
mnaserah yes22:13
lbragstadaha - yep - just missed that like22:13
mnaseri guess when its marked as duplicate it disappers22:13
lbragstadsearch queries in lp have a toggle for it22:13
mnaserlbragstad: maybe it would be nice as a ptg topic to follow up on this (perhaps an openstack-wide goal..)22:14
lbragstadand finally -
lbragstadmnaser we have a session dedicated to it on tuesday morning22:16
mnaserlbragstad: oh cool i'll try to be there22:18
*** spilla has quit IRC22:24
*** itlinux has quit IRC22:29
gagehugo confuses me a bit23:08
openstackLaunchpad bug 1735250 in OpenStack Identity (keystone) queens "Password column limit (128 char) in the Password table exceeded when using passwords exceeding 2000 characters" [High,Confirmed]23:08
lbragstadgagehugo that's because we hash the passwords23:14
lbragstadso when you pass keystone a password over 2k, the hash will exceed the limit of the password hash table23:14
gagehugolbragstad I don't understand why password.expression would ever have the non-hashed version23:14
gagehugobut that is likely sqla wizardry that I don't completely understand23:14
lbragstadit does have something to do with how hybrid_property works23:16
lbragstadkmalloc and i were discussing that in irc one day23:16
kmallocbtw, that is a documented limitation (iirc) in the password system23:17
kmallocbecause of issues with how the password column works23:17
kmalloclet me read the bug.23:17
kmallocbut... it's wonky23:17
lbragstadit's been a while since i've dug into that23:17
lbragstadgotta run to an appt quick, i'll be on later though23:17
kmallocoh wait. i think there WAS a bug on this23:18
kmallocand we fixed it.23:18
kmallocah the silly password.Password23:19
kmallocoh we didn't fix this.23:19
kmalloci think the solution is just deleting the @password.expression23:20
kmalloci can roll up some code for that... today or tomrorow23:21
kmallocbut like i said, i think the fix is just dropping @password.expression def.23:22
