18:01:35 <heckj> #startmeeting
18:01:36 <openstack> Meeting started Tue Jul 24 18:01:35 2012 UTC.  The chair is heckj. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:01:37 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
18:01:53 <heckj> Ola all!
18:02:28 <heckj> The big topic for today is the PKI code review work/descriptions/etc from ayoung
18:02:37 <heckj> Before we get there, is there any immediate issues folks have?
18:02:47 <heckj> #topic anything immediate?
18:03:37 <heckj> there's a new security bug that was alerted in today - received it, but haven't tested/verified against it.
18:03:38 <liemmn> ok
18:04:05 <ayoung> heckj, should we talking about that in a public chat room?
18:04:25 <ayoung> If it is OK to do so, please post the link
18:04:57 <heckj> I don't think it's too butal of an exploit - https://bugs.launchpad.net/keystone/+bug/1028563
18:05:18 <heckj> since it's a security vuln, you need to be explicitly subscribed to see it. I've added ayoung, termie, and dolph to it
18:05:46 <heckj> content: "Identity authentication does not check if user is enabled"
18:06:21 <ayoung> heckj, ah, ok.  yeah,  LDAP enabled is going to need a little massaging, as that field doesn't exist in the base schema.  The others are checked in the authenticate call...we should hit some of that in todays walk through
18:06:34 <heckj> cool
18:06:37 <ayoung> I can chime in on the ticket
18:06:48 <heckj> have at at your time
18:07:03 <heckj> Let's get into the main topic then
18:07:09 <heckj> #topic PKI code review
18:08:12 <ayoung> OK,  lets start with authenticate then...
18:08:28 <ayoung> https://review.openstack.org/#/c/7754/9/keystone/service.py
18:08:57 <ayoung> line 269
18:09:00 <heckj> #link https://review.openstack.org/#/c/7754/9/keystone/service.py
18:09:21 <ayoung> This is the call for either creating a new token, or validating an existing one
18:09:41 <ayoung> on line 290,  you see the start of the big "if" that splits these two use cases
18:10:08 <ayoung> bascially, if you pass in the password credential,  any existing token is ignored.  This logic is maintained
18:10:51 <ayoung> that majority of the heavy lifting is done by self.identity_api.get_user_by_name
18:11:34 <ayoung> hmm...
18:12:05 <ayoung> no,  the heavy lifting isin the common lines that are hidden, 1 sec
18:12:23 <ayoung> anyone know how to expand those?
18:12:44 <ayoung> the real heavy lifting is done by
18:12:47 <ayoung> auth_info = self.identity_api.authenticate(context=context,
18:12:47 <ayoung> user_id=user_id,
18:12:47 <ayoung> password=password,
18:12:47 <ayoung> tenant_id=tenant_id)
18:13:00 <ayoung> line 277 prior to the patch
18:13:02 <heckj> yeah, not sure - was just mucking with the interface to see that
18:13:07 <ayoung> https://github.com/openstack/keystone/blob/master/keystone/service.py#L277
18:13:29 <ayoung> as a note,  this call needs to be refactored,
18:13:51 <ayoung> identity api builds up the data that will be sent back as the authenticate response
18:13:56 <heckj> click on "preferences", then change context to "whole file" and you'll see the whole thing side by side
18:14:40 <ayoung> heckj, thanks.  You guys with me?
18:15:03 <heckj> with you
18:15:21 <ayoung> so line 315
18:15:40 <ayoung> that is what A:  checks that uid and password is correct and builds up the data for the token
18:15:50 <ayoung> there is a little postprocessing,  but let skip ahead to
18:16:08 <ayoung> after the if statement
18:16:19 <ayoung> line 426
18:16:39 <heckj> 426?
18:16:44 <ayoung> heckj yes
18:16:53 <ayoung> I am skipping the token path for now
18:16:58 <ayoung> er
18:17:11 <ayoung> the path where an existing token is passed in
18:17:23 <ayoung> and instead followng the userid and password are passed in
18:17:35 <heckj> gimme a sec, catching up
18:17:39 <ayoung> we'll go back to the else block in a second
18:18:04 <heckj> okay - good
18:18:16 <ayoung> to sum up:  line 315 creates the data for the token , line 426 signs it
18:18:43 <heckj> what's "cms" stand for?
18:18:50 <ayoung> crypto message syntax
18:18:56 <heckj> got it
18:18:56 <ayoung> it is the format of the signed document
18:19:04 <ayoung> it is what is used for SMIME among other things
18:19:15 <ayoung> and it maps to the following command line call:
18:19:41 <ayoung> openssl cms  -sign -in auth_token.json  -nosmimecap  -signer cert.pem -inkey key.pem -outform DER -nodetach -nocerts  -noattr -out auth_token.signed
18:20:06 <ayoung> that call, and the corresponding call to verify are in
18:20:25 <ayoung> https://review.openstack.org/#/c/7754/9/keystone/common/cms.py
18:20:55 <ayoung> note that they are done using popen.  THis is the best supported parallelisation mechanism in Eventlet (AFAICT)
18:21:15 <ayoung> so a new process is forked off,  then execs openssl ...
18:21:31 <ayoung> the output is read back into the parent process
18:21:49 <heckj> yep, with ya
18:21:55 <ayoung> so cms.cms_sign_text(json.dumps(token_data),
18:22:09 <ayoung> gets signed. on line 54
18:22:32 <ayoung> cms_to_token (called line 73) does a little postprocessing
18:22:54 <ayoung> strips off the header, footer, and replaces  / with -
18:22:58 <liemmn> ayoung, so line 375 (checking token's length) basically allows the old-fashioned token support if it's not a CMS token, correct?
18:23:01 <heckj> slaps it all together into one big string
18:23:07 <liemmn> (service.py), sorry
18:23:07 <ayoung> chops line returns to
18:23:09 <ayoung> heckj, yes
18:23:18 <ayoung> liemmn, that is correct
18:23:29 <liemmn> thx
18:23:32 <ayoung> liemmn, as does the check to see if it is disabled
18:23:35 <ayoung> line...
18:23:43 <ayoung> 422
18:23:52 <liemmn> got it
18:23:53 <ayoung> and for now default is to disable
18:24:21 <heckj> ayoung: nice, looks good
18:24:35 <ayoung> OK,  heckj so the big thing I would change, and will do so in the near future...
18:24:45 <ayoung> lets jump back to service.py
18:24:52 <ayoung> and look in that else block
18:25:21 <ayoung> line 340
18:25:34 <ayoung> we read the old token out of the Header
18:25:48 <ayoung> somewhere earlier...
18:26:12 <ayoung> and here we validate
18:26:49 <ayoung> first by checking to see if it is in the datastore....there are pros and cons to doing this,  but this is the least change approach
18:27:12 <ayoung> if it is not in the backend, we can assume "disabled" or "invalid token"
18:27:17 <ayoung> note that this is in Keystone
18:27:28 <ayoung> a remote service does not go through this code path
18:27:56 <ayoung> up to line 374 is fairly close to the old logic
18:28:24 <ayoung> and then we hit 375 which liemmn pointed out before...
18:28:39 <ayoung> UUID tokens are shorter
18:29:14 <ayoung> there is also some interspersed logic here for expiry, which we maintain
18:30:00 <ayoung> the common lines,  like the block at 394
18:30:11 <ayoung> is for building up the response to the verify call
18:30:31 <ayoung> and should be refactored into the identity api
18:30:48 <ayoung> thus, this code should be much simpler ideally
18:31:11 <liemmn> +1 for shorter methods :)
18:31:12 <ayoung> line 385 on is all really common code
18:31:35 <ayoung> OK..jump ahead to 430
18:31:56 <ayoung> regardless of password or token, if we issue a new token, once we sign it, we need to persist it
18:32:07 <ayoung> the big change is that the ID is no longer a uuid
18:32:26 <ayoung> 431: token_ref = self.token_api.create_token(
18:32:58 <ayoung> so this code is per-backend.  Lets look at the SQL one
18:33:15 <ayoung> https://review.openstack.org/#/c/7754/9/keystone/token/backends/sql.py
18:33:30 <ayoung> line 31
18:33:36 <ayoung> we have dropped the id column
18:33:42 <ayoung> because it is confusing
18:33:52 <ayoung> well, not dropped
18:33:58 <ayoung> but it is no longer the primary key
18:34:05 <ayoung> the id *is* the signed document
18:34:16 <ayoung> way too long to be indexable by MySQL etc
18:34:26 <ayoung> buy SQL Alchemy insists on a primary key
18:34:29 <ayoung> so that is
18:34:32 <ayoung> id_hash
18:34:45 <ayoung> camn anyone guess what that is?
18:34:54 <ayoung> hint on line 65
18:35:10 <liemmn> nice
18:35:11 <heckj> quick md5 of the signed token, eh?
18:35:16 <ayoung> yep
18:35:22 <ayoung> rkukura gets props for the idea
18:35:42 <ayoung> It is short enough and unique enough for our purposes
18:35:59 <s34n> unique enough?
18:36:10 <s34n> what is the collission rate?
18:36:27 <ayoung> s34n, s34n on md5?  QUite small
18:36:30 <ayoung> and for these
18:36:39 <heckj> s34n: unique for a primary key in SQLAlchemy - it has the standard md5 hashing characteristics
18:36:40 <ayoung> because the docs are so similar,  even smaller
18:37:44 <ayoung> MD5 is sensitive to small changes in the document.  2 dos that are similar are more likely to have different MD5s than wildly differen docs.
18:37:55 <s34n> something is tickling my brain on that. Something I recently read on md5 collisions. Let me research. I'm probably wrong.
18:38:09 <ayoung> thatis, of course, a laymans understanding, and would proabably make most people that know stats annoyed
18:39:31 <ayoung> s34n, we should also be flushing the tokens after they are expired somehow.  We are not doing that now,  but once we do,  collisions should be statistically sufficiently ignorable
18:39:48 <ayoung> s34n, note that UUIDs have the same problem
18:40:15 <ayoung> OK,  before we move on to the auth_)token middlewar
18:40:16 <ayoung> e
18:40:24 <ayoung> I'd like to talk a bit about SQL migration
18:40:35 <heckj> kk
18:40:37 <ayoung> note that I had to change https://review.openstack.org/#/c/7754/9/keystone/common/sql/migrate_repo/versions/001_add_initial_tables.py
18:41:05 <ayoung> that is because if you automate the table creation for tokens,  they will be defined according to the new schema,  with the id_hash column
18:41:22 <ayoung> and 001 needs to define them the way that they are today,  with id as the pkey
18:41:41 <ayoung> hence line 38 defining the table explicitly
18:41:57 <ayoung> and dropping the import of the token into the migrate code
18:42:02 <ayoung> on line 26
18:42:24 <ayoung> then,  because we are altering a table,  it is really hard to do it right in sqlalchemy...maybe impossible
18:42:36 <ayoung> so upgrade and downgrade is done using goo-ole-SQL
18:42:54 <heckj> got it
18:43:04 <ayoung> I have different files fdor mysdql and sqlite.  I've been told that Postgres follows the sqlite
18:43:13 <ayoung> lets assume you are doign an on-the-fly upgrade
18:43:28 <ayoung> the old uuid token goes into the id and the id_hash columns...no harm there
18:43:49 <ayoung> and the old authenticate code kicks in (same thing that liemmn noted above)
18:43:59 <ayoung> only new tokens are signed and hashed for realz
18:44:09 <ayoung> on downgrade, we just dump all data
18:44:24 <heckj> ayoung: sounds good
18:44:30 <ayoung> heckj, thanks
18:44:54 <ayoung> that is why for mysql we can get away with an altertable command.  it changes the column name, anddrops the pkey, but maintains the data
18:45:01 <ayoung> for sqlite etc we do it more explicitly
18:45:34 <ayoung> OK,  brief aside on config before auth_token
18:45:47 <ayoung> https://review.openstack.org/#/c/7754/9/keystone/config.py
18:46:05 <ayoung> line 128 is all we need
18:46:21 <ayoung> as the majoprity of the values we use we accepted in an earlier patch
18:46:49 <ayoung> opnce PKI is beat on somewhat, I'll submite a patch that flips line 129 to False
18:47:00 <ayoung> OK, any questions so far?
18:47:17 <heckj> ayoung: lookin' good so far!
18:47:18 <liemmn> looks good...  Is the default token validity 3650 days?
18:47:29 <ayoung> https://review.openstack.org/#/c/7754/9/keystone/middleware/auth_token.py
18:47:35 <ayoung> liemmn, um...let me see
18:47:42 * ayoung just closed that tab
18:47:43 <liemmn> 137
18:48:07 <heckj> liemmn: yep
18:48:13 <liemmn> That's a long time :)
18:48:26 <ayoung> yeah...that should be 1
18:48:35 <ayoung> that might be for the cert...let me check
18:48:51 <ayoung> that is not for token time out
18:48:57 <ayoung> that mechanism has not changed
18:49:01 <liemmn> oh, ok... makes sense
18:49:35 <ayoung> OK auth_token middleware, line 392
18:49:52 <ayoung> again, we gate on length
18:50:18 <ayoung> verify UUID token is the old path...on line validation
18:50:50 <ayoung> basically the red lines from line 350 to 392
18:51:06 <ayoung> moved to line 571
18:51:07 <liemmn> still wondering if there is value in caching these bigger cms tokens, since we are not incurring network cost anymore
18:51:20 <ayoung> liemmn, yes there is
18:51:32 <ayoung> if it is cached you don't have to do the fork/exec of openssl
18:52:05 <heckj> ayoung: so as long as your local cache is good, you only take the decrypt hit once
18:52:11 <ayoung> heckj, that is right
18:52:15 <liemmn> ok
18:52:32 <ayoung> line 613 is where we verify signed tokens
18:52:57 <ayoung> again, using the code in keystone/common/cms.py
18:53:26 <ayoung> bascially adds back in the header,  - to / and line breaks
18:53:33 <ayoung> then run through the openssl code
18:53:55 <ayoung> for now,  I am just assuming one ca and one signing cert
18:54:00 <ayoung> they are fetched on demand
18:54:27 <liemmn> ayoung, you have a typo error on line 628 and 631...  "this" -> "self"
18:54:33 <ayoung> see the exception blocks in line 627
18:54:42 <heckj> ayoung: so a deployment expectation is that you'd likely drop in a ca and cert on every machine running the auth_token middleware, restricted directory, and they use it direclty as needed
18:54:53 <ayoung> heckj, yes
18:55:01 <ayoung> heckj, however
18:55:07 <ayoung> you might want to keep the fetch
18:55:12 <ayoung> especially for the signing cert
18:55:30 <ayoung> as that might expire. or you might have to deal with a security breach
18:55:33 <ayoung> breech
18:55:37 <ayoung> break in
18:55:40 <heckj> heh
18:55:43 <heckj> yeah, got it
18:56:23 <ayoung> heckj, in the future, I want auth_token to allow a list of keystone servers.  We prime the pump with one
18:56:30 <ayoung> and the rest are fetched from the service catalog
18:56:48 <ayoung> then, in the signed token, it indicates "I was signed by foo"
18:56:53 <heckj> ayoung: reasonable for the larger installations
18:56:54 <ayoung> and we fetch the cert for foo
18:57:16 <ayoung> heckj, it will also allow for federation, etc
18:57:33 <ayoung> we can specify that a given signing cert can only sign for a specific domain....
18:57:40 <notmyname> heckj: just to get in under the wire, the keystone middleware was merged into swift. should be released next week in the next swift release
18:57:52 <heckj> notmyname: thxusir
18:58:10 <ayoung> OK... the rest of the patch is commentary:  tests and so forth
18:58:30 <termie> ;win 1
18:58:44 <ayoung> there is one thing I've found that makes me self-nack, but I did't want to speak up until after this walk through
18:58:56 <ayoung> I will buy a beer at the meetup if anyone can guess what it is
18:59:05 <ayoung> I will provide one hint:  it is not in any of the files in this patch
18:59:35 <ayoung> any guesses?
18:59:42 <ayoung> going once....
18:59:53 <ayoung> going twice....
18:59:53 <heckj> lack of docs
18:59:57 <ayoung> nope
19:00:00 <heckj> :-)
19:00:14 <Daisy> Hi
19:00:16 <ayoung> heckj, I would be willing to checking withou tdocs
19:00:22 <ayoung> as the default behavior hasn;'t changed
19:00:39 <ayoung> no the missig feature is ec2 and s3 tokens in contrib
19:00:41 <heckj> ayoung: yeah, I'm fine with it - but we'll want to describe how to use the new features very quickly
19:00:50 <ayoung> I think those will work as is by default
19:00:55 <ayoung> but not with PKI tokens
19:01:08 <ayoung> they only generate UUID tokens...which actually might be fine
19:01:21 <Daisy> Is this CI weekly meeting?
19:01:23 <ayoung> but they should be using common code for token generation
19:01:28 <ayoung> Daisy, not yet
19:01:30 <ayoung> still keystone
19:01:35 <ayoung> I'm waxing poetic
19:01:56 <ayoung> heckj, anyway...now that you know, I'll let you decide whether to nack on that...I think it is ok to do in a separate patch
19:02:31 <liemmn> very cool, ayoung... thanks for the walkthru!
19:02:38 <ayoung> My pleasure
19:02:44 <ayoung> now go forth and review!
19:02:48 <liemmn> :)
19:02:51 <liemmn> of course
19:02:53 * mtaylor taps foot patiently...
19:03:14 <heckj> #endmeeting