#openstack-meeting-alt log

16:01:23 <lbragstad> #startmeeting keystone
16:01:24 <openstack> Meeting started Tue Feb 19 16:01:23 2019 UTC and is due to finish in 60 minutes.  The chair is lbragstad. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:01:25 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:01:27 <openstack> The meeting name has been set to 'keystone'
16:01:33 <lbragstad> #link https://etherpad.openstack.org/p/keystone-weekly-meeting
16:01:36 <lbragstad> agenda ^
16:01:37 <lbragstad> o/
16:01:42 <cmurphy> o/
16:02:02 <kmalloc> zzzzzz
16:02:05 <kmalloc> i mean o/
16:02:18 <gagehugo> o/
16:02:50 <vishakha> o/
16:03:50 <knikolla> o/
16:04:09 <ayoung> \m/ dOvOb  \m/
16:04:26 <lbragstad> alright - let's get started
16:04:35 <lbragstad> #topic previous action items
16:04:49 <lbragstad> i think i was the only one that had something for last week
16:05:20 <lbragstad> which was to send a note to the mailing list about key loading performance and my struggles attempting to recreate issues with it - or failing to realize actual performance gain
16:05:30 <lbragstad> #link http://lists.openstack.org/pipermail/openstack-discuss/2019-February/002674.html
16:05:41 <lbragstad> ^ which is now available for your reading pleasure
16:05:52 <lbragstad> wxy-xiyuan and i did follow up with some downstream teams
16:06:18 <lbragstad> and the reported issue was using an older token provider, but likely still an issue with fernet/jwt
16:06:39 <lbragstad> the problem was that the keystone process would lock the files in order to read them
16:07:21 <lbragstad> so if you had a whole bunch of keystone processes on the same host, using the same key repository, you could theoretically see bottlenecks with performance if the processes start queuing up while they read keys from disk
16:07:37 <lbragstad> this gives me a bit more to go on
16:07:37 <kmalloc> it is likely the best bet is to run a test with low disk-cache (low ram available) and disk under io stress.
16:08:02 <lbragstad> i haven't taken a stab at recreating it - but i'm going to be setting aside time to do that this week
16:08:05 <kmalloc> NFS might also cause similar odd issues.
16:08:10 <lbragstad> yeah - that's a good point
16:08:45 <lbragstad> if this piques anyone's interest, i encourage you to try it out
16:08:49 <kmalloc> or any other network capable filesystem (iscsi based, gluster, RADOS/RBD)
16:09:40 <lbragstad> obviously, the initial jwt implementation is susceptible to this issue, as is the fernet token provider
16:10:13 <lbragstad> i could open a bug to track this specific issue though, and we could continue with the jwt implementation if people are ok with this being something we fix after the fact
16:10:38 <ayoung> If only there were a way to avoid having to go back to Keystone to validate tokens at all...
16:10:48 <ayoung> like, maybe, I don't know, sharing memcache
16:11:37 <ayoung> why don't we stick the tokens into memcache in the first place, and then use the same memcache region between keystone and the other services?
16:11:59 <ayoung> It would be especially cool if Keystone were the only service that could write to it
16:12:04 <lbragstad> as i noted in my email, this is only a factor when you disable a whole bunch of caching options
16:12:21 <ayoung> yeah, but we never provided a way to do ^^
16:12:43 <ayoung> If we are dependend on caching, we should use it in a smart way
16:13:25 <lbragstad> our caching implementation and usage has come a long way since ~mitaka/newton
16:13:47 <lbragstad> fwiw - we still need to supply a definitive caching guide for keystone
16:14:05 <lbragstad> that describes the complexity we have in our caching implementation and how to use it effectively
16:15:12 <kmalloc> lbragstad: *since havana
16:15:22 <kmalloc> lbragstad: (yes, caching was initially implemented in havana)
16:15:24 <ayoung> Is it possible to inject a new token into the cache upon creation without adding any new code?
16:15:48 <lbragstad> kmalloc right - we had to overhaul a bunch of it shortly after we implemented fernet
16:16:04 <lbragstad> ayoung yeah - that feature already exists
16:16:27 <kmalloc> ayoung: no. we would need to know where to inject the token cache into the memcache, the memcache clusters are not the same in all cases for distributed or centralized keystone / other services
16:16:29 <ayoung> can we share that same segment with the validation code in middleware on another server?
16:16:33 <lbragstad> though it key'd off an unnecessary configuration option
16:16:48 <kmalloc> and we would need to implement a cache that knows how the KSM keys are constructed
16:16:59 <kmalloc> we lean on a different cache key in keystone.
16:17:24 <ayoung> could we unify them?
16:17:34 <kmalloc> possibly
16:17:36 <lbragstad> that'd be a good idea
16:17:39 <kmalloc> but it's a chunk of code
16:19:02 <lbragstad> looks like we might have a couple actionable things here
16:19:13 <lbragstad> for investigative purposes
16:19:14 <kmalloc> i would want to add this post pymemcache
16:19:32 <kmalloc> python-memcache is actively causing issues for folks now (triple-o notably)
16:20:55 <kmalloc> "only keystone being able to write" means we need to either sign the data in memcache (not likely to work easily) or use bmemcache for SASL auth
16:20:58 <ayoung> sounds right to me
16:21:11 <kmalloc> basically anyone who can read from memcache can write
16:21:16 <kmalloc> the protocol is very ... limited.
16:21:24 <ayoung> it can be a security hardening to use it.  I think hrybacki is looking into SASL auth already
16:21:41 <kmalloc> all KSMs need SASL auth, and ability to write to the same keys
16:21:54 <ayoung> read
16:21:59 <kmalloc> no
16:22:02 <lbragstad> as far as the time we have left for Stein - is it feasible to get any additional caching work done outside of writing a guide?
16:22:04 <kmalloc> KSM also must write.
16:22:30 <ayoung> Nah.  If you validate, it ends up back in cache on the Keystone server side
16:22:34 <kmalloc> if the element is LRU'd out, unless keystone is pushing a cache to every cluster on every validate, the KSM needs to write
16:22:37 <ayoung> for this it would be read only
16:22:49 <ayoung> Ah
16:23:04 <ayoung> probably no real benefit then, huh?
16:23:08 <kmalloc> yeah
16:23:22 <kmalloc> keystone can push on issue, but it shouldn't be push everywhere on every validate
16:23:35 <kmalloc> also cache-on-issue for remote sites might cause severe performance issues
16:23:43 <kmalloc> leaning on the KSM to cache would be better in those cases
16:23:50 <ayoung> Would be cool to have a 2 level setup, where keystone can push to a serivce, but they can't push to each other
16:24:11 <kmalloc> this gets into where memcache is not the best tool
16:24:24 <ayoung> istio?
16:24:26 <kmalloc> using something like REDIS (don't ever use redis due to licensing) is better.
16:24:33 <hrybacki> (in training but will review meeting notes this evening)
16:24:35 <kmalloc> if we're using the current cache infrastructure
16:25:12 <kmalloc> if we totally re-implement caching [tokens] to something totally different like istio, etcd, or something else, it's more doable
16:25:22 <kmalloc> that is a monumental rewrite
16:25:35 <ayoung> yeah,  probably tight to get that into Stein
16:25:42 <lbragstad> ++
16:25:54 <kmalloc> might be hard to land that into Train, it's going to probably be a 2-cycle initiative
16:26:03 <kmalloc> 1 cycle for scaffolding, one for implementation/adoption
16:26:22 <ayoung> Seriously, though, this is starting to get into Service mesh land.  We shoud investigate something for that in a future release
16:26:31 <kmalloc> def. not opposed to it
16:26:39 <kmalloc> just opposed to trying to wedge it into Stein :)
16:26:41 <ayoung> There are numerous, some less tied to K8S than others
16:26:56 <lbragstad> does anyone want to take a shot at a spike and writing up their findings?
16:27:09 <kmalloc> unfortunately, i don't have the bandwidth at the moment.
16:27:21 <kmalloc> this would be a lot of fun to work on though.
16:27:36 <ayoung> Outreachy?
16:27:38 <lbragstad> yeah - it is interesting
16:27:44 <kmalloc> probably more than intern level of work
16:27:48 <lbragstad> if not - we can table it and come back to it later
16:27:55 <kmalloc> this is likely in the flask-rewrite class of effort
16:28:06 <lbragstad> i would certainly like to get the guide done this release though
16:28:10 <kmalloc> this touches a lot of very delicate parts.
16:28:40 <kmalloc> caching is hard. very hard to do right.
16:28:56 <kmalloc> every one of us here have messed it up at least once in keystone ;)
16:29:05 <kmalloc> (and in my case, a lot more often)
16:29:22 <lbragstad> anything else on this?
16:29:30 <lbragstad> we kinda went off on a tangent
16:30:13 <lbragstad> #topic blueprint cleanup
16:30:18 <lbragstad> #link https://blueprints.launchpad.net/keystone
16:30:23 <lbragstad> #link https://etherpad.openstack.org/p/keystone-blueprint-cleanup
16:30:41 <lbragstad> #link http://lists.openstack.org/pipermail/openstack-discuss/2019-February/002672.html
16:30:45 <lbragstad> quick status update on this
16:31:09 <lbragstad> all blueprints that weren't targeted to stein have been either marked as obsolete or superseded
16:31:55 <lbragstad> all blueprints that were describing applicable ideas or work have been ported to bug reports https://bugs.launchpad.net/keystone/+bugs?field.tag=rfe
16:32:30 <lbragstad> we have patches accepted to doc that describe this now
16:32:32 <lbragstad> #link https://review.openstack.org/#/c/637311/
16:32:36 <lbragstad> #link https://review.openstack.org/#/c/625282
16:33:02 <lbragstad> #link https://review.openstack.org/#/c/637567/ is a follow up to clean up some mistakes i made in the initial write-up
16:33:17 <lbragstad> does anyone have questions about what was done?
16:34:03 <lbragstad> moving on
16:34:10 <lbragstad> #topic reviews
16:34:16 <lbragstad> does anyone have reviews that need attention?
16:34:20 <kmalloc> JWS
16:34:25 <kmalloc> lots of JWS.
16:34:46 <kmalloc> we should be landing the last bits for that asap.
16:35:10 <gagehugo> I'll take a look at them
16:35:13 <lbragstad> #link https://review.openstack.org/#/q/topic:bp/json-web-tokens+status:open
16:35:35 <lbragstad> i still have a bunch of patches for scope checking and default roles
16:35:40 <lbragstad> ~50ish or so
16:35:43 <cmurphy> easy one https://review.openstack.org/637425
16:36:56 <vishakha> lbragstad: Should I change this patch https://review.openstack.org/#/c/609210/ according to https://review.openstack.org/#/c/636825/
16:37:30 <vishakha> lbragstad: and making this abandoned https://review.openstack.org/#/c/636825/
16:38:01 <lbragstad> vishakha you could - if you're ok with those changes, i was just looking at ways to try and test all the permutations for role assignments
16:38:29 <lbragstad> i still need to work on a couple of those bits - i'd like to write some more test cases for other types of users
16:39:09 <vishakha> yes I will add some more test cases for group assignment as per the comments
16:39:28 <lbragstad> cool - i'll set aside some time today to get back to that
16:39:46 <vishakha> sure . Thanks for the time on it
16:39:59 <lbragstad> vishakha no problem - thanks for picking that up
16:40:07 <lbragstad> knikolla ildikov asked about https://review.openstack.org/#/c/484121/ today in the edge meeting
16:40:39 <lbragstad> she was curious if you plan to revisit it or if you need help?
16:41:21 <ildikov> as we said we would do the max coverage with the K2K federation scenario I'm trying to put the pieces together and see how we can proceed
16:41:25 <knikolla> lbragstad: i hadn't checked on the comments, will do so later on today.
16:41:37 <knikolla> will revise accordingly.
16:41:44 <ildikov> knikolla: thanks!
16:41:50 <lbragstad> thanks knikolla
16:41:51 <knikolla> np :)
16:42:05 <lbragstad> anyone else have anything for reviews?
16:43:04 <lbragstad> #topic open discussion
16:43:14 <cmurphy> o/
16:43:14 <lbragstad> #link https://releases.openstack.org/stein/schedule.html
16:43:26 <cmurphy> i un-wip'd most of the app cred capabilities patches
16:43:36 * lbragstad just saw that
16:43:42 <lbragstad> ready for some reviews?!
16:43:42 <cmurphy> there are a few things i'd like to discuss about it
16:43:51 <cmurphy> but i don't want to take up meeting time and can't stay after the meeting
16:44:29 <cmurphy> maybe thursday we could chat about it?
16:44:43 <lbragstad> works for me - unless you want to do something async (ML?)
16:44:53 <cmurphy> i can do that too, might be easier
16:45:03 <lbragstad> i don't want to hold you up
16:45:12 <lbragstad> due to time zones
16:45:47 <cmurphy> i'll post something to the mailing list and following that will probably propose some small changes to the spec to align with the implementation
16:45:48 <kmalloc> cmurphy: cool i'll review more in depth
16:46:01 <lbragstad> sounds good
16:46:10 <cmurphy> but the thing is huge enough that i think it might be worth holding off pushing forward until train-1
16:46:21 <ayoung> cmurphy, nice
16:46:22 <lbragstad> i'll start taking a look at those reviews either today or tomorrow
16:46:28 <cmurphy> so that we have sufficient time for discussion and digestion
16:46:37 <gagehugo> ok
16:46:38 <lbragstad> ++
16:46:53 <lbragstad> speaking of release schedules
16:47:04 <lbragstad> we are in R-7
16:47:04 <kmalloc> cmurphy: maybe.
16:47:14 <ayoung> Train is ok for me, as that alings with the RH long term release, but if it goes any longer, it might as well not happen.
16:47:34 <kmalloc> cmurphy: i think we can possibly land a chunk of the code for support earlier and keep the API bits for T1
16:47:37 <lbragstad> PTL self nomination starts in two weeks - if you're planning on running please don't hesitate to reach out if you have questions
16:47:52 <kmalloc> so we don't allow anyone to set the values, but all supporting code is in.
16:47:56 <cmurphy> ayoung: at least at this point we have a complete implementation proposed which can now be tweaked, unlike last cycle where we had nothing at this time
16:48:20 <cmurphy> i'm confident it will happen i just don't want to rush it and regret things
16:48:37 <cmurphy> i got started on it a little late in the cycle unfortunately
16:48:56 <lbragstad> things happen - i'm just happy to see an implementation up for review
16:50:47 <kmalloc> and i'm more than happy to land the code so it's easier to polish and unlock for the users in T1.
16:50:52 <kmalloc> saves rebase-hell
16:51:14 <lbragstad> that's always an option, too
16:51:18 <cmurphy> kmalloc: we might be able to do that
16:51:29 <cmurphy> in any case expect an email from me in the next couple of days
16:52:11 <lbragstad> anything else for open discussion?
16:52:47 <kmalloc> don't forget to book hotels and get tickets to denver before it's sold-out/more-costly/etc
16:53:04 <kmalloc> we are rapidly hitting the 2mo out window.
16:53:11 <lbragstad> i think the summit schedule goes live tomorrow?
16:53:18 <kmalloc> something like that
16:53:39 <kmalloc> this was just an early reminder so folks don't do what I've done in the past and go "oh crap...and it's way more pricy"
16:55:06 <vishakha> thanks for the reminder kmalloc
16:55:14 <lbragstad> looks like we can wrap up a few minutes early
16:55:22 <lbragstad> thanks for the time, everyone
16:55:32 <lbragstad> reminder office hours starts in a few minutes for those available
16:55:39 <lbragstad> #endmeeting