16:01:23 #startmeeting keystone 16:01:24 Meeting started Tue Feb 19 16:01:23 2019 UTC and is due to finish in 60 minutes. The chair is lbragstad. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:25 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:27 The meeting name has been set to 'keystone' 16:01:33 #link https://etherpad.openstack.org/p/keystone-weekly-meeting 16:01:36 agenda ^ 16:01:37 o/ 16:01:42 o/ 16:02:02 zzzzzz 16:02:05 i mean o/ 16:02:18 o/ 16:02:50 o/ 16:03:50 o/ 16:04:09 \m/ dOvOb \m/ 16:04:26 alright - let's get started 16:04:35 #topic previous action items 16:04:49 i think i was the only one that had something for last week 16:05:20 which was to send a note to the mailing list about key loading performance and my struggles attempting to recreate issues with it - or failing to realize actual performance gain 16:05:30 #link http://lists.openstack.org/pipermail/openstack-discuss/2019-February/002674.html 16:05:41 ^ which is now available for your reading pleasure 16:05:52 wxy-xiyuan and i did follow up with some downstream teams 16:06:18 and the reported issue was using an older token provider, but likely still an issue with fernet/jwt 16:06:39 the problem was that the keystone process would lock the files in order to read them 16:07:21 so if you had a whole bunch of keystone processes on the same host, using the same key repository, you could theoretically see bottlenecks with performance if the processes start queuing up while they read keys from disk 16:07:37 this gives me a bit more to go on 16:07:37 it is likely the best bet is to run a test with low disk-cache (low ram available) and disk under io stress. 16:08:02 i haven't taken a stab at recreating it - but i'm going to be setting aside time to do that this week 16:08:05 NFS might also cause similar odd issues. 16:08:10 yeah - that's a good point 16:08:45 if this piques anyone's interest, i encourage you to try it out 16:08:49 or any other network capable filesystem (iscsi based, gluster, RADOS/RBD) 16:09:40 obviously, the initial jwt implementation is susceptible to this issue, as is the fernet token provider 16:10:13 i could open a bug to track this specific issue though, and we could continue with the jwt implementation if people are ok with this being something we fix after the fact 16:10:38 If only there were a way to avoid having to go back to Keystone to validate tokens at all... 16:10:48 like, maybe, I don't know, sharing memcache 16:11:37 why don't we stick the tokens into memcache in the first place, and then use the same memcache region between keystone and the other services? 16:11:59 It would be especially cool if Keystone were the only service that could write to it 16:12:04 as i noted in my email, this is only a factor when you disable a whole bunch of caching options 16:12:21 yeah, but we never provided a way to do ^^ 16:12:43 If we are dependend on caching, we should use it in a smart way 16:13:25 our caching implementation and usage has come a long way since ~mitaka/newton 16:13:47 fwiw - we still need to supply a definitive caching guide for keystone 16:14:05 that describes the complexity we have in our caching implementation and how to use it effectively 16:15:12 lbragstad: *since havana 16:15:22 lbragstad: (yes, caching was initially implemented in havana) 16:15:24 Is it possible to inject a new token into the cache upon creation without adding any new code? 16:15:48 kmalloc right - we had to overhaul a bunch of it shortly after we implemented fernet 16:16:04 ayoung yeah - that feature already exists 16:16:27 ayoung: no. we would need to know where to inject the token cache into the memcache, the memcache clusters are not the same in all cases for distributed or centralized keystone / other services 16:16:29 can we share that same segment with the validation code in middleware on another server? 16:16:33 though it key'd off an unnecessary configuration option 16:16:48 and we would need to implement a cache that knows how the KSM keys are constructed 16:16:59 we lean on a different cache key in keystone. 16:17:24 could we unify them? 16:17:34 possibly 16:17:36 that'd be a good idea 16:17:39 but it's a chunk of code 16:19:02 looks like we might have a couple actionable things here 16:19:13 for investigative purposes 16:19:14 i would want to add this post pymemcache 16:19:32 python-memcache is actively causing issues for folks now (triple-o notably) 16:20:55 "only keystone being able to write" means we need to either sign the data in memcache (not likely to work easily) or use bmemcache for SASL auth 16:20:58 sounds right to me 16:21:11 basically anyone who can read from memcache can write 16:21:16 the protocol is very ... limited. 16:21:24 it can be a security hardening to use it. I think hrybacki is looking into SASL auth already 16:21:41 all KSMs need SASL auth, and ability to write to the same keys 16:21:54 read 16:21:59 no 16:22:02 as far as the time we have left for Stein - is it feasible to get any additional caching work done outside of writing a guide? 16:22:04 KSM also must write. 16:22:30 Nah. If you validate, it ends up back in cache on the Keystone server side 16:22:34 if the element is LRU'd out, unless keystone is pushing a cache to every cluster on every validate, the KSM needs to write 16:22:37 for this it would be read only 16:22:49 Ah 16:23:04 probably no real benefit then, huh? 16:23:08 yeah 16:23:22 keystone can push on issue, but it shouldn't be push everywhere on every validate 16:23:35 also cache-on-issue for remote sites might cause severe performance issues 16:23:43 leaning on the KSM to cache would be better in those cases 16:23:50 Would be cool to have a 2 level setup, where keystone can push to a serivce, but they can't push to each other 16:24:11 this gets into where memcache is not the best tool 16:24:24 istio? 16:24:26 using something like REDIS (don't ever use redis due to licensing) is better. 16:24:33 (in training but will review meeting notes this evening) 16:24:35 if we're using the current cache infrastructure 16:25:12 if we totally re-implement caching [tokens] to something totally different like istio, etcd, or something else, it's more doable 16:25:22 that is a monumental rewrite 16:25:35 yeah, probably tight to get that into Stein 16:25:42 ++ 16:25:54 might be hard to land that into Train, it's going to probably be a 2-cycle initiative 16:26:03 1 cycle for scaffolding, one for implementation/adoption 16:26:22 Seriously, though, this is starting to get into Service mesh land. We shoud investigate something for that in a future release 16:26:31 def. not opposed to it 16:26:39 just opposed to trying to wedge it into Stein :) 16:26:41 There are numerous, some less tied to K8S than others 16:26:56 does anyone want to take a shot at a spike and writing up their findings? 16:27:09 unfortunately, i don't have the bandwidth at the moment. 16:27:21 this would be a lot of fun to work on though. 16:27:36 Outreachy? 16:27:38 yeah - it is interesting 16:27:44 probably more than intern level of work 16:27:48 if not - we can table it and come back to it later 16:27:55 this is likely in the flask-rewrite class of effort 16:28:06 i would certainly like to get the guide done this release though 16:28:10 this touches a lot of very delicate parts. 16:28:40 caching is hard. very hard to do right. 16:28:56 every one of us here have messed it up at least once in keystone ;) 16:29:05 (and in my case, a lot more often) 16:29:22 anything else on this? 16:29:30 we kinda went off on a tangent 16:30:13 #topic blueprint cleanup 16:30:18 #link https://blueprints.launchpad.net/keystone 16:30:23 #link https://etherpad.openstack.org/p/keystone-blueprint-cleanup 16:30:41 #link http://lists.openstack.org/pipermail/openstack-discuss/2019-February/002672.html 16:30:45 quick status update on this 16:31:09 all blueprints that weren't targeted to stein have been either marked as obsolete or superseded 16:31:55 all blueprints that were describing applicable ideas or work have been ported to bug reports https://bugs.launchpad.net/keystone/+bugs?field.tag=rfe 16:32:30 we have patches accepted to doc that describe this now 16:32:32 #link https://review.openstack.org/#/c/637311/ 16:32:36 #link https://review.openstack.org/#/c/625282 16:33:02 #link https://review.openstack.org/#/c/637567/ is a follow up to clean up some mistakes i made in the initial write-up 16:33:17 does anyone have questions about what was done? 16:34:03 moving on 16:34:10 #topic reviews 16:34:16 does anyone have reviews that need attention? 16:34:20 JWS 16:34:25 lots of JWS. 16:34:46 we should be landing the last bits for that asap. 16:35:10 I'll take a look at them 16:35:13 #link https://review.openstack.org/#/q/topic:bp/json-web-tokens+status:open 16:35:35 i still have a bunch of patches for scope checking and default roles 16:35:40 ~50ish or so 16:35:43 easy one https://review.openstack.org/637425 16:36:56 lbragstad: Should I change this patch https://review.openstack.org/#/c/609210/ according to https://review.openstack.org/#/c/636825/ 16:37:30 lbragstad: and making this abandoned https://review.openstack.org/#/c/636825/ 16:38:01 vishakha you could - if you're ok with those changes, i was just looking at ways to try and test all the permutations for role assignments 16:38:29 i still need to work on a couple of those bits - i'd like to write some more test cases for other types of users 16:39:09 yes I will add some more test cases for group assignment as per the comments 16:39:28 cool - i'll set aside some time today to get back to that 16:39:46 sure . Thanks for the time on it 16:39:59 vishakha no problem - thanks for picking that up 16:40:07 knikolla ildikov asked about https://review.openstack.org/#/c/484121/ today in the edge meeting 16:40:39 she was curious if you plan to revisit it or if you need help? 16:41:21 as we said we would do the max coverage with the K2K federation scenario I'm trying to put the pieces together and see how we can proceed 16:41:25 lbragstad: i hadn't checked on the comments, will do so later on today. 16:41:37 will revise accordingly. 16:41:44 knikolla: thanks! 16:41:50 thanks knikolla 16:41:51 np :) 16:42:05 anyone else have anything for reviews? 16:43:04 #topic open discussion 16:43:14 o/ 16:43:14 #link https://releases.openstack.org/stein/schedule.html 16:43:26 i un-wip'd most of the app cred capabilities patches 16:43:36 * lbragstad just saw that 16:43:42 ready for some reviews?! 16:43:42 there are a few things i'd like to discuss about it 16:43:51 but i don't want to take up meeting time and can't stay after the meeting 16:44:29 maybe thursday we could chat about it? 16:44:43 works for me - unless you want to do something async (ML?) 16:44:53 i can do that too, might be easier 16:45:03 i don't want to hold you up 16:45:12 due to time zones 16:45:47 i'll post something to the mailing list and following that will probably propose some small changes to the spec to align with the implementation 16:45:48 cmurphy: cool i'll review more in depth 16:46:01 sounds good 16:46:10 but the thing is huge enough that i think it might be worth holding off pushing forward until train-1 16:46:21 cmurphy, nice 16:46:22 i'll start taking a look at those reviews either today or tomorrow 16:46:28 so that we have sufficient time for discussion and digestion 16:46:37 ok 16:46:38 ++ 16:46:53 speaking of release schedules 16:47:04 we are in R-7 16:47:04 cmurphy: maybe. 16:47:14 Train is ok for me, as that alings with the RH long term release, but if it goes any longer, it might as well not happen. 16:47:34 cmurphy: i think we can possibly land a chunk of the code for support earlier and keep the API bits for T1 16:47:37 PTL self nomination starts in two weeks - if you're planning on running please don't hesitate to reach out if you have questions 16:47:52 so we don't allow anyone to set the values, but all supporting code is in. 16:47:56 ayoung: at least at this point we have a complete implementation proposed which can now be tweaked, unlike last cycle where we had nothing at this time 16:48:20 i'm confident it will happen i just don't want to rush it and regret things 16:48:37 i got started on it a little late in the cycle unfortunately 16:48:56 things happen - i'm just happy to see an implementation up for review 16:50:47 and i'm more than happy to land the code so it's easier to polish and unlock for the users in T1. 16:50:52 saves rebase-hell 16:51:14 that's always an option, too 16:51:18 kmalloc: we might be able to do that 16:51:29 in any case expect an email from me in the next couple of days 16:52:11 anything else for open discussion? 16:52:47 don't forget to book hotels and get tickets to denver before it's sold-out/more-costly/etc 16:53:04 we are rapidly hitting the 2mo out window. 16:53:11 i think the summit schedule goes live tomorrow? 16:53:18 something like that 16:53:39 this was just an early reminder so folks don't do what I've done in the past and go "oh crap...and it's way more pricy" 16:55:06 thanks for the reminder kmalloc 16:55:14 looks like we can wrap up a few minutes early 16:55:22 thanks for the time, everyone 16:55:32 reminder office hours starts in a few minutes for those available 16:55:39 #endmeeting