21:03:36 #startmeeting crossproject 21:03:36 jeblair, welcome to ping-as-as-service fun facts? 21:03:37 Meeting started Tue Feb 3 21:03:36 2015 UTC and is due to finish in 60 minutes. The chair is ttx. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:03:39 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:03:41 The meeting name has been set to 'crossproject' 21:03:42 lol 21:03:47 Our agenda for today: 21:03:51 #link http://wiki.openstack.org/Meetings/CrossProjectMeeting 21:04:04 #topic Horizon reviews for project (e.x. Sahara, Trove) panels (SergeyLukjanov) 21:04:09 o/ 21:04:11 SergeyLukjanov: awake ? 21:04:18 ttx, yup 21:04:21 SergeyLukjanov: you really shouldn't be, but you are 21:04:38 So... Some changes in specific panels in Horizon seem to linger a bit 21:04:44 I think this is due to a priority mismatch, with Horizon core focusing on the.. well.. core 21:04:53 And because it's difficult for them to review the change functionally 21:04:56 so, the question is very simple - how to manage change requests from different projects that have own panels to horizon 21:05:03 Not sure how to best solve that mismatch... 21:05:18 it won't get any better as we add more projects that may want Horizon panels 21:05:21 yeah, unfortunately no good proposals on improving it 21:05:35 o/ 21:05:36 There are a couple of reasons for the lag generally 21:05:50 but it's going to be an issues when new functionality couldn't be supported on the dashboard side 21:06:03 first, they aren't targeted to any milestone, so we aren't watching for them 21:06:34 two, the reason above, harder to verify the nuanced changes of other projects without subject matter experts 21:06:45 and as a bonus item 21:06:55 we've grown, a lot 21:07:11 as the community continues to grow 21:07:13 sounds a bit like tempest reviews 21:07:18 the first ones seems to be able to be solved, but the second one sounds like a much bigger problem to solve 21:07:21 yeah 21:07:28 there is the option of having each project be responsible of their panel, but quality would be lower (and it's generally not the same people writing core code and Horizon panel code) 21:08:01 ttx: would that be outside of the main horizon tree? 21:08:11 dhellmann: a bit like tuskar-ui 21:08:24 ok 21:08:24 can you get the same functionality out of tree? 21:08:38 it seems like that's a reasonable approach, but I don't know if the technical details work out 21:08:39 sahara-dashboard was merged into the horizon during the prev. cycle :) 21:08:46 horizon has a fairly mature plugin mechanism 21:09:18 so it would be feasible to have UI plugins for services sit in a separate repo and installed onto Horizon nodes 21:09:25 also there is benefit from having ui experts reviewing your code 21:09:26 dhellmann: I think horizon-folk would still have to advise on how to best do one 21:09:38 issues are UX consistency, translation, quality 21:09:44 even if we'll have the project specific stuff out of the tree - will be horizon folks able to review changes to help keep some code quality? 21:09:50 ttx: not sure how much has changed with horizon since it has been a while, but if each project's panel is a django app, can't these just live in tree of the projects? Like we plan to do with tempest? 21:09:51 but maybe they could directly write some (say for Nova and other core-compute stuff), and support the work of others 21:09:55 ttx: maybe I'm crazy 21:09:56 david-lyle, ++ 21:10:37 david-lyle: we have a number of horizontal projects which are moving towards a "handle directly less projects, but provide tools and advice for all" model 21:10:50 to me it's a similar problem to the docs reviews-- and we manage to keep an eye on quite a few repos so far 21:11:00 so yes, what ttx said 21:11:16 david-lyle: you would decide who you handle directly 21:11:16 I worry UX is a different beast 21:11:22 at least having a "foo-dashboard" repo facilitates overlap between foo and horizon teams 21:11:29 david-lyle: and provide advice for all the others rather than write them all 21:11:52 I would also point out that there arer panel in horizon today that have to touch multiple projects. The launch instance flow for instances can touch nova, neutron, and cinder. So I'm not sure how you would classify what is in core and what is in a project specific panel. 21:11:59 ttx, honestly most are written externally now and merged into tree 21:12:14 david-lyle: so the problem is more.. maintenance ? 21:12:22 yes 21:12:23 cburgess: yes, i think those are good candidates for a basic layer supported in horizon tree 21:12:25 cburgess: good point 21:12:36 many projects are providing the maintenance as well 21:12:55 core review load hasn't scaled to support the diversity 21:13:05 ttx, maintening and improving 21:13:06 So to pick on cinder.. I could see a "volume" panel as being project specific, but the launch instance flow stuff is core. So there is defiantly overlap. 21:13:28 additionally, when we get rather mature code merges, the UX can be inconsistent and harder to maintain 21:13:37 cburgess: what about creating a volume from an image :) 21:13:46 Right that too. 21:13:51 It gets complicated.. quickly. 21:14:02 but by that point it can be too far in the process 21:14:12 or backup a volume to an object store 21:14:20 david-lyle: it's a complex issue. I don't think we'll find the solution today, but I think that's the larger challenge for Horizon today 21:14:26 largest* 21:14:30 scratch the backup example 21:14:35 I agree 21:14:53 the big-tent scares me in that regard 21:14:56 find its role and place in the larger tent 21:15:28 either we look like we have 17 UI teams that throw something together, or we figure a better way to scale 21:15:36 david-lyle: like I said, other horizontal projects have decided to move more in a tools+mentoring mode, but I see how that may not be applicabel to Horizon 21:15:56 I think the latter is the correct approach, but details are pesky 21:16:39 I think it's a discussion you should have with your team and see how to reorganize to support the future 21:16:51 already have started 21:16:55 not sure there is much more we can do today 21:17:02 SergeyLukjanov: ? 21:17:12 yeah 21:17:24 it's a good point and not lost on me, been grappling with it for a while 21:17:29 anything else on that topic ? 21:17:38 david-lyle, could we followup with blueprints and issues? 21:17:46 sure 21:17:52 david-lyle, I mean to ensure that they aren't missed on the milestones? 21:17:57 yes, the targeting coordination sounds solvable short-term 21:17:59 that would be ideal 21:18:26 ok, moving on 21:18:27 #topic openstack-specs discussion 21:18:28 david-lyle, okay, I'll take a look on how it could be done and contact you 21:18:33 ttx, david-lyle thx 21:18:36 * Add TRACE definition to log guidelines (https://review.openstack.org/#/c/145245/) 21:18:37 thanks SergeyLukjanov 21:18:41 Sergfethx for staying up 21:18:55 SergeyLukjanov: ^ 21:19:04 sdague: around? 21:19:07 ttx: yes 21:19:31 so this is the first add after the major log guidelines, which is basically to take back the TRACE keyword 21:19:41 which we've been randomly jamming into stack traces 21:19:50 No opposition so far, so I raise it today to see if we can move on to TC rubberstamping next week 21:19:56 and use it for trace level logging (i.e. < DEBUG) 21:20:02 or if this needs a few more cycles 21:20:12 sdague: +10^9 21:20:24 my impression is that this doesn't have enough PTLs +1 yet 21:20:42 i was personally surprised the first time i saw a project logging stack traces/tracebacks as "trace" log level 21:21:02 fungi: well, it kind of isn't. It's just jammed into the oslo log format string 21:21:10 slightly back doored 21:21:14 yeah 21:21:24 but yeah, not what that's for 21:21:42 sdague: if you do another revision for any reason, i would add trace definition in the bullet list 21:21:51 anyway, please express opinions. This will clearly take longer amount of time because we'll need an oslo change 21:21:55 but not worth losing the curent votes over 21:22:09 sdague: did you have a chance to look at the implementation comment I left? 21:22:10 but I think it's the right long term direciton 21:22:24 dhellmann: I have not yet, I will loop back around on it 21:22:46 sdague: nothing critical, but we have to be careful about assuming that everything can use oslo.log because of the dependency cycles 21:22:53 ok 21:22:59 I read the current silence on this as consent, but we actually need +1 for consent :) 21:23:34 It would be really good to get a couple of seasoned operators' opinions from the operators list on this one. I can try to get some more attention... 21:23:51 Rockyg: this should not impact ops 21:23:51 Rockyg: that would be great 21:24:04 the expected use of this is at a level way below what ops should set at 21:24:06 * dhellmann hopes operators are not running with that much output 21:24:17 yeah, not sure this is an operator thing 21:24:17 sdague: Well several of us find stack traces very useful. As long as we can still get them without having to always run sub-debug level. 21:24:26 Most ops are running at debug 21:24:27 cburgess: this isn't about stack traces, though 21:24:36 OK 21:24:38 Haven't read the spec. 21:24:40 cburgess: I think the point is, not using TRACE for stack traces at all 21:24:43 and yes... we run at debug level. 21:24:50 OK thats cool. 21:24:54 cburgess: we want TRACE back for actual tracing 21:25:01 yeah, stack traces should be at >= error 21:25:03 ++ 21:25:08 dansmith, ++ 21:25:13 Yeah so I agree with that then. Move stack traces to debug and let operators choose debug if they want. 21:25:14 Free TRACE! 21:25:20 jeblair: agreed 21:25:24 or error 21:25:36 cburgess: no, stack traces should be errors, I think that's in the first log spec 21:25:44 because that should never be an ok thing 21:25:45 Thats fine with me. 21:25:51 cburgess: yes, certainly that if you are running at debug, you should see stack traces 21:26:00 cburgess: http://git.openstack.org/cgit/openstack/openstack-specs/tree/specs/log-guidelines.rst 21:26:08 OK I'm on the same page. 21:26:13 I don't see how this is an operator issue then. 21:26:32 cburgess: http://git.openstack.org/cgit/openstack/openstack-specs/tree/specs/log-guidelines.rst#n246 more specifically 21:26:35 stack traces still happen, debug is debug, and trace will be a new thing. Seems fine to me. 21:26:54 More on this topic ? 21:27:01 just ... go vote on it 21:27:08 yeah 21:27:59 #topic rootwrap overhead - should we stay on this plan (sdague) 21:28:03 just a warning that code that's not tested (e.g., if trace isn't enabled) often winds up broken 21:28:13 For those who missed the previous episodes, rootwrap was built for privilege separation, not really for performance 21:28:22 Some services make heavy use of rootwrapped system calls, so the low performance impacts them 21:28:40 and apparently as a result, some operators patch it out (although that's the first time I hear of that, I could use some details) 21:28:45 are those projects adopting the daemon mode work you did? 21:28:49 This basically means they are running the rootwrap-using nodes as root, which IMHO is not a great idea 21:28:58 dhellmann: no 21:29:01 The problem was *already* addressed on oslo.rootwrap side in Juno, with a high-performance daemon mode which is actually even faster than shelling out to sudo. 21:29:10 But that change was not picked up by Neutron or Nova yet 21:29:25 Neutron has a Kilo spec for it 21:29:26 I'll fall on the grenade here. 21:29:31 sdague: but you had other suggestions ? 21:29:41 ttx: 2 things 21:29:46 ttx: or cinder https://review.openstack.org/#/c/149677/ 21:29:52 So we did extensive performance testing of nova-network nova-compute in the G cycle with sudo and rootwrap. 21:29:54 ttx: For Neutron, we plan to get rootwrap daemon mode in by Kilo-3. 21:29:56 Its wasn't even close. 21:30:08 So we patched back in the ability to select sudo. 21:30:09 cburgess: not the daemon mode, I suspect 21:30:17 Nope not daemon mode. Wasn't an option yet. 21:30:35 So I will freely admit that our current practice is based on old performance data. 21:30:42 cburgess: no question the "normal" mode sucks. It was designed for a couple calls in nova 21:30:56 Neutron calls it a few hundreds times when creating a network 21:31:09 The daemon mode sound interesting and is something I would like to play with. But until nova and neutron support it (the two heaviest users of rootwrap) I don't know that it should be the only option. 21:31:23 See perf impact at https://review.openstack.org/#/c/81798/ 21:31:28 yeh, nova is going to land a patch to make sudo selectable 21:31:30 nova is holding off supporting daemon mode until it lands in neutron 21:31:47 the other issue is the rootwrap policies are full of some pretty big holes 21:31:55 * dhellmann hopes mestery doesn't say neutron is waiting for nova 21:32:02 sdague: it's landed 21:32:06 Right so the patch to allow sudo again was added (by me) because a recent setuptools change undid the hack in PBR to make rootwrap perfomant. This resulted in gate failures due to timeouts. 21:32:12 lol 21:32:14 No, we plan to land that in kilo-3 21:32:27 The work slipped from kilo-2, but I have high confidence it will land in kilo-3 21:32:31 you realize "running with sudo" is the same as running as root security-wise ? 21:32:33 mestery: early enough that nova is likely to be able to land it, too? 21:32:34 nova-compute's policies actually make 0 sense because there are at least half a dozen own the box holes in it 21:32:47 since you grant the user the ability to run anythign under sudo 21:33:04 dhellmann: I'm not sure about that. The services split caused the author a slight headache I think, but we have a plan now on it. 21:33:05 sdague: is that an issue with rootwrap, or with the policy definitions? 21:33:10 ttx: Yup I'm aware. 21:33:17 with policy definitions 21:33:24 sdague: like, do you see it as fundamental? 21:33:34 dhellmann: it's an issue with the assumption that some of these services can be priv separated 21:33:37 almost everything uses CommandFilter which sucks 21:33:43 * Rockyg I thought ttx was going to say running with sudo was like running with scissors 21:33:45 But as sdague pointed out you can do that anyways with the current policies. Its a risk we take on and have to be sure we are aware of audit. 21:34:31 cburgess: not for all the nodes though. I agree that the network and compute nodes have policies that basically let you do anything, because nobody takes the time to fix them 21:34:39 sdague: because they need to do so many things as root? 21:34:42 but an api or a metadata node are pretty secure 21:34:43 dhellmann: yes 21:34:43 ttx: Agreed and we only patch it back in for nova. 21:35:10 ttx: We don't use neutron yet, but I suspect we would for that as well, though I will need to benchmark neutron and the new daemon mode in kilo. 21:35:10 I wish that instead of patching it out people would spend more time fixing the policies or adopting the daemon mode 21:35:24 ok, well, yeah, if there are real cases of that we should figure out if we can just run a root-level service with a small application-specific api of some sort 21:35:39 cburgess: IIRC, you can't use it with nova yet unless you make nova's rootwrap support support daemon mode 21:35:47 dansmith: Right 21:35:57 sdague: but when I suggested neutron do that instead of building this general purpose thing into rootwrap they said that would be too hard 21:36:08 the surface area may be different for nova's needs 21:36:11 fixing the policy definitions and supporting daemon mode are one thing, 21:36:14 So seems like we should get nova using daemon mode then do a 3 way performance analysis. If daemon mode is as good as it looks remove the sudo option again. 21:36:29 but I think that precluding support for sudo in the tree is not a necessary policy 21:36:35 dansmith: ++ 21:37:04 dansmith: daemon mode and policy fixes are separate things. 21:37:06 on the other hand, then we have to ask which mode we'd test with 21:37:17 Also this feel like a nova and maybe neutron issue and not something that has to impact all projects. 21:37:21 cburgess: the two are required to demonstrate any advantage to using rootwrap, was my point 21:37:35 dansmith: Agreed 21:37:46 dansmith: it was done at a time when performance was not so much an issue, and we hoped people would just write better filter definitions 21:37:47 dhellmann: we can test with rootwrap if we want, that's fine.. I don't think there is much risk there for the more relaxed case breaking 21:37:55 ttx: yeah, I understand 21:38:09 the sad truth is, people don't care taht much about privilege separation 21:38:28 neutron (which can still use sudo) is most often run with it 21:38:43 it's also worth considering the risk profile of letting these services do things locally as root. in situations where they're the only thing running on the machine, it may not actually be buying you much to prevent giving them control over all the other things that system isn't actually doing 21:38:44 ttx, people may not care but that is because there hasn't been a reason to care (yet, thankfully) 21:38:45 well, and there are people that don't want to solve the problem that way anyway 21:39:00 we spend (too much) time getting our SELinux policies super tight 21:39:02 well it only helps if you *really* know you've got policy that doesn't let you get out 21:39:17 dansmith, sdague: so the patch is not patching out, it's just allowing to use sudo back again ? 21:39:18 fungi: exactly 21:39:23 but with the isolation fungi just described (many people run that way) it is less important to have the isolation 21:39:26 ttx: correct 21:39:33 ttx: yes 21:39:34 sdague: i'm fine with that 21:39:35 ttx: Yes its an option, uses rootwrap by default. 21:39:44 matches what Neutron has 21:39:48 what rootwrap is mostly buying is safety from someone coercing a service into doing things it wasn't intended to do, but preventing root access is a poor proxy for that sort of protection 21:40:07 this is a nice middleground fix allow sudo where people want to use sudo 21:40:09 sdague, ttx, dansmith : should we push that up into the oslo.rootwrap code itself, so applications only have to deal with one api? 21:40:13 fungi: right, agreed. which is why I wanted to raise it as an issue 21:40:22 rootwrap doesn't enforce being the only option... 21:40:30 dhellmann: that would be a library option to ... not do anything? :) 21:40:34 ttx but nova not supporting anything else does. 21:40:37 so Nova can definitely support both options 21:40:43 ttx, exactly 21:40:47 My work here is done. 21:40:51 it's nova-side code anyway 21:40:52 dansmith: a library option to use a function that calls sudo instead of our wrapper, so nova doesn't have to care at all 21:41:00 because it seems like we're doing a lot of work here, to provide options, but as no one is really auditing the policies effectively, it seems.... not effective 21:41:12 sdague: ++ 21:41:29 dansmith: and also instead of nova and neutron both having their own version of that option, and wrapper code 21:41:30 it's like belt and suspenders, but no pants 21:41:39 dhellmann: well, my point is, it seems weird to have the library have a short-circuit option, because if I want to use sudo I could just snip that import line from nova and be good, instead of having to install the library for no reason, right? 21:41:42 It's also worth noting that in many cases, the root call is not even necessary 21:41:50 it's just convenient 21:42:04 sdague: I once brought this up with one of the 'security' groups in OpenStack and they just shrugged and didn't want to audit 21:42:05 dansmith: it makes rootwrap become "run this as root somehow" instead of "use your wrapper to run this" 21:42:17 I'd really like to be able to say "use sudo" and not even have the library imported as native support in the tree 21:42:25 things like selinux and apparmor (and even just careful application of extended attributes and cgroups-imposed namespacing) are probably closer to where the real safety net should be, but that's a lot more complicated and varying cross-platform 21:42:30 dansmith: why make every application author implement that? 21:42:42 dhellmann: I suppose, just seems overkill, but meh 21:42:43 jogo, "containers or isolation of concerns mostly mitigates the issue" i mean - to be fair if someone compromised neutron they could still do all neutron could do w/ rootwrap. 21:42:44 sdague: you mentioned capabilities too 21:42:52 which isn't a small amount of breaking things. 21:42:54 fungi: bingo, that seems like where the effort should be spent to me, honestly 21:42:58 * dhellmann takes off his Killing Code Duplication hat 21:42:58 fungi: yep 21:43:10 sdague: I think it's not completely crazy to grant he neutron user rights over networky stuff 21:43:27 and shave a few thousands rootwrap calls 21:43:31 ttx: I was told neutron was moving to a model like that at the nova midcycle, maybe mestery can speak up 21:43:36 to me it's a complentary approach 21:43:40 except that networky stuff can often be parlayed into full root access to the system anyway 21:43:40 dhellmann: code dedup is good, but less-friggin-libraries-required wins for me every time :) 21:43:43 complementary* 21:43:54 there's at least a review in progress to get selinux in triploe - https://review.openstack.org/#/c/108168/ 21:43:57 tripleo 21:44:01 fungi: you stole my bullet :) 21:44:08 sdague: To be honest, we're hoping to get rootwrap in and I'm hoping we can discuss these aspects at the summit with the broader community 21:44:09 dansmith: meh. Work on more than one project for a while. 21:44:18 As someone said, we shouldn't solve htis one way in neutron and another in nova 21:44:23 definitely one of those classic security arguments where the perfect can be the enemy of the good, but not sure we've really got candidates for either perfect or good 21:44:42 fungi: ++ 21:45:13 fungi, +∞ 21:45:39 the main thing i see rootwrap doing is letting some deployers feel like they have control over what a service can or can't do, but it's the "false sense of" variety of security 21:46:08 I would pursue all options. Allow nova to run sudo directly for the less security-conscious. Enable rootwrap daemon mode for those who want a decent tradeoff. Add capabilities to shave a thousand rootwrap calls. Improve the rootwrap filter definitions so that they actually filter 21:46:18 fungi: yeh, that's a big concern for me 21:46:20 and i have doubts that many deployers/operators are fine-tuning the default supplied rootwrap rules to begin with 21:46:20 I mean - https://github.com/openstack/nova/blob/master/etc/nova/rootwrap.d/compute.filters#L39 21:46:24 ttx: +1 21:47:11 sdague: I've been removing that one at least once in the past 21:47:19 it just keeps on reappearing 21:47:35 Dear god that line scares me 21:47:35 needs a hacking check 21:47:44 well there is also dd, cp in there 21:47:45 cburgess, haha 21:47:50 and chown 21:47:58 Yeah 21:48:04 Wow I just really started looking at that file. 21:48:05 yeh, like I said, it's kind of pointless 21:48:07 there seems to be a reviewer education problem 21:48:10 I'm going to go back to my safe pretend world now. 21:48:28 the nova-compute node is running as root basically 21:48:33 ttx: right 21:48:37 ttx, shhhh 21:48:41 ttx, you'll scare someone 21:48:45 sdague: the qemu-nbd one is probably the worst in there 21:48:47 rootwrap just gives you a framlework to fix that, it's not a work-free bullet 21:48:49 :P 21:48:55 sdague: export /dev/sda as an nbd device.. game over :) 21:49:08 cburgess, what has been seen cannot be unseen. though yelling lalalalala helps sometimes. 21:49:08 yeh, there are so many game overs 21:49:26 which is why I'm not sure this approach works 21:49:56 sdague: buggy filters make it just as unsecure as the option of using sudo directly 21:50:13 ttx, might make it worse. because it provides the sense of security falsely 21:50:32 morganfainberg: true 21:50:49 but most people think "running under a non-root user" makes them safe too 21:51:24 The main issue to me is.. even when we fix them (I think I did it twice over the years) people keep on re-breaking them 21:51:24 did i already miss any discussion on bug 967832 ? 21:51:27 difficult to gate on it 21:51:29 anyway, it seems like on the compute nodes we should just admit it's basically game over, not sugar coat it at this level. Do it at the selinux/apparmor level 21:51:32 mriedem: no 21:51:50 sdague: or fix the filters 21:51:59 honestly, 21:52:05 there are two valuable things on a comptue node 21:52:11 the images with other people's data, and the mq credentials 21:52:21 if you get the latter, you can do anything you want to the entire deployment 21:52:28 both of those things are readable by nova's user anyway 21:52:57 sdague, i have opiinion on selinux and apparmor, but putting the enforcement at that level seems waaaaay more sane [regardless of which] than rootwrap 21:52:57 yep 21:53:03 if you're running your company's imap server alongside nova-compute, you're in for more trouble anyway 21:53:12 ! 21:53:28 from a vmt perspective, having a handle on this and publicizing that it "does not do the things you think" for securing a deployment would help reduce what could otherwise become needless vulnerability reports 21:53:28 We have one more topic to cover 21:53:32 dansmith, the mq-creds should be guarded with signed messages.. 21:53:36 So I shouldn't be running my our payroll processing and credit card processor on the same server and my openstack cloud? 21:53:37 dansmith, but thats another story 21:53:41 morganfainberg: should be but they're not 21:53:52 dansmith, lets plan to circle back on that at L summit? 21:53:56 dansmith, i want to see that gap closed. 21:54:00 cburgess: you should run them in containers on the same machine 21:54:06 morganfainberg +1 21:54:07 dansmith, will bug you on that front a bit later on this cycle? 21:54:10 morganfainberg: it's been on the list for a long time, but.. yeah :) 21:54:12 cburgess, you too. 21:54:18 ttx: Yeah I was joking and we are moving to containerizing everything. 21:54:37 because containers never have exploits... :) 21:54:41 cburgess: because that is safe (joking too) 21:54:42 sdague, ever. 21:54:48 sdague, EVAR 21:54:55 OK, let's cover the last topic quick 21:54:59 #topic Bug 967832 (mriedem) 21:55:00 ttx: I'm terrible at sarcasm. morganfainberg can verify this 21:55:08 and I'm French 21:55:11 #link https://bugs.launchpad.net/nova/+bug/967832 21:55:14 #link http://lists.openstack.org/pipermail/openstack-dev/2015-February/055801.html 21:55:15 ttx, he's worse at it :P 21:55:20 mriedem: you have 5 min :) 21:55:20 LOL 21:55:30 so morganfainberg already commented on this in the ML, 21:55:38 basically this is a bug across projects since grizzly, 21:55:52 looks like there was an approved spec in juno in neutron that didn't land code and wasn't re-proposed for kilo, 21:56:11 i actually like the concept of putting this into middleware that can receive registered "do X on event Y" 21:56:16 wondering if everyone is in agreement about fixing this thing and then looking for best solutions since it's going to be probably similar impl in the various projects 21:56:22 i would be 100% behind that. 21:56:41 mriedem: From an operator perspective I can tell you that this bug sucks, badly. 21:56:51 so nova tells middleware to call nova delete on an instance when a project that instance is running under is gone? 21:56:53 it might make tempest devs a lot happier too 21:56:56 mriedem, yep. 21:57:05 mriedem: yeah, callbacks 21:57:06 mriedem or spool, it for a periodic 21:57:14 fungi: yes it would, it makes the cleanup story much easier 21:57:20 but...does nova need the tenant to exist for the delete to happen, or does middlware process the callback before deleting the tenant? 21:57:24 mriedem up to nova to figure it all out the best way to not crater itself 21:57:24 mriedem: middleware in what project? nova or keystone? 21:57:34 I think you need a bit of both 21:57:34 notmyname, i'd put it as a package in keystonemiddleware 21:57:35 notmyname: keystone 21:57:37 you want events for latency 21:57:38 but not in auth_token 21:57:56 notmyname, s/package/module 21:57:59 thanks 21:58:04 I don't think you would need the tenant to exist. 21:58:09 and you want either a lossless history or a diff style check to deal with servers that were down etc 21:58:11 did anyone look at the WIPs from neutron in juno? https://review.openstack.org/#/q/status:abandoned+project:openstack/neutron+branch:master+topic:bp/tenant-delete,n,z 21:58:13 Each project would only cleanup its own local resources. 21:58:16 mestery: do you remember those? ^ 21:58:20 cburgess: agreed, making sure tenant exists makes this a lot more complex 21:58:24 The problem we have is cross project relationships. Like volumes attached to VMs. 21:58:37 But maybe we don't care. 21:58:49 middleware sounds risky since (1) it requires coordination for things that can't be coordinated (eg what happens if an instance fails to delete?) and (2) it will _really_ slow down the delete call to keystone 21:58:51 Maybe we just take destructive do it anyways actions because its all being nuked. 21:59:12 notmyname, it's a callback - i assume nova would spool it for a delete job that is periodic 21:59:16 are there any other examples where a server needs to keep track of state in another server? 21:59:16 does multi-tenancy muck this up at all 21:59:17 not a 'do it right now' 21:59:17 ? 21:59:24