15:00:30 #startmeeting heat 15:00:32 Meeting started Wed Jun 22 15:00:30 2016 UTC and is due to finish in 60 minutes. The chair is therve. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:34 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:36 The meeting name has been set to 'heat' 15:00:41 #topic Roll call 15:00:46 o/ 15:00:46 o/ 15:01:11 o/ 15:01:21 h1 15:01:58 hola 15:02:38 o/ 15:03:11 #topic Adding items to agenda 15:03:21 #link https://wiki.openstack.org/wiki/Meetings/HeatAgenda#Agenda_.282016-06-22_1500_UTC.29 15:03:30 hi 15:04:29 #topic DB errors blocking gate 15:04:41 #link https://bugs.launchpad.net/heat/+bug/1546431 15:04:42 Launchpad bug 1546431 in heat "Tests fails with SQL error 'Command Out of Sync'" [High,Triaged] 15:04:51 #link https://bugs.launchpad.net/heat/+bug/1483670 15:04:52 Launchpad bug 1483670 in heat "ResourceClosedError from DB in functional tests" [High,Triaged] 15:04:59 zaneb, You added this one no? 15:05:07 I think so, yes 15:05:22 we are hitting this *all* the time 15:05:45 Yeah the rate has been pretty bad recently 15:06:02 Presumably the 2 new builders don't improve things 15:06:11 yes, that's not helping 15:06:25 but we need to fix it 15:06:30 I looked at this issue, AFAICT it's that we behave badly when we kill greenlets 15:06:31 cwolferh: about? 15:06:41 yes 15:06:54 The oslodb work that stevebaker is working on may improve things 15:07:03 But that's as much as I know 15:07:37 what we do to kill greenlets is kinda uncool, but we should be able to do it without breaking the db 15:08:06 my theory is that every transaction should be using a with ...: block 15:08:22 so that if *any* exception occurs then it gets rolled back 15:08:45 zaneb, https://review.openstack.org/#/c/330800/ ought to do that 15:08:52 but that there must be places where we are not, and so we switch threads in the middle of a transaction 15:08:55 Using a decorator, but that's the same solution 15:09:55 that's good enough for me :) 15:10:11 do we think that merging that will fix the problem then? 15:10:36 I hope so, but not sure 15:10:40 that looks like a good next thing to try to me :-) 15:10:55 FWIW https://bugs.launchpad.net/heat/+bug/1499669 has a good reproducer for this kind of issues 15:10:55 Launchpad bug 1499669 in heat "Heat stucks in DELETE_IN_PROGRESS for some input data" [Medium,Triaged] 15:11:06 ok, then I'm a happy camper :) 15:11:22 zaneb, You mean reviewer? :) 15:11:44 There is a 10+ patches series, so we need to get onto that 15:11:50 therve: oh *that's* the bug I was looking for the other day 15:11:56 I knew I commented on it :D 15:12:14 If you need to search launchpad talk to me first :) 15:12:37 any reason we can't accelerate it to the top of the patch queue? I guess that's a question for stevebaker 15:12:57 therve: aha, you have a secret weapon 15:13:15 he just blasted out a bunch of patches yesterday, they should be at the top 15:13:27 Another stuck DELETE_IN_PROGRESS reproducer specific to convergence is https://review.openstack.org/#/c/329460/ 15:13:45 https://bugs.launchpad.net/heat/+bug/1592374 15:13:45 Launchpad bug 1592374 in heat "deleting in_progress stack with nested stacks fails with convergence enabled" [High,Confirmed] 15:14:03 If anyone has time to look into that it'd be good - the functional test appears to reproduce the issue 15:14:05 jdob: I can never remember what order the 'related changes' are in the new gerrit 15:14:16 dude, glad i'm not alone there 15:14:50 BRING BACK THE OLD GERRIT! 15:14:54 ahem, sorry. 15:14:57 :) 15:15:11 It's definitely at the bottom of the queue for a reason 15:15:22 Needs some API change to make it work correctly 15:15:37 dang 15:16:00 shardy, Yeah the convergence part is definitely something different than we we're talking about 15:16:22 shardy, Wondering if https://review.openstack.org/#/c/279520/ is related 15:16:25 Is the ResourceClosedError 100% reproducible with steps in bug #1499669? 15:16:25 bug 1499669 in heat "Heat stucks in DELETE_IN_PROGRESS for some input data" [Medium,Triaged] https://launchpad.net/bugs/1499669 15:16:26 therve: ack, I just wanted to mention it because I've not got time to work on fixing it myself this week 15:16:46 * shardy travelling/meetings 15:16:57 ramishra, I had to increase the resource count for my env, but yeah after that 15:17:13 shardy: speaking of no time, I put my name on the env changes spec and started on it 15:17:18 We could probably turn that into a (skipped) test 15:17:23 jdob: yup I saw that, thanks! :) 15:17:43 shardy: I had a look at it today, seems like there is no way to cancel a CREATE_IN_PROGRESS to acquire/steal the resource lock to DELETE. 15:18:18 ramishra: Ok, thanks for the analysis - that sounds, erm, not good ;) 15:18:32 ramishra, https://review.openstack.org/#/c/301483/ isn't this patch queue about that? 15:18:36 ramishra: it should time out eventually though, right? 15:18:37 * shardy wonders how many times we'll fix cancelling in-progress things 15:18:41 and then the delete will run? 15:18:47 zaneb: yes 15:18:51 shardy: fixing it again right now :) 15:19:14 and by 'again' I mean' for the first time', since it hasn't working since Kilo 15:19:14 zaneb: On my tripleo test env it times out after 4 hours ;) 15:19:15 * jdob working on a new cancel bug in reaction to zaneb's fix 15:19:48 * zaneb shakes fist at jdob 15:20:21 I think merge rate has been slightly better this week than last 15:20:33 We probably depend on the state of the hardware a bit somehow 15:20:48 a shit ton of my stuff landed last night, i was super happy coming into work this morning 15:20:49 Still, let's try to focus on that, and avoid blank rechecks :) 15:20:59 #action stevebaker merge https://review.openstack.org/#/c/330800/ asap 15:21:23 Anything else on that topic? 15:21:30 nope, let's move on 15:21:45 #topic Restoring bp ironic resources implementation 15:22:22 cat ----> pigeons 15:22:25 OK. I have nothing against that, let's do it. 15:22:39 Unless ironic people really yell at us again 15:22:45 +1 on this, although I'm not planning on reviving the patches I started myself 15:23:35 A possible alternative tho is a mistral workflow 15:23:37 https://review.openstack.org/#/c/313048/ 15:23:45 I started that, although not yet got it working 15:24:01 if we can make that work, there's no need for a bespoke heat plugin 15:24:05 I question the sanity of anyone using this, but I (still) see no harm, especially now that we can block admin-only resource types from general view 15:24:17 prazumovsky added that but isn't around 15:24:40 er s/anyone/anything/ 15:25:20 #action prazumovsky restore Ironic resources 15:25:24 zaneb: I think just a plugin representing the Ironic node isn't enough - there's a workflow around networking we need so that it emulates what Nova does 15:25:41 that could be in a heat plugin, but it's kinda workflow-ish 15:25:55 shardy, Shouldn't you use nova in that case? 15:26:29 therve: Nova is huge overkill when you don't care about multiple tenants, flavors or scheduling 15:26:31 shardy: yeah, that's one reason that I question why anything would want to use this 15:27:28 #topic Legacy to convergence migration 15:27:41 #link https://review.openstack.org/#/c/232277/ 15:27:51 ochuprykov, You added this one? 15:27:55 yep 15:28:09 i think it can be useful now 15:28:19 after we switch to convergence 15:28:34 So that's a new heat-manage command 15:28:41 yes 15:28:55 implementation here https://review.openstack.org/#/c/280836/ 15:29:10 but i've faced with one problem 15:29:45 in order to do this migration online ( and it was intended to be online) i need to lock stacks i wont to migrate 15:30:10 but such lock can be successfully stealed by working engine 15:30:27 Yep 15:30:43 That's a tough one 15:30:53 oh, because its done by the heat-manage command and not an engine :/ 15:31:04 i know this) 15:31:08 zaneb, Why does it make a difference? 15:31:18 I can steal from another engine, no? 15:31:25 therve: the engine id is stored in the lock 15:31:36 no, if this engine is live 15:31:40 alive* 15:31:43 so if the engine holding the lock is running, nothing will steal it 15:31:57 Ah 15:32:02 ochuprykov, Use the RPC API then 15:32:05 what ochuprykov said 15:32:57 but probably we can choose option 2 from alternatives section 15:33:08 We could automatically convert stacks on the next stack-update. 15:33:37 which is better? 15:33:43 ochuprykov: then we can never deprecate the legacy path 15:33:52 +1 for using the RPC API 15:34:09 ok, i will try this variant 15:34:22 so, i think i don't need to change a spec 15:34:32 zaneb, Operators could trigger a stack update? 15:34:52 You need to be in the tenant though... 15:34:55 therve: mmmmmhmmm I guess 15:35:15 still +1 on the RPC :D 15:35:21 Yeah I would try that :) 15:35:34 ochuprykov, Makes sense? 15:35:56 therve: yep, i will try to do it via rpc 15:35:58 ochuprykov: I +2'd your spec though! 15:36:02 Cool 15:36:05 so there's that 15:36:08 #topic Open discussion 15:36:40 prazumovsky, http://eavesdrop.openstack.org/meetings/heat/2016/heat.2016-06-22-15.00.log.txt we talked about the ironic resources 15:36:46 I'd appreciate wider feedback re https://review.openstack.org/#/c/327149/ 15:36:50 hi, sorry for the late, I read the logs and now I understand that before discussing ironic resources I needed in some deeper problem description understanding. 15:37:01 two specs I'd appreciate eyes on: https://review.openstack.org/328822 and https://review.openstack.org/330414 15:37:22 basically -f yaml in oscplugin doesn't output yaml - jdob provided feedback, I'd like to get some consensus 15:38:54 shardy, I think I'm with you on that one 15:39:00 shardy: i'll look again, I kinda forget what I was arguing 15:39:06 No point if -f yaml doesn't return yaml 15:39:28 agree 15:39:34 prazumovsky, Cool let us know. Ironic is tricky :) 15:41:22 Some confusion why implementing resource plugins are not enough, but keep investigating in the issue. 15:43:46 OK, anything else? 15:43:56 3 15:44:03 2 15:44:11 1 15:44:15 #endmeeting heat