16:00:58 #startmeeting Mistral 16:00:59 Meeting started Mon Aug 1 16:00:58 2016 UTC and is due to finish in 60 minutes. The chair is rakhmerov. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:01 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:03 The meeting name has been set to 'mistral' 16:01:22 o/ 16:01:42 o/ 16:01:45 hi 16:02:43 hi 16:03:00 sorry, I actually forgot to send out an email with agenda 16:04:01 I actually want to review action items that we had 2 weeks ago 16:04:08 #topic Review Action Items 16:04:58 1. rakhmerov: file BPs for individual work items of Custom Actions API 16:04:59 done 16:05:03 o/ 16:05:16 2. rakhmerov: assign mistral-lib work to Ryan Brady 16:05:18 done 16:05:30 3. hparekh_: create a BP for YAQL functions API 16:05:34 hparekh: you here? 16:06:28 #action rakhmerov: find out with hparekh if "hparekh_: create a BP for YAQL functions API" is done 16:06:39 4. rbrady, jpeeler: start with initial proposal on security module for Actions API 16:06:44 o/ 16:06:50 hi 16:06:56 rbrady, jpeeler: any updates for this? 16:07:23 not from me... 16:07:32 rakhmerov: yes. I'm currently working on it right now. the security and utils are connected 16:07:43 ok 16:07:46 good 16:07:55 #topic Current status 16:08:31 my status: I'm now fully dedicated to solving Mistral performance issues, made a serious of patches during last week 16:08:39 some of them are merged, some are on review 16:09:07 the biggest challenge: I want to get rid of pessimistic locking TX model at all 16:09:10 mistral-lib: I'm tracking mistral-lib work here: https://etherpad.openstack.org/p/mistral-lib 16:09:18 I made sure that it doesn't work well after doing profiling 16:09:33 oh, ok 16:09:36 cool 16:09:56 my status: fixed some bugs regarding new RPC layer. Still one waiting to be fixed. Then I want to add some tempest testing for this new feature. 16:10:37 mistral-lib: status, porting over utils is helping me to flush out security requirements. working on whether or not to port entire mistral.context over, or just create a security context that will become an attribute of mistral.context 16:11:22 ddeja: which one is waiting to be fixed? 16:12:28 rbrady: why do you have this dilemma? Can you tell about pros and cons for both approaches? 16:12:42 rakhmerov: I'm not sure if it this filled as a bug, but just from looking into the code this TODO must be addressed before the release or Newton mistral wouldn't work with anything but RabbitMQ https://github.com/openstack/mistral/blob/master/mistral/utils/rpc_utils.py#L31 16:14:09 ddeja: ooh, I see. Is it also true in case of using oslo messaging? 16:14:15 only rabbit for now? 16:14:39 yeah, rpc_utils are used olso for oslo 16:14:58 rakhmerov: I'm evaluating pro/cons at the moment. 16:16:53 ok 16:16:59 that's fine 16:17:12 jpeeler: any updates from you? 16:17:30 rakhmerov: I can email the list later once I understand more about the auth hook and auth functions near the bottom of the mistral.context module 16:17:48 i have on my todo list to finish proposed changes for client caching to use cachetools, but it's not happening in the near future unfortunately 16:17:57 jpeeler: as far as your patch with caching, I'm not sure if you saw my comment made recently. You may wanna look at what I did with LRUCache in parser.py 16:18:07 it works perfectly, I tested it 16:18:22 yeah i commented. i think TTLCache is best to handle automatic expiration 16:18:24 jpeeler: ok 16:18:39 jpeeler: it depends on the use case I think 16:19:02 the good thing is that it can be changed any time, they all have the same interface 16:19:35 yep, pretty good find 16:19:54 jpeeler: ok, if you feel that you don't have time to finish it in Aug please let us know, we'll help 16:20:15 oh i can do it by then. just have a tripleo item working on, and then going to be out of town this week. 16:20:33 that's not a problem 16:20:34 sure 16:21:14 ok, I just want to have a topic for Actions API to discuss something quickly 16:21:20 #topci Actions API progress 16:21:44 rbrady: so, do you think we have some roadblocks at this point? 16:21:52 some serious issues? 16:22:02 or it is going smoothly? 16:22:32 rakhmerov: I think it is going smooth enough so far. 16:23:09 rakhmerov: I'm trying to be a bit cautious about what get's moved over or is created, thinking about it's impact to the primary mistral repo 16:23:39 ok, I see 16:23:55 then there's nothing to discuss for now? :) 16:24:30 rakhmerov: the most pressing issue for me right now is the utils/security. I think I will have more questions when I get to the execution parts 16:24:31 rbrady: please use ML and IRC if something occurs 16:24:42 yeah, I see 16:25:01 yeah, I mean please don't wait for a next weekly meeting to discuss something 16:25:11 rakhmerov: ack, will use ML 16:25:12 we can do it in other ways too 16:25:17 thanks 16:25:31 ok, I'm glad it is progressing 16:26:07 then if there's nothing from your side on this let's move to Open Discussion 16:26:16 rahkmerov: wrt to executing actions within mistral vs creating an instance of an action class and calling run() to get the results, what is different? 16:26:43 pardon me? 16:27:15 not sure I understood your question 16:28:04 rakhmerov: in the custom-actions-api spec in the mistral.actions.api.utils section, it talks about "Return type for these actions though must be rather a wrapper that doesn't just call Action.run() method but instead uses Mistral action execution machinery to actually call action just like as if it was called as part of workflow " 16:28:08 what do you mean by "executing actions within mistral"? 16:28:36 ooh, I see 16:29:33 hm.. let me think 16:29:44 rakhmerov: in tripleo right now I have an action that calls super(DeployStackAction, self).run() to get some data and I'm just curious if there is another approach I should be taking 16:29:50 by saying that I meant that just calling Action.run() may not be enough 16:30:04 yeah, so 16:30:07 I was thinking this might be helpful for me to understand before I get tot the execution module 16:30:17 sure, ok 16:30:20 thinking.. 16:31:02 so basically, my thought was that when we execute actions normally we do some stuff before actually calling Action.run() 16:31:04 right? 16:31:23 like routing an action to a correct executor 16:31:39 using "target" task property, for example 16:31:45 prepare context 16:32:14 that's why I assumed that it might be a little more on top of just calling Action.run() 16:32:28 but maybe it's not really true 16:32:33 hm.. 16:32:53 basically, security context will be calculated in a different way now 16:33:06 w/o "context" param being prepared in advance 16:33:17 so minus one concern.. 16:33:53 as far as routing, maybe we don't really need it because we assume that we are already on some right executor 16:34:01 rbrady: what do you think about it? 16:34:27 rbrady: and do you see my initial point? 16:34:37 I'm not saying now it is correct though.. 16:35:03 rakhmerov: not sure I'm following yet...reading again 16:35:13 ok 16:36:41 rakhmerov: are you saying that a workflow might split tasks across more than one executor and the information needed in the security context might not be valid across executors? 16:37:13 yeah, sort of 16:37:36 rbrady: but you know, I'm now not really seeing reasons for this 16:38:03 anyway, please think about it more. Maybe I overcomplicated it in the spec actually :) 16:38:22 (my passion to make things too abstract maybe :) ) 16:38:24 rakhmerov: do we have an HA CI job or plans to have one in the future? 16:38:39 absolutely yes 16:38:51 I would say, even in the near future 16:39:29 my plan is to fix the most apparent performance issues that I identified and then move to creating an HA gate and writing more Rally tests 16:39:49 so that we make sure we're not regressing moving forward 16:39:51 rahkmerov: ack 16:39:52 with performance 16:40:18 just for now it's more important to actually fix those issues 16:41:13 btw, I'm not sure if you all are aware of it so I'll repeat: Mistral is now integrated with osprofiler tool 16:41:30 which allows to do profiling of all needed Mistral components 16:41:47 I started using it actively and was able to see some huge issues immediately 16:42:12 ddeja: you may want to use it, for example, when testing your RPC impl 16:42:23 it is a very handy tool 16:42:49 rakhmerov: ACK ;) 16:42:49 the only issue is that it's not yet documented in our docs how to actually use it 16:43:00 it took me a couple of hours to understand :) 16:43:08 but you can ask me :) 16:43:10 ok 16:43:18 #topic Open Discussion 16:43:39 I don't have anything else from my side 16:43:59 I'm fully focused on performance tasks.. 16:44:16 making needed preparation steps for getting rid of pessimistic locks 16:44:37 I was talking today with jtomasek and he had problems with using action API 16:44:51 what kind of problems? 16:45:02 jtomasek: I'm not sure if you are here and can elaborate a little... 16:45:07 yeah 16:45:13 great :) 16:45:27 so I am dealing with the problem of error handling the action-executions api calls 16:45:29 I saw some messages in our IRC channel but didn't have time to read them all 16:45:57 jtomasek: example? 16:46:12 regardless on whether the action fails or not, it always returns http 200 and a response which looks like this: 16:46:18 http://paste.openstack.org/show/545045/ 16:46:43 what action? 16:47:13 rakhmerov: this one specifically is tripleo.update_capabilities 16:47:18 rakhmerov: from here: 16:47:48 rakhmerov: https://review.openstack.org/#/c/348537/4/tripleo_common/actions/heat_capabilities.py 16:48:27 ok 16:49:39 rakhmerov: it would be nice if when the action fails, it would return an error state http code so the response would get identified as an error 16:49:40 hm.. I wonder why state is null 16:50:12 jtomasek: well, this seems to be possible now 16:50:22 but the action itself should take care of this 16:50:26 give me a sec.. 16:50:33 I'll find an example 16:50:40 rakhmerov: cool, just tell me why so I can make sure we update that in the action 16:50:43 :) 16:51:30 s/why/how 16:51:51 yeah, found it 16:51:54 look at this: https://github.com/openstack/mistral/blob/master/mistral/actions/std_actions.py#L214 16:52:16 this is how our standard std.http action takes care of different HTTP status codes 16:52:55 so the idea is that even if according to the logic of your action it gets some error result, this result could be structured and returned back to the workflow 16:53:14 so that the workflow could make some decisions on variations of this error result 16:53:34 is this what you're looking for? 16:53:45 rakhmerov, jtomasek: IIUC, the run-action will still return 200 and the calling code will still need to inspect the error 16:54:01 rakhmerov: so you say, that by specifying error param on Result, it would return another http code then 200? 16:54:15 rbrady: yeah, that is the problem 16:54:19 jtomasek: exactly 16:54:35 rakhmerov: hm, it seems not to be like that 16:54:36 right 16:55:14 rakhmerov: what is the problem: they want to quickly find out if action ended with success or error 16:55:22 but both returns HTTP 200 16:55:31 action itself need to understand: "Aha, I got an error somehow and now my responsibility to build Result object in a certain way so that in workflow we could analyze that" 16:55:32 and as you saw, state is 'null' 16:55:33 rakhmerov: is there a place in the executor where it sets the response code? 16:56:00 I'm wondering if state should be SUCCESS or ERROR 16:56:06 "error" in Result(error=..) will switch the action into ERROR state 16:56:38 it should, but as jtomasek showed, it is not... http://paste.openstack.org/show/545045/ 16:56:47 rakhmerov: but that happens only in case when action-execution gets persisted by using the special parameter to POST action-execution api call 16:56:47 do we have a bug? 16:56:56 then it might be a bug 16:57:13 rbrady: no, I'm not talking about the response code in general case. Like I said, this response code is a part of action logic 16:57:31 for some actions it may not make sense to have any status codes 16:57:40 jtomasek: please fill a bug with as many informormation as possible 16:57:40 rakhmerov: yeah, setting the action state should be enough 16:57:54 ddeja: ok, will do it tomorrow morning 16:57:59 thanks! 16:58:00 rakhmerov: this case is specific to run-action 16:58:01 jtomasek: I can take a look on it 16:58:03 for executor only ERROR or SUCCESS is important 16:58:15 jtomasek: what time zona are you? 16:58:39 ok, we're running out of time guys 16:58:41 s/zona/zone 16:58:48 ddeja: CEST 16:58:50 jtomasek: file a bug pls and we'll look at it 16:58:56 everything should work ;) 16:59:00 jtomasek: cool, same :) 16:59:08 nice 16:59:26 ok, let's end the meeting 16:59:32 bye, thanks for joining guys 16:59:34 bye! 16:59:38 have a great week! 16:59:44 bye, thanks all 16:59:49 #endmeeting