#openstack-meeting log

16:00:48 <tnurlygayanov> #startmeeting Mistral meeting
16:00:49 <openstack> Meeting started Mon Jun  2 16:00:48 2014 UTC and is due to finish in 60 minutes.  The chair is tnurlygayanov. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:50 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:52 <openstack> The meeting name has been set to 'mistral_meeting'
16:01:01 <tnurlygayanov> Hi there!
16:01:07 <dzimine> Hi there!
16:01:08 <NikolayM416> hi !
16:01:11 <bhavenst> Hi
16:01:13 <enykeev_> hey!
16:01:16 <dzimine> thanks for starting on time.
16:01:21 <tnurlygayanov> Welcome to Mistral weekly meeting :)
16:02:04 <tnurlygayanov> ok, so, let's review our progress from the previous week
16:02:50 <dzimine> no progress on my side or stackstorm side all together. We were finishing some internal stuff, now from June we will be able to pick work items.
16:03:28 <tnurlygayanov> we have many fixes in CI jobs, Angus, thank you!
16:04:13 <tnurlygayanov> dzimine, ok, we will wait  and plan our roadmap based on this.
16:04:45 <tnurlygayanov> so, from my side - we reviewed all automated tests and found several issues with Mistral workflows
16:05:25 <tnurlygayanov> Renat worked on it and now Renat on the holidays
16:06:01 <tnurlygayanov> so, enykeev, bhavenst, NikolayM416, what about your progress on the previous week?
16:06:04 <NikolayM416> I fixed some bugs in REST API (error handling), helped with devstack tests, tried to implement OAuth in Mistral and move test launching to testr
16:06:27 <bhavenst> I am interested in contributing to Mistral and we have posted a couple of blueprints.  I'm working on getting up to speed and understanding the component and code so that I can start.
16:06:34 <tnurlygayanov> NikolayM416, cool
16:06:46 <bhavenst> Will try to pick a bug and work on it as that seems the best way to get going.
16:06:48 <enykeev_> tnurlygayanov, nothing from my side. As dzimine stated, we are busy with internal stuff atm.
16:06:49 <NikolayM416> so we discussed with Angus we won't add oauth right now
16:07:05 <tnurlygayanov> bhavenst, you are welcome!
16:07:10 <bhavenst> thanks
16:07:45 <tnurlygayanov> ok
16:08:06 <tnurlygayanov> #info NikolayM416 fixed some bugs in REST API (error handling), helped with devstack tests, tried to implement OAuth in Mistral and move test launching to testr
16:08:41 <tnurlygayanov> #info dzimine, enykeev_ : no progress on my side or stackstorm side all together. We were finishing some internal stuff, now from June we will be able to pick work items.
16:09:33 <tnurlygayanov> #info bhavenst: I am interested in contributing to Mistral and we have posted a couple of blueprints.  I'm working on getting up to speed and understanding the component and code so that I can start
16:09:53 <tnurlygayanov> ok, so, it was good week and do we have plans for the next week?
16:10:19 <tnurlygayanov> I know that Renat will work on incubation request after his holidays
16:10:51 <bhavenst> Sounds good
16:11:30 <tnurlygayanov> we should fix some issues, and we work with CI right now, we plan to write new tests for workflow executions, Sergey Murashov works on it
16:11:43 <NikolayM416> cool
16:12:13 <NikolayM416> I see tests are working now, dsvm is passed
16:12:30 <bhavenst> One question, any place to find info on how to integrate Mistral w/ Horizon?
16:12:55 <tnurlygayanov> #action Sergey Murashov & Timur Nurlygayanov: create more automated tests for Mistral devstack gates
16:12:56 <bhavenst> (not sure if such a question is appropriate for the meeting or not, so sorry if not. :)
16:13:17 <dzimine> guys I have a request. Can you share more details on the blueprints? E.g., moving to testr: there is a blueprint with no info and the review with no explanations on why. No email, no context of why.
16:13:35 <dzimine> It will be good for community if you have some trail of these decisions somewhere open to community.
16:13:49 <dzimine> I am sure it’s good change, but please be more transparent.
16:14:17 <tnurlygayanov> bhavenst, yes, dzimine can describe how we can install horizon dashboard for Mistral
16:15:06 <tnurlygayanov> dzimine, sure, we will update blueprints, which assigned to the next milestone
16:15:17 <tnurlygayanov> so, I plan to do this on this week
16:15:40 <NikolayM416> bhavenst, have you already seen https://github.com/stackforge/python-mistralclient/blob/master/horizon_dashboard/README.rst ?
16:15:55 <tnurlygayanov> #action Timur Nurlygayanov & NikolayM416: update Mistral blueprints, which targeted to release 0.1
16:16:26 <dzimine> note that this is not TRUE integraion with existing Horizon dashboard, we will be working on this soon.
16:16:35 <bhavenst> No, I haven't..  I assumed there might be instructions somewhere
16:16:48 <bhavenst> I will give it a shot
16:16:56 <bhavenst> Thanks
16:16:56 <dzimine> Yes, the instructions are in the readme.
16:17:02 <m4dcoder> on config clean up, as we agreed in the ML, i'm removing the keystone section and using the default keystone_authtoken config options.  i can't find unit tests in Mistral for testing the keystone AuthProtocol middleware.  i'll have to add that so it's taking more time than I like.  once this patch is completed, i will regenerate the sample config file and finish up the config clean up.
16:17:16 <tnurlygayanov> good news: today I seen the patch sets to solum, which allows to integrate mistral and solum
16:17:50 <tnurlygayanov> https://review.openstack.org/#/q/status:open+project:stackforge/solum+branch:master+topic:new-api,n,z
16:18:02 <NikolayM416> dzimine, yes, we just discussed this with Renat (about testr) and I am going to update the blueprint
16:18:51 <m4dcoder> i'm also assigned https://blueprints.launchpad.net/mistral/+spec/mistral-engine-executor-protocol, so I'll be looking at notes to figure out what the details are and come up with proposal.
16:19:32 <dzimine> m4dcoder: good. Reach out to enykeev if you have any questions on his notes.
16:19:46 <m4dcoder> ok
16:22:30 <tnurlygayanov> hm, ok, looks like we finished with action items for the next week
16:24:21 <tnurlygayanov> so, let's start the open discussion
16:24:32 <tnurlygayanov> #topic Open Discussion
16:25:06 <tnurlygayanov> I have several questions about Mistral workflows :)
16:25:51 <tnurlygayanov> we have tasks and we can update status of tasks manually - and statuses can be changed during the workflow execution
16:26:19 <tnurlygayanov> So, what if we have already finished task and user will try to change the status of this task?
16:26:34 <tnurlygayanov> should we allow this or we should deny this?
16:26:46 <tnurlygayanov> what about the workflows with periodic tasks?
16:27:06 <NikolayM416> We should deny this, I think
16:27:16 <dzimine> the task status can’t be changed after the task is complete.
16:27:34 <tnurlygayanov> so, and what about the priodic tasks?
16:27:38 <NikolayM416> We can change the state of task to SUCCESS only 1 time, isn't it?
16:27:38 <dzimine> IDLE->RUNNING->[ERROR | SUCCESS]
16:27:54 <tnurlygayanov> for example, this task is already completed, but we should run it again
16:28:24 <tnurlygayanov> or it will be Running all the time?
16:29:24 <dzimine> it will be another task instance. In the audit log it will show this task executed twice (or N times).
16:29:49 <NikolayM416> yes, correct :)
16:30:13 <tnurlygayanov> dzimine, ok, now it is clear
16:30:17 <dzimine> If it’s REPEAT, than the task is considered to be running for the duration of repetitions, untill it succeeds or fails N times.
16:31:04 <tnurlygayanov> today we worked with workflows and looks like this is unclear for new users that we can not change statuses after the execution
16:31:09 <dzimine> In case of periodic task: if you mean “cron”, each time it triggers, it creates new workflow execution.
16:31:16 <tnurlygayanov> but we can chage status during the execution
16:31:42 <enykeev_> dzimine, each iteration of repeat should produce new instance, correct?
16:32:06 <dzimine> no, it’s the same instance of task (although it is a different invocation of Action)
16:32:22 <dzimine> Sorry, RETRY, not repeat.
16:32:26 <dzimine> Repeat is not implemented yet.
16:32:28 <tnurlygayanov> so, I want to see how it will look in UI with logs for each executiong of workflows
16:32:36 <enykeev_> ok, got it
16:33:28 <NikolayM416> I guess Timur ask us about async task only, when we can update the state of task via REST
16:33:42 <dzimine> Aha, now I see.
16:33:54 <tnurlygayanov> yes
16:34:04 <NikolayM416> so what if we change the state to SUCCESS, and then to RUNNING?
16:34:44 <dzimine> For Async tasks, the 3rd party server assumes that it runs the action, and responsible for posting the results back to the engine.
16:35:03 <dzimine> Once results posted, it’s DONE.
16:35:10 <tnurlygayanov> because now we work on automated tests and this is interesting - what the expected behaviour in different cases when we have some state and we want to change this state to another
16:35:18 <enykeev_> is there are reason to allow changing state rather than from RUNNING to SUCESS or ERROR?
16:35:51 <dzimine> need to look what we should do when you call ‘convey-results’ to the action already completed. We may have a bug there.
16:36:10 <tnurlygayanov> and what it task can't pass the results to engine? we will set the timeout and move task to ERROR?
16:36:13 <dzimine> +1 to enykeev
16:36:50 <tnurlygayanov> so, during the excution we want to STOP task execution
16:37:04 <NikolayM416> oh, we should throw an error on [ERROR, SUCCESS] -> RUNNING transition
16:37:14 <enykeev_> tnurlygayanov, the point of async execution is to run the task beyond the scope of timeout
16:37:56 <enykeev_> so, no, i think we should not make automatic transition from RUNNING to ERROR
16:38:01 <tnurlygayanov> NikolayM416, yes, without 500 respose code :)
16:38:14 <tnurlygayanov> hm
16:38:32 <dzimine> we don’t support cancelling tasks yet.
16:38:37 <tnurlygayanov> enykeev_, so what meeans 'automatic'? How we should do this?
16:39:04 <dzimine> And it’s irrelevant for ASYNC tasks: the 3rd party server is running Action anyways.
16:39:12 <tnurlygayanov> dzimine, yes, but we plan to support it
16:39:27 <dzimine> The engine only need to know what is the result of the tasks, so it can compute the next patch and pass the data.
16:40:11 <tnurlygayanov> dzimine, ok
16:41:00 <m4dcoder> what if 3rd party task never returns?
16:41:59 <tnurlygayanov> m4dcoder, looks like engine should try to do this again
16:42:24 <enykeev_> tnurlygayanov, by issuing the timeout to async task we then would need a way to prolong this timeout when needed
16:42:28 <m4dcoder> how many times before quitting?
16:42:37 <tnurlygayanov> and if this task is not idempotent, it will fail.
16:43:06 <tnurlygayanov> enykeev_ hm....
16:43:08 <NikolayM416> m4dcoder, what do you mean? executing on 3rd party service can last a week or more
16:44:04 <enykeev_> I recall there were some talks about external service that should control such things, watch dog of some sort. Anyway, this is a great question and we should spend some time investigating it.
16:45:20 <tnurlygayanov> ok, cool, I will write this ti action item
16:45:26 <m4dcoder> i was trying to understand what's stated here.  say 3rd party task exceeded timeout (regardless how long), how will engine manage state of that task?
16:45:30 <dzimine> The obvious bug here is that API allows to arbitrarily update the task status. Instead, it shall only be called for RUNNING tasks, and accept SUCCESS or FAILURE + data.
16:45:56 <dzimine> Right now we don’t have a timeout on tasks.
16:46:03 <tnurlygayanov> #action <enykeev_> I recall there were some talks about external service that should control such things, watch dog of some sort. Anyway, this is a great question and we should spend some time investigating it.
16:46:19 <dzimine> When we do, after timeout the task moves to ERROR and workflow continues running
16:46:44 <tnurlygayanov> #action rakmerov review the meeting minutes and create new blueprint if it's needed.
16:47:17 <tnurlygayanov> dzimine how workflow will continues to run if one task will in ERROR state?
16:47:53 <dzimine> it’s designed to have multiple ERRORS,
16:48:02 <tnurlygayanov> I think that ERROR with any task will couse an error of all execution / workflow - if we have no 'if' statements
16:48:08 <tnurlygayanov> ok
16:48:08 <dzimine> e.g., the task will have on-error: do-something-else
16:48:23 <tnurlygayanov> so, what if one task depends from another?
16:48:30 <dzimine> or, on-finish - which means that next task will run regardless of exit code.
16:48:34 <tnurlygayanov> and the first task will became to ERROR
16:49:05 <dzimine> it may depend with on-finish, than next task will execute regardless.
16:49:22 <tnurlygayanov> ok, yes, now it is clear
16:49:27 <dzimine> or it may be on-error: do-error-handling.
16:49:32 <m4dcoder> should timeout be more specific then just be generalized as error?  maybe on-timeout: retry-or-do-something-else?
16:49:35 <tnurlygayanov> so, we will have just on-error
16:49:42 <dzimine> but if it’s on-success: do-somethign-good, this task won’t be run.
16:50:34 <dzimine> Manas had some thinking about result policies, essentially he stated that all categories fail into success | error | finish (aka everything).
16:50:36 <tnurlygayanov> on-timeout - it is interesting, but in the most use cases, which I know, the timeout is equal to error
16:50:40 <dzimine> timeout is a type of error.
16:51:10 <tnurlygayanov> if we will have separate type of error 'on-timeout', need to support user-defined states :)
16:51:20 <m4dcoder> ok. just putting that out for discussion.
16:51:37 <dzimine> let’s file a bug on task API and review the rest when Renat will be re-factoring engine.
16:51:58 <tnurlygayanov> #action: [20:51:48] <dzimine> let’s file a bug on task API and review the rest when Renat will be re-factoring engine.
16:52:16 <tnurlygayanov> dzimine, which bug do you mean?
16:54:10 <tnurlygayanov> so... ok, we can discuss it later.
16:55:36 <tnurlygayanov> Thanks everyone for meeting!
16:55:43 <tnurlygayanov> #endmeeting