#openstack-mistral log

08:06:19 <rakhmerov> #startmeeting Mistral
08:06:20 <openstack> Meeting started Wed Sep  4 08:06:19 2019 UTC and is due to finish in 60 minutes.  The chair is rakhmerov. Information about MeetBot at http://wiki.debian.org/MeetBot.
08:06:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
08:06:23 <rakhmerov> vgvoleg: ok
08:06:24 <openstack> The meeting name has been set to 'mistral'
08:06:25 <rakhmerov> go ahead
08:06:57 <rakhmerov> eyalb, abdelal, apetrich: ^
08:07:00 <vgvoleg> First of all, this one https://blueprints.launchpad.net/mistral/+spec/mistral-task-skipping-feature
08:07:09 <apetrich> o/
08:07:20 <eyalb> o/
08:07:40 <vgvoleg> It's already done in our fork, but I didn't see any reaction here :(
08:07:48 <abdelal> o/
08:07:54 <rakhmerov> vgvoleg: ok, let me read (again)
08:08:19 <vgvoleg> I've done changes in cloudflow too to support this
08:09:07 <vgvoleg> Just tell me that it would be useful and I'll push it :)
08:10:29 <vgvoleg> https://sun9-12.userapi.com/c854220/v854220580/df9c0/j5PU_qZKAi8.jpg
08:10:32 <vgvoleg> something like this
08:11:41 <abdelal> a question regarding that , if t2 published x1 , how will you pass it to t3 since t2 is skipped ?
08:12:22 <vgvoleg> skipped task also has publish section
08:12:30 <vgvoleg> it works the same way
08:12:39 <abdelal> it will publish even if it was skipped ?
08:12:47 <vgvoleg> oh wait
08:12:56 <vgvoleg> Maybe I understand you wrong
08:13:18 <vgvoleg> there is 'publish-on-skip' section
08:13:28 <rakhmerov> no-no, I think we can't publish anything using the regular "publish" if the state is not SUCCESS
08:13:51 <vgvoleg> where you can, for example, mock data
08:13:58 <rakhmerov> vgvoleg: are you aware of different syntax that you can use to publish vars?
08:14:13 <vgvoleg> It is OK with publish-on-error
08:14:15 <rakhmerov> you can have publish under "on-success", "on-error", etc.
08:14:33 <vgvoleg> I don't see any differences with publish-on-skip
08:14:50 <rakhmerov> vgvoleg: I think we shouldn't mix these things
08:15:05 <vgvoleg> we didn't mix them
08:15:14 <vgvoleg> it is another publish section
08:15:15 <rakhmerov> different states => different language key words
08:15:30 <vgvoleg> if task is skipped, no 'publish' will be published
08:15:31 <vgvoleg> :)
08:15:45 <rakhmerov> vgvoleg: that's right
08:16:06 <rakhmerov> vgvoleg: regular "publish" and "publish-on-error" will be deprecated I think soon
08:16:26 <rakhmerov> we'll be encouraging people to use "publish" under "on-..."
08:16:47 <rakhmerov> where you can define scopes (currently "branch" and "global")
08:16:49 <vgvoleg> the only thing I'm not sure about is what we should do if task don't have 'on-skip' branch
08:17:04 <rakhmerov> I'm thinking may be we shouldn't even introduce this "publish-on-skip"
08:17:50 <rakhmerov> vgvoleg: so, it's still a bit confusing to me..
08:17:52 <vgvoleg> In our fork, if we skip a task with no 'on-skip' section, we use 'on-success'
08:18:11 <rakhmerov> vgvoleg: no, I disagree with this approach
08:18:21 <rakhmerov> it should be a totally separate things
08:18:25 <rakhmerov> thing
08:18:57 <rakhmerov> vgvoleg: can you express with one phrase why this functionality is needed? :)
08:18:59 <rakhmerov> this whole thing
08:19:15 <rakhmerov> I'm still struggling with the idea I guess
08:19:32 <rakhmerov> apetrich, eyalb, abdelal: may be you have some thoughts
08:19:50 <rakhmerov> vgvoleg: so we do skipping if what?
08:20:08 <rakhmerov> on an external failure?
08:20:25 <vgvoleg> If the flow execution is too long, it will be great to have an opportunity to skip a failed task in the tail of the flow
08:20:29 <apetrich> sorry, I'm trying to understand it too.
08:20:44 <vgvoleg> if there is an external failure
08:20:55 <vgvoleg> so retry will not help
08:21:20 <vgvoleg> So it is all about a manual control of the flow
08:21:31 <vgvoleg> like rerun, cancel, pause
08:22:13 <openstackgerrit> Eyal proposed openstack/mistral master: Add a cookiecutter template to generate custom stuff  https://review.opendev.org/679782
08:22:27 <vgvoleg> If we see that result of the task is not so important right now, we can skip this task and continue the execution of the flow
08:22:50 <rakhmerov> ok
08:23:36 <akovi> hi All!
08:23:46 <rakhmerov> akovi: hi Andras
08:23:59 <rakhmerov> we're discussing https://blueprints.launchpad.net/mistral/+spec/mistral-task-skipping-feature
08:23:59 <vgvoleg> and we should be sure that the execution will not break
08:24:28 <vgvoleg> so we provide an opportunity to 'mock' any data in 'publish-on-skip'
08:24:34 <rakhmerov> vgvoleg: what's the problem if it breaks?
08:24:50 <rakhmerov> it will just have status "ERROR" but it will do all the work
08:25:04 <vgvoleg> in case we need something from 'publish' in further
08:25:17 <rakhmerov> maybe you just need some mechanisms (in your system) to interpret the results of a failed workflow correctly?
08:25:47 <vgvoleg> We can't expect every situation
08:25:57 <rakhmerov> vgvoleg: again, as far as "publish-on-skip", I'm not sure we need this at all
08:26:53 <rakhmerov> but may be for the sake of symmetry, we need to do both options: "publish-on-skip" and the totally new clause "on-skip" that can have "publish" inside just like for other "on-xxx" things
08:27:22 <vgvoleg> I'm not sure if I understand you right
08:27:29 <rakhmerov> ok
08:27:34 <vgvoleg> you just say about global problems
08:27:40 <rakhmerov> no
08:27:43 <vgvoleg> not about problems of this feature?
08:27:50 <rakhmerov> I'm talking about the new syntax for "publish"
08:28:00 <rakhmerov> no-no, it's related
08:28:04 <rakhmerov> that's why I touched it
08:28:06 <akovi> so, to clarify: the basic idea is to let a task be skipped before rerunning it. Right?
08:28:11 <vgvoleg> Ok I got it
08:28:22 <rakhmerov> akovi: even after running it
08:28:27 <rakhmerov> vgvoleg: is that right?
08:28:47 <vgvoleg> akovi: yes, we skip the task and rerun the workflow
08:29:00 <akovi> the execution fails, we skip the task and rerun
08:29:05 <akovi> ok
08:29:05 <rakhmerov> yep
08:29:10 <vgvoleg> Consider the situation: in the flow there is a task that requests any data from a third-party service while it is dead. Retries will not help in this situation, the task will go to the ERROR terminal state and the workflow will finish its work. Starting the workflow from the very beginning can be very expensive - it could have been several days bef
08:29:11 <vgvoleg> ore the fall. For such cases in the mistral there is a rerun mechanism - a certain decision maker determines whether circumstances have changed (whether third-party service has come to life), and if so, the workflow will continue its work from the fallen task.
08:29:24 <rakhmerov> because "retry" wouldn't make sense in many cases
08:29:32 <vgvoleg> n fact, the environment cannot always stabilize, and it can be very expensive to adapt the workflow to work with a new environment. Also it is not always possible to automatically assess the nature of the error that led to the fall. It can be something fatal, and maybe something insignificant, which in general does not affect the execution of the w
08:29:32 <vgvoleg> hole workflow. The decision maker can assess how important the results of the current task are and continue the execution of the workflow if not important.
08:29:35 <vgvoleg> yes
08:29:42 <vgvoleg> it's flom blueprint :D
08:29:49 <rakhmerov> I figured )
08:29:50 <vgvoleg> all the arguments are there
08:30:03 <akovi> this is probably useful when wfs are executed under human surveillance
08:30:10 <vgvoleg> yes
08:30:21 <rakhmerov> vgvoleg: too many arguments, usually if we can't express an idea with 1 phrase then it's a bad idea )
08:30:37 <rakhmerov> akovi: well, yes, it's exactly about that
08:31:05 <rakhmerov> akovi: basically, that way we provide more ways for humans to influence workflow executions
08:32:13 <akovi> so my general stance on wfs is that if needed, they should be implemented in an idempotent way
08:32:15 <rakhmerov> so I guess, I'm not against it if 1) It's 100% backwards compatible (shouldn't be a problem here) 2) if it's a totally separate feature that doesn't interfere with existing stuff
08:32:23 <akovi> however, it's freaking hard in many cases
08:32:34 <rakhmerov> by #2 I mean that it doesn't reuse "on-success" etc.
08:32:45 <rakhmerov> akovi: yeah..
08:33:29 <akovi> I think this feature is a useful one.
08:33:57 <vgvoleg> It would be very comfortable to reuse 'on-success' if 'on-skip' is missed
08:34:23 <akovi> Unfortunately it will work in many cases only if the publish-on-skip closure is defined in the wf spec
08:34:36 <rakhmerov> vgvoleg: no, let's please not do this
08:34:45 <vgvoleg> In many cases we want to continue 'on-success' execution
08:34:48 <rakhmerov> well, on the other hand...
08:34:55 <vgvoleg> So we will have duplicates
08:34:57 <vgvoleg> in every task
08:35:09 <rakhmerov> I know you want, but I'm not sure at all if other people will want )
08:35:27 <rakhmerov> I want to make sure we have the common sense here
08:35:31 <vgvoleg> akovi: yes, that's the main idea: if you want to use this feature, be sure that your flow is ready for this
08:36:01 <rakhmerov> vgvoleg: they you can have "on-skip" where needed
08:36:17 <akovi> what if we omit the publish-on-skip and substitute it with a noop task that just publishes the same values?
08:36:17 <rakhmerov> vgvoleg: I somewhat don't like the idea to reuse "on-success"
08:36:18 <vgvoleg> rakhmerov: it's not a problem if we have 'on-skip: t1, on-success: t1'
08:36:41 <rakhmerov> other people can say "we want to consider such tasks failed but w/o moving them to ERROR status)
08:36:56 <vgvoleg> but if we have a long array with next tasks, duplicating them would be ugly
08:37:12 <rakhmerov> vgvoleg: why?
08:37:16 <rakhmerov> why ugly?
08:37:21 <rakhmerov> it's just about your case
08:37:45 <rakhmerov> but like I said, we're a considering a completely different even that may happen in a workflow
08:38:02 <rakhmerov> and different people may treat it differently
08:38:25 <rakhmerov> that's why I want to make it more generic and not let it interfere with the existing mechanisms
08:38:36 <vgvoleg> well, I can write some docs... :D
08:38:46 <rakhmerov> docs for what?
08:38:52 <vgvoleg> For this feature
08:38:59 <akovi> wf:
08:38:59 <akovi> task1:
08:38:59 <akovi> action: some_custom_action
08:38:59 <akovi> publish:
08:38:59 <akovi> var1: <% task (). result.var1%>
08:39:00 <akovi> var2: <% task (). result.var2%>
08:39:00 <akovi> var3: <% task (). result.var3%>
08:39:01 <akovi> on-success: task2
08:39:01 <akovi> on-skip: task2
08:39:02 <akovi> skipped-task1:
08:39:02 <akovi> action: std.noop
08:39:03 <akovi> publish:
08:39:03 <akovi> var1: "var1"
08:39:04 <akovi> var2: "var2"
08:39:16 <vgvoleg> If something is described in docs, it is legal
08:39:35 <rakhmerov> vgvoleg: let me put it this way: you may want to have lots of repeating tasks in "on-success" and in "on-error". But we don't say "it's ugly to repeat them"
08:39:42 <rakhmerov> because those a totally different cases
08:40:09 <rakhmerov> for that we actually have "on-complete" where we can move repeating stuff
08:40:44 <rakhmerov> vgvoleg: no-no, I can't accept that approach ("If something is described in docs, it is legal"), sorry
08:41:32 <rakhmerov> docs must not aim to explain why we came up with a bad design
08:41:46 <vgvoleg> akovi: how do you want to describe them?
08:42:17 <vgvoleg> if we run this flow the skipped task will be executed with task1 in the parallel way :D
08:42:31 <akovi> no
08:42:35 <akovi> I messed it up
08:42:55 <akovi> task1.on-skip = skipped-task1
08:43:15 <vgvoleg> oh
08:43:21 <akovi> this way there's no need for alternative publish methods
08:44:04 <vgvoleg> I can't undestand why 'publish-on-error' is OK and 'publish-on-skip' is not OK
08:44:08 <akovi> where do we usually share copy-pastes? I forgot the name of the service
08:44:18 <eyalb> pastebin
08:44:26 <vgvoleg> I think that creating redundant instances for publish is not OK
08:44:49 <rakhmerov> redundant instances?
08:44:52 <rakhmerov> what's that?
08:45:04 <vgvoleg> noop task just for publish
08:45:12 <rakhmerov> guys, please let's be more accurate with terms
08:45:58 <rakhmerov> noop for publish... I fail to understand this
08:46:14 <vgvoleg> look at Andras' example
08:46:33 <rakhmerov> ok, yes...
08:47:02 <rakhmerov> well, IMO it's not a problem to make a separate "publish-on-skip" thing
08:47:22 <rakhmerov> and "publish" under "on-skip"
08:47:32 <rakhmerov> it's easy
08:48:00 <rakhmerov> I just don't like the idea to reuse either "on-success" or something else that already exists to handle skipping
08:48:22 <rakhmerov> duplicates, in my opinion, is not a problem that we need to solve now
08:48:44 <vgvoleg> ok
08:48:45 <abdelal> the plan is to eventually remove publish and publish-on-error right?, i dont think its wise to add publish-on-skip too
08:48:58 <rakhmerov> abdelal: yes!
08:49:10 <rakhmerov> abdelal: yes, right
08:49:32 <rakhmerov> I just thought that maybe we still need to add it, but just for the sake of symmetry with other clauses
08:49:40 <abdelal> so i think we should follow the current synyax we want,just have publish under on-skip if anything
08:49:53 <abdelal> syntax*
08:49:56 <rakhmerov> that's for sure, yes
08:50:02 <vgvoleg> guys I'm OK with this changes but please don't mix them to the feature discussion
08:50:05 <eyalb> I agree
08:50:33 <rakhmerov> I'm just saying that the language should always be symmetric around similarities that we're adding
08:50:51 <rakhmerov> vgvoleg: :)
08:51:29 <rakhmerov> so, if there's no serious objections then I'd say "go ahead and push a patch"
08:51:31 <abdelal> i think this feature is useful overall , but also as renat said , it should be as generic as much as possible and be alligned with current syntax
08:51:46 <rakhmerov> these technical nuances is something that we'll polish a bit later
08:51:47 <rakhmerov> not now
08:51:57 <rakhmerov> yes
08:51:58 <akovi> ok, great
08:52:12 <rakhmerov> vgvoleg: sounds OK?
08:52:23 <vgvoleg> sure :)
08:52:24 <akovi> so on-skip.publish? or publish.on-skip?
08:52:33 <rakhmerov> once we see your patch we may notice something else to discuss
08:53:15 <rakhmerov> akovi: I'm for adding "publish-on-skip" and "on-skip" (that may contain "publish") just the same way as for other clauses
08:53:26 <rakhmerov> then it'll be 100% symmetric
08:53:44 <vgvoleg> guys, in the blueprint everything is symmetric
08:53:47 <vgvoleg> 102%
08:53:55 <akovi> hmm
08:54:12 <akovi> the removal of publish-on* was new info for me
08:54:38 <vgvoleg> It's because they mix two different topics
08:54:42 <akovi> and if this is being removed then I'm not sure why we would provide this syntax for a new feature
08:54:45 <akovi> but I'm ok
08:54:47 <vgvoleg> relax :)
08:55:38 <rakhmerov> vgvoleg: yes, sorry for that
08:55:44 <rakhmerov> but it's kind of related
08:55:56 <rakhmerov> if we are to discuss particular syntax
08:56:03 <vgvoleg> so the second blueprint I'd like to discuss in the next meeting
08:56:11 <rakhmerov> vgvoleg: yes, please
08:56:15 <vgvoleg> thank you!
08:56:16 <rakhmerov> we don't have enough time today
08:56:25 <rakhmerov> thanks guys, I have to wrap up )
08:56:33 <rakhmerov> thanks everyone for joining
08:56:43 <rakhmerov> I'd encourage you to do it more often
08:56:45 <eyalb> bye
08:56:48 <rakhmerov> bye
08:56:49 <vgvoleg> bye!
08:56:51 <akovi> great, I'm looking forward to see this working
08:56:58 <rakhmerov> #endmeeting