08:01:50 <d0ugal> #startmeeting mistral
08:01:51 <openstack> Meeting started Fri May  4 08:01:50 2018 UTC and is due to finish in 60 minutes.  The chair is d0ugal. Information about MeetBot at http://wiki.debian.org/MeetBot.
08:01:53 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
08:01:55 <openstack> The meeting name has been set to 'mistral'
08:02:12 <d0ugal> rakhmerov, apetrich, bobh, mcdoker181818: ping! ^
08:02:16 <d0ugal> Happy friday everyone :)
08:02:22 <d0ugal> I just need to get a coffee, back in a few mins.
08:02:30 <rakhmerov> d0ugal: hey
08:02:30 <apetrich> o/
08:02:31 <rakhmerov> here
08:02:53 <d0ugal> I don't have anything particular for the agenda, I'd like to go through and tidy up some of the blueprints maybe
08:02:59 <d0ugal> Otherwise happy to discuss anything people have!
08:03:12 <d0ugal> but first, coffee :)
08:09:25 <d0ugal> https://blueprints.launchpad.net/mistral/
08:09:34 <d0ugal> So we have 140 blueprints, that is more than the number of bugs :)
08:09:54 <d0ugal> I guess we should triage blueprints like we do with bugs. Lots of them are "New" and "Not Started"
08:18:02 <rakhmerov> :)
08:18:05 <rakhmerov> ok
08:21:14 <rakhmerov> d0ugal: I'm here
08:21:25 <d0ugal> rakhmerov: Great
08:21:35 <d0ugal> I'm just reading some blueprints, I have not seen many of these before :)
08:22:10 <d0ugal> I am also reading https://help.launchpad.net/Blueprint - just so I can learn how blueprints should be used :)
08:22:23 <d0ugal> It might be that we don't really need to do anything with them... I am not sure
08:22:27 <rakhmerov> ok
08:22:57 <mcdoker1818> Hi, all. I created new bug ticket yesterday https://bugs.launchpad.net/mistral/+bug/1769012 . We can discuss it after triage blueprints.
08:22:58 <openstack> Launchpad bug 1769012 in Mistral "Workflow pause with task retry policy" [Undecided,New]
08:23:10 <rakhmerov> well, in my understanding we just need to go over them and assign them to cycles and milestones according to our dev plans
08:24:33 <d0ugal> rakhmerov: sure, but I don't think they should stay in "New" for design, like a bug that means they have not been triaged?
08:24:44 <rakhmerov> yes
08:24:45 <d0ugal> they should either be approved or rejected I guess?
08:24:53 <rakhmerov> btw, did we clean up BPs for R-2?
08:25:11 <d0ugal> rakhmerov: no, not yet.
08:25:12 <rakhmerov> d0ugal: I think they can be rejected, yes. If they are nonsense )
08:25:16 <d0ugal> :)
08:25:31 <d0ugal> or Discussion looks like a useful status if they need to be talked about more
08:26:14 <rakhmerov> yes
08:26:21 <d0ugal> It is just a task I want to do a little bit at a time - I think the bugs are much tidier now, so I'd like to do something similar with blueprints.
08:26:26 <rakhmerov> I'm not sure how to reject them properly though..
08:26:34 <d0ugal> Good question :)
08:26:34 <rakhmerov> maybe just mark them as "obsolete"
08:27:14 <d0ugal> https://wiki.openstack.org/wiki/Blueprints
08:27:18 <d0ugal> That also looks useful.
08:27:37 <d0ugal> First I think we should discuss mcdoker1818's bug.
08:27:46 <rakhmerov> ok
08:27:54 <rakhmerov> as you wish, commandor )
08:28:13 <d0ugal> haha, I think it is something he wants to work on now. The blueprints can wait longer :)
08:28:21 <rakhmerov> yes
08:28:25 <rakhmerov> reading it..
08:28:30 <d0ugal> #link https://bugs.launchpad.net/mistral/+bug/1769012
08:28:34 <openstack> Launchpad bug 1769012 in Mistral "Workflow pause with task retry policy" [Undecided,New]
08:28:34 <d0ugal> I am too :)
08:29:05 <rakhmerov> woow, so many details..
08:29:37 <rakhmerov> btw, just something I noticed immediately: the title of the bug doesn't describe the bug
08:29:39 <rakhmerov> :)
08:30:31 <rakhmerov> there's a good BLUF (bottom line up front) principle that I personally try to always use
08:31:20 <rakhmerov> mcdoker1818: Vitalii, can you please put a line right in the beginning of the description that reflects the matter of the bug
08:31:31 <d0ugal> (or change the bug title)
08:32:26 <mcdoker1818> Updated :)
08:32:57 <rakhmerov> ok, thanks
08:34:00 <d0ugal> I wouldn't delete the details - just make sure there is an easy to understand summary at the top
08:34:50 <rakhmerov> I also slightly updated it
08:34:57 <rakhmerov> yep
08:35:09 <rakhmerov> you can add details after a short summary
08:35:17 <rakhmerov> but ok, I understand it now
08:35:24 <mcdoker1818> Thanks!
08:35:47 <rakhmerov> I guess the bug should be pretty easy to fix
08:36:08 <rakhmerov> my assumption is that retry policy doesn't take PAUSED state into account for some reason
08:36:10 <d0ugal> Yeah, so in summary - pausing a workflow breaks the retry policy?
08:36:20 <d0ugal> right
08:36:42 <rakhmerov> maybe "Retry policy keeps iterating if the workflow is paused" :)
08:36:55 <d0ugal> +1
08:37:13 <mcdoker1818> https://github.com/openstack/mistral/blob/master/mistral/engine/tasks.py#L232
08:37:46 <d0ugal> I changed the bug to "Confirmed" - how important is this bug for you? :)
08:38:05 <rakhmerov> mcdoker1818: yes, but DELAYED is not the same as PAUSED
08:38:30 <d0ugal> rakhmerov: do we have a good desription of the states somewhere? I get confused with them sometimes.
08:38:30 <mcdoker1818> Yep, I mean the retry executes before check the pause state
08:38:54 <rakhmerov> just to clarify: DELAYED is mostly an internal state needed to tell Mistral "this task is running but it's delayed due to some internal implementation reasons, like policy or something else"
08:39:08 <mcdoker1818> d0ugal: I plan to fix it soon
08:39:16 <rakhmerov> PAUSED means that a user stopped it temporarily on purpose
08:39:31 <d0ugal> mcdoker1818: thanks, I added it to rocky-2
08:39:37 <rakhmerov> d0ugal: probably we don't, let me check
08:39:43 <rakhmerov> ok
08:40:57 <d0ugal> rakhmerov: https://github.com/openstack/mistral/blob/master/mistral/workflow/states.py#L18 :)
08:41:11 <mcdoker1818> :DD
08:41:15 <rakhmerov> d0ugal: so we have some info about the states in the spec but that's definitely not full
08:41:45 <rakhmerov> d0ugal: haha :))
08:42:06 <mcdoker1818> rakhmerov: I guess the main problem how resume task iterations after resume execution
08:43:30 <rakhmerov> mcdoker1818: it shouldn't be a problem
08:43:54 <rakhmerov> we save info about retry iteration in the task 'runtime_context' field
08:44:04 <rakhmerov> under the key 'retry' or something like that
08:44:17 <rakhmerov> so we always know what the current iteration is
08:44:40 <mcdoker1818> rakhmerov: As I know we resume tasks which has IDLE state
08:45:19 <rakhmerov> nope
08:45:21 <rakhmerov> PAUSED
08:45:21 <hardikjasani> arthur100
08:45:33 <rakhmerov> IDLE is for a different purpose
08:46:49 <mcdoker1818> We don't change state task to PAUSED
08:47:59 <rakhmerov> why not?
08:49:26 <rakhmerov> so the states change in this case as follows: RUNNING or RUNNING_DELAYED [we pause wf] -> PAUSED [we resume wf] ->  RUNNING or RUNNING_DELAYED
08:50:11 <rakhmerov> as far as RUNNING_DELAYED, you can perceive it as a sub state of RUNNING
08:50:17 <mcdoker1818> let me check
08:50:26 <rakhmerov> so it's a flavor of RUNNING state
08:50:46 <rakhmerov> hardikjasani: hey
08:50:56 <rakhmerov> what is arthur100? :)
08:51:21 <hardikjasani> typed it in wrong window :D
08:51:31 <hardikjasani> Nothing critical of course
08:52:31 <rakhmerov> ok )
08:52:53 <rakhmerov> hardikjasani: btw, how is it going? Progressing with your tasks?
08:55:26 <mcdoker1818> rakhmerov: Ok, thank you for clarifying!
08:55:32 <rakhmerov> np
08:58:15 <hardikjasani> rakhmerov: going great!
08:58:22 <rakhmerov> )
08:58:57 <mcdoker1818> d0ugal, rakhmerov: What is about this ticket https://bugs.launchpad.net/mistral/+bug/1767830 ? Should I create blueprint for new api?
08:58:58 <openstack> Launchpad bug 1767830 in Mistral "Execution and task specification can be out of date" [Medium,Confirmed]
08:59:21 <mcdoker1818> Or is it a bug?
09:00:03 <d0ugal> Sory, I got distracted for a moment there.
09:00:16 <d0ugal> mcdoker1818: Looking.
09:00:34 <mcdoker1818> http://localhost:8989/v2/execution/%ex_id%/spec
09:00:37 <mcdoker1818> for example
09:00:44 <d0ugal> Yeah, I remember now.
09:00:57 <d0ugal> mcdoker1818: I am happy for you to treat it like a bug
09:00:59 <rakhmerov> hm..
09:01:12 <rakhmerov> well, wait a second
09:01:23 <rakhmerov> so my concern is the following
09:02:02 <rakhmerov> the 'spec' field is an internal thing and it may look much different from the initial YAML text
09:02:06 <rakhmerov> that's the thing..
09:02:19 <rakhmerov> and that is why it was not exposed in the first place
09:03:02 <rakhmerov> as a solution, we could keep a snapshot of the initial *YAML* chunk but that's extra space
09:03:08 <d0ugal> yeah
09:03:18 <rakhmerov> we in many cases have many megs of data there
09:03:26 <d0ugal> I guess the best solution is to keep old versions of a workflow while there is a related execution.
09:03:41 <rakhmerov> and imagine if that is stored for every execution
09:04:03 <d0ugal> That is why I think it would be better to version the exectuion
09:04:09 <mcdoker1818> > the 'spec' field is an internal thing and it may look much different from the initial YAML text
09:04:17 <rakhmerov> d0ugal: yeah, the most decent solution that I can think of is workflow versioning
09:04:21 <mcdoker1818> Why do you think that it is a problem?
09:04:36 <d0ugal> so we just have one unique copy of every workflow - then a each execution could have a reference to the workflow and a version id (the sha of the yaml contents?)
09:04:56 <rakhmerov> so that workflow definition keep versions and the reference should look like "WF id = 1-2-3-4, ver 5"
09:05:05 <d0ugal> mcdoker1818: if we expose an internal data structure to users then we need to support and document it.
09:05:24 <rakhmerov> yep
09:05:40 <rakhmerov> I'd really hold on with this for now, honestly
09:05:41 <mcdoker1818> Noop, we will not. We create a ExecutionSpec - resource
09:05:58 <mcdoker1818> And we will expose it
09:06:14 <rakhmerov> mcdoker1818: from user perspective it doesn't make much sense
09:06:15 <d0ugal> I don't follow.
09:06:20 <rakhmerov> we already have workflows
09:06:26 <rakhmerov> (i.e. workflow definitions)
09:06:46 <rakhmerov> why would user need to deal with one more thing that's called Spec something..
09:06:47 <rakhmerov> ?
09:07:08 <rakhmerov> from the user perspective they just deal with different versions of workflows
09:07:16 <rakhmerov> but we don't support it properly now
09:08:04 <rakhmerov> it can be done on the user side though: just introduce the policy not to every update workflows and use special naming that includes a version
09:08:05 <rakhmerov> that's it
09:08:07 <mcdoker1818> of you right, yes
09:08:19 <mcdoker1818> all of you are right, yes
09:08:31 <rakhmerov> you can even forbid the corresponding operation in the policy.json file
09:08:33 <rakhmerov> yeah
09:09:11 <d0ugal> That is a decent work-around for now.
09:09:52 <d0ugal> Okay - so I guess we consider this bug to be invalid?
09:09:58 <mcdoker1818> I'm sorry I missed what is wa?
09:10:27 <d0ugal> mcdoker1818: What rakhmerov said. Don't let users update workflows via policy.json and then you can always get the original workflow.
09:10:37 <mcdoker1818> hahahaah
09:10:47 <d0ugal> Then instead of updating workflow you always create new workflows (and delete old ones later)
09:10:50 <mcdoker1818> noop, it's not wa :)
09:11:06 <d0ugal> Why not?
09:11:41 <rakhmerov> sorry, what is "wa"?
09:11:41 <mcdoker1818> We use workflow by name in case of start  sub-workflow (from task)
09:11:43 <rakhmerov> not following..
09:11:51 <mcdoker1818> work-around
09:11:58 <rakhmerov> ooh
09:12:06 <d0ugal> Good point. That makes it harder.
09:12:21 <rakhmerov> sub-workflow names can be dynamic :)
09:12:25 <rakhmerov> but I understand, ok
09:12:34 <d0ugal> but that would get ugly :)
09:13:46 <d0ugal> mcdoker1818: Why do you need the original spec?
09:13:55 <mcdoker1818> UI
09:13:55 <d0ugal> What do you want to do with it?
09:13:59 <d0ugal> I see
09:14:11 <d0ugal> Would it fit in the workflow execution description? :)
09:14:59 <d0ugal> no, I guess not.
09:15:03 <mcdoker1818> Sorry, I don't understand you :)
09:15:36 <d0ugal> Don't worry - the idea won't work.
09:16:25 <rakhmerov> the reason I'm arguing this much is that I don't want workarounds
09:16:28 <rakhmerov> I hate them )
09:16:46 <rakhmerov> they have a trend to live there for long
09:17:04 <rakhmerov> if this is so important then why not implement it in the right way?
09:17:04 <mcdoker1818> :D
09:17:14 <mcdoker1818> I like the idea with the version
09:17:37 <rakhmerov> d0ugal: maybe we just need to write a spec and see so that it's backwards compatible and decently implement it?
09:17:45 <rakhmerov> rather than ending up with workarounds
09:18:02 <d0ugal> rakhmerov: Right, that is why I am trying to think of a workaround from the users perspective
09:18:04 <d0ugal> I have never used namespaces - but could a new, unique namespace be used each time? do sub-workflow executions look within a namespace?
09:18:45 <d0ugal> rakhmerov: sure, I just wasn't sure how easy/realistic that was
09:18:55 <d0ugal> but I am happy for workflow versioning to be added.
09:19:44 <rakhmerov> ok
09:20:03 <rakhmerov> so yes, my suggestion is at least write a spec and see how hard it would be
09:20:12 <d0ugal> mcdoker1818: is this something you can work on?
09:20:30 <rakhmerov> let's do some research within a reasonable amount of time and then make a decision, how's that sound?
09:20:55 <d0ugal> Sure, sounds good. I don't have the time for this, but somebody should go fot it :)
09:20:58 <d0ugal> for it*
09:21:29 <mcdoker1818> It's not a blocker for me right now, but I think it will be :) Yeap, I can do it a little later
09:22:18 <d0ugal> Okay, sounds good
09:22:28 <d0ugal> I'll mark the bug as invalid for now and sumarise this.
09:22:39 <mcdoker1818> Ok
09:25:03 <d0ugal> Thanks!
09:25:06 <d0ugal> I am going to end the meeting bot before I forget, but I will be around for the rest of the day :) it is only 10:30am here.
09:25:08 <d0ugal> #endmeeting