20:05:29 <mikeyp> #startmeeting Orchestration
20:05:30 <openstack> Meeting started Thu Nov 17 20:05:29 2011 UTC.  The chair is mikeyp. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:05:32 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic.
20:05:54 <maoy> #topic workflow engines
20:06:17 <mikeyp> #topic workflow engines
20:06:24 <maoy> :)
20:07:00 <maoy> i was wondering about the error handling in the mailing list
20:07:03 <mikeyp> we dont have an agenda full agenda - I think it's workflow engines, eventlet/zookeeper, and anything else
20:07:20 <mikeyp> ok
20:07:35 <maoy> i'm interested more in they handle runtime errors
20:08:14 <maoy> and also if it deal with some failure issues, such as a node is crashed
20:08:26 <mikeyp> main thing I noticed was that exceptions are just raised; there didn't appear to be any concept of exception handling specific to the workflow.
20:08:55 <maoy> is the exception raised in another node (or another Python interpreter)?
20:09:27 <mikeyp> it's single threaded, no conecpt of concurrency or parallelism.
20:09:37 <maoy> got it
20:09:51 <maoy> but we need something that can handle those
20:11:19 <mikeyp> definitely, but I didn't find any cloud-grade (tm) workflows libraries
20:12:04 <mikeyp> it's raises the larger point of how this will all work together - think we need Sandy for that.
20:12:26 <maoy> ok. i'll put some thoughts on that too.
20:12:30 <maoy> i'll try to convert my powerpoint proposal to a wiki page before next meeting.
20:12:57 <mikeyp> the strawman I have in my head is 'orchestration' is a reliable service, that calls into other openstack services.
20:13:01 <maoy> now that i've read though the nova code i have a better idea how to fit in the code..
20:13:14 <maoy> yes
20:13:43 <mikeyp> I'm not sure what the granularity would be, either in initial or later releases.
20:14:24 <maoy> i think combining that, with more orchestration cooperation logic inside the compute/network nodes, we have something there.
20:14:28 <mikeyp> it seems like TROPIC could support fine grained control.
20:14:55 <maoy> the "orchestrator" might actually nicely fit with the scheduler
20:15:47 <mikeyp> agreed - I see changes there.
20:16:20 <maoy> mike, can you elaborate in "fine grained control"?
20:16:47 <mikeyp> just the general level of steps.
20:16:54 <maoy> ok
20:17:33 <mikeyp> so today, the operations are pretty high level.  Schedule calls create, and a large number of things happen.
20:18:15 <mikeyp> should those individual operations be coordinated by orchestration ?
20:18:43 <maoy> i think if they are non trivial, e.g. takes a while to finish
20:18:53 <maoy> they should report their status
20:19:10 <maoy> so that the orchestrator could 1) know what's going on
20:19:17 <maoy> 2) if it's stuck/dead/crashed
20:19:29 <maoy> 3) abort, or restart if necessary
20:20:37 <mikeyp> #action get sandys input on granularity of orchestration
20:21:00 <maoy> the state of the workflow progress should be available
20:21:14 <maoy> it could be either in database, or in zookeeper
20:21:48 <maoy> right now, the task_state column is kind of like that
20:22:12 <maoy> but can definitely be improved
20:22:14 <mikeyp> yes - when I'v done this in the past, workflow runs independently of other operations, and can be interrogated
20:24:05 <mikeyp> in your TROPIC work, where there multiple workflow servers ?
20:24:12 <maoy> i also like to use the analogy of the OS process
20:24:45 <maoy> we essentially need to build mechanisms to track the distributed processes as a coherent workflow
20:24:54 <maoy> restart, or abort it if necessary
20:26:07 <maoy> if you look at those business workflow management software, they are solving a different problem
20:26:53 <maoy> yes. in TROPIC we call them controllers
20:27:01 <maoy> there are multiple of them
20:27:04 <mikeyp> business workflow tends to focus on process control, rather than process execution.
20:27:22 <maoy> but only one is elected as leader to make decisions
20:27:49 <mikeyp> so one is active, the others are 'standby' or failover ?
20:28:08 <maoy> yes
20:28:36 <mikeyp> got it - thats what I thoiugh the paper said.
20:28:41 <maoy> it's hard to make distributed decision. :)
20:29:07 <maoy> although possible, we run the numbers and seems one active is fast enough
20:29:40 <maoy> i also looked at the other proposal mentioned in last meeting
20:29:44 <maoy> from dragon
20:30:00 <maoy> i felt it's very similar to the ppt file I sent
20:30:03 <mikeyp> I#topic pacemaker
20:30:12 <mikeyp> #topic pacemaker
20:30:47 <mikeyp> I haven't really reviewed it, was mostly looking at libraries.
20:31:03 <mikeyp> what are the main differences ?
20:31:27 <maoy> between dragon's and mine?
20:31:52 <mikeyp> yes
20:32:42 <maoy> mine also proposes to keep logs so that we can automatic rollback
20:33:42 <maoy> hold one
20:33:53 <maoy> i need to refresh my memory. :)
20:34:31 <maoy> #action maoy gives dragon proposal feedback
20:34:42 <maoy> i'll do this in an email after the meeting
20:34:53 <mikeyp> #action maoy gives dragon proposal feedback
20:35:15 <mikeyp> ok, I will also review it.
20:35:32 <maoy> i don't know much about pacemaker
20:35:42 <mikeyp> #action mikeyp to review dragon's proposal
20:36:00 <maoy> the picture of pacemaker seems to suggest that corosync is a dependency which i also know nothing about
20:37:02 <maoy> i got zookeeper working with eventlet
20:37:11 <maoy> so that's not a concern.
20:37:29 <mikeyp> #link https://lists.launchpad.net/openstack/msg03767.html  dragondm's proposal
20:37:44 <mikeyp> #topic zookeeper / eventlet
20:38:00 <mikeyp> yes, I saw that, good progress.
20:38:37 <mikeyp> #topic vm-stat transitions
20:38:53 <mikeyp> The proposed vm state transitions are in review
20:39:06 <mikeyp> #link https://review.openstack.org/#change,1695
20:39:34 <mikeyp> They seem to be held up, but I'm reviewing the changes anyways.
20:40:07 <mikeyp> I should have said state transition management
20:41:06 <maoy> somehow i felt that the solution they proposed is a little too complicated
20:41:41 <maoy> i remember i saw a big state transition table in the summit
20:42:01 <maoy> hopefully it can be simplified, otherwise it's hard to debug
20:42:53 <mikeyp> hopefully, orchestration can remove some of the complications.
20:43:00 <maoy> exactly
20:43:20 <mikeyp> so, what else do we have ?
20:43:37 <mikeyp> #topic wrap up
20:43:56 <maoy> not much
20:44:34 <maoy> next week is thanksgiving
20:44:36 <mikeyp> OK, then lets wrap up till Sandy can review - I know he was out of pocket travelling today.
20:45:20 <maoy> cool
20:45:28 <mikeyp> #action mikeyp to send email re: next week schedule
20:45:34 <mikeyp> ok, ttyl.
20:45:48 <mikeyp> #endmeeting