13:00:43 <Qiming> #startmeeting senlin
13:00:45 <openstack> Meeting started Tue Feb 16 13:00:43 2016 UTC and is due to finish in 60 minutes.  The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:00:46 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:00:48 <openstack> The meeting name has been set to 'senlin'
13:00:54 <Qiming> hello
13:00:59 <haiwei> hi
13:00:59 <yanyanhu> hi
13:01:03 <elynn> hi
13:01:04 <lixinhui_> hi
13:01:29 <Qiming> hi, pls check meeting agenda and see if you have things to add: https://wiki.openstack.org/wiki/Meetings/SenlinAgenda
13:01:58 <Qiming> welcome back, guys, hope you all have enjoyed a pleasant vacation, :)
13:02:17 <yanyanhu> thanks, you too :)
13:02:36 <Qiming> #topic Mitaka work items
13:02:45 <Qiming> #link https://etherpad.openstack.org/p/senlin-mitaka-workitems
13:02:46 <haiwei> yes, you four
13:03:11 <Qiming> API revision
13:03:36 <Qiming> don't think we have progress on it, or we can make them into M release
13:04:13 <Qiming> so ... maybe we should move them back to the TODO.rst
13:04:34 <Qiming> Heat resource type
13:04:36 <Qiming> ethan?
13:04:42 <elynn> in progress
13:04:49 <elynn> codes are submitted
13:05:01 <elynn> waiting review
13:05:12 <Qiming> I have some problems regarding the logics you used for testing action completion
13:05:37 <elynn> Can't access to gerrit now...
13:05:38 <Qiming> there might be some cleaner ways to do that
13:05:49 <elynn> I will check that later.
13:05:49 <Qiming> good luck, :)
13:06:14 <Qiming> okay
13:06:20 <elynn> You mean move them to senlin plugin?
13:06:39 <Qiming> maybe you can paste the links to the related patches into the etherpad
13:06:52 <Qiming> yes, it could be a routine on senlin plugin
13:07:02 <Qiming> check something (action) has completed
13:07:03 <elynn> Will do after my network back to normal...
13:07:11 <Qiming> great
13:07:36 <Qiming> also, some checkings are better offloaded to some threads
13:07:37 <elynn> I reply you, it's not that easy... I reply you in that patch, I will have a look if I can move them to senlin plugin.
13:08:11 <Qiming> I'm not quite sure I have a good understanding on doing things asynchronously in Heat
13:08:28 <Qiming> There might be some ways ... need some homework on that
13:08:55 <elynn> I think some of them are asynchronously in my codes.
13:09:06 <Qiming> great
13:09:06 <elynn> Can you be more specific?
13:09:25 <Qiming> have to look at the code to be more specific
13:09:31 <Qiming> :)
13:09:35 <elynn> ok :)
13:09:49 <Qiming> testing
13:10:01 <Qiming> cannot recall why we have oslo.messaging scalability there
13:10:32 <Qiming> someone has got a higher memory capacity?
13:10:33 <yanyanhu> also have no idea
13:10:47 <Qiming> okay, two overflowed
13:10:57 <elynn> Not sure about that...
13:11:02 <Qiming> 3
13:11:13 <haiwei> neither
13:11:19 <lixinhui_> seems xujun mentioned about scalability test
13:11:20 <Qiming> I could be from Junwei Liu's talk during our meetup
13:11:20 <yanyanhu> oh, maybe about a discussion in midcycle meeting
13:11:24 <yanyanhu> yep
13:11:39 <lixinhui_> but not sure what we can do on this
13:11:45 <Qiming> right ...
13:11:56 <yanyanhu> about API service stop responding request in some cases
13:12:05 <Qiming> maybe need to check with xujun and see if there are resources from their side working on this
13:12:25 <lixinhui_> good idea
13:12:30 <Qiming> yanyanhu, you mean senlin API?
13:12:36 <yanyanhu> yes, I think this is job is about best practise
13:12:41 <lixinhui_> if anything need more testing, Bran_W can help too
13:12:43 <yanyanhu> Qiming, yes
13:12:44 <elynn> Seems you told him to increase workers to see if that can be solved?
13:12:59 <yanyanhu> e.g. when creating a cluster consists of 500 nodes
13:13:05 <Qiming> lixinhui_, please do
13:13:17 <lixinhui_> okay, Qming
13:13:23 <Qiming> we need to know the limits
13:13:44 <Qiming> with both cloud backends, the usual one, and the faked one
13:13:48 <yanyanhu> elynn, yes, increasing worker is most direct solution
13:14:00 <yanyanhu> but still need more evaluation
13:14:08 <lixinhui_> Bran_w has some data already, will send all the results for review and suggestion.
13:14:20 <Qiming> yanyanhu just added some more items?
13:14:26 <Qiming> COOOL
13:14:29 <yanyanhu> Qiming, yes. about test
13:14:48 <yanyanhu> I'm now working on completing functional test for cluster/node
13:14:59 <Qiming> maybe those are the samething bran has already started?
13:15:04 <yanyanhu> to add test cases for cluster add/del node, cluster update
13:15:19 <yanyanhu> oh, you mean stress test?
13:15:21 <Qiming> right, those are yours
13:15:27 <Qiming> yes, stress tests
13:15:37 <lixinhui_> Yes, Bran_w focus on Stress tests
13:15:47 <yanyanhu> actually my plan is not just doing stress test :)
13:16:08 <Qiming> I'm putting xinhui's name under that item
13:16:13 <yanyanhu> I'm thinking whether we need to add a gate job for it :)
13:16:31 <Qiming> gate job for stress test?
13:16:38 <yanyanhu> to test the scalability for some important changes
13:16:42 <yanyanhu> yes
13:16:56 <elynn> I remember there's a project in community can help us do stress test?
13:16:58 <Qiming> emm ... don't think that's a good idea, :)
13:17:00 <elynn> rally?
13:17:06 <yanyanhu> I have got clear idea how to do this
13:17:23 <lixinhui_> okay, Qiming, yanyahu, let me know if anywhere need labor work
13:17:26 <yanyanhu> yea, so we need more discussion here:)
13:17:44 <lixinhui_> :)
13:17:45 <Qiming> great if you have already got a good idea
13:17:45 <yanyanhu> xinhui, will talk more with you about this issue
13:17:51 <yanyanhu> lets think it together
13:18:05 <Qiming> my concern is resources at gate are very precious ones
13:18:11 <lixinhui_> okay :)
13:18:19 <yanyanhu> Qiming, yes, understand
13:18:46 <yanyanhu> so maybe a manually trigger one. Or just a test tool in senlin source tree
13:19:10 <Qiming> also, need a clear design to ensure that we are stressing the correct target
13:19:22 <yanyanhu> absolutely
13:19:35 <Qiming> stress testing a cluster of 1000 m1.huge nova servers sounds crazy
13:19:41 <yanyanhu> haha
13:19:45 <lixinhui_> :)
13:19:46 <yanyanhu> sure
13:19:53 <Qiming> it is not gonna yielding any useful insights
13:20:26 <Qiming> how about we bring bran into this
13:20:39 <lixinhui_> Yes, Qiming
13:20:41 <lixinhui_> please
13:20:55 <yanyanhu> great
13:21:05 <haiwei> who is bran?
13:21:08 <Qiming> I believe VMware has a lot interests in seeing the scalability report
13:21:19 <Liuqing> cool
13:21:32 <haiwei> VMware guy?
13:21:39 <Qiming> bran is lixinhui_ 's assistant, you can think of him that way
13:21:45 <lixinhui_> Yes, Haiwei
13:21:53 <haiwei> niubility
13:21:56 <yanyanhu> I believe built stress test tool and test results are very useful for user who want to manage large scale cluster using Senlin
13:22:01 <Qiming> technical assistant
13:22:07 <yanyanhu> s/built/builtin
13:22:13 <Qiming> absolutely
13:22:16 <lixinhui_> just trying to get more resource to help senlin
13:22:17 <Liuqing> haha  Qiming:)
13:22:25 <yanyanhu> although maybe it's just prototype or an example
13:22:34 <lixinhui_> and bring senlin into VMware market engagement
13:23:06 <Qiming> move on?
13:23:15 <yanyanhu> ok
13:23:18 <lixinhui_> ok
13:23:22 <haiwei> it's better so
13:23:24 <Qiming> oh, functional test for failure scenarios
13:23:37 <Qiming> I think that sounds a tempest job
13:23:41 <yanyanhu> not sure how to do this
13:24:03 <yanyanhu> yea, maybe not use existing functional test
13:24:04 <Qiming> currently, we are only testing the 'correct' execution paths
13:24:43 <Qiming> need to test all weird inputs and make sure api and engine always behave as designed
13:25:08 <Qiming> that will be a huge effort
13:25:18 <yanyanhu> yes, and I think we need a design for 'fault injection'
13:25:52 <yanyanhu> actually, the functional test for lock breaker also faces similar problem
13:25:54 <Qiming> em ... that would be cool
13:26:02 <yanyanhu> yep :)
13:26:20 <Qiming> but before that, I really hope we have a good idea about some typical cases
13:26:34 <yanyanhu> agree
13:26:53 <Qiming> say, creating a cluster of 100 nodes, but only first 80 were successful
13:27:19 <Qiming> scaling-in by 10 nodes, but we only managed to remove 3 nodes ...
13:27:35 <yanyanhu> yep
13:28:02 <Qiming> these are more realistic challenges
13:28:27 <lixinhui_> very import for industry level quality
13:28:31 <Qiming> maybe add a TODO item for this?
13:28:45 <lixinhui_> good idea
13:28:46 <yanyanhu> yes, I think so
13:29:05 <elynn> yes, that would be good!
13:29:08 <Qiming> hopefully, we can get something done before M release
13:29:29 <yanyanhu> maybe not only Senlin need this kind of tool/design :)
13:29:29 <lixinhui_> is there sad path testing already in other projects?
13:29:34 <Qiming> document the rest as 'known issues', haha
13:29:35 <lixinhui_> like HEAT
13:30:08 <lixinhui_> okay
13:30:12 <Qiming> don't know if any project is doing defensive programming
13:30:34 <Qiming> off topic now
13:30:42 <yanyanhu> :)
13:30:46 <lixinhui_> okay
13:31:00 <Qiming> #action Qiming to add TODO item about testing typical failure scenarios
13:31:02 <elynn> Heat has a internal resource, it can fake it's status. you can set to failed to success.
13:31:36 <Qiming> good idea
13:31:46 <elynn> in that way heat can test failure cases.
13:31:48 <Qiming> in senlin we can have some fake actions
13:31:57 <Qiming> it may and may not succeed
13:32:13 <yanyanhu> do we need to expose API/rpc interface for it?
13:32:21 <Qiming> oh, no
13:32:34 <Qiming> it is a testing support
13:32:47 <yanyanhu> so we need to consider how to trigger it
13:32:59 <Qiming> yep
13:33:23 <Qiming> let's think of it offline
13:33:31 <yanyanhu> ok
13:33:49 <Qiming> btw, should we create a senlin-specs project?
13:34:06 <yanyanhu> not that necessary I think
13:34:20 <yanyanhu> in this cycle
13:34:34 <Qiming> or we can have a specs dir in tree
13:34:37 <elynn> maybe a folder in senlin repo is enough?
13:34:59 <Qiming> doc/specs ?
13:35:22 <Qiming> there we can discuss designs like this asynchronously
13:35:23 <haiwei> yes, should have that kind of thing
13:35:36 <yanyanhu> sounds good
13:35:47 <haiwei> though we didn't use it at all
13:36:13 <lixinhui_> I just wrote spec for nova
13:36:26 <lixinhui_> and think we will need that when allow others to extend senlin
13:36:41 <lixinhui_> according to my limited experience
13:36:45 <Qiming> let's practice a small scale specs discussion
13:36:55 <haiwei> honestly that is the formal way
13:36:59 <Qiming> and migrate to a dedicated project when we feel necessary
13:37:35 <lixinhui_> sound great start point
13:37:35 <Qiming> so you prefer a senlin-specs project ?
13:37:53 <lixinhui_> not really a project at this time
13:38:02 <Qiming> okay
13:38:02 <lixinhui_> but at least something others can follow
13:38:17 <lixinhui_> spec dir would be good and doc seems not enough
13:38:57 <Qiming> if we discuss things in tree
13:39:14 <Qiming> we can always easily migrate the outputs into design docs
13:39:30 <Qiming> anyway
13:39:38 <lixinhui_> yes
13:39:38 <Qiming> moving on
13:39:45 <haiwei> you mean using mailing?
13:39:50 <Qiming> progress on health mgmt?
13:39:51 <haiwei> ok, go on
13:40:17 <lixinhui_> the senlinclient change is blocked by sdk change
13:40:26 <Qiming> yep, sadly, that's true
13:40:34 <lixinhui_> but I am still working on senlin and sdk side
13:40:50 <Qiming> yes
13:41:03 <lixinhui_> hope to commit them into firstly then extend client once sdk problem resolved
13:41:20 <lixinhui_> for the semi-automation part
13:41:40 <lixinhui_> I will commit into some patch firstly for comments in this week
13:41:40 <Qiming> okay, hopefully everything will gets sorted out by the end of this week at sdk side
13:41:51 <Qiming> cool
13:41:54 <yanyanhu> and will add functional test for node check/recover
13:42:01 <Qiming> documentation side
13:42:06 <yanyanhu> and for cluster as well after xinhui's job is done
13:42:11 <Qiming> did a little revision to the senlin wiki
13:42:28 <Qiming> need to work more on that
13:42:37 <lixinhui_> okay, yanyanhu and Qming
13:42:49 <lixinhui_> Yes, Qiming
13:42:49 <Qiming> container support
13:43:09 <Qiming> emm ... I have tried something 'last year'
13:43:32 <haiwei> I also did something in the last few days
13:43:33 <Qiming> I think I'm stuck by a scheduling problem
13:43:39 <Qiming> cool
13:43:43 <lixinhui_> you mean last days on horse year
13:44:04 <Qiming> when you want to create a container, you need to specify on which VM/host to start it
13:44:12 <yanyanhu> yes
13:44:24 <Qiming> that becomes a huge scheduling problem
13:44:42 <Qiming> though we can do some very naive 'placement' today
13:44:56 <yanyanhu> that's because no underlayer service helping to do a default scheduling decision
13:45:02 <Qiming> this could be a spec for discussion
13:45:18 <Qiming> yanyanhu, exactly
13:45:32 <elynn> k8s/mesos/swarm can do schedule task?
13:45:34 <haiwei> I think if you want to create a container cluster by Senlin, you will have to do something similar to k8s and swarm
13:45:36 <elynn> can't
13:45:43 <Qiming> I know Haiwei has proposed a topic for the coming summit on this
13:45:58 <lixinhui_> cool, haiwei
13:45:58 <yanyanhu> elynn, they can
13:46:25 <haiwei> an idea from Qiming
13:46:34 <lixinhui_> :)
13:46:39 <haiwei> and also some magnum guys
13:46:39 <Qiming> we need to help him getting something solid before the summit, not matter the talk is accepted or not
13:47:05 <Qiming> #action Haiwei to start a spec discussion about container support
13:47:14 <Qiming> haiwei, okay?
13:47:23 <haiwei> ok
13:47:34 <Qiming> Receiver functional test, done?
13:47:42 <yanyanhu> yep
13:47:48 <yanyanhu> we can remove this item
13:48:02 <Qiming> omg, finally find someething to remove
13:48:09 <lixinhui_> haha
13:48:11 <yanyanhu> :)
13:48:16 <elynn> :D
13:48:18 <Qiming> NODE_CREATE/DELETE
13:48:26 <Qiming> no progress I know of
13:48:36 <yanyanhu> another diffult work
13:48:36 <Qiming> engine status list
13:48:39 <Qiming> done?
13:48:45 <elynn> seems so.
13:48:48 <yanyanhu> still minor issue there
13:49:05 <yanyanhu> function is ok now
13:49:07 <elynn> yes, still have a but openning.
13:49:15 <Qiming> don't feel shy to file a bug
13:49:27 <yanyanhu> yes, elynn has done that
13:49:45 <Qiming> okay, last item
13:49:45 <Qiming> sdk change
13:49:47 <elynn> It's already filed :) I will look into it tomorrow :)
13:49:58 <yanyanhu> elynn, no hurry :)
13:50:05 <Qiming> it was a series of disruptive change
13:50:21 <Qiming> almost all resources have their property name reinvented ...
13:50:38 <Qiming> so we have to stick to 0.7.4 for senlin testing/functioning
13:50:45 <yanyanhu> em, need rework senlin driver modules
13:51:02 <Qiming> then see if we can get a chance to catch up with 0.8.0 if that can be released next week
13:51:10 <Qiming> time window is very short
13:51:16 <elynn> Seems sdk is always changing... Hope it can be stable soon.
13:51:16 <Qiming> http://releases.openstack.org/mitaka/schedule.html#m-final-lib
13:51:33 <yanyanhu> will work on it if 0.8.0 can be released soon
13:51:33 <Qiming> not this kind of changes ... sigh
13:51:41 <Qiming> great
13:51:51 <yanyanhu> may need one or two days, mainly for test
13:51:53 <Qiming> that is something just want to raise awareness in the team
13:52:05 <Qiming> don't try clone latest sdk code for senlin development
13:52:54 <yanyanhu> got it
13:52:56 <Qiming> #topic sdk issues
13:53:00 <Qiming> done
13:53:14 <Qiming> #topic DB concurrency problem
13:53:24 <yanyanhu> it happened again?
13:53:26 <Qiming> I think chuck has reported the db concurrency problem again
13:53:32 <Qiming> sadly, yes
13:53:33 <yanyanhu> ...
13:53:40 <elynn> Is there any bug open for it?
13:54:27 <Qiming> this time, I'm gonna monkey patch the oslo.db
13:54:42 <Qiming> just like what mistral has todo
13:55:09 <yanyanhu> from engine level?
13:55:12 <Qiming> s/todo/to do
13:55:12 <Qiming> will propose a fix tomorrow
13:55:12 <yanyanhu> or sesssion level
13:55:26 <Qiming> engine level
13:55:33 <yanyanhu> ok
13:55:53 <Qiming> when doing get_engine, we set isolation_level to READ_COMMITTED
13:56:20 <yanyanhu> ok
13:56:22 <Qiming> don't see other way out
13:56:22 <Qiming> okay, 4 minutes for open
13:56:24 <Qiming> #topic open discussion
13:56:56 <yanyanhu> no other topic from me :)
13:57:13 <lixinhui_> no from me
13:57:16 <elynn> Nope from me :)
13:57:41 <haiwei> what do you think of k8s and docker swarm
13:57:50 <haiwei> ok, no from me
13:58:09 <yanyanhu> haiwei, you mean use them as backend service to manage container?
13:58:15 <Qiming> haiwei, I am all ears to any entry level talks on k8s or docker swarm
13:58:24 <haiwei> I mean replace them :)
13:58:31 <lixinhui_> :)
13:58:44 <haiwei> sounds impossible
13:58:45 <Qiming> haiwei, don't think that should be the goal
13:58:47 <yanyanhu> wow, haven't thought that before :)
13:59:04 <Qiming> the goal should be 'openstack native way to manage container clusters'
13:59:09 <yanyanhu> just got some experience about Mesos and glad to share it with you guys
13:59:15 <haiwei> ok, I will think that more
13:59:22 <Qiming> or 'make containers first class citizens on openstack'
13:59:23 <yanyanhu> will join the discussion about container cluster
13:59:39 <haiwei> thanks
13:59:43 <Qiming> haiwei, be sure to start a spec or two on that
13:59:45 <lixinhui_> interesting, yanyanhu
13:59:50 <Qiming> time is up
13:59:56 <haiwei> no problem, Qiming
13:59:58 <yanyanhu> lixinhui_, yes, lets talk this offline
14:00:01 <Qiming> thanks for joining everyone
14:00:01 <Qiming> good night
14:00:04 <Qiming> #endmeeting