13:00:35 #startmeeting senlin 13:00:36 Meeting started Tue Dec 22 13:00:35 2015 UTC and is due to finish in 60 minutes. The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:37 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:39 The meeting name has been set to 'senlin' 13:01:07 good evening 13:01:15 hi 13:01:20 Hi 13:01:24 hi 13:01:53 pls check meeting agenda and see if there are things you want to add 13:01:57 #link https://wiki.openstack.org/wiki/Meetings/SenlinAgenda 13:03:03 let's start with the etherpad 13:03:51 heat resource type 13:04:05 I still stuck in cluster update 13:04:18 don't know how to check the status 13:04:49 elynn, you mean the status of senlin cluster as heat resource? 13:04:51 well, okay 13:05:06 yanyanhu: yes 13:05:19 after the api revision, we are returning 202 as status code for create, update and delete operations 13:05:22 the status change from 'active' -> 'updating' -> 'active' 13:05:52 we can never promise you those operations are performed synchronously 13:06:08 in functional test, I checked action status to get the cluster update/resize/scale result 13:06:14 when returning the 202 status code, we are also returning some hints in the header 13:06:35 in the body, we also return the action id 13:06:46 header only contains location 13:07:23 elynn, then probably check the body? 13:07:40 Qiming: haven't find a way to do it. 13:08:07 Client will translate the response data to cluster class... 13:08:32 ah, that's a problem 13:08:42 even for 202 return code? 13:08:57 I can check how we do in functional tests. 13:09:16 elynn, in functional test, we didn't use senlinclient 13:09:25 directly API access is used for testing 13:09:25 http://git.openstack.org/cgit/openstack/senlin/tree/senlin/tests/functional/api.py#n64 13:09:32 https://github.com/openstack/python-senlinclient/blob/master/senlinclient/v1/client.py#L85-L86 13:10:32 Is that ok directly using api request? 13:10:44 elynn, I guess we need to figure out how to get action information when using senlinclient to talk with service 13:11:29 yanyanhu, yes 13:11:32 elynn, yes, if you have access to the raw body, yes 13:12:04 That would be a little weird... 13:12:24 oh, cluster_update returns the cluster in the body 13:12:33 need to figure out a clean way on that 13:12:47 And I also encounter another problem 13:13:10 Qiming, anyway to make sdk return action id to enduser as well as the resource itself? 13:13:32 the response comes from keystoneauth1 now 13:13:48 we have to have senlin server do that 13:13:57 After I do cluster-resize, cluster status will change to updating for a while, but before it's finished 13:14:21 elynn, you need to check the action, not the targeted object 13:14:42 we cannot guaranteed you that a cluster is updated immediately 13:14:53 it might be locked for a resize at the moment 13:14:57 Qiming: en, I think I need to think another way to check status. 13:15:04 yes 13:15:10 just like you said, check action's status 13:15:28 okay, we can add action to the response body if not included 13:16:10 Qiming, and maybe also need to revise senlinclient implementation 13:16:31 some functions like this https://github.com/openstack/python-senlinclient/blob/master/senlinclient/v1/client.py#L85 13:16:55 just not sure how to do it in a graceful way 13:16:58 okay, need some investigation there 13:17:10 whether non cluster property attributes will be filtered out 13:17:20 yes 13:17:46 yanyanhu, can you help elynn out there? 13:18:02 sure 13:18:07 thx 13:18:09 will think about this issue 13:18:11 thanks! 13:18:24 client unit test 13:18:27 you're welcome :) 13:18:31 almost done I think 13:18:38 yea 13:18:45 still last mile to complete the shell test 13:18:51 I saw the 4th patch for client test from haiwei 13:19:03 at the same time, I'm fleshing out some useless code from client 13:19:32 when doing that, I have found something that needs improvement at sdk side, will work on that also 13:19:41 Qiming: are your revise can compatible with old one? 13:20:13 so far there will be no impact on software using senlinclient.v1.client 13:20:36 you are not calling resize, scale_in, attach_policy stuff yet 13:20:50 will fix them before you start using them 13:20:52 https://review.openstack.org/#/c/260329/1/senlinclient/v1/client.pyL45 get_profile 13:21:15 it can accept id before, it can still accept id now? 13:21:28 yes 13:21:28 I notice you remove dict(id=value) 13:21:57 that was because we have been using our own self.get function 13:22:17 we have been reinventing a lot wheels at senlinclient 13:22:31 because sdk was nevery ready 13:22:56 ok, now I'm clear 13:23:11 now with our last resource type 'receiver' under review (https://review.openstack.org/260310), we are throwing away out local stuff 13:23:27 we were doing that because we don't want our progress blocked by sdk 13:23:53 ideally, to be honest, heat should call openstacksdk, instead of senlinclient 13:24:09 senlin-dashboard should call openstacksdk, instead of senlinclient 13:24:22 then we can drop senlinclient.v1.client completely 13:24:37 got what I mean? 13:24:52 Qiming: yes 13:25:08 some day in 2016, we will drop python-senlinclient project completely 13:25:18 migrating to openstackclient 13:25:52 this has to be done step by step 13:26:02 We should push openstacksdk get into big tent 13:26:19 API response modification, do we have a release note on that already? 13:26:34 nope, I think 13:27:05 fix-api-return-202-b9d31250c4d7c624.yaml 13:27:11 we have this in tree 13:27:25 oh 13:27:33 just can't access that link 13:27:49 sigh 13:28:03 http://git.openstack.org/cgit/openstack/senlin/tree/releasenotes/notes/fix-api-return-202-b9d31250c4d7c624.yaml 13:28:20 health policy 13:28:24 ok, if so I think this work has been done 13:28:25 lixinhui_ ? 13:29:01 we are experimenting a message listener inside health manager 13:29:28 so far it is telling us that a nova server has been create/deleted/.... 13:29:44 not a complex code 13:30:03 the difficulty lies in linking these events to actions 13:30:16 what you mean by health manager? the one Liuhang added at the beginning? 13:30:36 need to think about how to do that, RPC? 13:30:36 Yes, Qiming 13:31:02 yes, health manager starts a thread that listens to the message queue, about 'notifications' 13:31:12 http://git.openstack.org/cgit/openstack/senlin/tree/senlin/engine/health_manager.py 13:31:14 we filter out irrelevant messages 13:31:58 lixinhui_ has been experimenting that in hope we are not doing resource wasting polling 13:32:17 Qiming, I think rpc is ok 13:32:53 still need to finish that framework before debugging the code 13:33:04 profile update 13:33:21 name and flavor update proposed, great, just reviewed, both look good 13:33:21 finally started working on this again 13:33:29 thanks 13:33:39 receiver 13:33:54 db -> engine -> rpc -> api -> client 13:33:56 done 13:34:16 next step is client support? 13:34:31 on the reverse direction removing webhook, client -> api -> rpc -> middleware -> engine -> module -> db 13:34:36 a patch has been merged 13:34:51 client support done and tested manually 13:34:57 ok 13:34:58 nice 13:35:10 and I will start working on revise functional test of webhook 13:35:13 didn't get time to work on the middleware revision 13:35:19 since related API has been removed 13:35:27 due to a full afternoon 'patent' review 13:35:38 :P 13:35:48 we still have the webhook trigger api left 13:35:56 but that endpoint is not discoverable 13:36:06 you cannot do list, or get on it 13:36:09 :) 13:36:14 yes, but the creating/deleting part need revision 13:36:18 yea 13:36:19 i'm evil 13:36:36 creation and deletion is done via receiver api now 13:36:36 you are robot 13:36:53 Qiming, you can work 7X24 13:37:02 engine, locks 13:37:12 temporarily commented out 13:37:17 lixinhui_, that depends on coffee and food 13:37:20 can be added back now 13:37:35 :) 13:37:37 need revise? 13:37:44 functional test on this would be difficult 13:38:00 yes, elynn, don't call action.set_status() 13:38:28 call dbapi.action_update({'status': 'ERROR'}) instead 13:38:29 What should I do? 13:38:37 ok 13:38:41 and .... 13:38:54 I'm thinking whether this lock breaker should be moved somewhere else 13:39:10 it is now sitting on the critical path on lock acquire 13:39:14 yes, maybe we just steal lock in some special cases 13:39:18 that is hurting 13:39:26 like, cluster/node deleting 13:39:50 before adding it back, please revisit the problem in addition to the current implementation 13:40:16 ok 13:40:28 functional test broken ... 13:40:32 I think we are good now 13:40:38 a whole week on it 13:41:13 still not sure it is 100% fixed 13:41:22 though we are passing gate now, :P 13:41:34 high priority bug, gone! 13:41:45 great job! 13:42:03 is it better to enable it at jenkins but no-voting? 13:42:05 blueprint review 13:42:19 elynn, it slows everything down 13:42:39 I'd suggest we keep it optional 13:42:46 currently, just need to check it before + workflow I think 13:43:07 we have a lot work to do at the moment 13:43:11 ok got your point 13:43:15 cannot wait the gate to say yes 13:43:53 when we are touching some functional logic, we need to remind ourselves, do a 'check experimental' 13:44:11 or when our progress is slowing down, things are more stable 13:44:14 we turn it on 13:44:14 yea 13:44:35 blueprints review <-- done? 13:45:03 https://blueprints.launchpad.net/senlin 13:45:04 I think so 13:45:08 no backlog now 13:45:59 I submit a BP 13:46:01 https://blueprints.launchpad.net/senlin/+spec/support-health-management 13:46:01 maybe we will need a senlin-spec sub-project some time in future 13:46:22 so far we are good with bps 13:46:28 lixinhui_, approved 13:46:47 Qiming, Thanks for your help on this 13:46:57 thanks for the hard work 13:47:06 #topic mid-cycle meetup planning 13:47:11 hope I can add use case and implements after listener trials 13:47:14 #link https://etherpad.openstack.org/p/senlin-mitaka-midcycle 13:47:24 great, lixinhui_ 13:47:37 currently we have a two-day schedule 13:47:42 lixinhui_, looking forward to your patch :) 13:47:54 I was hoping haiwei can join us 13:47:56 Thanks, yanyanhu :) 13:48:19 please review 13:48:46 ok 13:49:11 note that this is not merely a developer meetup 13:49:30 we are inviting some people to share their use case, their painpoints 13:49:52 how about come to VMware :) 13:50:10 we'd like to solve their problems, rather than just developing a toy 13:50:26 lixinhui_, that is an option too 13:50:36 Qiming :) 13:51:25 I think we can decide the detail by next week to let everybody has enough time to make preparation 13:51:27 lixinhui_, we'll strive to keep your coffee cup filled 13:51:44 haha 13:51:45 yes, please think about it 13:52:23 I'm about to send out an email to the os-dev mailinglist 13:52:42 ok, lets discuss it in mailing list 13:52:52 okay 13:53:08 if you have suggestions, you can input into the etherpad directly 13:53:53 #action Everyone to think about topics to share during meetup 13:54:02 #topic open discussion 13:54:57 We are trying to do some scalability testing these days 13:55:05 okay 13:55:11 about this bug https://bugs.launchpad.net/senlin/+bug/1528525 13:55:11 Launchpad bug 1528525 in senlin "senlin engine will be blocked when action chain is too long" [Undecided,New] 13:55:11 hope to get some data soon for everyone's review 13:55:58 In some case, senlin engine will not response for some operation. 13:55:59 elynn, 999 nodes in a cluster? 13:56:07 elynn, interesting 13:56:10 just a large number 13:56:10 will check it 13:56:32 elynn, I also noticed this issue. 13:56:46 elynn 13:56:48 okay, that sounds familar 13:57:01 * Qiming sighs 13:57:05 KVM you are using? 13:57:09 when I create 5 cluster with 1000 nodes, service just stop responding to request like cluster list 13:57:11 yes 13:57:28 we are not limiting the number of workers I think 13:57:46 sounds like engine is overloaded 13:57:46 yanyanhu, I my case, api for list is still working 13:57:48 Qiming, the pool size is default 1000 13:57:55 I think so 13:57:59 but create or resize doesn't response 13:58:03 pool works? 13:58:13 greenthread pool size 13:58:33 yanyanhu, didn't notice where we did a check on that 13:58:35 log shows engine are busy deleting nodes. 13:59:10 elynn, you may need more than one engine, :) 13:59:27 a good test for concurrency, again 13:59:36 Qiming, if you mean the check about concurrent action number, there is no check for it I think 13:59:39 well, time's up 13:59:39 Qiming: ...... 13:59:47 elynn, that's true :) 13:59:50 back to #senlin pls 14:00:07 thanks for joining, guys 14:00:10 #endmeeting