13:00:36 <Qiming> #startmeeting senlin
13:00:37 <openstack> Meeting started Tue Aug  2 13:00:36 2016 UTC and is due to finish in 60 minutes.  The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:00:38 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:00:41 <openstack> The meeting name has been set to 'senlin'
13:00:59 <Qiming> hello
13:01:06 <lixinhui_> hi
13:01:13 <haiwei_> hi
13:01:31 <Qiming> evening, xinhui, haiwei
13:01:35 <yanyanhu> hi
13:01:39 <yanyanhu> sorry I'm late
13:01:41 <Qiming> hi, yanyan
13:01:51 <Qiming> np
13:02:10 <Qiming> pls review agenda and see if you got items to add
13:02:12 <Qiming> #link https://wiki.openstack.org/wiki/Meetings/SenlinAgenda#Weekly_Senlin_.28Clustering.29_meeting
13:02:40 <Qiming> first item, newton work items
13:02:48 <Qiming> #topic newton work items
13:02:56 <Qiming> #link https://etherpad.openstack.org/p/senlin-newton-workitems
13:03:37 <Qiming> updates ?
13:03:58 <yanyanhu> rally plugin
13:04:23 <yanyanhu> the patch for rally side is still in progress, will check and fix latest issues tomorrow morning
13:04:32 <yanyanhu> hope can finish it soon
13:04:43 <Qiming> okay, just got some comments from Roman ...
13:04:48 <yanyanhu> for senlin repo, plugin for cluster scaling has been proposed
13:04:50 <yanyanhu> Qiming, yes
13:05:00 <yanyanhu> need quick fix and also some explanation
13:05:08 <Qiming> okay
13:05:29 <yanyanhu> https://review.openstack.org/346656
13:05:32 <Qiming> interesting ... we are still exposing 'parent' to client?
13:05:32 <yanyanhu> the one for cluster scaling
13:05:56 <yanyanhu> I guess there is still some out of date msg in doc?
13:06:07 <Qiming> okay, it has been hanging there for some days
13:06:07 <yanyanhu> will check it and reply to roman
13:06:15 <Qiming> sounds great
13:06:24 <yanyanhu> yes, hope can complete it soon
13:06:38 <Qiming> maybe we can ask some helps from cmcc
13:06:51 <Qiming> don't know eldon can offer a hand
13:07:01 <yanyanhu> Qiming, yes, I think we can ask them for some use case reference
13:07:20 <yanyanhu> for coding, it's ok for me since
13:07:28 <Qiming> okay
13:07:34 <yanyanhu> there is no critical issue left I feel
13:07:45 <yanyanhu> anyway, will keep working on it
13:07:53 <Qiming> anything else in this space?
13:08:02 <yanyanhu> nope I think
13:08:06 <Qiming> moving on
13:08:18 <lixinhui_> About Fencing
13:08:22 <Qiming> health management, no progress from my side last week
13:08:50 <lixinhui_> I add some points on Qiming's HA etherpad
13:09:13 <lixinhui_> First step is to target fencing nova compute service
13:09:35 <lixinhui_> Second step is fencing of vm
13:09:49 <lixinhui_> for compute service fencing
13:10:01 <lixinhui_> which should happen when some host failure happens
13:10:11 <Qiming> that is actually abouf fencing a nova compute node, correct?
13:10:19 <Qiming> yep
13:10:23 <lixinhui_> Yes, Qiming
13:10:31 <Qiming> I don't have a multi-node setup at hand
13:10:48 <Qiming> cannot produce a compute node failure to observe the host failure events
13:10:53 <lixinhui_> So I wonder if proper to add this into healthmonitor
13:11:06 <lixinhui_> Qiming
13:11:11 <Qiming> have you got any hints on that? either by digging into the source or doc or thru experimentation?
13:11:37 <lixinhui_> compute node failure can only be known by polling service status
13:11:45 <Qiming> observing host failure could only be done thru events
13:12:07 <lixinhui_> Actually
13:12:13 <Qiming> I'm a little reluctant to poll nova compute services
13:12:28 <lixinhui_> Nova today use heartbeat to know if host is alive or not
13:12:37 <Qiming> that is their internals
13:12:55 <Qiming> we are not supposed to peek into that
13:13:01 <lixinhui_> There are only two types of event for nova to notice
13:13:08 <lixinhui_> one if the node.update
13:13:16 <lixinhui_> the other is service.update
13:13:17 <Qiming> IIRC, nova has event reports when a host fails
13:14:01 <Qiming> okay, then we can listen to those events
13:14:07 <lixinhui_> Service.update can be sent only when the change happen on the nova services by nova service*
13:14:28 <Qiming> don't understand
13:15:08 <lixinhui_> you can read the code of nova/objects/service
13:15:16 <lixinhui_> .py
13:15:36 <Qiming> can you pls just explain your last sentence?
13:15:38 <lixinhui_> my experiments proove this
13:16:28 <lixinhui_> that means the up or down of nova services will be changes based on heatbeat without notice
13:16:50 <lixinhui_> but service.update will be sent when I enable or disable some serivce
13:16:57 <Qiming> okay, those are the nova internal state maintenance, we cannot check it from outside
13:17:15 <lixinhui_> so
13:17:24 <Qiming> if nova-compute is down, no event notification is sent?
13:17:31 <lixinhui_> no
13:17:36 <lixinhui_> after two cycle
13:17:44 <lixinhui_> the serivce becomes down
13:17:47 <lixinhui_> that is all
13:18:20 <lixinhui_> after detection that, we can fencing the compute
13:18:21 <Qiming> good/bad to know ...
13:18:49 <Qiming> sounds like the only way for failure detection is polling?
13:18:59 <Qiming> need to double check that
13:19:04 <Qiming> that was my understanding
13:19:06 <lixinhui_> or read the status of nova service
13:19:28 <Qiming> but last time in a mailinglist discussion, I raised this question
13:19:30 <lixinhui_> that would be good if you can double check
13:19:51 <Qiming> someone told me that nova is already capable of sending out notifications when a compute service is down
13:20:04 <Qiming> I hope I have ten heads
13:20:18 <Qiming> need to dig that email
13:20:25 <Qiming> or the source code
13:21:00 <Qiming> #action Qiming to double check nova's capability of notifying host down
13:21:10 <lixinhui_> they indeed add some notices
13:21:10 <Qiming> moving on
13:21:13 <lixinhui_> nova/nova/objects/service.py
13:21:29 <Qiming> documentation side
13:21:42 <Qiming> added some user references docs last week
13:21:57 <Qiming> mainly a reorg around auto-scaling, receivers ... etc
13:22:44 <Qiming> I was thinking of adding a tutorial about auto-scaling, but later I realized that is a huge topic, not suitable for a tutorial, which is supposed to be pretty short
13:23:22 <Qiming> I have also moved the heat based autoscaling under a scenarios subdirectory
13:23:35 <Qiming> where in future we can add more scenarios for references
13:24:03 <Qiming> will check if tutorial doc can be left there ...
13:24:15 <Qiming> next ...
13:24:34 <Qiming> yanyan just started adding version control to profile and policy specs
13:24:47 <Qiming> this is necessary, pls help review
13:24:58 <yanyanhu> yes, just proposed the first patch https://review.openstack.org/348709
13:25:06 <Qiming> thanks
13:25:06 <yanyanhu> to add version support to schema and spec
13:25:28 <Qiming> in parallel, I'm looking into oslo.versionedobjects for a more wholistic solution
13:25:32 <Qiming> will update later
13:25:43 <Qiming> moving on ...
13:25:43 <yanyanhu> my pleasure. Really need some discussion about this topic
13:25:51 <Qiming> container profile support
13:26:24 <Qiming> haiwei just pushed a commit: https://review.openstack.org/#/c/349906
13:27:11 <haiwei_> yes, Qiming
13:27:22 <Qiming> I haven't got time to review
13:27:28 <haiwei_> I only tested it partly
13:27:28 <Qiming> just a quick glance
13:27:59 <Qiming> team, please take a look at it and help polish it when you got cycles
13:28:19 <yanyanhu> sure, will check it
13:28:28 <Qiming> thx
13:28:30 <Qiming> moving on
13:28:32 <haiwei_> I think the point for that patch is where should we store 'host_node' uuid? in that patch I stored it in the metadata of profile
13:29:00 <Qiming> maybe node.data ?
13:29:33 <Qiming> if you check other policy decisions such as zone placement, region placement ...
13:29:54 <haiwei_> ok, I will think about it
13:29:59 <Qiming> we are injecting data into the 'data' field of the node (abstract one)
13:30:23 <Qiming> then when we are about to create the physical resource, we extract those policy decisions
13:30:36 <Qiming> profile metadata was designed for users to use
13:30:49 <haiwei_> in service layer we got host_node, but it is not the server's id, so we need to pass server's id to profile
13:31:08 <Qiming> e.g. {'author': 'haiwei', 'last-updated': '2016-08-02', ... } etc
13:31:35 <Qiming> we can pass those information in node.data field
13:31:50 <haiwei_> I will check it later
13:31:52 <Qiming> the node.data field was designed to carry those data around
13:31:55 <Qiming> great
13:32:33 <Qiming> pls also think if we can move the policy decision out into a policy type
13:33:04 <Qiming> 1. that will make the engine code cleaner; 2. we could later improve/replace that policy type implementation easily
13:33:19 <haiwei_> what policy decision?
13:33:33 <haiwei_> currently I am thinking about node_create
13:33:37 <Qiming> by "policy decision" I mean the selection of node in a hosting cluster
13:34:26 <Qiming> just something to keep in mind, I'm not sure how feasible it is without digging into the code
13:34:39 <haiwei_> ok
13:34:45 <Qiming> great, moving on
13:34:52 <Qiming> zaqar based receiver
13:35:13 <Qiming> yanyan has been busy working on that ...
13:35:31 <yanyanhu> yes, have confirmed with zaqar team again about the usage of "project_id" and "client_id" today
13:35:53 <yanyanhu> just as you said, we should expose them out for invoker of sdk proxy call
13:36:20 <yanyanhu> have post the latest result in the follow patches:   * https://review.openstack.org/349369
13:36:21 <yanyanhu> * https://review.openstack.org/338041
13:36:37 <Qiming> okay, the thing to bear in mind is ...
13:37:03 <Qiming> if you put 'client_id = Header('Client-ID') in that Message class
13:37:21 <Qiming> the header still won't appear in the final request ...
13:37:41 <Qiming> that is something I missed when reviewing your last patch
13:38:23 <yanyanhu> overriding resource calls will make it take effect I think?
13:38:23 <Qiming> so ... your way of overriding those methods are still valid, though there are rooms for improvement
13:38:29 <Qiming> yes
13:38:30 <yanyanhu> like the latest patch does
13:38:36 <yanyanhu> yes
13:38:49 <yanyanhu> really not graceful way
13:38:51 <Qiming> it is ugly, but ... you know, people need time to understand the issue we are facing
13:38:58 <yanyanhu> yea
13:39:17 <Qiming> we should allow custom headers in all those create, get, list calls
13:39:18 <yanyanhu> hope brian can figure it out when building resource2
13:39:24 <yanyanhu> using better way
13:39:29 <yanyanhu> yes
13:39:30 <Qiming> resource2 is already there ...
13:39:36 <yanyanhu> if so, that will be much better
13:39:49 <yanyanhu> Qiming, it still needs some improvement I think
13:39:50 <Qiming> but he doesn't seem buy in the idea of adding more parameters
13:40:00 <yanyanhu> for those "corner" use cases
13:40:12 <Qiming> thanks for keeping the balls rolling
13:40:24 <Qiming> will review your new patchset tomorrow
13:40:32 <yanyanhu> thanks a lot
13:40:38 <Qiming> moving on
13:40:47 <Qiming> events/notifications
13:40:55 <Qiming> no update from me in this space
13:41:17 <Qiming> actually I was trapped in a more general issue .... versioning of things
13:41:43 <Qiming> okay, next topic
13:41:49 <Qiming> #topic newton deliverables
13:42:29 <Qiming> though I've been digging into the issue of versioning of things, I don't think we can get it done by this cycle
13:43:04 <Qiming> on the other hand, the new features about cluster-collect and cluster-do will have to base on micro-version support
13:43:22 <Qiming> which is also blocked here: https://review.openstack.org/#/c/343992/
13:43:56 <Qiming> still need time to convince brian that the current patch is already okay
13:44:07 <yanyanhu> this part is really complicated...
13:44:37 <Qiming> the overall design and impl is good, there are some trivial coding style things for communication
13:44:49 <Qiming> health policy implementation ...
13:45:08 <Qiming> I do hope we can deliver a basic, working version by this cycle
13:45:17 <yanyanhu> sure
13:45:18 <Qiming> as for container cluster
13:45:33 <yanyanhu> really need to achieve that goal I feel
13:45:35 <Qiming> it would be GREAT we can have a basic, working version
13:45:51 <Qiming> yes, people are asking questions on that
13:46:01 <haiwei_> yes
13:46:18 <Qiming> let's keep working hard on this
13:46:24 <Qiming> s/this/these
13:46:39 <Qiming> next topic I added is about versioned objects
13:46:56 <Qiming> when adding new properties to policy (e.g. the lb policy revision lately)
13:47:03 <Qiming> we need to bump policy version
13:47:19 <Qiming> so ... we have a lot of things to be versioned
13:47:35 <Qiming> 1. API micro-version
13:47:43 <Qiming> 2. API request body version
13:47:51 <Qiming> 3. API response version
13:47:57 <Qiming> 4. RPC version
13:48:07 <Qiming> 5. DB object version
13:48:14 <Qiming> 6. Event/Notification version
13:48:21 <Qiming> 7, Policy type version
13:48:26 <Qiming> 8. Profile type version
13:48:39 <Qiming> 9. Receiver version
13:49:05 <Qiming> without proper versioning infra at hand, we will quickly loose control of things
13:49:16 <yanyanhu> Qiming, yes, almost every elements that could vary over time
13:49:17 <Qiming> and things will break in a thousand ways
13:49:44 <Qiming> so... I'm investigating oslo.versionedobjects, every single line of code there
13:49:54 <Qiming> and also jsonschema doc/implementation
13:50:21 <Qiming> I think I have got a rough idea on how to unify all object versioning into the same framework
13:50:42 <Qiming> but that warrants a lot of experimentation and code churn
13:51:14 <Qiming> will leave that as a long term work, maybe by end of Ocata we will have this framework completely landed
13:51:41 <Qiming> ideally, after that, when you want to add a new property to an existing resource
13:52:09 <Qiming> you won't need to modify a few hundred lines of code while still worrying about breaking existing users
13:52:22 <yanyanhu> great, we can add version support for different elements gradually I think
13:52:42 <Qiming> some preliminary code have proved the feasibility of this
13:53:05 <Qiming> we can even make the api-ref documentation generated out of these objects
13:53:10 <yanyanhu> start from most basic part and keep it in mind when making changes on those "unversioned" stuff
13:53:24 <Qiming> yep
13:53:56 <Qiming> so that is my update in this thread
13:54:15 <Qiming> I didn't leave any time for questions/comments
13:54:25 <Qiming> #topic open discussion
13:54:31 <yanyanhu> no problem, will check the code :)
13:54:44 <yanyanhu> voting is open now :)
13:54:50 <yanyanhu> good luck for senlin's topic
13:54:59 <yanyanhu> topic(s)
13:55:07 <Qiming> yup
13:55:09 <Qiming> blessing
13:55:33 <Qiming> this is a very strange document to read ... http://json-schema.org/latest/json-schema-validation.html
13:55:59 <yanyanhu> yes...
13:56:16 <Qiming> and that is their most comprehensive one I guess .... :)
13:56:23 <namnh> Hi
13:56:23 <yanyanhu> really hope we are native eng speaker :)
13:56:38 <Qiming> hi, namnh
13:57:46 <Qiming> anything else?
13:57:55 <yanyanhu> nope from me
13:58:42 <Qiming> seems lixinhui has dropped
13:58:50 <yanyanhu> yea
13:58:57 <yanyanhu> will over time soon
13:58:59 <Qiming> but anyway I will look into the nova code
13:59:15 <Qiming> thanks, guys
13:59:21 <Qiming> let's meet next week
13:59:23 <yanyanhu> thanks, have a good night
13:59:28 <Qiming> #endmeeting