13:00:23 <Qiming> #startmeeting senlin
13:00:23 <openstack> Meeting started Tue Jul 26 13:00:23 2016 UTC and is due to finish in 60 minutes.  The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:00:24 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:00:27 <openstack> The meeting name has been set to 'senlin'
13:00:33 <zzxwill> Hello.
13:00:36 <yanyanhu> hi
13:00:38 <elynn> o/
13:00:38 <Qiming> #topic roll call
13:01:07 <Qiming> hello
13:01:30 <yanyanhu> o/
13:01:47 <elynn> Evening!
13:01:51 <Qiming> xinhui or haiwei online?
13:01:59 <lixinhui_> Yes
13:02:06 <lixinhui_> Just jumped in
13:02:24 <Qiming> you jumped beautifully
13:02:30 <lixinhui_> :)
13:02:36 <Qiming> #topic newton work items
13:02:47 <Qiming> #link https://etherpad.openstack.org/p/senlin-newton-workitems
13:03:13 <Qiming> any progress on stress testing last week?
13:03:57 <Qiming> I saw yanyan's rally work blocked for release-cut window
13:04:07 <yanyanhu> Qiming, yes
13:04:27 <yanyanhu> still waiting, guess still need to wait for a while
13:04:42 <Qiming> is that patch the last one we will "beg" rally to merge in?
13:05:03 <yanyanhu> :)
13:05:11 <yanyanhu> Qiming, it is not necessary to add all them into rally repo
13:05:22 <Qiming> yep
13:05:25 <yanyanhu> but letting them stay in rally side is better than keeping them inside senlin
13:05:35 <Qiming> benefit?
13:05:44 <yanyanhu> so will first add plugins into our repo and migrate them into rally gradually
13:05:54 <yanyanhu> we don't need to hold them by ourselves
13:06:12 <Qiming> but it will still be senlin team to maintain it
13:06:20 <yanyanhu> sure
13:06:33 <Qiming> then what's the benefit?
13:07:02 <yanyanhu> just once there some structure refactoring inside rally, we will know at once if it breaks senlin plugin I think
13:07:03 <Qiming> hopefully, we won't forget adding/modifying rally jobs when we change things ...
13:07:11 <yanyanhu> sure
13:07:21 <Qiming> that testing can be done at senlin gate as well
13:08:09 <Qiming> a little bit upset by the slow reviews there
13:08:20 <yanyanhu> Qiming, yes, me too...
13:08:47 <yanyanhu> looks like the team has no enough bandwidth for all these reviews...
13:09:15 <yanyanhu> but we do get lots of important comments :)
13:09:33 <yanyanhu> to help improve my patch and let me get better understand of rally
13:09:39 <Qiming> anyway, I agree we should do this at senlin repo at first, then migrate to rally step by step
13:09:51 <yanyanhu> Qiming, yes
13:09:53 <yanyanhu> this is my plan
13:10:00 <Qiming> graet
13:10:23 <Qiming> any other updates about benchmarking/performance testing?
13:10:56 <Qiming> guess no
13:10:57 <Qiming> moving on
13:10:59 <yanyanhu> no other progress I think
13:11:02 <Qiming> health management
13:11:40 <Qiming> I spent some time reading oslo.messaging code
13:11:47 <Qiming> 2 findings
13:12:27 <Qiming> 1. the transport used for listeners is supposed to be different from the one used for RPC, that one has been fixed, although we still get a working listener there somehow
13:13:21 <Qiming> 2. when invoking 'get_notification_listener', we have an opportunity to specify the 'executor'
13:13:28 <Qiming> which defaults to 'blocking' today
13:13:44 <Qiming> other choices are 'threading', 'eventlet'
13:14:13 <lixinhui_> oh?
13:14:14 <Qiming> I tried them both but had to revert to 'blocking' for the listeners to work properly
13:14:48 <lixinhui_> what do the other two mean?
13:14:58 <Qiming> the only pitfall is that we will get a warning from oslo.messaging saying that our listener may be hang forever listening to events
13:15:04 <Qiming> that is acceptable
13:15:23 <Qiming> they are imported from package 'futurist' directly
13:15:35 <Qiming> that package provides options to execute taks in different flavors
13:15:47 <Qiming> I don't have a lot bandwidth to dig into that
13:15:56 <lixinhui_> ok
13:16:07 <Qiming> if anyone is interested in this, here is the doc: http://docs.openstack.org/developer/futurist/api.html#executors
13:16:36 <Qiming> that is how oslo.message dispatches events
13:16:52 <lixinhui_> ok
13:16:54 <Qiming> LB bug fix, any news there?
13:17:26 <lixinhui_> Two of three patches have been accepted
13:17:35 <lixinhui_> still this one https://review.openstack.org/325624
13:17:43 <Qiming> btw, someone stopped by on senlin channel asking for a working version of health policy
13:17:56 <Qiming> he said he watched our presentation on austin summit
13:18:15 <lixinhui_> oh
13:18:16 <lixinhui_> I can provide one
13:18:18 <Qiming> that is ringing a loud alarm to me
13:18:25 <lixinhui_> if he or she needs
13:18:46 <Qiming> we should be very very very careful when delivering presentation/demos
13:18:47 <lixinhui_> Adam has some concerns
13:19:09 <Qiming> unless we can ensure users can reproduce the demo easily using the public code base
13:19:42 <lixinhui_> I think you have revised the health policy from WIP
13:19:44 <lixinhui_> right?
13:20:03 <Qiming> or else, we will have difficulties attracting them to come back
13:20:17 <Qiming> that health policy is still not working
13:20:31 <Qiming> the loop is not closed
13:20:37 <Qiming> and fencing is not there yet
13:21:13 <Qiming> people will git clone and try it and see that it doesn't work
13:21:16 <Qiming> then they leave
13:22:05 <Qiming> so ... for the coming barcelona presentations, no matter which one(s) are accepted
13:22:16 <Qiming> the demos used in those talks must work
13:22:49 <Qiming> the code/profile/policy has to show up in main tree
13:23:39 <Qiming> I'll spend time on health management this week
13:23:51 <Qiming> try to close the loop asap
13:24:07 <Qiming> let's move on?
13:24:17 <yanyanhu> one question
13:24:24 <Qiming> shoot
13:24:28 <yanyanhu> does https://review.openstack.org/345916 fixes the issues xinhui mentioned?
13:24:43 <yanyanhu> about wait after listener is tarted
13:24:43 <yanyanhu> https://review.openstack.org/346390
13:24:45 <yanyanhu> this one
13:25:06 <Qiming> pls check the bug report
13:25:17 <Qiming> https://launchpad.net/bugs/1605869
13:25:17 <openstack> Launchpad bug 1605869 in senlin "hang: wait is waiting for stop to complete" [Undecided,In progress] - Assigned to Cindia-blue (miaoxinhuili)
13:25:36 <Qiming> it is not an error reported by oslo.messaging
13:25:50 <Qiming> oslo.messaging is too smart in this respect
13:26:14 <Qiming> when it detects we didn't set a timer when calling wait()
13:26:29 <Qiming> it will warn us that the listener may listen forever
13:26:30 <yanyanhu> I see
13:26:33 <Qiming> thus a 'hang'
13:26:49 <Qiming> actually, that is what we wanted in a listener thread
13:27:05 <Qiming> a dedicated listener thread
13:27:23 <Qiming> okay, moving on
13:27:26 <yanyanhu> just need to ensure stop is explicitly called before stoping health manager
13:27:39 <Qiming> yep, that will be desirable
13:27:58 <Qiming> however, in multi-engine setup, we don't have a way to gracefully shutdown all threads
13:28:24 <Qiming> if we start a single engine, we can see that all threads are gracefully killed
13:28:33 <Qiming> that is a broader problem to solve
13:28:43 <yanyanhu> yes, it is
13:28:56 <Qiming> moving on, documentation
13:29:24 <Qiming> I'm working on tutorial documentation for autoscaling today
13:29:44 <Qiming> to make auto-scaling work, I am using ceilometer + aodh + senlin
13:29:59 <Qiming> many interesting/annoying findings
13:30:16 <Qiming> but finally, I got auto-scaling with cpu_util working now
13:30:26 <Qiming> though I know in theory it should work
13:30:35 <joehuang> exit
13:30:40 <Qiming> share some findings with you:
13:31:19 <Qiming> 1. aodh alarm-update cannot process --query parameters properly, we have to get --query specified properly when doing 'aodh alarm create'
13:32:04 <Qiming> 2. recent modifications to python-openstacksdk is breaking server details retrieval
13:32:05 <yanyanhu> sounds like a bug?
13:32:24 <Qiming> we cannot get 'image' and 'flavor' properties if we are using latest master
13:32:33 <yanyanhu> means senlin node-show -D will break as well?
13:32:56 <Qiming> yes, that one was broken as well
13:33:06 <yanyanhu> I see...
13:33:10 <Qiming> I have rebased senlin resources to resource2/proxy2
13:33:37 <yanyanhu> great
13:33:38 <Qiming> https://review.openstack.org/#/c/344662/
13:34:12 <Qiming> to make that work, I have spent a lot time discussing with sdk team about the 'to_dict()' method which was removed from resource2.Resource
13:34:23 <Qiming> it will break all senlinclient resource show command
13:34:36 <yanyanhu> yes, think so
13:34:46 <yanyanhu> we use [''] now
13:35:01 <Qiming> if you are interested in this, you can check the review history: https://review.openstack.org/#/c/331518/
13:35:12 <yanyanhu> not a backward compatible change
13:35:23 <Qiming> it took about one month to get that accepted
13:36:11 <Qiming> back to the auto-scaling experiment
13:36:23 <yanyanhu> yes, noticed the discussion between you and brian
13:36:27 <yanyanhu> will check it :)
13:36:52 <Qiming> this is how I created an alarm:
13:36:53 <Qiming> aodh alarm create -t threshold --name c1a1 -m cpu_util --threshold 50 --comparison-operator gt --evaluation-periods 1 --period 60 --alarm-action http://node1:8778/v1/webhooks/518fc9b7-01e8-410a-ac34-59fb33cb398f/trigger?V=1 --repeat-actions True --query metadata.user_metadata.cluster=113707a0-8fdc-434f-b824-98fd706a5e0d
13:37:25 <Qiming> the tricky part is in the --query parameter, not well documented, and it is using 'pyparsing'
13:37:36 <Qiming> the docs says that '==' can be used, but it won't work
13:38:00 <Qiming> no one is telling you that you should use 'metadata.user_metadata.cluster' for filtering
13:38:08 <yanyanhu> well, inconsistency in document again...
13:38:18 <Qiming> had to read the source code to get it work
13:38:38 <yanyanhu> sure, I did that two and half years ago
13:38:47 <yanyanhu> when I first time try fitlering in ceilometer
13:38:48 <Qiming> after this step, you won't get an alarm
13:38:55 <yanyanhu> still happening :)
13:39:27 <Qiming> because in all the cpu_util samples, you won't see the nova metadata included
13:39:48 <Qiming> then ceilometer cannot evaluate the samples, aodh cannot fire an alarm
13:40:10 <yanyanhu> looks weird
13:40:12 <Qiming> after reading the source code, I figured that I have to add one line into ceilometer.conf file:
13:40:23 <Qiming> reserved_metadata_keys = cluster
13:40:39 <yanyanhu> what does that mean?
13:40:40 <Qiming> after that, restart ceilometer compute agent
13:41:17 <Qiming> the ceilometer compute pollster will now know  that 'cluster' value in the nova.metadate should be reserved
13:41:49 <Qiming> or else, ceilometer is dropping all metadata key-values, unless the keys are prefixed by 'metering.'
13:41:58 <Qiming> I don't think this is documented anywhere
13:42:06 <yanyanhu> I see
13:42:21 <yanyanhu> I recalled I met similar problem before
13:42:27 <yanyanhu> at the end of 2014
13:42:46 <Qiming> I'll document the process into the tutorial doc, so users will know how to make the whole thing work
13:42:57 <yanyanhu> needed to do some hack to address it
13:43:29 <yanyanhu> since this condition was not always satisfied
13:43:36 <Qiming> yep
13:44:09 <Qiming> since haiwei is not online and no one is working on container support, we can skip the container profile item
13:44:17 <yanyanhu> not a pleasant experience :)
13:44:25 <Qiming> engine, NODE_CREATE, NODE_DELETE
13:44:35 <Qiming> I think the problem is solved now
13:44:48 <yanyanhu> yes
13:44:54 <yanyanhu> saw those patches
13:44:59 <Qiming> I was thinking of deriving cluster actions from node actions so that policies will be respected
13:45:09 <Qiming> but it turned out to be too complicated
13:45:24 <yanyanhu> current solution is good I think
13:45:34 <Qiming> I did a workaround, making policy aware of NODE_xxx actions
13:45:47 <Qiming> that is making things much more clearer
13:45:54 <yanyanhu> yes, and differentiate node actions derived from different sources
13:46:00 <Qiming> so .. deleting that work item
13:46:26 <Qiming> yep, we had that design/impl in place, these patches were just leveraging them
13:46:40 <yanyanhu> yea
13:46:48 <Qiming> em ... need to add some release notes about this
13:46:56 <yanyanhu> right :)
13:47:07 <Qiming> zaqar receiver thing
13:47:12 <Qiming> where are we?
13:47:15 <yanyanhu> no progress this week...
13:47:22 <yanyanhu> still pending for sdk support
13:47:30 <yanyanhu> and also document updating
13:47:50 <yanyanhu> I have made some local test on 'message' resource
13:47:52 <Qiming> if sdk support is in, we will get a working version soon?
13:47:56 <yanyanhu> but still some problems need to fix
13:47:59 <yanyanhu> to figure out
13:48:08 <Qiming> then grab wangfl
13:48:11 <yanyanhu> nope, it is just for queue
13:48:16 <yanyanhu> yea
13:48:23 <yanyanhu> he is working on that I think
13:48:27 <yanyanhu> saw his patch
13:48:29 <Qiming> okay
13:48:45 <Qiming> then continue grabbing him when necessary, :)
13:49:03 <Qiming> no update about event/notification from last week
13:49:08 <yanyanhu> sure :) owe him a beer
13:49:17 <Qiming> ok
13:49:30 <Qiming> #topic newton deliverables
13:49:49 <Qiming> guys, if you take a look at the newton release schedule
13:49:51 <Qiming> #link http://releases.openstack.org/newton/schedule.html
13:50:04 <Qiming> you will see that we are at week R-10
13:50:36 <Qiming> that means we still have 10 weeks before the final 2.0.0 release
13:50:37 <yanyanhu> a month left
13:51:04 <Qiming> if we consider newton-3 milestone, we only have 1 month
13:51:21 <yanyanhu> yes, for feature freeze
13:51:33 <Qiming> hopefully, we can deliver what we planned at the beginning of this cycle
13:51:59 <Qiming> e.g. profile-validate, policy-validate, cluster-collect, cluster-do, health policy, notification, container profile
13:52:20 <yanyanhu> also message type of receiver
13:52:20 <elynn> I might got some spare time next week, hope we can finish that.
13:52:35 <yanyanhu> elynn, great :)
13:52:53 <Qiming> yep, time to step up and claim some items that most interested you
13:52:56 <yanyanhu> know you are really trapping on some annoying stuff :)
13:53:07 <Qiming> that is life
13:53:13 <elynn> :)
13:53:17 <yanyanhu> yea
13:53:22 <yanyanhu> always :)
13:53:26 <Qiming> never meant to be an easy one for anybody
13:53:37 <lixinhui_> :)
13:54:07 <Qiming> glad wie can still get things moving forward and even accomplish something we feel good
13:54:34 <Qiming> let's see what we can complete during the coming month
13:54:40 <Qiming> #topic open discussions
13:54:49 <yanyanhu> oh, BTW, about the mascot
13:55:01 <Qiming> right, I replied their email
13:55:01 <yanyanhu> I guess forest?
13:55:04 <yanyanhu> :P
13:55:07 <Qiming> maybe just forest
13:55:13 <yanyanhu> its an obvious choice for us
13:55:22 <Qiming> that is what senlin means
13:55:29 <lixinhui_> agree
13:55:32 <Qiming> we still have choices
13:55:41 <Qiming> if you have some favorite animal
13:55:47 <elynn> yes!that what senlin is :)
13:56:02 <yanyanhu> forest is straightforward :)
13:56:20 <yanyanhu> easy to understand, I think the picture we always use in the slice is ok
13:56:40 <Qiming> email from Heidi:
13:56:40 <Qiming> Thank you so much for the reply! Of course I won�t mock you. Actually, I�m thrilled to know you already have a great mascot that works with this project. Senlin will have the first right of refusal on a forest since that�s already your logo. You might want to discuss with your team whether you intend the trees in your forest to look deciduous, evergreen, or a specific variety (stands of Aspen, for example). That can help guide our illustrator
13:56:41 <Qiming> to make a forest that reflects what you like.
13:56:41 <Qiming> Cheers,
13:56:41 <yanyanhu> hope no conflict with other projects :P
13:56:42 <Qiming> Heidi Joy
13:57:08 <yanyanhu> haha
13:57:17 <Qiming> deciduous, evergreen, or ...
13:57:48 <yanyanhu> evergreen sounds good, haha
13:57:50 <yanyanhu> for HA
13:57:57 <Qiming> good point
13:58:50 <Qiming> 2 minutes left
13:59:35 <Qiming> thanks for joining boys and girls
13:59:37 <yanyanhu> no other topic from me
13:59:42 <yanyanhu> thanks
13:59:42 <Qiming> will you all a happy night
13:59:45 <Qiming> pleasant one
13:59:51 <yanyanhu> take good care of you baby :)
13:59:54 <elynn> :)
13:59:54 <Qiming> #endmeeting