Friday, 2017-01-20

*** zhurong has joined #senlin00:48
*** hanwei has joined #senlin01:11
*** Drago2 has quit IRC01:32
*** yanyanhu has joined #senlin01:44
*** elynn has joined #senlin01:46
*** zhurong has quit IRC01:49
*** zhurong has joined #senlin01:51
*** openstackgerrit has joined #senlin01:55
*** ChanServ sets mode: +v openstackgerrit01:55
openstackgerritOpenStack Proposal Bot proposed openstack/senlin-dashboard: Updated from global requirements
openstackgerritMerged openstack/senlin: Fix various problems in doc tree
yanyanhuhi, Qiming, around?02:08
yanyanhuI have got the title and abstract of those two NFV related proposal for Boston summit from Xinhui and Haiwei. Will add them to the proposal review list of IBM02:08
yanyanhuand there is the etherpad to track them02:09
yanyanhuhi, xuhaiwei_ , around?02:10
openstackgerritXueFeng Liu proposed openstack/senlin: Revise create action when do cluster check
Qimingruijie_, a failed action is a failed action02:21
QimingI don't think we should focus on actions02:22
Qimingthey are senlin internals02:22
Qiminginstead, we should let users do what they want to do on their clusters and nodes02:22
ruijie_Yes Qiming, since in that case, node1:error, node2:active02:22
ruijie_After we trigger cluster_recover, node1 will be recreated by default02:23
ruijie_and the desired_capacity:002:23
ruijie_that doesn't make sense ..02:23
Qimingright, desired capacity was the user's intent02:23
Qimingnode1 is error, then user can delete it02:23
ruijie_Then that will be hard for user to recover the cluster to a user excepted condition.02:25
Qimingthere seem something we can improve02:26
Qimingbut never touch/operate the actions from API directly02:26
ruijie_Since the action is an internal obj, add an action_id parameter to cluster_recover may not be a good way?02:27
Qimingsuppose you have a failed action, then you try to resume/retry that action, this 'resume' or 'retry' can then become a new action that may fail ...02:27
Qimingplease think from a different angle if possible02:28
Qimingwe are treating actions read-only from users perspective02:28
Qimingonce that door is opened, we will face a lot of condtions leading the system out of control ...02:29
ruijie_Agreed, Qiming, that's why I said we should mark the cluster a special STATUS since that may need manual intervention02:29
openstackgerritMerged openstack/python-senlinclient: Add deprecation of cluster-run cli
Qimingsay you have a cluster A, desired_capacity is 0, but it still has two nodes, one is in ERROR, the other is in ACTIVE state02:30
Qimingyou are the user, what do you want to do next?02:30
ruijie_I hope to do the scale_in again to delete these 2 nodes ..02:31
Qimingthen do it02:31
Qimingwhat is the blocker?02:31
ruijie_there will be a mess. I can do cluster_recover to recover the cluster if scale_out failed since it will recreate the node by default, And I can mark the cluster as ERROR or something after cluster_recover failed.02:35
ruijie_but anyway, this could be a workaround to handle failure.02:37
ruijie_just, that will be not easy to handle scale_in failure02:37
Qimingit is not a mess02:38
Qimingthe current design is much clearer than before02:38
Qimingwhat is bringing problem is the cluster_recover operation, we can discuss that in a separate thread02:38
Qimingthe semantics of cluster_recover is still ambiguous, I agree02:39
Qimingwe can ignore that operation for the moment02:39
Qimingback to the situation you have02:39
Qimingone cluster A, desired_capacity is 0, but it still has two nodes, one ERROR, one ACTIVE02:40
Qimingyou can still do cluster-scale-in -c 2, right?02:40
Qimingif I'm remembering this correctly, senlin will check if you have 2 nodes in the cluster, and do another scale-in operation02:41
Qimingit will sort of "ignore" your previous desired_capacity02:41
ruijie_right Qiming02:42
Qimingduring the lifetime of a cluster, you will meet a lot of situations where "desired_capacity" doesn't match the actual capacity, failures may happen anywhere, any time02:42
Qimingwhat senlin gurantees you is this: it will validate the operation against the acutuality, not the desired state02:43
ruijie_scale itself is fine and clear, just I think we could find a way to handle some basic failures02:43
Qimingeven if senlin cannot sucessfully complete an operation, it will record your previous attempt into the "desired_capacity" after the operation02:44
Qimingnext time, when you want to do an operation, senlin is always providing you: 1. the fact; 2. your previous intention02:44
Qimingthat is enough for you to make a decision02:44
Qimingif this is clear enough, we can discuss the cluster_recover operation02:45
ruijie_yes Qiming02:45
Qimingcurrently, cluster_recover is very easy to be mis-interpreted as doing all and everything to recover a cluster to a healthy status02:46
Qimingthe problem lies in the definition of "healthy"02:46
Qimingback to your situation: one cluster A, desired_capacity is 0, but it still has two nodes, one ERROR, one ACTIVE02:46
Qimingwhen you type cluster-recover, what is in your mind?02:46
Qimingsenlin doesn't know ...02:47
ruijie_do scale_in again or check the action triggered02:47
Qimingthe status of the cluster is: two nodes, one ERROR, one ACTIVE02:47
ruijie_bug not a good idea since action is visiable to user02:47
Qimingwhen senlin gets the cluster_recover request, it can do one of two things:02:47
Qiming1. delete the two nodes, so that the cluster has 0 nodes;02:48
Qiming2. recover the two nodes, so that the cluster has 2 nodes02:48
Qiminghow would senlin make such a decision?02:48
Qimingthis is something we can improve02:48
ruijie_so, the desired_capacity is a point we can use?02:49
ruijie_since it represent the user's excpeted condition02:49
Qimingput it differently, we may want senlin cluster-recover to "recover" a cluster to its current "desired_capacity"02:50
Qimingas a different example, if a cluster has desired_capacity set to 5, but it only have 2 nodes currently, one ERROR, one ACTIVE02:51
Qimingcurrent behavior of cluster-recover only attempts to recover the ERROR node02:51
ruijie_I see, Qiming. Then we need to make sure the desired_capacity is operated correctly in all others actions02:51
Qiminghowever, user's intent may be: 1) recover the error node if needed, 2) delete active node if needed, 3) create new nodes if needed02:52
Qimingright, that is the dilemma facing us02:52
Qimingthis worth a spec discussion in my opinion02:53
Qimingit is not a simple bug fix as I feel it02:53
ruijie_Agreed Qiming.02:54
Qimingbe cautious though02:55
Qimingthis work, once iniated, is an overlap of the cluster-scale-in or cluster-scale-out logic02:56
Qimingall scaling operation can be done this way: 1) cluster-resize --desired <new_value> <cluster> 2) cluster-recover if necessary02:57
Qimingthat means, you first set the new 'desired_capacity', then you use cluster-recover to enforce it, ...02:57
Qimingcluster-recover == recover-bad-nodes + delete-extra-nodes + create-new-nodes02:58
Qiminganyway, we need spend more time on this for clarification02:59
ruijie_yes Qiming, maybe we need to have a clear definition of "desired_capacity"03:02
Qimingdesired capacity defined by its name: desired capacity03:03
Qimingwhenever you read the value from a cluster, you know after all previous operations, what size the user want the cluster to have03:04
Qimingdesired capacity is used to determine cluster status03:04
Qimingyou know, cluster status is not determined by the operations performed on it03:04
Qimingusers and us should be aware of a simple truth: desired capacity can be the reality, but in a cloud environment, it is very common for the values to differ03:05
ruijie_yes Qiming, will think about it03:07
openstackgerritQiming Teng proposed openstack/senlin: Remove LB_STATUS_POLLING from health detection type
*** zhurong has quit IRC03:39
openstackgerritMerged openstack/senlin-dashboard: Imported Translations from Zanata
openstackgerritMerged openstack/senlin-dashboard: Updated from global requirements
*** shu-mutou-AWAY is now known as shu-mutou04:03
openstackgerritXueFeng Liu proposed openstack/senlin: Remove LB_STATUS_POLLING from health detection type
*** catinthe_ has quit IRC04:54
*** catintheroof has joined #senlin05:02
*** catintheroof has quit IRC05:14
yanyanhuhi, xuhaiwei_, are you around?05:16
*** eandersson_ has joined #senlin05:30
*** eandersson__ has quit IRC05:33
*** catintheroof has joined #senlin05:46
*** zhurong has joined #senlin05:49
openstackgerritMerged openstack/senlin: Remove LB_STATUS_POLLING from health detection type
openstackgerritMerged openstack/python-senlinclient: Updated from global requirements
openstackgerritmiaohb proposed openstack/senlin: Add db purge in senlin manage
*** XueFeng has quit IRC06:50
*** zhurong has quit IRC07:03
*** Jeffrey4l_ has quit IRC07:10
*** XueFeng has joined #senlin07:19
xuhaiwei_yanyanhu: sorry , just saw it now07:48
yanyanhuhi, xuhaiwei_ no problem07:50
yanyanhujust want to discuss with you about the proposal title07:50
yanyanhuthis one07:50
yanyanhuI feel the current name is not that sparking :)07:51
yanyanhuand also it doesn't mention Tacker which an important basement of that topic07:52
yanyanhumaybe we can consider to change it to something like "integrating Senlin and Tacker for scalable and high available VDU management"?07:53
yanyanhuNFV is a word for too huge scope...07:53
yanyanhuI guess being more specific could be helpful to attract audience :)07:55
xuhaiwei_yanyanhu: yes, the title is not good IMO :)07:57
yanyanhuyep, if there is no "Tacker" appears in the title, it is not appropriate for this is expected to be joint proposal from Senlin and Tacker :)07:58
xuhaiwei_yanyanhu: My manager also mentioned this point, he thought the title should not contain project names in it07:59
yanyanhuxuhaiwei_, that is also ok I think07:59
yanyanhuif we don't mention neither tacker or senlin08:00
yanyanhuif so, the title can be more specific for the issue we want to address08:00
yanyanhuit hard to attract people if the name is too common :)08:01
yanyanhuNFV has been talked about for years08:01
xuhaiwei_another problem is that we should stress why senlin are used for the vnf scaling, that means how senlin meets nfv's use case, it seems vnf's scaling is not that easy, there should be many specific configurations08:02
yanyanhuso we just focus on the specific issue we are about to address or the features we provide08:02
yanyanhuxuhaiwei_, yes absolutely08:02
yanyanhuthat should be the most important part of the presentation08:03
yanyanhuwhy we use senlin to support VDU scaling(pool management)08:03
xuhaiwei_so I think we should figure out how to start a VNF first, to see what configurations are needed there08:03
yanyanhuwhat is the benefit and what is the challenge08:03
yanyanhuxuhaiwei_, yes. Just can put the implementation detail aside for a while08:04
QimingVDU is only for NFV guys ... but it may be fine08:04
yanyanhuand figure it out in higher design level08:04
Qimingmentioning "senlin" or "tacker" in the title is not gonna help attracting more audience08:05
yanyanhuQiming, thanks. please just correct me if I use it incorrectly :)08:05
xuhaiwei_yanyanhu: another question about senlin, does senlin support a policy that can be triggered by time? for example 18:00 every day, scale out a vm?08:05
yanyanhuactually I'm not very clear about the difference between those glossary :)08:05
Qimingand we have already passed the stage to raise the community's awareness of this project or others08:05
*** eandersson_ has quit IRC08:06
yanyanhuxuhaiwei_, we can if it is needed08:06
*** eandersson_ has joined #senlin08:06
Qimingxuhaiwei_, you can create a cron tab entry easily for those requirements08:06
yanyanhuactually you can implement any policy if it matches the interface senlin defines for policy plugin08:06
Qimingit is out of senlin's current scope I think08:06
yanyanhuthat why I said we can put the implementation detail aside in current stage08:07
Qimingthe sky is your only limit for how to trigger a senlin operation08:07
xuhaiwei_just saw some articles which said some vnfs need to be scaled out due to time changes08:07
Qimingadding a new cron policy to trigger certain actions at specific point in time08:08
Qimingthat is doable too08:08
xuhaiwei_got it08:08
xuhaiwei_currently if we want to install  some applications into vms, we have to use software_config or just user_data?08:10
xuhaiwei_I mean make the whole story automatic08:10
ruijie_Hi all, just drafted an topic about our use case,hope to listen to your opinions :)08:16
Qimingxuhaiwei_, you can do both ways, depending on which profile you are using08:17
Qimingthere is no such a process that fits all usage scenarios I think08:17
xuhaiwei_Qiming, got it, thanks08:17
Qimingbe cautious when using software config08:18
Qimingit is very difficult to debug ...08:18
Qimingyou will need a vm image that has all the required agents installed08:19
xuhaiwei_the required agents are?08:20
Qimingthose agents will write data here and there, /var/log, /var/lib, /var/run ...08:20
Qimingos-collect-config, os-refresh-config, heat-config, heat-script-config, ... etc08:21
xuhaiwei_ok, where can I have it?08:21
xuhaiwei_you made it by diskimage-builder?08:21
Qimingall those things have to be installed in order to have software config and software deployment work08:23
Qimingmost of the software are now migrated into a new repo here:
xuhaiwei_I used to create this kind of image once08:23
QimingI was a fan of software config08:24
Qimingafter playing it for a while08:24
QimingI gave up08:24
Qimingdifficult to debug08:24
Qimingand ... useless08:25
Qimingthere are many twisted parameters to tune08:25
Qimingsoftware_config_transport for example08:25
xuhaiwei_I think it can make the heat template clean at least :)08:26
Qimingokay, if you think that way08:26
xuhaiwei_honestly I am not using it too much08:26
Qimingwhat if you want to change a parameter to your software config?08:26
Qimingand try make a mistake by typing ${SOMETHING} to $(SOMETHING) ...08:27
Qimingit will take you a week to find where things go wrong08:28
xuhaiwei_I'd better not try08:28
Qimingyou have to understand the whole workflow ... read all the agents source code to figure out where debug output is written08:29
xuhaiwei_yes, that's not good08:29
Qimingbookmark this link:
Qimingyou will need it quickly08:30
Qimingalso this:
Qimingif you are doing things using Bash scripts08:32
xuhaiwei_got it08:33
Qimingthis is the code you will check if software deployment is not found in the VM:
xuhaiwei_I used to debug os-xxx-config a little08:35
xuhaiwei_in fact curretly I am doing something terrible enough, that is install an openstack environment with ironic on a NEC server and NEC switch in VLAN network08:37
xuhaiwei_now I know why hareware vendor wants to contribute to openstack08:39
Qiminggood exepriences08:48
xuhaiwei_by the way, happy Chinese xiaonian to all, let's go home early today!08:51
ruijie_haha, my stage name is xiaonian :)08:52
xuhaiwei_what is stage name? ruijie_08:54
ruijie_oh, the company ask every to have a stage name(nickname)08:55
ruijie_then we call each other's nickname :)08:55
xuhaiwei_got it :)08:56
xuhaiwei_today is special to you08:57
ruijie_haha :)08:58
*** openstackgerrit has quit IRC09:02
yanyanhuxuhaiwei_, thanks, you too :)09:03
yanyanhuremember to eat dumpline09:03
yanyanhuruijie_, nice name, haha09:04
xuhaiwei_today my wife and me will eat malatang :)09:04
xuhaiwei_ma la tang09:05
xuhaiwei_leaving now, see u09:05
yanyanhusee u09:05
ruijie_I used dictionary to search malatang :)09:09
*** yanyanhu has quit IRC09:19
*** openstackgerrit has joined #senlin09:22
*** ChanServ sets mode: +v openstackgerrit09:22
openstackgerritRUIJIE YUAN proposed openstack/senlin: revise tempest api for cluster 4
openstackgerritRUIJIE YUAN proposed openstack/senlin: revise tempest api test for cluster 4
openstackgerritRUIJIE YUAN proposed openstack/senlin: revise tempest api for cluster 4
openstackgerritQiming Teng proposed openstack/senlin: Add developer doc for health policy
QimingXueFeng, online?10:37
openstackgerritQiming Teng proposed openstack/senlin: Add developer doc for health policy
*** hanwei has quit IRC10:39
*** Jeffrey4l has joined #senlin10:40
*** lixinhui has quit IRC10:43
openstackgerritKenji Ishii proposed openstack/senlin-dashboard: Implement action updating cluster policies
openstackgerritKenji Ishii proposed openstack/senlin-dashboard: Implement action updating cluster policies
openstackgerritKenji Ishii proposed openstack/senlin-dashboard: Implement action updating cluster policies
*** hanwei has joined #senlin11:49
*** hanwei_ has joined #senlin11:56
*** hanwei has quit IRC11:58
*** hanwei has joined #senlin11:59
*** hanwei_ has quit IRC12:03
*** wllabs has quit IRC12:09
*** catinthe_ has joined #senlin12:32
*** catintheroof has quit IRC12:33
*** fabian4 has quit IRC12:37
*** Jeffrey4l_ has joined #senlin13:00
*** Jeffrey4l has quit IRC13:04
*** catintheroof has joined #senlin14:33
*** catinthe_ has quit IRC14:36
*** elynn has quit IRC14:54
*** Drago1 has joined #senlin16:18
*** Drago1 has quit IRC18:32
*** Drago1 has joined #senlin18:43
*** catinthe_ has joined #senlin19:50
*** catintheroof has quit IRC19:53
*** catinthe_ has quit IRC21:02
*** catintheroof has joined #senlin21:03
*** catintheroof has quit IRC21:07
*** Jeffrey4l__ has joined #senlin21:35
*** Jeffrey4l_ has quit IRC21:35

Generated by 2.14.0 by Marius Gedminas - find it at!