Friday, 2017-01-20

*** zhurong has joined #senlin		00:48
*** hanwei has joined #senlin		01:11
*** Drago2 has quit IRC		01:32
*** yanyanhu has joined #senlin		01:44
*** elynn has joined #senlin		01:46
*** zhurong has quit IRC		01:49
*** zhurong has joined #senlin		01:51
*** openstackgerrit has joined #senlin		01:55
*** ChanServ sets mode: +v openstackgerrit		01:55
openstackgerrit	OpenStack Proposal Bot proposed openstack/senlin-dashboard: Updated from global requirements https://review.openstack.org/422971	01:55
openstackgerrit	Merged openstack/senlin: Fix various problems in doc tree https://review.openstack.org/422386	02:06
yanyanhu	hi, Qiming, around?	02:08
yanyanhu	I have got the title and abstract of those two NFV related proposal for Boston summit from Xinhui and Haiwei. Will add them to the proposal review list of IBM	02:08
yanyanhu	and there is the etherpad to track them	02:09
yanyanhu	https://etherpad.openstack.org/p/senlin-boston-summit-proposal	02:09
yanyanhu	hi, xuhaiwei_ , around?	02:10
openstackgerrit	XueFeng Liu proposed openstack/senlin: Revise create action when do cluster check https://review.openstack.org/421615	02:15
Qiming	ruijie_, a failed action is a failed action	02:21
Qiming	I don't think we should focus on actions	02:22
Qiming	they are senlin internals	02:22
Qiming	instead, we should let users do what they want to do on their clusters and nodes	02:22
ruijie_	Yes Qiming, since in that case, node1:error, node2:active	02:22
ruijie_	After we trigger cluster_recover, node1 will be recreated by default	02:23
ruijie_	and the desired_capacity:0	02:23
ruijie_	that doesn't make sense ..	02:23
Qiming	right, desired capacity was the user's intent	02:23
Qiming	node1 is error, then user can delete it	02:23
ruijie_	Then that will be hard for user to recover the cluster to a user excepted condition.	02:25
Qiming	okay	02:26
Qiming	there seem something we can improve	02:26
Qiming	but never touch/operate the actions from API directly	02:26
ruijie_	Since the action is an internal obj, add an action_id parameter to cluster_recover may not be a good way?	02:27
Qiming	suppose you have a failed action, then you try to resume/retry that action, this 'resume' or 'retry' can then become a new action that may fail ...	02:27
Qiming	please think from a different angle if possible	02:28
Qiming	we are treating actions read-only from users perspective	02:28
Qiming	once that door is opened, we will face a lot of condtions leading the system out of control ...	02:29
ruijie_	Agreed, Qiming, that's why I said we should mark the cluster a special STATUS since that may need manual intervention	02:29
openstackgerrit	Merged openstack/python-senlinclient: Add deprecation of cluster-run cli https://review.openstack.org/422498	02:29
Qiming	say you have a cluster A, desired_capacity is 0, but it still has two nodes, one is in ERROR, the other is in ACTIVE state	02:30
Qiming	you are the user, what do you want to do next?	02:30
ruijie_	I hope to do the scale_in again to delete these 2 nodes ..	02:31
Qiming	then do it	02:31
Qiming	what is the blocker?	02:31
ruijie_	there will be a mess. I can do cluster_recover to recover the cluster if scale_out failed since it will recreate the node by default, And I can mark the cluster as ERROR or something after cluster_recover failed.	02:35
ruijie_	but anyway, this could be a workaround to handle failure.	02:37
ruijie_	just, that will be not easy to handle scale_in failure	02:37
Qiming	it is not a mess	02:38
Qiming	the current design is much clearer than before	02:38
Qiming	what is bringing problem is the cluster_recover operation, we can discuss that in a separate thread	02:38
Qiming	the semantics of cluster_recover is still ambiguous, I agree	02:39
Qiming	we can ignore that operation for the moment	02:39
Qiming	back to the situation you have	02:39
Qiming	one cluster A, desired_capacity is 0, but it still has two nodes, one ERROR, one ACTIVE	02:40
Qiming	you can still do cluster-scale-in -c 2, right?	02:40
Qiming	if I'm remembering this correctly, senlin will check if you have 2 nodes in the cluster, and do another scale-in operation	02:41
Qiming	it will sort of "ignore" your previous desired_capacity	02:41
ruijie_	right Qiming	02:42
Qiming	during the lifetime of a cluster, you will meet a lot of situations where "desired_capacity" doesn't match the actual capacity, failures may happen anywhere, any time	02:42
Qiming	what senlin gurantees you is this: it will validate the operation against the acutuality, not the desired state	02:43
ruijie_	scale itself is fine and clear, just I think we could find a way to handle some basic failures	02:43
Qiming	even if senlin cannot sucessfully complete an operation, it will record your previous attempt into the "desired_capacity" after the operation	02:44
Qiming	next time, when you want to do an operation, senlin is always providing you: 1. the fact; 2. your previous intention	02:44
Qiming	that is enough for you to make a decision	02:44
Qiming	if this is clear enough, we can discuss the cluster_recover operation	02:45
ruijie_	yes Qiming	02:45
Qiming	currently, cluster_recover is very easy to be mis-interpreted as doing all and everything to recover a cluster to a healthy status	02:46
Qiming	the problem lies in the definition of "healthy"	02:46
Qiming	back to your situation: one cluster A, desired_capacity is 0, but it still has two nodes, one ERROR, one ACTIVE	02:46
Qiming	when you type cluster-recover, what is in your mind?	02:46
Qiming	senlin doesn't know ...	02:47
ruijie_	do scale_in again or check the action triggered	02:47
Qiming	the status of the cluster is: two nodes, one ERROR, one ACTIVE	02:47
ruijie_	bug not a good idea since action is visiable to user	02:47
ruijie_	invisible	02:47
Qiming	when senlin gets the cluster_recover request, it can do one of two things:	02:47
Qiming	1. delete the two nodes, so that the cluster has 0 nodes;	02:48
Qiming	2. recover the two nodes, so that the cluster has 2 nodes	02:48
Qiming	how would senlin make such a decision?	02:48
Qiming	this is something we can improve	02:48
ruijie_	so, the desired_capacity is a point we can use?	02:49
Qiming	right	02:49
Qiming	exactly	02:49
ruijie_	since it represent the user's excpeted condition	02:49
Qiming	put it differently, we may want senlin cluster-recover to "recover" a cluster to its current "desired_capacity"	02:50
Qiming	as a different example, if a cluster has desired_capacity set to 5, but it only have 2 nodes currently, one ERROR, one ACTIVE	02:51
Qiming	current behavior of cluster-recover only attempts to recover the ERROR node	02:51
ruijie_	I see, Qiming. Then we need to make sure the desired_capacity is operated correctly in all others actions	02:51
Qiming	however, user's intent may be: 1) recover the error node if needed, 2) delete active node if needed, 3) create new nodes if needed	02:52
Qiming	right, that is the dilemma facing us	02:52
Qiming	this worth a spec discussion in my opinion	02:53
Qiming	it is not a simple bug fix as I feel it	02:53
ruijie_	Agreed Qiming.	02:54
Qiming	be cautious though	02:55
Qiming	this work, once iniated, is an overlap of the cluster-scale-in or cluster-scale-out logic	02:56
Qiming	all scaling operation can be done this way: 1) cluster-resize --desired <new_value> <cluster> 2) cluster-recover if necessary	02:57
Qiming	that means, you first set the new 'desired_capacity', then you use cluster-recover to enforce it, ...	02:57
Qiming	cluster-recover == recover-bad-nodes + delete-extra-nodes + create-new-nodes	02:58
Qiming	anyway, we need spend more time on this for clarification	02:59
ruijie_	yes Qiming, maybe we need to have a clear definition of "desired_capacity"	03:02
Qiming	desired capacity defined by its name: desired capacity	03:03
Qiming	whenever you read the value from a cluster, you know after all previous operations, what size the user want the cluster to have	03:04
Qiming	desired capacity is used to determine cluster status	03:04
Qiming	you know, cluster status is not determined by the operations performed on it	03:04
Qiming	users and us should be aware of a simple truth: desired capacity can be the reality, but in a cloud environment, it is very common for the values to differ	03:05
ruijie_	yes Qiming, will think about it	03:07
Qiming	great	03:07
openstackgerrit	Qiming Teng proposed openstack/senlin: Remove LB_STATUS_POLLING from health detection type https://review.openstack.org/423012	03:11
*** zhurong has quit IRC		03:39
openstackgerrit	Merged openstack/senlin-dashboard: Imported Translations from Zanata https://review.openstack.org/421683	03:55
openstackgerrit	Merged openstack/senlin-dashboard: Updated from global requirements https://review.openstack.org/422971	03:56
*** shu-mutou-AWAY is now known as shu-mutou		04:03
openstackgerrit	XueFeng Liu proposed openstack/senlin: Remove LB_STATUS_POLLING from health detection type https://review.openstack.org/423012	04:39
*** catinthe_ has quit IRC		04:54
*** catintheroof has joined #senlin		05:02
*** catintheroof has quit IRC		05:14
yanyanhu	hi, xuhaiwei_, are you around?	05:16
*** eandersson_ has joined #senlin		05:30
*** eandersson__ has quit IRC		05:33
*** catintheroof has joined #senlin		05:46
*** zhurong has joined #senlin		05:49
openstackgerrit	Merged openstack/senlin: Remove LB_STATUS_POLLING from health detection type https://review.openstack.org/423012	05:59
openstackgerrit	Merged openstack/python-senlinclient: Updated from global requirements https://review.openstack.org/420865	06:15
openstackgerrit	miaohb proposed openstack/senlin: Add db purge in senlin manage https://review.openstack.org/420666	06:43
*** XueFeng has quit IRC		06:50
*** zhurong has quit IRC		07:03
*** Jeffrey4l_ has quit IRC		07:10
*** XueFeng has joined #senlin		07:19
xuhaiwei_	yanyanhu: sorry , just saw it now	07:48
yanyanhu	hi, xuhaiwei_ no problem	07:50
yanyanhu	just want to discuss with you about the proposal title	07:50
yanyanhu	this one	07:50
yanyanhu	https://etherpad.openstack.org/p/boston_summit_proposal_senlin_tacker	07:50
yanyanhu	I feel the current name is not that sparking :)	07:51
yanyanhu	and also it doesn't mention Tacker which an important basement of that topic	07:52
yanyanhu	maybe we can consider to change it to something like "integrating Senlin and Tacker for scalable and high available VDU management"?	07:53
yanyanhu	NFV is a word for too huge scope...	07:53
yanyanhu	I guess being more specific could be helpful to attract audience :)	07:55
xuhaiwei_	yanyanhu: yes, the title is not good IMO :)	07:57
yanyanhu	yep, if there is no "Tacker" appears in the title, it is not appropriate for this is expected to be joint proposal from Senlin and Tacker :)	07:58
xuhaiwei_	yanyanhu: My manager also mentioned this point, he thought the title should not contain project names in it	07:59
yanyanhu	xuhaiwei_, that is also ok I think	07:59
yanyanhu	if we don't mention neither tacker or senlin	08:00
xuhaiwei_	yea	08:00
yanyanhu	if so, the title can be more specific for the issue we want to address	08:00
yanyanhu	it hard to attract people if the name is too common :)	08:01
yanyanhu	NFV has been talked about for years	08:01
xuhaiwei_	another problem is that we should stress why senlin are used for the vnf scaling, that means how senlin meets nfv's use case, it seems vnf's scaling is not that easy, there should be many specific configurations	08:02
yanyanhu	so we just focus on the specific issue we are about to address or the features we provide	08:02
yanyanhu	xuhaiwei_, yes absolutely	08:02
yanyanhu	that should be the most important part of the presentation	08:03
yanyanhu	why we use senlin to support VDU scaling(pool management)	08:03
xuhaiwei_	so I think we should figure out how to start a VNF first, to see what configurations are needed there	08:03
yanyanhu	what is the benefit and what is the challenge	08:03
yanyanhu	xuhaiwei_, yes. Just can put the implementation detail aside for a while	08:04
Qiming	VDU is only for NFV guys ... but it may be fine	08:04
yanyanhu	and figure it out in higher design level	08:04
Qiming	mentioning "senlin" or "tacker" in the title is not gonna help attracting more audience	08:05
yanyanhu	Qiming, thanks. please just correct me if I use it incorrectly :)	08:05
xuhaiwei_	yanyanhu: another question about senlin, does senlin support a policy that can be triggered by time? for example 18:00 every day, scale out a vm?	08:05
yanyanhu	actually I'm not very clear about the difference between those glossary :)	08:05
Qiming	and we have already passed the stage to raise the community's awareness of this project or others	08:05
*** eandersson_ has quit IRC		08:06
yanyanhu	xuhaiwei_, we can if it is needed	08:06
*** eandersson_ has joined #senlin		08:06
Qiming	xuhaiwei_, you can create a cron tab entry easily for those requirements	08:06
yanyanhu	actually you can implement any policy if it matches the interface senlin defines for policy plugin	08:06
Qiming	it is out of senlin's current scope I think	08:06
yanyanhu	that why I said we can put the implementation detail aside in current stage	08:07
yanyanhu	that's	08:07
Qiming	the sky is your only limit for how to trigger a senlin operation	08:07
xuhaiwei_	just saw some articles which said some vnfs need to be scaled out due to time changes	08:07
Qiming	adding a new cron policy to trigger certain actions at specific point in time	08:08
Qiming	that is doable too	08:08
xuhaiwei_	got it	08:08
xuhaiwei_	currently if we want to install some applications into vms, we have to use software_config or just user_data?	08:10
xuhaiwei_	I mean make the whole story automatic	08:10
ruijie_	Hi all, just drafted an topic about our use case,hope to listen to your opinions :)	08:16
ruijie_	https://etherpad.openstack.org/p/senlin-dtdream-use-case	08:16
Qiming	xuhaiwei_, you can do both ways, depending on which profile you are using	08:17
Qiming	there is no such a process that fits all usage scenarios I think	08:17
xuhaiwei_	Qiming, got it, thanks	08:17
Qiming	np	08:17
Qiming	be cautious when using software config	08:18
Qiming	it is very difficult to debug ...	08:18
xuhaiwei_	ok	08:18
Qiming	you will need a vm image that has all the required agents installed	08:19
xuhaiwei_	the required agents are?	08:20
Qiming	those agents will write data here and there, /var/log, /var/lib, /var/run ...	08:20
Qiming	...	08:20
Qiming	os-collect-config, os-refresh-config, heat-config, heat-script-config, ... etc	08:21
xuhaiwei_	ohhh	08:21
xuhaiwei_	ok, where can I have it?	08:21
xuhaiwei_	you made it by diskimage-builder?	08:21
Qiming	http://git.openstack.org/cgit/openstack/heat-templates/tree/hot/software-config/elements/README.rst	08:22
Qiming	all those things have to be installed in order to have software config and software deployment work	08:23
Qiming	most of the software are now migrated into a new repo here: http://git.openstack.org/cgit/openstack/heat-agents/	08:23
xuhaiwei_	I used to create this kind of image once	08:23
Qiming	I was a fan of software config	08:24
Qiming	after playing it for a while	08:24
Qiming	I gave up	08:24
xuhaiwei_	why	08:24
Qiming	difficult to debug	08:24
Qiming	and ... useless	08:25
Qiming	there are many twisted parameters to tune	08:25
Qiming	software_config_transport for example	08:25
xuhaiwei_	I think it can make the heat template clean at least :)	08:26
Qiming	okay, if you think that way	08:26
xuhaiwei_	honestly I am not using it too much	08:26
Qiming	what if you want to change a parameter to your software config?	08:26
Qiming	and try make a mistake by typing ${SOMETHING} to $(SOMETHING) ...	08:27
Qiming	it will take you a week to find where things go wrong	08:28
xuhaiwei_	I'd better not try	08:28
Qiming	you have to understand the whole workflow ... read all the agents source code to figure out where debug output is written	08:29
xuhaiwei_	yes, that's not good	08:29
Qiming	bookmark this link: http://git.openstack.org/cgit/openstack/heat-agents/tree/heat-config/os-refresh-config/configure.d/55-heat-config	08:30
Qiming	you will need it quickly	08:30
xuhaiwei_	done	08:31
Qiming	also this: http://git.openstack.org/cgit/openstack/heat-agents/tree/heat-config-script/install.d/hook-script.py	08:32
Qiming	if you are doing things using Bash scripts	08:32
xuhaiwei_	got it	08:33
Qiming	this is the code you will check if software deployment is not found in the VM: http://git.openstack.org/cgit/openstack/os-collect-config/tree/os_collect_config/collect.py	08:34
xuhaiwei_	ok	08:35
xuhaiwei_	I used to debug os-xxx-config a little	08:35
xuhaiwei_	in fact curretly I am doing something terrible enough, that is install an openstack environment with ironic on a NEC server and NEC switch in VLAN network	08:37
xuhaiwei_	now I know why hareware vendor wants to contribute to openstack	08:39
Qiming	:)	08:48
Qiming	good exepriences	08:48
xuhaiwei_	by the way, happy Chinese xiaonian to all, let's go home early today!	08:51
ruijie_	haha, my stage name is xiaonian :)	08:52
xuhaiwei_	what is stage name? ruijie_	08:54
ruijie_	oh, the company ask every to have a stage name(nickname)	08:55
ruijie_	then we call each other's nickname :)	08:55
xuhaiwei_	got it :)	08:56
xuhaiwei_	today is special to you	08:57
ruijie_	haha :)	08:58
*** openstackgerrit has quit IRC		09:02
yanyanhu	xuhaiwei_, thanks, you too :)	09:03
yanyanhu	remember to eat dumpline	09:03
yanyanhu	ruijie_, nice name, haha	09:04
xuhaiwei_	today my wife and me will eat malatang :)	09:04
yanyanhu	cool	09:05
yanyanhu	haha	09:05
xuhaiwei_	ma la tang	09:05
xuhaiwei_	leaving now, see u	09:05
yanyanhu	see u	09:05
ruijie_	I used dictionary to search malatang :)	09:09
yanyanhu	:P	09:15
*** yanyanhu has quit IRC		09:19
*** openstackgerrit has joined #senlin		09:22
*** ChanServ sets mode: +v openstackgerrit		09:22
openstackgerrit	RUIJIE YUAN proposed openstack/senlin: revise tempest api for cluster 4 https://review.openstack.org/423139	09:22
openstackgerrit	RUIJIE YUAN proposed openstack/senlin: revise tempest api test for cluster 4 https://review.openstack.org/423140	09:22
openstackgerrit	RUIJIE YUAN proposed openstack/senlin: revise tempest api for cluster 4 https://review.openstack.org/423139	09:32
openstackgerrit	Qiming Teng proposed openstack/senlin: Add developer doc for health policy https://review.openstack.org/423174	10:16
Qiming	XueFeng, online?	10:37
openstackgerrit	Qiming Teng proposed openstack/senlin: Add developer doc for health policy https://review.openstack.org/423174	10:37
*** hanwei has quit IRC		10:39
*** Jeffrey4l has joined #senlin		10:40
*** lixinhui has quit IRC		10:43
openstackgerrit	Kenji Ishii proposed openstack/senlin-dashboard: Implement action updating cluster policies https://review.openstack.org/423209	11:05
openstackgerrit	Kenji Ishii proposed openstack/senlin-dashboard: Implement action updating cluster policies https://review.openstack.org/423209	11:06
openstackgerrit	Kenji Ishii proposed openstack/senlin-dashboard: Implement action updating cluster policies https://review.openstack.org/423209	11:23
*** hanwei has joined #senlin		11:49
*** hanwei_ has joined #senlin		11:56
*** hanwei has quit IRC		11:58
*** hanwei has joined #senlin		11:59
*** hanwei_ has quit IRC		12:03
*** wllabs has quit IRC		12:09
*** catinthe_ has joined #senlin		12:32
*** catintheroof has quit IRC		12:33
*** fabian4 has quit IRC		12:37
*** Jeffrey4l_ has joined #senlin		13:00
*** Jeffrey4l has quit IRC		13:04
XueFeng	hi,QiMing	13:24
*** catintheroof has joined #senlin		14:33
*** catinthe_ has quit IRC		14:36
*** elynn has quit IRC		14:54
*** Drago1 has joined #senlin		16:18
*** Drago1 has quit IRC		18:32
*** Drago1 has joined #senlin		18:43
*** catinthe_ has joined #senlin		19:50
*** catintheroof has quit IRC		19:53
*** catinthe_ has quit IRC		21:02
*** catintheroof has joined #senlin		21:03
*** catintheroof has quit IRC		21:07
*** Jeffrey4l__ has joined #senlin		21:35
*** Jeffrey4l_ has quit IRC		21:35

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!