13:01:05 #startmeeting senlin 13:01:06 Meeting started Tue Jun 28 13:01:05 2016 UTC and is due to finish in 60 minutes. The chair is yanyanhu. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:01:07 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:01:09 The meeting name has been set to 'senlin' 13:01:20 hi 13:01:28 o/ 13:01:57 hi, elynn 13:02:01 hi, lixinhui_ 13:02:05 long time no see :P 13:02:33 ok, here is the agenda, plz feel free to add any item you want to discuss 13:02:38 https://wiki.openstack.org/wiki/Meetings/SenlinAgenda#Weekly_Senlin_.28Clustering.29_meeting 13:03:06 #topic newton workitem 13:03:17 https://etherpad.openstack.org/p/senlin-newton-workitems 13:03:29 here is the etherpad track our newton workitem 13:03:39 first item, testing 13:03:47 tempest API test has been done 13:03:56 and tempest functional test is in good progress 13:04:03 Dose release notes in? 13:04:14 I think all existing functional test cases have been migrated(some in progress) 13:04:39 elynn, not yet, several patches are still under review :) 13:04:53 but I think we can finish it in this week 13:05:15 the new gate job for tempest functional test is also available now 13:05:35 although it is experimental and doesn't vote 13:05:58 after the functional test migration is done, I will propose a patch to remove the old functional test gate job 13:06:06 it will be replaced by the new one :) 13:06:27 then we will have two jobs enabled in both check and gate pipeline and they will vote 13:06:45 ok, this is the tempest test part 13:06:58 about rally plugin, didn't get time to work on it this week 13:07:13 but our patch to add cluster/profile plugin for rally got first +2 13:07:19 need another +2 and workflow 13:07:28 basically, it looks good now 13:07:40 ok 13:07:55 hi, lixinhui_ , around? 13:08:07 any update on HA related work :) 13:08:28 I think Qiming didn't get time to work on it in last week 13:08:35 yes 13:08:48 I finished the fencing tests 13:08:59 great! 13:09:01 need to discuss how to bring into 13:09:03 Senlin 13:09:06 that is important for our HA solution 13:09:08 ok 13:09:26 And I tried the resilient 13:09:34 I think we can make further discussion in irc channel to decide how to merge into current HA framework 13:09:42 elastic cluster template with ceilometer/aodh/gnocchi 13:09:52 yes, yanyanhu 13:10:17 so monitoring has been included? 13:10:31 for failure detection? 13:11:47 actually I'm thinking how to build the basic workflow of our HA solution, including leverage other monitor service to detect failure in different layers 13:11:48 what the failure detection? 13:12:02 like host/VM crash or app failure 13:12:04 the failure detection is based on node status 13:12:18 yes 13:12:20 yanyanhu 13:12:23 yes, from nova notification, e.g. 13:12:26 for VM failure 13:12:43 I think Qiming was working on this? 13:12:44 one problem I met is about heat 13:12:49 yes? 13:13:06 where do you know to set the timeout or retry time 13:13:18 retry for? 13:13:20 creation of loadbalancer is very slow 13:13:31 then heat stack-create will keep retrying 13:13:32 yes, it is sometimes 13:13:48 I'm not sure there is such an option to customized 13:13:54 then there will be several loadbalancers under creation 13:13:57 hi, elynn, do you have any idea? 13:14:35 I think there should be somewhere can be customized 13:14:49 and I know Qiming is adding listener 13:14:56 for detection functions 13:15:09 I guess that is a fixed value defined in heat engine? 13:15:09 so you put all those resources in a heat template? 13:15:31 yes 13:15:35 retry time for creating resource? 13:15:40 yes 13:15:43 elynn 13:16:03 lixinhui_, maybe you can try to split lb resource from other ones if the stack creation always failed for the timeout of lb creation 13:16:13 to avoid creations duplicated creation of loadbalancer 13:16:44 yanyanhu, I am using the same template as we presented in Austin 13:16:58 which is committed into as a tutorial 13:17:03 lixinhui_, that should happened I think if you mean duplicated creation of lb for timeout retry 13:17:03 I can't remember any property or configurations about that. 13:17:20 lixinhui_, I see 13:17:30 that is okay elynn 13:17:38 octavia is so slow 13:17:38 sorry, that shouldn't happen 13:17:49 so haproxy works well 13:17:51 I can check later 13:18:01 but octavia doesn't? 13:18:05 that will be very nice, elynn 13:18:12 That would be a problem if it happens on presentation... 13:18:18 yes, yanyanhu 13:18:30 I see 13:18:39 elynn, nsx and haporxy work well 13:18:43 elynn, hope we can find some solution for it 13:19:00 or in worst case, use some workaround 13:19:00 why so many people use octavia 13:19:12 lixinhui_, we don't :) 13:19:31 yanyanhu, what kinds backend for IBM 13:19:32 ? 13:19:35 We can at least change the hardcode retry times in our env ;) 13:20:10 I'm not sure. But I think octavia is better than haproxy hosted in network controller 13:20:24 Saw an option client_retry_limit 13:20:32 we can try to increase it in heat.conf 13:20:43 not only reduce the risk of single failure, but also much better scalability I think 13:20:50 oh, elynn? 13:21:11 sorry, single point failure 13:21:11 okay yanyanhu 13:21:18 I see 13:21:36 BTW, I have draft a topic proposal about HA for Barcelona summit 13:21:43 https://www.openstack.org/summit/barcelona-2016/call-for-presentations/manage/15037/summary 13:21:56 hope we can finish our HA design and make a presentation for it 13:22:07 I added you two and Qiming's name as speaker 13:22:17 And also we can specify an timeout parameter when creating new stack. 13:22:31 I think we can make further discussion to see how can we build this demo 13:22:40 okay, yanyanhu. will read it and raise discussion 13:22:58 sounds like a better solution, elynn 13:23:02 lixinhui_, thanks :) I think HA is an important feature we need to finish in this cycle 13:23:56 although maybe the basic one, we hope it is a complete loop, from failure detection to recovery 13:24:00 is that a cli parameter? elynn 13:24:23 lixinhui_: yes 13:24:38 okay, I will try it 13:24:47 yes, yanynahu 13:24:53 lixinhui_, timeout works well I think 13:25:01 for heat stack creation 13:25:16 :) 13:25:37 you can have a try. I tried to increase it from 1 hour to 10 hours when deployed a very large and complicated stack in softlayer 13:25:39 :) 13:26:09 That is so long. 13:26:11 happy to know 13:26:34 ok, I think we can collect our idea about HA topic using this existing etherpad: https://etherpad.openstack.org/p/senlin-ha-recover 13:26:53 elynn, yes, since the deployment of some services inside VM is very slow :) 13:26:57 like DB2 13:27:14 and we used software deployment for it 13:27:21 that's true... 13:27:33 anyway, it works good 13:27:54 ok, so this is about HA? 13:27:59 yes 13:28:01 we can have more discussion offline 13:28:06 thanks, xinhui 13:28:06 sure 13:28:22 pleasure 13:28:22 next one 13:28:27 :) 13:28:31 lets skip document 13:28:43 I guess haiwei is not here? 13:29:00 I noticed he proposed patch to add docker driver 13:29:16 although just for most basic operations like creating, deleting 13:29:22 it's a startpoint I think 13:29:45 will talk with him to add more driver interfaces 13:30:10 umm, for other left items, we have no progress I think 13:30:18 so topic 2 13:30:27 #topic proposal for summit 13:30:47 hi, elynn, lixinhui_ , any idea :) 13:30:53 besides the HA one 13:31:14 Not sure what else we can propose 13:31:25 cluster.do 13:31:42 I think that is worthy a talk 13:31:47 you mean deployment? lixinhui_ 13:31:52 of app 13:31:57 yes 13:32:14 and any cluster management 13:32:15 I see. It's very useful and powerful I believe 13:32:33 just I feel we may need a use case as reference 13:32:55 integration will some given agent 13:32:59 otherwise, pure technical discussion could be difficult to understand for audience who are not familiar with Senlin code 13:33:23 yes 13:33:24 lixinhui_, for agent, you mean? 13:33:37 agree 13:33:39 agent of some given monitor 13:34:05 so this for HA solution? 13:34:15 as monitoring part? 13:34:20 I think it should be a separate topic 13:34:26 ok 13:34:27 We can think of some use cases, like scaling from a standby pool, green-blue deployment 13:34:39 cool, ekynn 13:34:41 elynn, that's interesting as well 13:34:44 elynn 13:34:54 ok, I will create an etherpad to collect these ideas 13:34:55 How to use senlin for better management. 13:35:07 and then we can discuss and refine them to see what we can propose 13:35:47 yes 13:35:48 #action yanyanhu to create an etherpad to collect proposal ideas 13:35:48 oh, BTW, I also talked with eldon to see whether we can propose one to share their experience on managing cluster in large scale 13:35:48 eldon from cmcc 13:36:16 cool yanyanhu 13:36:18 they have tried to use senlin to manage cluster consists of a thousand of VMs? 13:36:19 I think 13:36:37 you know I always wanna to learn more details 13:36:39 so that can be a very good demonstration and sharing 13:36:50 they also met some problems I believe 13:36:58 lixinhui_, sure, me too :) 13:37:07 :) 13:37:14 so lesson and learn 13:37:18 ok 13:37:28 will add it as well 13:37:47 I will create the etherpad and post the link into irc channel 13:38:10 the deadline is 13th ? 13:38:38 yes 13:38:39 so we need to propose before that date :) 13:38:51 about 2 weeks 13:38:55 yea 13:39:08 lots of thinking needed :P 13:39:26 ok, that's for proposal for summit 13:39:43 #topic open discussion 13:39:57 any other items you guys want to discuss? 13:40:43 no from me 13:40:55 if not, we can finish 20 minutes earlier :) 13:41:11 Qiming should be a good dad 13:41:28 Congratuate to him :) 13:41:36 lixinhui_, I believe so :) 13:41:39 yea 13:41:53 :) 13:42:11 so maybe a month later, we can go to his home to see his baby:) 13:42:18 haha 13:42:22 heihei 13:42:29 lets go together 13:42:37 he will be managed by two women 13:42:41 haha 13:42:42 since then 13:42:43 haha 13:42:49 lixinhui_, LOL 13:42:55 yep 13:43:03 ok, thank you so much for joining 13:43:12 I think we can finish the meeting now 13:43:14 see you next time 13:43:20 lets make further discussion later 13:43:20 cu 13:43:22 see U 13:43:27 #endmeeting