13:00:11 #startmeeting senlin 13:00:12 Meeting started Tue Nov 22 13:00:11 2016 UTC and is due to finish in 60 minutes. The chair is yanyanhu. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:16 The meeting name has been set to 'senlin' 13:00:28 hello 13:00:32 hi 13:00:37 hi 13:00:39 hi 13:00:42 evening 13:00:56 hi, xinhui, xuefeng, haiwei 13:01:08 hi,all 13:01:17 hello 13:01:18 lets wait for a while for other attenders 13:01:20 hi, Qiming 13:02:03 ok, lets start 13:02:05 https://wiki.openstack.org/wiki/Meetings/SenlinAgenda#Agenda_.282016-11-22_1300_UTC.29 13:02:10 here is the agenda 13:02:16 please feel free to add items 13:02:24 #topic ocata work items 13:02:33 ocata workitems 13:02:53 ocata-1 has been released in last week 13:03:08 so now we are working for ocata-2 13:03:23 test 13:03:28 performance test, no progress 13:04:01 improve tempest API test. since the versioned request support is almost done, will consider to resume this work 13:04:35 the basic idea is adding verifcation of exception message 13:05:10 next one 13:05:13 HA support 13:05:20 hi, Qiming, xinhui, your turn 13:05:44 no update from me 13:05:54 ok 13:06:00 lixinhui, and you? 13:06:10 I noticed this bug is marked as won't fix 13:06:12 https://bugs.launchpad.net/neutron/+bug/1548774 13:06:12 Launchpad bug 1548774 in senlin "LBaas V2: operating_status of 'dead' member is always online with Healthmonitor" [Undecided,New] 13:06:17 the bug report about lbaas 13:06:22 I will propose a BP this week 13:06:24 yes, I challenged that decision 13:06:33 Qiming, I think you have asked armando 13:06:38 yes 13:07:02 lixinhui, great 13:07:20 lixinhui, bp about? 13:07:35 to ocativia 13:07:41 okay 13:07:46 no matter what is the result 13:07:51 we should try 13:07:56 yep 13:08:07 btw, the following patch is abandon for no update in last 4 weeks 13:08:09 https://review.openstack.org/#/c/325624/ 13:08:19 lixinhui, you may want to restore it if needed :) 13:08:20 yes 13:08:24 I know this 13:08:40 but totally different resolution 13:08:44 ok 13:08:58 cool 13:09:05 so you will reuse this patch or propose a new one? 13:09:15 Propose a new one 13:09:25 while we are keeping an eye on the LB service 13:09:41 I'm wondering if we should look for some alternatives 13:10:20 Qiming, you mean some loadbalancers outside openstack? 13:10:29 e.g. hardware based ones 13:10:29 the reason we are investigating and even try making things right with LB is for health check 13:11:11 users will decide whether they will use neutron-lbaas or not 13:11:36 by alternative, I mean some 'ping', 'http get' requests sent to the cluster nodes 13:11:49 I see 13:12:00 as poller for health check 13:12:25 it is a little bit different from what LB is doing, because what we will be doing is from the management network 13:12:43 LB can do it from guest network 13:13:42 guest network, you mean? 13:13:49 tenant network 13:14:10 data plane? 13:14:14 I see 13:14:16 yes 13:14:47 if we decide to do this, I'd suggest we start a new service process for this 13:14:56 if so, we may need to find a way to make senlin hm reach to tenant network 13:14:59 poller will consume a lot of cpu cycles I believe 13:15:11 Qiming, yes, that makes sense 13:15:33 having health manager configured into the tenant network is an option though 13:15:35 currently, all senlin components are running in management network 13:15:41 yes 13:15:55 all VMs should be reachable from management network 13:16:02 that is a safe assumption I think 13:16:13 or else, ceilometer cannot work, heat cannot work ... 13:16:18 Qiming, yes, just we need a graceful and scalable way to support it 13:16:27 I never worry about management network reachability 13:16:51 for if there are lots of vms in a huge cluster, ping them one by one could be a big problem 13:17:36 if you have a larget scale cluster, you will highly likely prolong the interval between two polling operations 13:17:48 yes 13:18:12 because the base is large, the crashes of one or two VM should be solved, but it won't be that urgent 13:18:30 yes 13:18:56 anyway, we hope lbaas can provide us some help here 13:19:05 there won't be a decent solution for this, even for listeners, you still have to process a lot of events 13:19:14 right 13:19:21 if not, we may consider plan B to support status check by ourselves 13:19:31 +2 13:19:48 tacker is using ping 13:19:50 hope xinhui's work can help to solve this issue :) 13:20:12 haiwei_, any detail about how they support it? 13:20:24 ping can be used, but ping won't prove you that the service is still working 13:20:26 could you point us to the code, haiwei_ 13:20:35 thx 13:20:40 just ping the vm, not anything special 13:20:54 ok 13:21:28 haiwei_, do they perform the ping operation periodically? 13:21:35 http://git.openstack.org/cgit/openstack/tacker/tree/tacker/vnfm/monitor_drivers/http_ping/http_ping.py 13:21:36 or it is triggered by some events 13:21:54 https://github.com/openstack/tacker/blob/master/tacker/vnfm/monitor_drivers/ping/ping.py 13:21:59 you are quick 13:22:01 that is 13:22:07 :) 13:22:24 and a native ping: http://git.openstack.org/cgit/openstack/tacker/tree/tacker/vnfm/monitor_drivers/ping/ping.py 13:22:55 thanks, Qiming 13:23:05 great. we can refer to it 13:23:07 seems not periodically, yanyanhu 13:23:15 haiwei_, ok 13:23:29 looks like this is how it is invoked: http://git.openstack.org/cgit/openstack/tacker/tree/tacker/vnfm/monitor.py#n96 13:23:37 periodical ping could cause huge overhead as Qiming said 13:23:50 it is doing periodical pings 13:23:57 yes 13:24:07 oops, I saw check_intvl 13:24:20 default 10 secons 13:24:32 the logic is not complicated 13:24:52 so may need some evaluations here 13:25:12 once we decide to apply similar design 13:25:16 code is easy, design is the difficult part 13:25:17 not test this yet, don't know if it works or not 13:25:26 I see 13:25:30 Qiming, +1 13:26:50 ok, lets wait for xinhui's work in lbaas before going along this way 13:27:13 the other side of the HA solution is about usage scenarios I think 13:27:29 need some tutorials from xinhui some day on mistral 13:27:35 NOT today 13:27:45 Qiming, sure, also looking forward to it 13:28:04 maybe we can get a lecture from xinhui someday :) 13:28:06 w:) 13:28:07 I can pay for the lunchbox 13:28:19 haha, I can pay for beverage 13:28:22 hope so:) 13:28:49 ok, lets decide it offline after the meeting :) 13:28:49 a small senlin meetup 13:28:57 haiwei_, yep :) 13:29:02 :) 13:29:04 ptg? 13:29:14 haiwei_, can call in and provide some music as background 13:29:16 oh, right, for we won't appear in ptg 13:29:33 : D 13:29:37 a meetingup may be needed for us 13:29:41 my boss asked me today about senlin ptg 13:30:07 haiwei_, we don't have plan to join it :( and also for we are marked as single-vendor now 13:30:24 so one mission in this cycle is adding our diversity :) 13:30:30 I know it, yanyanhu, so I just smiled to him 13:30:38 the reason we were "awarded" that label is review statistics 13:30:52 hope more contribution especially code review from you guys who are not ibmer :P 13:30:53 90% reviews are from IBM 13:30:57 yep 13:31:11 ok 13:31:35 we have been improving quickly recently 13:32:00 yep, especially thanks XueFengLiu lvdongbin and also Ruijie_ :) 13:32:39 lets spend more effort here :) 13:32:53 ok, we can talk about meetup later 13:32:54 no problem :) 13:32:56 :) 13:33:02 so lets move on? 13:33:06 :) 13:33:13 ibm reviews: 68% now 13:33:27 ibm commits: 41% 13:33:51 great 13:34:06 ok, next item, document 13:34:10 very happy to see fresh blood in the team 13:34:10 no progress I think 13:34:19 Qiming, definitely :) 13:34:20 yes, skip that please, :( 13:34:30 ok 13:34:34 versioned request support 13:34:35 almost done 13:34:44 with the effort from all the team 13:34:55 em, one problem though 13:34:57 https://blueprints.launchpad.net/senlin/+spec/objectify-service-requests 13:35:01 hopefully it can be finished in one or two weeks 13:35:06 no patch has been referencing this BP 13:35:18 ...that's true 13:35:23 although we may have a few dozens of patches working on this 13:35:38 sigh... 13:35:51 didn't notice this issue before 13:35:56 I'll mark it ... "Good Progress", if no objections 13:36:05 agree... 13:36:10 it was my fault, I didn't do this, so ... 13:36:23 you built the basement :) 13:36:45 and the habit of not referencing the bp 13:36:59 :P 13:37:11 ok, next one 13:37:17 container profile 13:37:31 haiwei_, any new progress? 13:37:35 no much progress , yanyanhu 13:37:45 ok 13:37:51 so, next one 13:37:54 event/notification 13:38:01 I think the dependency handling work is good 13:38:03 I believe Qiming did lots of work here 13:38:18 yes, quite some code written and deleted locally 13:38:43 I believe the design (not written out anywhere) is 70% 13:38:48 great 13:39:11 the last mile is about generalization about cluster/node actions 13:39:44 generalization, you mean? 13:39:57 the format? or the timing that generate event 13:40:04 extract generic parameters/properties while ensuring important data can be presented to users, that is ... a big puzzle for me 13:40:32 I see 13:40:54 the format, do we want a ClusterScaleOutActionPayload and a ClusterResizeActionPayload, or we stop at ClusterActionPayload 13:41:12 myself prefer the last one :) 13:41:13 have been playing with different options recently 13:41:21 yes, me too 13:41:49 then that payload have to be very expressive, capable of delivering the event details for all actions 13:41:56 yes 13:42:05 so the generalization is very important 13:42:28 notification object, when serialized ... 13:42:58 will be something like this: http://paste.openstack.org/show/590069/ 13:43:29 wow, looks great 13:43:44 line 33 in that paste will only appear when something wrong happens 13:44:06 I see 13:44:33 so it will be empty when everything is good 13:44:36 still working on the line 44 part, trying to get the most interesting properties from all action types, while keeping the structure concise 13:44:58 you know ... exception is an ObjectField with nullable=True 13:45:07 ah, right 13:45:38 the lin 19 part will be replace by a 'node' dict when dealing with nodes 13:45:39 great 13:46:36 the structure looks good 13:46:57 thanks for this great work, Qiming :) 13:46:59 other things to be settled is about event_type, which isn't a big issue 13:47:34 what about the event log file? 13:47:45 there will be a output file, right? 13:48:07 currently we focus on database and message backend (aka. driver) 13:48:18 ok 13:48:34 haiwei_, file will be an option 13:48:36 if there are requirements, we can quickly add a new driver, dumping these into a JSON file, for example 13:49:02 got it 13:49:09 Qiming, the target milestone of basic support is ocata-2? 13:49:30 just leave it there and see if I can work my ass off 13:49:51 great, just take your time :) 13:49:59 will propose a postpone if I figured I cannot finish it by o-2 13:50:06 ok 13:50:35 great, this will be a very important feature we will include in o release 13:50:55 so those are all items in the list 13:51:17 any more you guys are working on? 13:51:35 #topic open discussion 13:51:44 ok, open discussion now 13:52:08 any topic you want to discuss? 13:52:16 if there are still some bandwidth from team, I'd propose we work on improving container cluster support 13:52:34 haiwei has done a great job, but it is still a starting point 13:52:44 yes 13:53:04 good idea 13:53:10 hope we can have at least one case that can be successfully run based on senlin container cluster 13:53:40 my dreamed goal is: start a container cluster using a single senlin command, which will be load-balanced, auto-scalable and health managed 13:54:08 :) 13:54:14 Wow 13:54:31 too ambitious ? 13:54:43 honestly, a little bit :P 13:54:56 at least for this cycle, it is 13:55:04 a little bit, but we should be there 13:55:17 I didn't say the goal is for this cycle, :D 13:55:29 but we have to work on it 13:55:33 haha, understand 13:55:34 :) 13:55:35 sure 13:56:10 that will be a great show of senlin's capability 13:56:20 yes, definitely 13:56:53 been looking at spark architecture recently, as a beginner 13:57:04 Qiming, :) 13:57:14 when comparing its low-level architecture, we only miss the scheduling part 13:57:48 since we are not a scheduler actually 13:57:53 senlin is more like an engine 13:57:55 we don't have one 13:58:00 even a simple one 13:58:17 maybe placement policy can help to address this issue 13:58:22 that has been a blocking factor for haiwei I believe 13:58:43 yes, for most short-lived containers, it is just a placement decision 13:58:52 you won't migrate them around 13:59:02 yes 13:59:24 ok, time is almost over. lets move back to senlin channel 13:59:28 aren't containers designed to be short-lived? 13:59:34 :D 13:59:37 a big question 13:59:38 Qiming, yes, it is :) 13:59:41 in most case 13:59:57 ok, we can have further discussion later 14:00:02 thanks all you guys for joinging 14:00:05 have a good night 14:00:12 #endmeeting