03:00:57 #startmeeting Masakari 03:00:58 Meeting started Tue Jul 3 03:00:57 2018 UTC and is due to finish in 60 minutes. The chair is tpatil. Information about MeetBot at http://wiki.debian.org/MeetBot. 03:00:59 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 03:01:01 The meeting name has been set to 'masakari' 03:01:08 Hi All 03:01:11 tpatil: Hi 03:01:16 hi 03:01:16 Sorry, bit late 03:01:51 samP: No problem, please take the pilot seat 03:01:57 tpatil: thanks. 03:02:09 OK, then, let's start 03:02:31 #topic bugs and patches 03:02:48 Any critical bugs or patches? 03:03:11 #link https://bugs.launchpad.net/masakari/+bug/1779165 03:03:11 Launchpad bug 1779165 in masakari "masakari service docs are not published" [High,Confirmed] - Assigned to SamP (sampath-priyankara) 03:03:28 #link https://bugs.launchpad.net/masakari/+bug/1779166 03:03:28 Launchpad bug 1779166 in masakari "masakari API docs are not published" [High,Confirmed] - Assigned to SamP (sampath-priyankara) 03:03:42 docs are not published, I will take care of this. 03:03:52 Ok 03:03:55 #link https://bugs.launchpad.net/masakari/+bug/1779752 03:03:56 Launchpad bug 1779752 in masakari "masakari segment-list returns error (openstack queens)" [Undecided,New] 03:04:30 #link https://bugs.launchpad.net/masakari/+bug/1779752 03:04:54 ^^ looks like a python-masakariclient issue 03:04:54 This bug is reported by Torin. 03:05:01 tpatil: yep 03:05:35 I remember fixing this, may be wrong. 03:06:01 what version of python-masakariclient and openstacksdk to be installed for stable/queens? 03:06:37 5.10 03:06:45 5.1.0 03:07:19 openstacksdk>=0.13.0 03:09:09 Someone need to install stable/queens to reproduce this issue. I will see if I have time to do this. 03:09:52 tpatil: I am rebuilding my env with stable/queens to test masakari-monitors issues 03:10:03 I will take a look at this too. 03:10:19 Ok Gr8 03:12:15 I am gonna assign this to me just for now.. 03:12:25 We should mark this issue as Critical 03:12:34 tpatil: agree 03:12:51 done 03:12:59 tpatil: thanks 03:13:19 I will confirm this after testing 03:13:33 Any other patches or bugs? 03:14:19 Need review : https://review.openstack.org/#/c/576042/ 03:14:59 tpatil: sure, I will do. 03:17:38 #link : https://bugs.launchpad.net/masakari/+bug/1739383 03:17:38 Launchpad bug 1739383 in masakari "'409 Conflict' occurred when adding reserved_host to aggregate" [Undecided,In progress] - Assigned to takahara.kengo (takahara.kengo) 03:17:47 This LP bug was fixed long time ago 03:18:15 Can u please make this bug as fix released or fix committed? 03:19:03 tpatil: set it to fix released, thanks 03:19:48 samP: Thanks 03:20:36 Any other issues? 03:21:10 N 03:21:10 if not, let's move to discussion 03:21:11 No 03:21:30 #topic Discussion points and subprojects 03:21:44 (1) Horizon Plugin 03:22:31 sorry, I didn't have time to check the patches. 03:22:52 I think tpatil is mainly checking those. I will join soon 03:22:56 I have reviewed 4 patches, some minor comments are given which niraj_singh will fix soon 03:23:05 tpatil: thanks 03:23:17 yes 03:23:33 niraj_singh: thanks for great work! 03:23:49 (2) Ansible support 03:24:13 This task is pending due to environment issue. 03:24:32 Yes 03:24:36 I will fix the env, and try to provide required features asap 03:24:58 (3) recovery method customization 03:25:32 I saw your comment on specs 03:25:35 #link https://review.openstack.org/#/c/458023/ 03:26:25 I have requested Shilpa to update specs 03:27:03 Instead of adding new section "host_rh_failure_recovery_tasks", we will add two new config option under taskflow_driver_recovery_flows for RH failure 03:27:21 host_rh_failure_pre_recovery_tasks 03:27:43 host_rh_failure_main_recovery_tasks 03:28:10 RH workflow is a bit different from others 03:29:26 if reserved_host list contains more than one reserved host and if instances cannot be evacuated on one reserved host, we need to get the next one from the list 03:30:05 for that purpose we have used ParameterizedForEach flow 03:30:29 we need to call disable compute host only once, hence this task will be part of host_rh_failure_pre_recovery_tasks 03:30:51 and all other tasks will be part of host_rh_failure_main_recovery_tasks config option 03:31:16 tpatil: yes, understand. But we can consider this nested flow as single flow? 03:32:45 it executes as single workflow only, but from configuration point of view, we have provided two config options so that operator can add more custom tasks for each of these options 03:33:17 for example, host_rh_failure_pre_recovery_tasks config option, disable compute node and send alert email 03:34:56 and in case of host_rh_failure_main_recovery_tasks config option, along with evacuation of instances, send alert to operator of enabling compute service on each reserved host 03:35:26 tpatil: understand your point. What my concern is, it will break the uniformity of the config. 03:36:50 tpatil: may be not, let me reconsider with correct details. 03:37:03 Yes, for RH failure, there are two config options 03:38:02 but then these config options will be under [taskflow_driver_recovery_flows] instead of [host_rh_failure_recovery_tasks] 03:38:48 only in case of RH failure, we need to iterate through input reserved host list 03:38:59 and hence we have come up with this design 03:40:50 tpatil: In host failure, config options are different from each other. But operator only one of them. 03:42:58 In, [taskflow_driver_recovery_flows], we can have structure for 4 host failure recovery patterns. Only one will be used. 03:43:25 that is configurable based on recovery_method configured in segment 03:44:36 tpatil: ah.. someone can have different pattern for different segments. But we have one config file. 03:45:47 In this case, multiple host failure recovery methods need to address in config file 03:46:30 #link http://paste.openstack.org/show/724825/ 03:46:53 all recovery tasks will be part of a new config file, 03:48:11 LGTM 03:48:41 ok 03:49:28 auto priority and rh priority also use same config. do we need some config for them? 03:52:17 auto_priorty = host_auto_failure_recovery_tasks followed by (host_rh_failure_pre_recovery_tasks + host_rh_failure_main_recovery_tasks) 03:52:33 rh priority is exactly opposite 03:53:14 if instances are evacuated successfully using "host_auto_failure_recovery_tasks" recovery workflow, then it won't execute (host_rh_failure_pre_recovery_tasks + host_rh_failure_main_recovery_tasks) 03:54:06 tpatil: yep. Let's focus on above. I think no need to add anything specific to auto and rh priority methods 03:55:29 Please add above discussion to spec. 03:55:40 Ok 03:55:42 we have 5mins left 03:55:46 #topic AOB 03:55:58 Any other issues? 03:56:43 No 03:56:55 if not let's finish today meeting. Thank you all for your time. 03:57:14 Thank you, I will end this meeting 03:57:23 tpatil: sure 03:57:24 #endmeeting