03:01:14 <tpatil> #startmeeting Masakari
03:01:15 <openstack> Meeting started Tue Jun 26 03:01:14 2018 UTC and is due to finish in 60 minutes.  The chair is tpatil. Information about MeetBot at http://wiki.debian.org/MeetBot.
03:01:16 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
03:01:18 <openstack> The meeting name has been set to 'masakari'
03:01:36 <tpatil> Hi All
03:01:47 <samP> sorry Im late
03:01:55 <samP> Hi
03:02:02 <tpatil> Hi samP
03:02:15 <tpatil> Please take the pilot seat
03:02:15 <samP> not started yet, right?
03:02:28 <samP> tpatil: sure..
03:02:32 <tpatil> already started
03:02:43 <samP> ah.. yes thanks
03:02:49 <samP> OK then,
03:03:25 <samP> #topic bugs and patches
03:04:11 <samP> any bugs or patches need to discuss?
03:04:25 <tpatil> Louie Kwan has uploaded a new PS
03:04:38 <tpatil> I'm busy reviewing other patches. Need help in reviewing his patch
03:04:46 <tpatil> #link https://review.openstack.org/#/c/534958/
03:04:47 <patchbot> patch 534958 - masakari-monitors - Introspective Instance Monitoring through QEMU Gue...
03:04:59 <samP> tpatil: great.,, I will take a look on that
03:05:05 <samP> tpatil: thanks for the review
03:05:16 <tpatil> Thank you
03:05:35 <tpatil> I have proposed a fix for LP bug https://bugs.launchpad.net/masakari/+bug/1773132
03:05:35 <openstack> Launchpad bug 1773132 in masakari "masakari-engine runs recovery twice for one notification when disconnection with rabbitmq" [Undecided,In progress] - Assigned to Tushar Patil (tpatil)
03:05:48 <tpatil> #link https://review.openstack.org/#/c/576042/
03:05:49 <patchbot> patch 576042 - masakari - Avoid recovery from failure twice
03:05:56 <tpatil> Please take a look at this patch as well
03:06:31 <samP> tpatil: sure
03:06:49 <samP> tpatil: thanks for the fix
03:08:02 <tpatil> Other LP bug https://bugs.launchpad.net/masakari/+bug/1773765
03:08:03 <openstack> Launchpad bug 1773765 in masakari "There is a possibility that 'running' notification will remain" [Undecided,New]
03:08:20 <tpatil> It is little tricky to fix this issue becoz we don't have much information about which instances are already evacuated
03:09:06 <tpatil> Masakari doesn't save any info about which instances are evacuated and how many are still pending during host failure notification
03:09:24 <tpatil> hence it's difficult to re-execute running status notification
03:11:15 <samP> humm...I think we can store some of these info in masakari, and with current status form nova, we are able to calculate it.
03:11:40 <samP> But, not sure is this the best way to deal with.
03:12:01 <tpatil> yes, it's a big change
03:12:29 <tpatil> Can we focus on this issue in next release?
03:13:07 <tpatil> If we decide to maintain basic info in masakari, we can also add code to show the current progress of host failure notification
03:13:46 <tpatil> like source host failure, out of 10 instances, 5th instance evacuation is in progress
03:13:47 <samP> tpatil: Indeed.
03:14:41 <samP> toabctl: IIRC, nova does have api for get the status for migration,
03:17:20 <tpatil> samP: Yes, but Masakari will still need to maintain basic info to query status from nova
03:18:09 <samP> tpatil: agree. In any cause, it seems like this fix is too heavy for this release.
03:18:23 <tpatil> samP: Correct
03:19:10 <samP> Let's postpone it to next release. Otherwise, some one raise urgent flag..!
03:20:42 <samP> any other bugs or patches?
03:21:07 <tpatil> One minor issue
03:21:11 <tpatil> #link https://review.openstack.org/#/c/574614/
03:21:12 <patchbot> patch 574614 - masakari - Segment description allows multiline characters
03:21:28 <tpatil> Description doesn't allow multiline characters
03:21:52 <tpatil> I will review this patch today
03:23:19 <samP> LGTM
03:24:02 <samP> need to rebase
03:24:19 <samP> thanks for the fix
03:24:39 <tpatil> Ok
03:25:11 <tpatil> I have voted + 2 on this patch https://review.openstack.org/#/c/567825/
03:25:12 <patchbot> patch 567825 - masakari - Enable mutable config in Masakari
03:25:40 <tpatil> Need someone to review and approve this patch
03:26:17 <samP> sure, I will take a look
03:26:21 <samP> tpatil: thanks
03:27:50 <samP> Let's move to status update for features and sub-projects
03:28:13 <samP> (1) Horizon Plugin
03:28:46 <tpatil> I have reviewed 3-4 patches and posted comments
03:29:39 <samP> I want to review, but it will take some time to get there...;)
03:29:47 <tpatil> ok
03:29:50 <samP> but I will be their soon
03:30:24 <samP> tpatil: thanks for reviews
03:30:44 <samP> (2) recovery method customization
03:31:01 <tpatil> specs to be reviewed
03:31:27 <samP> I read the spec. I will put my comments soon, No critical comments, basically LGTM
03:31:40 <tpatil> Implementation is almost finished. Working on adding unit tests. Expect to push patch in next week
03:31:51 <samP> tpatil: sure, thanks a lot
03:32:46 <samP> (3) Ansible support for Masakari
03:33:13 <samP> Any issues?
03:33:19 <tpatil> This task is currently on hold due to unavailability of dev. environment
03:33:39 <tpatil> we had discussed about this issue in last-to-last meeting
03:33:40 <samP> tpatil: ah, yes. my fault
03:34:48 <samP> I will try to provide server init method asap
03:34:49 <tpatil> Still many items are pending. Looks like we will need to postpone it to the next release
03:34:59 <tpatil> samP: Thanks
03:36:10 <samP> tpatil: are you referring to ansible?
03:36:30 <tpatil> yes
03:37:34 <samP> tpatil: monitor part is bit difficult and I think it is OK to postpone it to next release
03:38:28 <tpatil> samP: Agree
03:39:13 <samP> if some one came up with a quick solution, then we can reconsider to put it into to this release.
03:39:27 <tpatil> ok
03:40:09 <samP> Otherwise, let's consider it on next release
03:40:19 <samP> #topic AOB
03:40:28 <tpatil> I want to bring up this issue report by Torin on openstack ML
03:40:30 <tpatil> #link http://lists.openstack.org/pipermail/openstack/2018-June/046620.html
03:40:40 <tpatil> s/report/reported
03:41:56 <tpatil> execute method is failing with 124 error
03:42:08 <samP> humm.... seems like Corosync faild to sync
03:43:08 <tpatil> timeout returns 124 error, but if you pass --preserve-status option to timeout it return 0
03:44:06 <tpatil> In that case, I don't think it will raise ProcessExecutionError
03:45:06 <tpatil> Do we expect host monitor process to be run by superuser?
03:46:12 <tpatil> becoz run_as_root=True is passed to utils.execute method
03:46:13 <tpatil> https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/hostmonitor/host_handler/handle_host.py#L113
03:47:36 <samP> tpatil: case here is, tcpdump can not be run by the normal user.
03:49:03 <tpatil> If you have any info about this issue, can you please reply to the above ML thread
03:49:21 <samP> tpatil: sure, I will join to discussion.
03:49:53 <samP> I need to read the thread carefully. I will reply to thread
03:50:20 <samP> tpatil: thanks for answering to thread
03:51:07 <samP> ah, and this is openstack ML, not the dev
03:51:30 <tpatil> yes
03:51:43 <samP> OK got it.
03:52:02 <tpatil> After upgrading to Queens, Torin is not able to run host monitor process
03:52:21 <samP> That is critical.
03:52:50 <samP> tpatil: I will take a look on this issue.
03:53:27 <tpatil> Thanks
03:54:15 <samP> Any other items to discuss?
03:54:34 <samP> Otherwise, let finish today's meeting.
03:54:35 <tpatil> That's all from my end for now
03:55:18 <samP> Please use openstack-dev ML or IRC #openstack-masakari on freenode for further discussion..
03:55:31 <samP> thank you for attending today's meeting
03:55:50 <samP> tpatil: thanks
03:55:56 <samP> #endmeeting
03:56:01 <tpatil> Thank you, Bye
03:56:09 <samP> ah, you have to end it..
03:56:19 <tpatil> yes will do
03:56:22 <tpatil> #endmeeting