09:02:47 <aspiers> #startmeeting ha
09:02:48 <openstack> Meeting started Wed Dec 21 09:02:47 2016 UTC and is due to finish in 60 minutes.  The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:02:50 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:02:52 <openstack> The meeting name has been set to 'ha'
09:03:12 <samP> hi o/
09:03:22 <aspiers> hi :) anyone else here today?
09:03:51 <aspiers> I'll try to do a better job with minutes than last time ...
09:04:54 <aspiers> OK, I guess just us
09:05:03 <aspiers> #topic specs
09:05:47 <samP> OK I've just update VM recovery spec.
09:05:58 <aspiers> #info VM recovery spec is making progress
09:06:38 <aspiers> yeah I saw, thanks. I will be working on these over the vacation while it is quiet
09:06:49 <aspiers> I have had a lot of urgent customer issues recently :-(
09:07:15 <samP> And I put a comment on compute node monitoring spec also
09:07:32 <aspiers> great thanks!
09:07:37 <aspiers> I'll reply to that ASAP
09:07:50 <aspiers> a general format makes a lot of sense
09:08:11 <samP> yeh, I would like to have more discussion on that.
09:08:33 <aspiers> my feeling is that the specs need to be quite precise about the implementation
09:08:44 <aspiers> to ensure we can have compatibility between implementations
09:09:07 <samP> aspiers: sure
09:09:42 <aspiers> but ddeja does not seem so sure
09:10:58 <aspiers> well, I guess I need to think about this more
09:11:15 <aspiers> did you see the conversation from last week? if not, worth reading I think
09:11:32 <samP> yes, I read it
09:11:36 <aspiers> ok cool
09:12:11 <aspiers> I suggested that the libvirtd/nova-compute RAs should do normal process monitoring *and* recovery, but also send HTTP message when monitor fails and when starting/ending recovery
09:12:40 <haukebruno> hey
09:12:40 <aspiers> then the external process recovery component could decide its own policy
09:12:44 <aspiers> oh hey haukebruno :)
09:12:55 <samP> haukebruno: hi!
09:13:18 <samP> aspiers: I agree
09:13:36 <aspiers> samP: ok great, I think I will propose that as the preferred implementation in the spec
09:13:42 <aspiers> samP: but I will list alternatives
09:14:00 <aspiers> samP: maybe including one where Pacemaker gets enhanced
09:14:32 <samP> aspiers: I'm wondering how much further should I wire in spec "VM recovery"
09:14:33 <aspiers> and definitely including the one where we do service-disable on every stop
09:15:25 <aspiers> samP: I think VM recovery spec needs to document the interface point
09:16:31 <samP> aspiers: if we wants to put more details about implementation in spec, I prefer to have fix the notification format first
09:16:49 <aspiers> uhhh
09:16:57 <aspiers> #topic VM recovery spec
09:17:01 <aspiers> :)
09:17:11 <aspiers> samP: yes, that makes sense
09:18:13 <aspiers> samP: did you see Qiming's suggestion about versioning?
09:18:28 <aspiers> https://review.openstack.org/#/c/406659/2/specs/newton/approved/newton-instance-ha-host-monitoring-spec.rst@205
09:19:58 <aspiers> #agreed that we should decide on a standard format for all HTTP notifications, across all components
09:20:04 <samP> how about have a simple spec for notification format and lets make reference it from each spec
09:20:32 <aspiers> samP: ok sure
09:20:36 <samP> aspiers: yes, about JSON ver?
09:21:04 <aspiers> the versioning sounds like a good idea to me
09:21:21 <aspiers> I guess there is already an oslo standard?
09:23:32 <aspiers> hmm, maybe just nova
09:23:34 <aspiers> http://developer.openstack.org/api-guide/compute/microversions.html
09:23:57 <samP> I couldn't fine any doc in oslo
09:24:20 <aspiers> this looks like a good place to start
09:24:31 * ddeja forgot about the meeting...
09:24:38 <ddeja> sorry guys, hello all
09:24:40 <aspiers> ddeja: haha no problem :)
09:24:49 <samP> ddeja: hi
09:25:00 <ddeja> but, I remember on Monday that there is no meeting
09:25:20 <aspiers> that's a good start anyway :)
09:25:22 <aspiers> ddeja: samP has a nice suggestion to standardize the common bits of HTTP message format across all components
09:25:43 <ddeja> that's good
09:26:12 <ddeja> wheras I'm still not sure if it's a good idea to standarize it at all, as I stated in one of the reviews
09:26:33 <aspiers> ddeja: yes I saw that
09:26:52 <aspiers> ddeja: I'm fine with your driver idea, but I think we have to specify the message format in the specs
09:27:22 <aspiers> ddeja: since the main point of the specs is to allow different implementations of each component to be compatible
09:27:30 <aspiers> so that they are interchangeable
09:27:49 <ddeja> OK
09:28:10 <ddeja> we can add an abstraction layer for compatibility reasons
09:28:31 <ddeja> I'm afraid that it may be seen as an overhead for others
09:28:37 <ddeja> and noone would use it
09:29:20 <aspiers> what kind of overhead do you mean?
09:29:27 <aspiers> performance, or implementation?
09:30:33 <ddeja> niether of them
09:30:39 <ddeja> let me show as an example
09:30:44 <aspiers> ok
09:30:56 <ddeja> let's say we have someone who wants to use Masakari
09:31:01 <ddeja> for VM ha
09:31:12 <ddeja> he would have to a) setup pacamaker
09:31:22 <ddeja> b) setup agent that monitors vm
09:31:38 <ddeja> c) setup agent that receive alarms from monitor and inform Maskari
09:31:45 <ddeja> and d) Masakri itself
09:31:53 <ddeja> what he woudl think about it?
09:31:55 <aspiers> no, c) and d) are the same
09:32:04 <ddeja> OK
09:32:27 <ddeja> then, we need Masakri to be able to read the http message that monitor would send
09:32:31 <ddeja> same for Mistral
09:32:31 <aspiers> yes
09:32:34 <ddeja> same for everything
09:33:19 * ddeja is now entering 'mistral core mode'
09:33:25 <ddeja> so, as for Mistral
09:33:27 <aspiers> sometimes it might be easier to write a shim which proxies notifications
09:33:34 <aspiers> but that's up to the individual implementation
09:33:48 <ddeja> unless that http message is openstack complaint, I see no way to implement something like that in Mistral
09:34:05 * ddeja closed 'mistral core mode' :)
09:34:12 <aspiers> ok so in the mistral case maybe a shim is required
09:34:39 <aspiers> is there a problem with that?
09:34:49 <ddeja> aspiers: for me there is no
09:35:02 <ddeja> but if we go this way
09:35:08 <ddeja> we got a, b, c and d
09:35:15 <ddeja> (as in my example)
09:35:28 <aspiers> right
09:35:38 <ddeja> and given someone may think "hm, why I need something that just caches the alarm and sends it to recovery service"
09:35:53 <ddeja> "why I can just notify the recovery itself?"
09:36:11 <ddeja> and given someone would end up writing his own monitor
09:36:30 <ddeja> that's why I think driver-based architecture woudl be better
09:37:26 <ddeja> in vm monitor spec, we just specyfi what information would be passed to driver layer
09:37:44 <ddeja> and what to do with it, is up to driver writer/maintener
09:37:57 <aspiers> but then we have no guarantee of compatibility
09:38:21 <ddeja> yes, but what kind of copatibility you need at this point?
09:38:40 <ddeja> you just want to catch the alaram and send it to recovery service of your choice
09:38:57 <aspiers> compatibility between monitors and recovery services
09:39:19 <ddeja> it would be guaranteed by the driver
09:39:52 <aspiers> the driver in the monitor?
09:39:56 <ddeja> which means, we don't need to specyfi what kind of informatio recovery service needs to understand
09:40:18 <ddeja> we just specyfi that we have a drive that can notify the recovery service and it woudl understand it
09:41:12 <aspiers> but if that's undocumented then there is no reliable way for multiple implementations to be compatible
09:41:29 <ddeja> aspiers: agree
09:41:48 <ddeja> aspiers: taking it other way around: why we want it to be compatible?
09:42:08 <ddeja> as I said before:  you just want to catch the alaram and send it to recovery service of your choice
09:42:11 <aspiers> so that we can incrementally move from existing vendor implementations to a single upstream converged implementation
09:42:42 <ddeja> aspiers: so you can change each peace independently?
09:42:45 <aspiers> yes
09:43:03 <ddeja> that's one thing I didn't take into account
09:43:05 <aspiers> that was the whole point of the approach we agreed in Tokyo
09:43:31 <ddeja> OK
09:43:42 <ddeja> what if we have a driver that still talks "the old way"
09:43:49 <aspiers> because several of us have existing implementations which are in production and supported for customers
09:44:04 <ddeja> and then, when you change the revocery service, you just switch the driver?
09:44:44 <aspiers> ddeja: yes, that would be perfect. any support for communication an "old way" is of course still allowed, but outside the scope of the spec
09:44:51 <ddeja> or, why not both?
09:44:54 <ddeja> I mean
09:44:58 <aspiers> both is fine
09:45:09 <ddeja> let's do the driver architecture
09:45:20 <ddeja> specyfi what information are given to it
09:45:30 <aspiers> the specs should only require support for the new way, they are not exclusive deals which prohibit the old ways :)
09:45:43 <ddeja> and give one driver that is "the" driver
09:46:01 <aspiers> for which component?
09:46:06 <ddeja> for monitor
09:46:16 <ddeja> that talks using the specyfied http message
09:46:17 <aspiers> VM monitor or host monitor?
09:46:24 <aspiers> or both
09:46:26 <ddeja> VM monitor
09:46:37 <ddeja> but for host monitor it should also works
09:47:09 <ddeja> then we can specyfi the modular architecture beetwen monitors<->Recovery services
09:47:13 <ddeja> using standard drivers
09:47:29 <aspiers> I like the idea of drivers, but I don't see why the spec would need to require a driver architecture
09:47:51 <aspiers> any component is free to use a driver architecture, but why force all implementations to have one?
09:47:53 <ddeja> so that we can ommit the point 'c' for some recovery services ;)
09:48:37 <aspiers> sure, but why does it have to be in the spec?
09:48:40 <ddeja> we just let the user choice: either your recovery service understand the standard http message
09:48:53 <ddeja> aspiers: I didn't say it's need to be in spec :)
09:48:56 <aspiers> oh ok :)
09:49:18 <aspiers> so I think we are agreed on everything then :)
09:49:39 <aspiers> hmm
09:49:40 <ddeja> I think so
09:49:41 <aspiers> maybe not
09:49:43 <aspiers> :)
09:49:46 <ddeja> ?
09:49:47 <aspiers> I am thinking ...
09:50:08 <aspiers> OK, so let's say we implement a VM monitor with notification drivers
09:50:27 <ddeja> I just don't like the idea to writing some proxy service that translate our standard http message to mistral
09:50:31 <aspiers> then we implement a) a driver for notifying Mistral
09:50:36 <samP> sounds good, lest discuss more in the spec
09:50:44 <aspiers> b) a driver for sending standard HTTP message
09:50:48 <ddeja> aspiers: because of this -> https://xkcd.com/927/
09:51:07 * aspiers guesses that is the one about standards
09:51:18 <aspiers> right :)
09:51:21 <haukebruno> lol
09:51:46 <ddeja> aspiers: I think we need to only implement b)
09:51:59 <ddeja> as a starting point
09:52:27 <aspiers> ddeja: but driver b) requires a shim, and you want to eliminate that for mistral
09:52:46 <samP> ddeja: how do you propose to implement drivers?
09:53:00 <ddeja> aspiers: and for anything else, that cannot understand the standard http message
09:53:36 <aspiers> but if you don't provide the shim, the only way for each VM monitor implementation to be compatible with mistral is by talking directly to it
09:53:51 <ddeja> aspiers: yes, I want to elimiante it, but let's make it work in one standard way, then add next implementation
09:53:58 <ddeja> that's how I see it
09:54:30 <ddeja> samP: I think of it as some standard base class, to which we would pass a given set of input parameters
09:54:39 <ddeja> and then call some method, like 'notify' on it
09:54:44 <ddeja> that's all
09:54:53 <ddeja> the rest is on the drivers writer side
09:55:01 <aspiers> ddeja: oh, so you are saying we *would* write a shim for initial case?
09:55:15 <aspiers> ddeja: and then later optimise by adding mistral-specific driver?
09:55:23 <ddeja> aspiers: maybe
09:55:28 <aspiers> that could work
09:55:40 <ddeja> or maybe just made it work with only Masakari at first moment
09:56:01 <ddeja> or we can write a driver nearly in the same time, it shouldn't be a big deal
09:56:24 <samP> ddeja: OK, got it.
09:56:31 <ddeja> what is more, with drivers it may be easier to be complaint with someone existing solution
09:56:52 <ddeja> he just uses/writes a driver that is compliant with exisitn recovery workflow
09:57:02 <ddeja> service
09:57:21 <aspiers> yes I'm definitely in favour of drivers where it makes sense
09:57:37 <aspiers> I'm just trying to figure out what should be in the specs though
09:57:39 <ddeja> then switches to Masakari/Mistral (are there any other standards?)
09:57:48 <ddeja> and switches the driver
09:58:00 <ddeja> aspiers: I don't think it must be specyfi in the spec itself
09:58:12 <aspiers> right
09:58:15 <ddeja> maybe just information about 'modular architecture'
09:58:33 <ddeja> so it woudl be easy to notify various recovery services?
09:58:43 <samP> IMO, spec should contain all the info can obtain from monitors. Then drivers can choose what they want
09:59:00 <ddeja> samP: that's what I was thinking too :)
09:59:43 <aspiers> ok, still not sure I follow 100% but we're out of time so let's just continue working in gerrit :)
09:59:56 <aspiers> good discussion though, thanks!
10:00:10 <samP> yep. thanks ddeja
10:00:11 <ddeja> aspiers: I'll make some diagram to show my idea :)
10:00:17 <aspiers> ddeja: perfect!
10:00:19 <ddeja> thanks guys!
10:00:22 <aspiers> thanks :)
10:00:28 <aspiers> #endmeeting