09:02:36 <aspiers> #startmeeting ha
09:02:36 <openstack> Meeting started Mon Dec 14 09:02:36 2015 UTC and is due to finish in 60 minutes.  The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:02:38 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:02:42 <openstack> The meeting name has been set to 'ha'
09:02:46 <aspiers> #topic Current status (progress, issues, roadblocks, further plans)
09:03:11 <aspiers> alright, masahito you wanna start today?
09:03:17 <masahito> got it
09:04:10 <masahito> I want us to start thinking of milestone of the HA.
09:04:21 <masahito> so I thought how we can integrate Masakari and Mistral work flow proposed by _gryf.
09:04:35 <aspiers> ok
09:04:38 <masahito> then, I wrote my idea in https://etherpad.openstack.org/p/automatic-evacuation at L12-32.
09:04:59 <aspiers> cool, thanks :)
09:05:14 * aspiers has a quick read
09:07:13 <aspiers> I'm not sure I fully understand the quick example
09:07:27 <aspiers> what's the difference between the two cases?
09:08:08 <masahito> first one is only disable nova-compute and the instance on the host remain there.
09:08:25 <aspiers> #info masahito documented a proposal for short and long term goals for masakari and mistral, and convergence of the two
09:08:26 <beekhof> so technically you dont need to do anything, yet
09:08:44 <masahito> second one is the instance on the host will be moved to other host.
09:09:09 <aspiers> ok - but what's the goal of the first case?
09:10:04 <beekhof> vigilance?
09:10:07 <masahito> in the case, admin tryies to start nova-compute
09:10:56 <aspiers> ah ok, so e.g. if nova-compute simply crashed?
09:11:08 <masahito> aspiers: yes
09:11:34 <aspiers> ok. wouldn't pacemaker simply try to restart it then?
09:12:13 <masahito> pacemaker try it but couldn't restart it.
09:12:51 <aspiers> then it will try to stop it -> fail -> fence
09:12:56 <aspiers> -> failover
09:13:03 <aspiers> is there a use case where manual intervention would be needed?
09:13:40 <masahito> we have this use-case.
09:13:56 <aspiers> since manual intervention doesn't scale
09:13:56 <ddeja> hi all, sorry for beeing late
09:14:01 <aspiers> hi ddeja
09:14:13 <aspiers> ddeja: take a look at https://etherpad.openstack.org/p/automatic-evacuation L12-32
09:14:24 <ddeja> aspiers: OK
09:14:46 <aspiers> actually maybe I am talking rubbish, as usual for early on Monday mornings
09:15:01 <aspiers> the stop of nova-compute could succeed
09:15:39 * aspiers checks NovaCompute RA
09:16:15 <aspiers> in fact, it always succeeds
09:16:36 <beekhof> so it should :)
09:16:53 <aspiers> I can't even remember how this is supposed to work :-/
09:16:59 <aspiers> what triggers fencing?
09:17:04 <beekhof> unless kill -9 and rm -f have stopped working, there is little reason for stop to fail for any service
09:17:32 <aspiers> sorry, I'm not awake yet :((
09:18:16 <aspiers> I mean, if nova-compute fails to start correctly multiple times
09:18:30 <aspiers> it's just a clone, so it has nowhere to migrate to
09:18:31 <beekhof> in practice they often fail though. especially systemd units :-/
09:19:04 <aspiers> in this case it just stays down, no fencing and no evacuation, right?
09:19:13 <aspiers> that's a black spot in our current approach?
09:19:18 <aspiers> s/black/blind/
09:19:51 <aspiers> assuming that clone-max isn't set to less than the # of compute nodes
09:20:09 <beekhof> yep
09:20:13 <ddeja> I'm not in the toppic - but if nova-compute fails to start, it shouldn't have any vms?
09:20:14 <beekhof> well no
09:20:30 <aspiers> ddeja: we're discussing if it fails to *re*start
09:20:35 <aspiers> ddeja: sorry for not being clear
09:20:44 <beekhof> actually, correct. no evacuations because the node wasnt fenced
09:20:45 <aspiers> ddeja: e.g. if it crashes
09:20:55 <beekhof> which is... probably unfortunate
09:21:03 <aspiers> hmm
09:21:09 <ddeja> but if nova-compute crashes
09:21:17 <ddeja> it's vms are still alive, right?
09:21:27 <aspiers> ddeja: oh that's a good point
09:21:40 <ddeja> so there's no need to evacuate anything
09:22:01 <aspiers> and finally I understand masahito's use case :)
09:22:29 <aspiers> so in that case, we want the admin to manually rescue the machine, since we can't do anything useful automatically
09:22:35 <masahito> aspiers: ddeja: My use case I mentioned is vm is still alive.
09:22:41 <aspiers> right
09:22:41 <masahito> aspiers: yap :)
09:22:53 <aspiers> masahito: ok got it now, sorry for being slow ;-)
09:23:35 <masahito> aspiers: no problem. If I responsed quickly, it would be fast.
09:23:37 <aspiers> so in this case the workflow is just "scream to the admins for help", right?
09:23:48 <beekhof> masahito: how do you check instance health without a functional nova?
09:24:23 <masahito> aspiers: It depends on how we use openstack.
09:24:49 <aspiers> beekhof: if we were using mistral, it could query nova to see if any instances were running there before nova-compute died
09:24:58 <masahito> If we use OpenStack as public cloud, notifying admin and client is good.
09:25:14 <aspiers> if the host was empty, probably best route is to fence without evacuation
09:25:41 <ddeja> Or we can check in libvirt for VM's
09:25:41 <masahito> on the other hand, if we use openstack as private cloud, like EC site. I think it's better to fence node.
09:25:49 <aspiers> since there is a smalll chance fencing / reboot would fix the nova-compute issue
09:26:03 <aspiers> ddeja: yes, that's probably even better
09:26:17 <ddeja> Oh, we can have a big problem if libvirtd crashes and fails to re-start...
09:26:39 <beekhof> aspiers: i think my point is, you probably need to fence+evacuate because you can't tell if the instances you've promised to keep running, are actually running
09:27:22 <ddeja> beekhof: on the other hand, if instance is totaly OK, you can make somebody very unhappy by doing this ;)
09:27:29 <beekhof> so yeah, the instances can run without nova, but no-one can know that if nova is dead
09:27:48 <ddeja> beekhof: you can check in libvirt
09:27:51 <beekhof> ddeja: its loose-loose
09:27:58 <aspiers> beekhof: the question is whether it makes sense for mistral to monitor the libvirt layer out of band from pacemaker
09:27:59 <ddeja> or maybe
09:28:07 <ddeja> we can trigger live-migrate in such case
09:28:31 <ddeja> #link https://libvirt.org/migration.html
09:28:42 <ddeja> using libvirt
09:28:45 <aspiers> ddeja: I doubt that's doable without nova
09:28:55 <beekhof> i think there is scope for ${something} to be looking at libvirt, dont have much of an opinion who initiates it
09:28:57 <aspiers> ddeja: by circumventing openstack you break lots of things
09:29:04 <ddeja> yup
09:29:14 <beekhof> could be a pacemaker RA, could be something else
09:29:29 <beekhof> aspiers: even if we are R/O ?
09:29:35 <ddeja> I'm just seeking for options to have instances alive as long as possible and not to kill them, if they're alive
09:29:46 <beekhof> oh, i see now
09:30:04 <aspiers> ddeja: agree 100% with that goal
09:30:21 <aspiers> well we already have monitoring of libvirtd, but really that's different
09:30:33 <aspiers> here we are talking about monitoring individual VMs via libvirtd
09:30:39 <beekhof> ddeja: the problem is that to make the "right" decision at every stage of the process, you usually need more knowledge than is available
09:30:41 <masahito> as we're discussing now, there are a lot of use case. So my idea is we offer the admin some usecase.
09:30:45 <beekhof> at least to an aytomated process
09:31:11 <aspiers> masahito: yes, sounds like a good first step is to do manual notification to the admin for sure
09:31:16 <beekhof> correct (should always) trump optimal
09:31:23 <ddeja> I think masahito has a good point; We can leave this decision to admin
09:31:43 <aspiers> beekhof: agreed, although hopefully we can achieve both eventually :)
09:31:48 <kazuIchikawa> In current masakari implementation, a processmonitor does monitor libvirtd, and masakari just does disable nova-compute without evacuation, fyi.
09:31:58 <beekhof> as long as we can also opt to take automated action
09:32:30 <aspiers> yup
09:32:41 <aspiers> I guess there could be strange edge cases
09:32:51 <aspiers> e.g. nova-compute dies, and so do half of the VMs
09:33:08 <aspiers> in that case do you fence or not?
09:33:13 <masahito> I think mistral template (sorry, I don't know the correct phrase) is good tool to offer this.
09:33:25 <aspiers> it depends on whether the half which died were more or less important than the half which are still alive
09:33:43 <aspiers> masahito: I agree, it sounds like the kind of thing mistral could do a good job on
09:34:18 <kazuIchikawa> aspiers: just disabling nova-compute at schedular level, no fence.
09:34:30 <beekhof> this is assuming we have something checking instances outside of openstack?
09:34:47 <aspiers> kazuIchikawa: if the VMs which died were critical but the VMs still alive were non-critical, fencing would be a better choice
09:35:01 <aspiers> kazuIchikawa: since it is more important to resurrect the dead VMs than keep the other ones alive
09:35:55 <aspiers> beekhof: well mistral could do it by querying either nova-api or libvirtd directly
09:36:04 <ddeja> aspiers: To sum up: we would like to have vm-monitoring on libvirt? And in case of nova-compute crash make some decision based on it's informations?
09:36:31 <aspiers> ddeja: sounds like a good summary to me
09:36:47 <beekhof> aspiers: how can nova-api help if nova isnt on that compute node?
09:37:07 <aspiers> beekhof: for querying the current state - but it could be out of date, so it's probably not useful
09:37:29 <aspiers> I think it has to be via libvirtd
09:37:35 <aspiers> in case VMs died after nova-compute died
09:37:49 <aspiers> which is quite possible in that failure scenario (e.g. OOM killer)
09:37:58 <beekhof> agreed
09:38:11 <aspiers> actually OOM killer is probably quite a good example of the scenario we are discussing
09:38:21 <aspiers> well, maybe not
09:38:23 <kazuIchikawa> aspiers: I agree if we confirmed some of instance were died, then we should take any action such as rebuild. it's done by instancemonitor rather than processmonitor in Masakari.
09:38:31 <aspiers> I guess nova-compute could potentially restart ok
09:38:52 <aspiers> kazuIchikawa: right. this is one area where masakari is ahead right now
09:39:48 <aspiers> #info scenario was discuussed where nova-compute and possible some/all VMs on that host die and can't be restarted
09:40:50 <aspiers> #agreed appropriate action (to fence or not to fence) in this scenario is policy-based and context-sensitive
09:41:16 <aspiers> #agreed a good first step is to automatically notify the cloud operators to request manual remediation
09:42:08 <aspiers> #info notification in this scenario could be handled by mistral
09:42:28 <aspiers> ok
09:42:33 <masahito> back to my update. If agree, I want to move the work items.
09:42:33 <beekhof> by?
09:42:43 <beekhof> i'd have thought "to"
09:42:53 <gibi> alex_xu: could you please check https://review.openstack.org/#/c/243562/ ? I answered your comments
09:43:04 <aspiers> gibi: wrong channel :)
09:43:16 <gibi> aspiers: sorry :)
09:43:16 <beekhof> masahito: go ahead
09:43:22 <aspiers> gibi: np
09:43:27 <aspiers> masahito: yes please, go ahead
09:43:40 <masahito> I'm wondering where we should keep masakari repo.
09:43:55 <masahito> It's in under ntt-sic/ directory now, so I and other less people only can merge it now.
09:44:08 <aspiers> ok
09:44:29 <masahito> If someone want to help it, I think the repo place is not good.
09:44:37 <aspiers> masahito: it's possible to make other people collaborators on github projects
09:45:11 <beekhof> yep
09:45:20 <beekhof> where it is doesn't affect that
09:45:31 <masahito> by issues or pull request?
09:45:38 <aspiers> masahito: both
09:45:40 <beekhof> either
09:45:52 <beekhof> they can push directly to the repo if you want them to
09:46:04 <beekhof> depends how much you trust them ;-)
09:46:14 <aspiers> hmm I wonder if it's different for organizations
09:46:18 <aspiers> then maybe you have to use teams?
09:46:20 <beekhof> no
09:46:41 <beekhof> pretty sure not
09:46:43 <masahito> Oh, I'm worried about only NTT guys can merge it.
09:47:03 <aspiers> I think masahito is probbly right
09:47:06 <beekhof> should be possible to allow outsiders too
09:47:08 <masahito> if you all don't mind it, I'm ok.
09:47:15 <aspiers> you can't be in an org team unless you're in the org?
09:47:37 <beekhof> CLusterLabs is an org
09:47:50 <aspiers> yes but anyone can become a member of clusterlabs
09:47:53 <beekhof> its not really any different
09:47:58 <aspiers> not anyone can become a member of the NTT org :)
09:48:13 <masahito> aspiers: yap
09:48:20 <aspiers> unless they create a team called "non-NTT OpenStack people"
09:48:29 <aspiers> but I can totally understand them not wanting to do that
09:48:43 <aspiers> it might leak other privileges within the org
09:49:14 <aspiers> masahito: well you can either move to stackforge, or to another organization e.g. NTT-FOSS
09:49:35 <aspiers> masahito: I suggest we discuss this offline since we are running out of time in this meeting
09:49:41 <masahito> the reason I asked it is that others who isn't in the meeting wants to report codes. but we don't have any consensus about it.
09:49:51 <masahito> aspiers: got it.
09:49:53 <aspiers> we need to give beekhof and ddeja a chance to report quick status updates
09:50:21 <aspiers> beekhof: you wanna go ahead?
09:50:53 <beekhof> ok
09:51:23 <beekhof> continued testing of the existing solution
09:51:58 <beekhof> there wasn't a lot of error handling in there, and now that there is... i'm really not convinced anything works
09:52:13 <aspiers> hehe, oh dear :)
09:52:16 <beekhof> i can trigger evacuations all day long
09:52:24 <beekhof> but what nova does with them...
09:52:33 <beekhof> i don't know its worth the trouble
09:52:57 <beekhof> also, making use of the force_down API is... problematic
09:53:12 <beekhof> various API versioning issues
09:53:23 <aspiers> beekhof: if you found several issues, any chance you could detail themm in an etherpad/wiki or mail openstack-dev?
09:53:33 <_gryf> beekhof, actually it's a client problem
09:53:34 <aspiers> then we could examine and discuss next time?
09:53:42 <_gryf> the api is working just fine :)
09:53:49 <beekhof> _gryf did a nice writeup
09:53:55 <aspiers> oh
09:54:04 <beekhof> i filed a rh bug that included it
09:54:06 <aspiers> somehow I missed that
09:54:17 <beekhof> :)
09:54:21 <beekhof> i'll get the link
09:54:24 <aspiers> thanks
09:54:25 <beekhof> apart from that, general notice that RH will quite likely drop all the A/A openstack services from the pacemaker config
09:54:37 <aspiers> why?
09:54:43 <beekhof> they can use systemd
09:54:44 <ddeja> :o
09:54:52 <_gryf> oh my
09:55:02 <aspiers> how will they be monitored?
09:55:15 <beekhof> they keep telling us they dont need pacemaker, now they get the chance to show it :)
09:55:19 <beekhof> by systemd
09:55:27 <beekhof> because its wonderful
09:55:28 <aspiers> systemd doing API-level monitoring?
09:55:36 <beekhof> does everything a person could want
09:55:46 <aspiers> oh OK, so this is a decision elsewhere in RH, not by you?
09:55:54 <beekhof> oh, not at all
09:56:04 <ddeja> beekhof: Its... beautiful
09:56:07 <beekhof> i'm one of the architects of the plan
09:56:25 <aspiers> #info _gryf and beekhof found several issues with evacuation
09:56:40 <aspiers> hmm
09:56:46 <beekhof> https://bugzilla.redhat.com/show_bug.cgi?id=1291106
09:56:46 <openstack> bugzilla.redhat.com bug 1291106 in python-novaclient "force_down API not available" [Urgent,New] - Assigned to eglynn
09:56:52 <aspiers> thanks
09:57:14 <aspiers> beekhof: any chance you could write up a few more details of that plan too?
09:57:23 <aspiers> sounds rather odd to me
09:57:33 <aspiers> though I know you love systemd ;-)
09:57:45 <beekhof> magnificent codebase
09:57:47 <aspiers> _gryf / ddeja: want to report anything in the last 2 mins? (sorry)
09:58:07 <_gryf> aspiers, nope
09:58:14 <aspiers> also we need to decide whether to have a meeting next week
09:58:31 <aspiers> I suggest we can have one even if some people might be away
09:58:33 <ddeja> my status: Bringing up my test-setup, to test Mistral on something better than devstack
09:59:18 <aspiers> cool
09:59:39 <aspiers> #info ddeja is preparing to test the Mistral PoC on something better than devstack
10:00:06 <aspiers> ok, not much to report from my side except fighting with CI
10:00:31 <aspiers> #info RH will probably move management of A/A openstack services from pacemaker to systemd
10:01:05 <aspiers> unfortunately we have to end there. sorry for not being a good chair this week - next time I'll try to structure the meeting more effectively!
10:01:15 <_gryf> :D
10:01:23 <aspiers> if we have topics to discuss we should move them to a discussion section after the status update section
10:01:31 <aspiers> lesson learned :)
10:01:57 <aspiers> alright, thanks everyone and of course we can continue discussing on #openstack-ha for anyone who is still free right now!
10:02:11 <aspiers> to everyone else, thanks for coming and bye for now :)
10:02:18 <ddeja> thanks, bye
10:02:22 <_gryf> bye!
10:02:24 <masahito> thanks!
10:02:28 <kazuIchikawa> bye
10:03:33 <aspiers> #endmeeting