09:02:36 #startmeeting ha 09:02:36 Meeting started Mon Dec 14 09:02:36 2015 UTC and is due to finish in 60 minutes. The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:02:38 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:02:42 The meeting name has been set to 'ha' 09:02:46 #topic Current status (progress, issues, roadblocks, further plans) 09:03:11 alright, masahito you wanna start today? 09:03:17 got it 09:04:10 I want us to start thinking of milestone of the HA. 09:04:21 so I thought how we can integrate Masakari and Mistral work flow proposed by _gryf. 09:04:35 ok 09:04:38 then, I wrote my idea in https://etherpad.openstack.org/p/automatic-evacuation at L12-32. 09:04:59 cool, thanks :) 09:05:14 * aspiers has a quick read 09:07:13 I'm not sure I fully understand the quick example 09:07:27 what's the difference between the two cases? 09:08:08 first one is only disable nova-compute and the instance on the host remain there. 09:08:25 #info masahito documented a proposal for short and long term goals for masakari and mistral, and convergence of the two 09:08:26 so technically you dont need to do anything, yet 09:08:44 second one is the instance on the host will be moved to other host. 09:09:09 ok - but what's the goal of the first case? 09:10:04 vigilance? 09:10:07 in the case, admin tryies to start nova-compute 09:10:56 ah ok, so e.g. if nova-compute simply crashed? 09:11:08 aspiers: yes 09:11:34 ok. wouldn't pacemaker simply try to restart it then? 09:12:13 pacemaker try it but couldn't restart it. 09:12:51 then it will try to stop it -> fail -> fence 09:12:56 -> failover 09:13:03 is there a use case where manual intervention would be needed? 09:13:40 we have this use-case. 09:13:56 since manual intervention doesn't scale 09:13:56 hi all, sorry for beeing late 09:14:01 hi ddeja 09:14:13 ddeja: take a look at https://etherpad.openstack.org/p/automatic-evacuation L12-32 09:14:24 aspiers: OK 09:14:46 actually maybe I am talking rubbish, as usual for early on Monday mornings 09:15:01 the stop of nova-compute could succeed 09:15:39 * aspiers checks NovaCompute RA 09:16:15 in fact, it always succeeds 09:16:36 so it should :) 09:16:53 I can't even remember how this is supposed to work :-/ 09:16:59 what triggers fencing? 09:17:04 unless kill -9 and rm -f have stopped working, there is little reason for stop to fail for any service 09:17:32 sorry, I'm not awake yet :(( 09:18:16 I mean, if nova-compute fails to start correctly multiple times 09:18:30 it's just a clone, so it has nowhere to migrate to 09:18:31 in practice they often fail though. especially systemd units :-/ 09:19:04 in this case it just stays down, no fencing and no evacuation, right? 09:19:13 that's a black spot in our current approach? 09:19:18 s/black/blind/ 09:19:51 assuming that clone-max isn't set to less than the # of compute nodes 09:20:09 yep 09:20:13 I'm not in the toppic - but if nova-compute fails to start, it shouldn't have any vms? 09:20:14 well no 09:20:30 ddeja: we're discussing if it fails to *re*start 09:20:35 ddeja: sorry for not being clear 09:20:44 actually, correct. no evacuations because the node wasnt fenced 09:20:45 ddeja: e.g. if it crashes 09:20:55 which is... probably unfortunate 09:21:03 hmm 09:21:09 but if nova-compute crashes 09:21:17 it's vms are still alive, right? 09:21:27 ddeja: oh that's a good point 09:21:40 so there's no need to evacuate anything 09:22:01 and finally I understand masahito's use case :) 09:22:29 so in that case, we want the admin to manually rescue the machine, since we can't do anything useful automatically 09:22:35 aspiers: ddeja: My use case I mentioned is vm is still alive. 09:22:41 right 09:22:41 aspiers: yap :) 09:22:53 masahito: ok got it now, sorry for being slow ;-) 09:23:35 aspiers: no problem. If I responsed quickly, it would be fast. 09:23:37 so in this case the workflow is just "scream to the admins for help", right? 09:23:48 masahito: how do you check instance health without a functional nova? 09:24:23 aspiers: It depends on how we use openstack. 09:24:49 beekhof: if we were using mistral, it could query nova to see if any instances were running there before nova-compute died 09:24:58 If we use OpenStack as public cloud, notifying admin and client is good. 09:25:14 if the host was empty, probably best route is to fence without evacuation 09:25:41 Or we can check in libvirt for VM's 09:25:41 on the other hand, if we use openstack as private cloud, like EC site. I think it's better to fence node. 09:25:49 since there is a smalll chance fencing / reboot would fix the nova-compute issue 09:26:03 ddeja: yes, that's probably even better 09:26:17 Oh, we can have a big problem if libvirtd crashes and fails to re-start... 09:26:39 aspiers: i think my point is, you probably need to fence+evacuate because you can't tell if the instances you've promised to keep running, are actually running 09:27:22 beekhof: on the other hand, if instance is totaly OK, you can make somebody very unhappy by doing this ;) 09:27:29 so yeah, the instances can run without nova, but no-one can know that if nova is dead 09:27:48 beekhof: you can check in libvirt 09:27:51 ddeja: its loose-loose 09:27:58 beekhof: the question is whether it makes sense for mistral to monitor the libvirt layer out of band from pacemaker 09:27:59 or maybe 09:28:07 we can trigger live-migrate in such case 09:28:31 #link https://libvirt.org/migration.html 09:28:42 using libvirt 09:28:45 ddeja: I doubt that's doable without nova 09:28:55 i think there is scope for ${something} to be looking at libvirt, dont have much of an opinion who initiates it 09:28:57 ddeja: by circumventing openstack you break lots of things 09:29:04 yup 09:29:14 could be a pacemaker RA, could be something else 09:29:29 aspiers: even if we are R/O ? 09:29:35 I'm just seeking for options to have instances alive as long as possible and not to kill them, if they're alive 09:29:46 oh, i see now 09:30:04 ddeja: agree 100% with that goal 09:30:21 well we already have monitoring of libvirtd, but really that's different 09:30:33 here we are talking about monitoring individual VMs via libvirtd 09:30:39 ddeja: the problem is that to make the "right" decision at every stage of the process, you usually need more knowledge than is available 09:30:41 as we're discussing now, there are a lot of use case. So my idea is we offer the admin some usecase. 09:30:45 at least to an aytomated process 09:31:11 masahito: yes, sounds like a good first step is to do manual notification to the admin for sure 09:31:16 correct (should always) trump optimal 09:31:23 I think masahito has a good point; We can leave this decision to admin 09:31:43 beekhof: agreed, although hopefully we can achieve both eventually :) 09:31:48 In current masakari implementation, a processmonitor does monitor libvirtd, and masakari just does disable nova-compute without evacuation, fyi. 09:31:58 as long as we can also opt to take automated action 09:32:30 yup 09:32:41 I guess there could be strange edge cases 09:32:51 e.g. nova-compute dies, and so do half of the VMs 09:33:08 in that case do you fence or not? 09:33:13 I think mistral template (sorry, I don't know the correct phrase) is good tool to offer this. 09:33:25 it depends on whether the half which died were more or less important than the half which are still alive 09:33:43 masahito: I agree, it sounds like the kind of thing mistral could do a good job on 09:34:18 aspiers: just disabling nova-compute at schedular level, no fence. 09:34:30 this is assuming we have something checking instances outside of openstack? 09:34:47 kazuIchikawa: if the VMs which died were critical but the VMs still alive were non-critical, fencing would be a better choice 09:35:01 kazuIchikawa: since it is more important to resurrect the dead VMs than keep the other ones alive 09:35:55 beekhof: well mistral could do it by querying either nova-api or libvirtd directly 09:36:04 aspiers: To sum up: we would like to have vm-monitoring on libvirt? And in case of nova-compute crash make some decision based on it's informations? 09:36:31 ddeja: sounds like a good summary to me 09:36:47 aspiers: how can nova-api help if nova isnt on that compute node? 09:37:07 beekhof: for querying the current state - but it could be out of date, so it's probably not useful 09:37:29 I think it has to be via libvirtd 09:37:35 in case VMs died after nova-compute died 09:37:49 which is quite possible in that failure scenario (e.g. OOM killer) 09:37:58 agreed 09:38:11 actually OOM killer is probably quite a good example of the scenario we are discussing 09:38:21 well, maybe not 09:38:23 aspiers: I agree if we confirmed some of instance were died, then we should take any action such as rebuild. it's done by instancemonitor rather than processmonitor in Masakari. 09:38:31 I guess nova-compute could potentially restart ok 09:38:52 kazuIchikawa: right. this is one area where masakari is ahead right now 09:39:48 #info scenario was discuussed where nova-compute and possible some/all VMs on that host die and can't be restarted 09:40:50 #agreed appropriate action (to fence or not to fence) in this scenario is policy-based and context-sensitive 09:41:16 #agreed a good first step is to automatically notify the cloud operators to request manual remediation 09:42:08 #info notification in this scenario could be handled by mistral 09:42:28 ok 09:42:33 back to my update. If agree, I want to move the work items. 09:42:33 by? 09:42:43 i'd have thought "to" 09:42:53 alex_xu: could you please check https://review.openstack.org/#/c/243562/ ? I answered your comments 09:43:04 gibi: wrong channel :) 09:43:16 aspiers: sorry :) 09:43:16 masahito: go ahead 09:43:22 gibi: np 09:43:27 masahito: yes please, go ahead 09:43:40 I'm wondering where we should keep masakari repo. 09:43:55 It's in under ntt-sic/ directory now, so I and other less people only can merge it now. 09:44:08 ok 09:44:29 If someone want to help it, I think the repo place is not good. 09:44:37 masahito: it's possible to make other people collaborators on github projects 09:45:11 yep 09:45:20 where it is doesn't affect that 09:45:31 by issues or pull request? 09:45:38 masahito: both 09:45:40 either 09:45:52 they can push directly to the repo if you want them to 09:46:04 depends how much you trust them ;-) 09:46:14 hmm I wonder if it's different for organizations 09:46:18 then maybe you have to use teams? 09:46:20 no 09:46:41 pretty sure not 09:46:43 Oh, I'm worried about only NTT guys can merge it. 09:47:03 I think masahito is probbly right 09:47:06 should be possible to allow outsiders too 09:47:08 if you all don't mind it, I'm ok. 09:47:15 you can't be in an org team unless you're in the org? 09:47:37 CLusterLabs is an org 09:47:50 yes but anyone can become a member of clusterlabs 09:47:53 its not really any different 09:47:58 not anyone can become a member of the NTT org :) 09:48:13 aspiers: yap 09:48:20 unless they create a team called "non-NTT OpenStack people" 09:48:29 but I can totally understand them not wanting to do that 09:48:43 it might leak other privileges within the org 09:49:14 masahito: well you can either move to stackforge, or to another organization e.g. NTT-FOSS 09:49:35 masahito: I suggest we discuss this offline since we are running out of time in this meeting 09:49:41 the reason I asked it is that others who isn't in the meeting wants to report codes. but we don't have any consensus about it. 09:49:51 aspiers: got it. 09:49:53 we need to give beekhof and ddeja a chance to report quick status updates 09:50:21 beekhof: you wanna go ahead? 09:50:53 ok 09:51:23 continued testing of the existing solution 09:51:58 there wasn't a lot of error handling in there, and now that there is... i'm really not convinced anything works 09:52:13 hehe, oh dear :) 09:52:16 i can trigger evacuations all day long 09:52:24 but what nova does with them... 09:52:33 i don't know its worth the trouble 09:52:57 also, making use of the force_down API is... problematic 09:53:12 various API versioning issues 09:53:23 beekhof: if you found several issues, any chance you could detail themm in an etherpad/wiki or mail openstack-dev? 09:53:33 <_gryf> beekhof, actually it's a client problem 09:53:34 then we could examine and discuss next time? 09:53:42 <_gryf> the api is working just fine :) 09:53:49 _gryf did a nice writeup 09:53:55 oh 09:54:04 i filed a rh bug that included it 09:54:06 somehow I missed that 09:54:17 :) 09:54:21 i'll get the link 09:54:24 thanks 09:54:25 apart from that, general notice that RH will quite likely drop all the A/A openstack services from the pacemaker config 09:54:37 why? 09:54:43 they can use systemd 09:54:44 :o 09:54:52 <_gryf> oh my 09:55:02 how will they be monitored? 09:55:15 they keep telling us they dont need pacemaker, now they get the chance to show it :) 09:55:19 by systemd 09:55:27 because its wonderful 09:55:28 systemd doing API-level monitoring? 09:55:36 does everything a person could want 09:55:46 oh OK, so this is a decision elsewhere in RH, not by you? 09:55:54 oh, not at all 09:56:04 beekhof: Its... beautiful 09:56:07 i'm one of the architects of the plan 09:56:25 #info _gryf and beekhof found several issues with evacuation 09:56:40 hmm 09:56:46 https://bugzilla.redhat.com/show_bug.cgi?id=1291106 09:56:46 bugzilla.redhat.com bug 1291106 in python-novaclient "force_down API not available" [Urgent,New] - Assigned to eglynn 09:56:52 thanks 09:57:14 beekhof: any chance you could write up a few more details of that plan too? 09:57:23 sounds rather odd to me 09:57:33 though I know you love systemd ;-) 09:57:45 magnificent codebase 09:57:47 _gryf / ddeja: want to report anything in the last 2 mins? (sorry) 09:58:07 <_gryf> aspiers, nope 09:58:14 also we need to decide whether to have a meeting next week 09:58:31 I suggest we can have one even if some people might be away 09:58:33 my status: Bringing up my test-setup, to test Mistral on something better than devstack 09:59:18 cool 09:59:39 #info ddeja is preparing to test the Mistral PoC on something better than devstack 10:00:06 ok, not much to report from my side except fighting with CI 10:00:31 #info RH will probably move management of A/A openstack services from pacemaker to systemd 10:01:05 unfortunately we have to end there. sorry for not being a good chair this week - next time I'll try to structure the meeting more effectively! 10:01:15 <_gryf> :D 10:01:23 if we have topics to discuss we should move them to a discussion section after the status update section 10:01:31 lesson learned :) 10:01:57 alright, thanks everyone and of course we can continue discussing on #openstack-ha for anyone who is still free right now! 10:02:11 to everyone else, thanks for coming and bye for now :) 10:02:18 thanks, bye 10:02:22 <_gryf> bye! 10:02:24 thanks! 10:02:28 bye 10:03:33 #endmeeting