08:05:34 #startmeeting ha 08:05:35 Meeting started Wed Mar 29 08:05:34 2017 UTC and is due to finish in 60 minutes. The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot. 08:05:36 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 08:05:38 The meeting name has been set to 'ha' 08:05:39 let's have a quick chat 08:05:47 maybe about the architecture diagram 08:06:01 sounds like a good place to start 08:06:37 so the workflows are all done? 08:06:47 no 08:06:59 oh 08:07:16 but does it make sense to you? 08:07:21 are mistral workflows HA yet? 08:07:21 as a way forward 08:07:35 I don't think so, let me check 08:07:55 #topic next generation instance HA architecture 08:08:17 #link https://drive.google.com/file/d/0B8tqeaAn45VOWEh4RmgzVEJ1Ums/view?usp=sharing 08:08:34 what was fence_evacuate? 08:08:36 #info aspiers drew this diagram, make sure you open in draw.io to view properly 08:08:53 that was ddeja's old fencing agent 08:08:59 or at least in a size that doesn't require a magnifying glass :) 08:09:11 https://github.com/gryf/mistral-evacuate/blob/master/fence_evacuate.py 08:09:41 nova compute wasn't talking to fence agents in our deployments 08:09:48 it's set to A3 paper size, I have no idea why it displays so small 08:10:39 you sure about that? :) https://github.com/openstack/openstack-resource-agents/blob/master/ocf/nova-compute-wait#L279 08:11:03 this was one of the decouplings which you and I agreed in one of these meetings a month or two 08:11:03 of course not, i always talk out of my butt :) 08:11:08 lol 08:11:42 thats compute _wait_ though. different agent 08:11:51 it was the same in NovaCompute 08:12:02 you just split NovaCompute into systemd + nova-compute-wait 08:12:05 we dont use that agent :) 08:12:10 right 08:12:26 nothing else about this aspect of the architecture is different 08:12:41 anyhooo... devil's advocate... if we get masakari in there, why both with fence_compute and attrd? 08:12:52 s/both/bother/ 08:13:00 #info previous architecture discussions in http://eavesdrop.openstack.org/meetings/ha/2017/ha.2017-02-01-09.20.log.html 08:13:15 because we still need to queue evacuation work somewhere reliable 08:13:20 dude, stop using my own words against me :) 08:13:39 didnt they already have the problem solved? 08:14:25 the idea is to decouple the failure monitor/notifier from the failure recovery controller 08:14:47 which would allow mix'n'match architectures 08:15:26 e.g. mistral without masakari, or masakari without mistral, or masakari+mistral, or something completely different (who knows, maybe Senlin can recover stuff) 08:15:38 so fence_compute can loose all the talking to nova functionality 08:16:03 and nova evacuate becomes multi-evac and talks to masakari 08:16:05 y? 08:16:15 yes 08:16:26 or to whatever else 08:17:04 is masakari monitoring VMs in this scenario? 08:17:15 or are we still at compute node failures only 08:17:23 this diagram is just for the latter 08:17:54 but yes I think the way masakari monitors VMs is good 08:18:08 if we're going to switch, i need to be able to sell it. VM monitoring might be the key to that 08:18:26 sure, exactly the same for me 08:19:03 I think masakari has lots of really good aspects 08:19:10 it's much more OpenStacky 08:19:11 do those requests need to be stored in attrd? 08:19:46 well they're not requests, just failures which still need to be delegated to a recovery workflow controller 08:19:54 gotta run. i'll be back in ~30 though 08:19:57 the controller decides what to do with them 08:20:01 hmmm 08:20:10 lets continue this soon 08:20:15 that gives us way more flexibility for different policies 08:20:20 ok, I'll close this meeting for now then 08:20:27 thanks - see you in #openstack-ha in 30 08:22:13 BTW for the record, https://blueprints.launchpad.net/mistral/+spec/mistral-ha was closed as obsolete in January 08:22:32 IIRC this was discussed in one of our weeklies around that time 08:22:36 #endmeeting