09:00:08 <aspiers> #startmeeting ha
09:00:09 <openstack> Meeting started Mon Mar 21 09:00:08 2016 UTC and is due to finish in 60 minutes.  The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:00:10 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:00:12 <openstack> The meeting name has been set to 'ha'
09:00:22 <aspiers> hi
09:00:30 <aspiers> who do we have today?
09:00:36 <ddeja> o/
09:01:10 <aspiers> if it's just us two, we can discuss our talk ;-)
09:01:16 <ddeja> yup
09:01:37 <ddeja> but maybe someone else will show, we can wait 2 minutes more ;)
09:01:44 <aspiers> sure
09:04:19 <ddeja> ok, it looks like it's really us two, aspiers
09:04:22 <aspiers> yep
09:04:24 <aspiers> no problem
09:04:47 <aspiers> well let's use normal structure anyway, it will work fine
09:04:51 <aspiers> #topic Current status (progress, issues, roadblocks, further plans)
09:04:58 <aspiers> not much from my end
09:05:06 <aspiers> I did a few tweaks to openstack-resource-agents
09:05:13 <aspiers> and reported some HA bugs
09:05:23 <aspiers> e.g. http://bugs.clusterlabs.org/show_bug.cgi?id=5271
09:05:24 <openstack> bugs.clusterlabs.org bug 5271 in Documentation "usage of status action in OCF RAs needs clarifying or eliminating" [Normal,Unconfirmed] - Assigned to oalbrigt
09:05:43 <ddeja> so only short status from my side - I have prepared images for demo "Mistral HA". I didn't completed it yet since I worked on getting US visa
09:05:47 <aspiers> when I noticed that most of the RAs have the status action, and that's pointless
09:05:49 <aspiers> so I will drop it
09:05:57 <ddeja> ok
09:06:22 <aspiers> what type of images?
09:06:34 <ddeja> oh, images == pictures
09:06:39 <aspiers> oh, heh
09:06:49 <ddeja> like diagrams in visio ;)
09:06:51 <aspiers> too early in the morninng for me ;-)
09:06:57 <aspiers> cool!
09:07:16 <aspiers> I guess you also took a look at reveal.js, right?
09:07:26 <aspiers> or at least at Florian's talk
09:07:48 <ddeja> yes, I have watched Florians talk, didn't have time to play with technology itself but surely do this week
09:07:52 <aspiers> ok
09:08:03 <aspiers> I was also working on our neutron-ha-tool OCF RA which we use in SUSE OpenStack Cloud 5 and 6 for neutron HA
09:08:26 <aspiers> I guess for the next release we will probably switch to the standard upstream approach
09:08:39 <ddeja> that's good
09:09:01 <aspiers> yeah
09:09:08 <aspiers> or at least in the DVR case
09:09:35 <aspiers> oh
09:09:43 <aspiers> I just remembered
09:09:48 <ddeja> hm?
09:10:06 <aspiers> last Tuesday was the cross-project meeting
09:10:20 <aspiers> that was last week, right?
09:10:28 <ddeja> I think so
09:10:37 <aspiers> I guess most people read the minutes / logs the next day
09:10:41 <bogdando> hi
09:10:43 <aspiers> but I should mention it for completeness
09:10:46 <aspiers> oh hey bogdando :)
09:10:51 <ddeja> hello :)
09:11:11 <aspiers> so we talked about the auto-evacuation spec
09:11:30 <aspiers> #link https://review.openstack.org/#/c/257809/2
09:11:44 <aspiers> there seemed to be a LOT of interest in the topic
09:12:06 <aspiers> and I explained why I thought it was cross-project
09:12:43 <aspiers> however it was decided that whilst this is important work, a high-level cross-project spec doesn't really make sense
09:13:01 <aspiers> I think specs are supposed to have concrete low-level action items
09:13:12 <aspiers> rather than be technical strategy documents
09:13:26 <masahito> hi
09:13:30 <aspiers> so I think the spec will be abandoned
09:13:35 <aspiers> hi masahito :)
09:13:52 <aspiers> I was just summarising the cross-project meeting
09:14:28 <masahito> got it. I'll check eavesdrop.
09:14:33 <aspiers> http://eavesdrop.openstack.org/meetings/crossproject/2016/crossproject.2016-03-15-21.01.log.html
09:14:47 <aspiers> http://eavesdrop.openstack.org/meetings/crossproject/2016/crossproject.2016-03-15-21.01.html
09:15:01 <bogdando> any action items to the spec decomposition then?
09:15:04 <aspiers> #info the conclusion was: keep working on it and collaborating with the different projects
09:15:09 <bogdando> we shall not just abandon and forget :)
09:15:34 <aspiers> #info and submit specs to individual projects when there are APIs missing
09:16:30 <aspiers> bogdando: well I guess until now, each group was working on their individual solutions
09:16:44 <bogdando> and looking to the comments, I believe it was a good idea to make it CP
09:16:54 <aspiers> but maybe we need to start collaborating on the convergence
09:17:06 <bogdando> otherwise we'd never have collected so many comments from folks from different areas / projects
09:17:11 <ddeja> aspiers: I think still some groups working on their solitions
09:17:22 <aspiers> ddeja: yes, I think we all are
09:17:36 <aspiers> ddeja: maybe it is still too early
09:17:46 <aspiers> but we could start thinking about convergence action items
09:18:19 <bogdando> so, CP is eactly perfect place for the *high-level* concepts
09:18:26 <aspiers> #topic how to evaluate possible convergence strategies for auto evac
09:18:32 <bogdando> exactly
09:18:53 <aspiers> bogdando: high-level architectural discussion is cross-project for sure
09:19:04 <aspiers> but apparently specs are not the way to do it
09:19:38 <aspiers> maybe because that reaches too wide an audience
09:19:43 <bogdando> So I'd prefer we finish that instead of allowing the initiative to be split into detached local activities
09:19:59 <aspiers> I think a wiki or etherpad is fine
09:20:06 <aspiers> and mailing list / IRC
09:20:21 <bogdando> but only spec provides a path for future code reviews...
09:20:24 <aspiers> or we could move the spec somewhere else, e.g. mistral
09:20:39 <aspiers> several people said they thought it looked like a mistral spec
09:20:54 <bogdando> maybe yes
09:21:10 <ddeja> aspiers: from my experience with Mistral I don't think such spec suits in it
09:21:27 <aspiers> ddeja: yeah I'm not convinced by that either
09:21:39 <ddeja> since (despite some bugs) Mistral is only a Workflow executor
09:21:59 <ddeja> but there was some plans of having 'very good, reliable workflows' in Mistral repo
09:22:00 <aspiers> Sean Dague said "A cross project spec should either be a thing which affects nearly all openstack projects, or a thing that all the projects involved have agreed to already"
09:22:01 <bogdando> yes, we cannot design things like fencing there
09:22:13 <bogdando> or nova API changes, if any
09:22:35 <aspiers> bogdando: I think the point with nova API changes was that currently it is believed to do everything we need
09:22:52 <aspiers> so in that sense, nova is not involved until we find a bug or a gap in functionality
09:22:54 <ddeja> bogdando: for now, fencing is done outside OpenStack, so I don't know if we should discussed fencing at all
09:23:04 <bogdando> so we may end up having 3 specs - Mistral (if anything to be changed in the Mistral), Nova (ditto), OpenStack resource agents space ?
09:23:23 <ddeja> IMO we should just state that fencing must be configured
09:23:32 <aspiers> ddeja: that's a good point - fencing will probably never be inside OpenStack (unless using Ironic / Triple-O?)
09:23:43 <bogdando> yes, we can just note and leave out of scope
09:23:57 <aspiers> but fencing is a crucial part of the architecture of course
09:23:57 <ddeja> yeah, also we can provide OCF agent for Mistral
09:24:20 <aspiers> ddeja: that's a great example of something which deserves a spec in openstack-resource-agents :)
09:24:21 * ddeja has OCF agent that calls mistral API prepared
09:24:25 <bogdando> indeed, new OCF RA for Mistral fits well into the latter of 3 specs
09:24:27 <aspiers> which reminds me I need to set up a specs repo
09:24:37 <aspiers> #action aspiers to set up a specs repo for openstack-resource-agennts
09:25:05 <bogdando> and the rest things might be just put to the HA guide
09:25:07 <aspiers> (we also need that for planning Fuel reconvergence)
09:25:23 <aspiers> bogdando: hmm, I think it's maybe too WIP for the HA guide now?
09:25:24 <bogdando> like - make sure you configured the pcmk that way , and enabled fencing
09:25:36 <aspiers> oh, you mean just fencing?
09:25:43 <bogdando> yes, if we have a clear vision, nothing blocks us
09:25:59 <bogdando> everything you mentioned would not go as a part of OpenStack setup/op
09:26:16 <ddeja> since everyone is using pacemaker for fencing, it can be mentioned in HA guide
09:26:22 <aspiers> TBH I think we need an architecture diagram
09:26:30 <bogdando> we have a section for controllers HA setup
09:26:36 <aspiers> which maps all the required components of auto-evac
09:26:44 <ddeja> +1
09:26:45 <bogdando> we could add there all details near to the existing pacemaker/corosync sections
09:26:57 <aspiers> maybe we could use google drive to collaborate on drawing on?
09:27:00 <aspiers> *one
09:27:05 <bogdando> and add details how one shall configure pcmk remote for computes HA , for example
09:27:36 <aspiers> or some other tool which is like "ether-visio"
09:27:36 <bogdando> and we can add diagrams
09:27:40 <bogdando> we have many now :)
09:28:14 <aspiers> ddeja: I think we need this map for our talk anyway ;-)
09:29:03 <bogdando> we have bright examples Pacemaker Cluster Manager http://docs.openstack.org/ha-guide/intro-ha-arch-pacemaker.html and Keepalived http://docs.openstack.org/ha-guide/intro-ha-arch-keepalived.html architecture details and limitations
09:29:08 <ddeja> aspiers: you mean something like that https://github.com/gryf/mistral-evacuate/blob/master/Automatic%20evacuate%20design.jpg
09:29:15 <aspiers> bogdando: are there diagrams in the ha-guide?
09:29:20 <aspiers> bogdando: oh... thanks :)
09:29:22 <bogdando> :)
09:29:47 <ddeja> so it looks like we have enaough diagrams ;)
09:30:02 <aspiers> bogdando: is there a standard way to produce diagrams for upstream docs?
09:30:10 <bogdando> but we need ad more specific to the Instance ha + pacemaker remote now
09:30:15 <bogdando> I have no idea ;(
09:30:24 <bogdando> let's ask openstack docs folks
09:30:31 <aspiers> good idea
09:30:38 <aspiers> the neutron docs have tons of great docs
09:30:44 <aspiers> I wonder how they collaborate on them
09:30:45 <bogdando> or Andrew Beekhof, who probably created those above
09:30:47 <bogdando> :)
09:31:51 <beekhof> what i do?
09:31:54 <aspiers> Google Drawings is probably the easiest way
09:31:58 <aspiers> beekhof: it's all your fault!
09:32:08 <bogdando> so do we agree we can start working on the docs update w/o waiting for accepted implementations?
09:32:19 <bogdando> as it seems 100% to be pcmk_remote with OCF RA
09:32:21 <ddeja> beekhof: out of nowhere!
09:32:24 <aspiers> :)
09:32:39 * beekhof was cooking dinner - its been on of those days
09:32:42 <bogdando> beekhof, those? Pacemaker Cluster Manager http://docs.openstack.org/ha-guide/intro-ha-arch-pacemaker.html and Keepalived http://docs.openstack.org/ha-guide/intro-ha-arch-keepalived.html architecture details and limitations
09:33:00 <beekhof> yep, i made those
09:33:00 <aspiers> bogdando: what do you mean by docs update? a new section in ha-guide?
09:33:06 <bogdando> yes
09:33:23 <bogdando> to cover everything's missing to the setup required for the Instances HA
09:33:36 <bogdando> and in that we all 100% sure
09:33:58 <ddeja> fencing is such thing
09:34:00 <bogdando> like, pacemaker_remote, fencing, Mistral OCF RA
09:34:21 <masahito> I find good section.
09:34:26 <masahito> http://docs.openstack.org/ha-guide/compute-node-ha-api.html
09:34:30 <bogdando> exaclty
09:35:01 <aspiers> bogdando: well, are we 100% sure on Mistral? for me it is a very good option, but I think we still have to do a lot of work and testing to be 100%
09:35:15 <bogdando> we can skip Mistral then
09:35:22 <aspiers> that's still implementation details
09:35:33 <aspiers> I think for now, only architecture should be covered in ha-guide
09:35:38 <beekhof> those diagrams were made in keynote (apple application like powerpoint)
09:35:56 <bogdando> hm, I'm only trying to find a place for things will not go to any specs but still must be known (how-to)
09:35:56 <beekhof> i can probably export it into powerpoint which google docs might import
09:36:12 <aspiers> bogdando: yes, that is the challenge we need to figure out :)
09:36:21 <ddeja> aspiers: as far as I know triple-O guys are willing to use Mistral
09:36:31 <aspiers> bogdando: I think probably a wiki
09:36:47 <aspiers> bogdando: unless we want to use gerrit for review
09:37:22 <bogdando> do we agree on adding pacemaker_remote topics to the HA guide compute nodes HA?
09:37:35 <bogdando> it seems like 100% will go into the final solution
09:37:53 <masahito> bogdando: Just pacemaker_remote?
09:37:53 <bogdando> and fencing details!
09:37:55 <aspiers> bogdando: +1
09:38:13 <ddeja> for now I see no alternative - maybe Ironic would be able to do fencing someday...
09:38:36 <aspiers> agreed
09:38:58 <aspiers> bogdando: I think it would also be good to document that compute HA is still WIP
09:39:06 <aspiers> bogdando: the ha-guide could point to our community here
09:39:13 <bogdando> btw, aspiers I saw you asked about remote stonith agents, so may be you could add some things you know now :)
09:39:20 <aspiers> bogdando: link to etherpad, weekly meetings etc.
09:39:32 <ddeja> aspiers: Big +1 on that
09:39:37 <bogdando> great then
09:39:42 <masahito> it sounds nice.
09:40:00 <aspiers> bogdando: can you take care of that? and add us as reviewers?
09:40:22 <bogdando> If I had a time to play with setup verification
09:40:36 <aspiers> maybe we could attract more people to our community that way
09:40:56 <bogdando> would be nice though someone who already did just shared results and notes
09:41:07 <bogdando> so I could expose them as the guide (and test them as well)
09:41:28 <aspiers> bogdando: I think the wiki is the right place to link to them
09:41:32 <masahito> aspiers: agree. I noticed I can't find our eatherpad on google.
09:41:35 <aspiers> since they are WIP and change quickly
09:41:45 <ddeja> bogdando: you mean steps how to configure pacemaker_remote?
09:41:53 <aspiers> we can also use wiki for evolving arch/design docs
09:41:53 <ddeja> and fencing of compute nodes?
09:42:13 <bogdando> yes, and fencing agents to use with computes probably (w/o devices specific things)
09:42:31 <aspiers> bogdando: there's not much to say about stonith of remote nodes, it's all documented already
09:42:41 <aspiers> bogdando: I just failed to find the docs the first time
09:42:55 <bogdando> well I'd like to put only the very specific things, no cross posting
09:43:18 <bogdando> or a link if that just works the way we want
09:43:44 <bogdando> the idea is to document something verified, even if as PoC
09:44:17 <ddeja> I belive that the way RH explained how to setup stonith is OK
09:44:22 <ddeja> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Configuring_the_Red_Hat_High_Availability_Add-On_with_Pacemaker/ch-fencing-HAAR.html
09:44:52 <bogdando> folks, it should be not just a common things, but specific to OpenStack computes and pacemaker_*remote*
09:45:15 <bogdando> with existing limitations like stonith resources may not be running on the compute nodes
09:45:25 <bogdando> or which type of remote fencing agents to use
09:45:32 <bogdando> that we *do* recommend
09:45:47 <ddeja> ok, I see
09:46:04 <aspiers> I don't think there is a need to favour any type of fencing agent
09:46:11 <bogdando> like maximum of nodes supported in the cluster (16 afaik)
09:46:15 <aspiers> any is good
09:46:21 <bogdando> so, more practical things
09:46:26 <bogdando> useful for ops
09:47:14 <bogdando> so that would not be just a reference to the existing guides AFAICT
09:47:16 <aspiers> bogdando: in the period before we have a working solution to document, what do you think is the goal of this section in the ha-guide?
09:47:36 <aspiers> I'm not sure it makes sense to document how to do 20% of a full solution
09:47:41 <bogdando> the goal is to make community prepared for Instances HA solutions
09:48:01 <bogdando> and shed some light to required tooling, setups
09:48:15 <bogdando> there are not much things about pacemaker remote IIUC
09:48:28 <bogdando> only few tech talks, and many things WIP, am I right?
09:48:31 <ddeja> bogdando: I have a setup with fencing configured, but it's based on this https://access.redhat.com/articles/1544823 The only things I've added are custom fencing agent (and Mistral ofcourse)
09:48:35 <aspiers> IMHO it's fine to say "you will need pacemaker_remote"
09:48:43 <aspiers> and maybe link to docs on pacemaker_remote
09:48:57 <aspiers> but I'm not sure it makes sense to document details of how to set it up
09:48:59 <bogdando> okay, let's just try to draft something...
09:49:14 <aspiers> bogdando: yeah, please just submit a review and add us to cc
09:49:30 <bogdando> ok I'll try :)
09:49:30 <aspiers> bogdando: I will definitely review anything you cc me on :)
09:49:39 <aspiers> oops, that was a dangerous promise ;)
09:49:56 <ddeja> aspiers: prepare to review some fuel patches ;)
09:50:03 <aspiers> lol
09:50:19 <bogdando> haha
09:50:23 <aspiers> bogdando: I think the most important thing to add is info on our WIP
09:50:49 <aspiers> bogdando: i.e. http://eavesdrop.openstack.org/#High_Availability_Meeting
09:50:56 <aspiers> https://etherpad.openstack.org/p/automatic-evacuation
09:51:12 <aspiers> and maybe we need a wiki page which is a more friendly landing page for this topic
09:51:23 <aspiers> bogdando: also you could link to the user story
09:51:38 <aspiers> although that is the first link in the etherpad :)
09:52:46 <aspiers> #action bogdando to submit an ha-guide review adding info about community WIP on auto-evac
09:53:24 <aspiers> shall we also have a play with google drawing?
09:53:56 <aspiers> https://docs.google.com/drawings/d/1q50txuu3vVx2WadhWGAeSO25PaEy_DbmwhVryT4FXCY/edit?usp=sharing
09:54:14 <aspiers> #action anyone who wants to, to experiment with google drawing
09:54:29 <aspiers> I think an architecture map would really help us
09:55:09 <ddeja> I can paste there drawing I have prepared for my demo to have something to start with
09:55:16 <aspiers> ok
09:55:19 <bogdando> ddeja, great!
09:55:29 <aspiers> #topic AOB
09:55:33 <ddeja> It's Mistral-oriented, but it's still something to start
09:55:33 <aspiers> any other business before we finish?
09:55:59 <ddeja> wait one minute guys and tell if my drawing makes any sense ;)
09:56:33 <ddeja> Done, pasted
09:57:45 <masahito> I'll paste Masakari architecture.
09:57:57 <masahito> in the page.
09:58:07 <ddeja> but I don't know if it is anywhere close to what you guys have in mind :)
09:58:16 <haukebruno> well, hi \o/ being a _bit_ late, sorry :p
09:58:26 <aspiers> haha hi haukebruno, we are just finishing :)
09:58:30 <masahito> I think it's easy to compare both and others.
09:58:42 <ddeja> so feel free to delete it, I have it in another docs
09:58:46 <ddeja> masahito: +1
09:58:53 <bogdando> good point
09:58:54 <haukebruno> aspiers, yeah. I apologize... pretty bad timings today
09:59:10 <ddeja> I just wonder if we can add second page in this drawing?
09:59:10 <aspiers> ddeja: I was thinking we need a process diagram
09:59:39 <aspiers> ddeja: I'll try to sketch something so it makes more sense
10:00:03 <aspiers> something a bit like https://github.com/ntt-sic/masakari/blob/master/contents/architecture.png
10:00:17 <aspiers> but more generic
10:00:24 <aspiers> not specific to any implementation
10:00:28 <aspiers> anyway, we are out of time
10:00:29 <ddeja> I see
10:00:36 <aspiers> let's continue on #openstack-ha
10:00:46 <aspiers> thanks all!
10:00:55 <aspiers> #endmeeting