09:02:34 #startmeeting ha 09:02:35 Meeting started Mon Aug 8 09:02:34 2016 UTC and is due to finish in 60 minutes. The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:02:37 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:02:39 The meeting name has been set to 'ha' 09:02:49 o/ 09:02:50 ok hi everyone, and welcome to the "new" (old) meeting time 09:03:11 hi o/ 09:03:17 please bear with me today if I am slow, as I am currently on two meetings at once :) 09:03:18 hi o/ 09:03:30 but hopefully the other one will not require too much of my attention 09:03:37 o/ 09:04:02 hi all, and welcome to any recent joiners :) 09:04:29 #topic Current status (progress, issues, roadblocks, further plans) 09:04:47 samP: thanks a lot for your spec! I am currently reviewing it 09:05:07 samP: if it's ok I will just upload a new patch set with some minor edits? 09:05:11 aspiers: Thanks 09:05:21 aspiers: sure 09:05:23 ok 09:05:49 btw I think my comment about a link in the index was wrong 09:05:53 it seems to have automatically linked 09:06:08 the HTML rendered view is viewable from the Jenkins job build link 09:06:20 http://docs-draft.openstack.org/17/352217/2/check/gate-openstack-resource-agents-specs-docs-ubuntu-xenial/84fd365//doc/build/html/specs/newton/approved/newton-instance-ha-vm-monitoring-spec.html 09:07:00 BTW this is the review we are talking about, for those who don't know: https://review.openstack.org/#/c/352217/ 09:07:33 My status: only 3 days of regular work last week. Mostly adjusting to new role in mistral by doing lots of reviews, not diretly related to VM HA unfortunately. 09:08:12 no problem :) good luck with the reviews 09:08:49 aspiers: about review, I need you attention on "Proposed Change" 09:09:02 samP: sure, I will review the whole thing 09:09:15 aspiers: thanks 09:09:23 * ddeja will also review 09:09:34 ddeja: thanks 09:09:44 samP: but I think we probably already agreed on the approach - the masakari way seems best IIRC 09:10:40 aspiers: yes we did. Ok then, I will proceed with libvirt monitoring for VM monitoring 09:11:08 samP: I think the spec will be more about documenting this decision and then deciding the integration points 09:11:13 Now, I can complete the VM recovery spec 09:11:52 another update: I attended a meeting about cinder-volume active/active 09:11:52 aspiers: sure 09:11:57 I think I forgot to report that last week 09:12:24 there is still design discussion to be had, especially around fencing 09:12:39 they have a weekly meeting in case anyone is interested 09:12:55 aspiers: great. that also a very important topic for us 09:12:56 aspiers: well, I can contact you with ma colleague who is Cinder core reviewer 09:12:58 https://etherpad.openstack.org/p/cinder-active-active-HA 09:13:10 and doing cinder A/A from liberty 09:13:20 and is sitting next to me in office ;) 09:14:09 oh, cool! 09:14:26 will he attend the future cinder meetings? 09:14:34 what is his name? 09:14:39 I'm sure he will 09:14:41 dulek 09:14:45 on IRC 09:15:23 maybe he can be added to the list of nicks to ping for that meeting? 09:15:48 he's on two weeks vacation starting today, so I'm not sure if he would attend. 09:16:41 ok 09:16:49 I'll add his nick for future meetings 09:16:56 OK 09:17:40 any other status reports? 09:19:00 other than spec, not from my side 09:19:06 ok 09:19:12 #topic Barcelona 09:19:23 IIRC, today is the deadline for voting for sessions 09:19:36 so if you haven't yet voted (like me), please do it quickly ;-) 09:20:44 also I got a reply from ttx about the HA track 09:20:58 as you can see on the list 09:21:37 aspiers: thanks for mail 09:22:02 http://lists.openstack.org/pipermail/openstack-dev/2016-August/100679.html 09:22:19 then I followed up on openstack-operators 09:22:40 http://lists.openstack.org/pipermail/openstack-operators/2016-August/011154.html 09:22:44 but no replies yet :-( 09:23:01 so it looks like we will have to wait until after Barcelona 09:23:17 although maybe we can chase ops meetup organisers directly 09:23:54 aspiers: ops meetup at NY would be a good place 09:24:03 samP: yes, you are going right? 09:24:12 aspiers: yes, 09:24:17 samP: maybe you could talk to them about this last email? 09:25:24 aspiers: Im discussing with some ops people, hope fully I can spread the word and ask some support 09:25:30 perfect! 09:25:37 on the compute HA topic, I think we should set some goals to achieve before Barcelona 09:25:44 IMHO minimum would be: 09:25:51 - finish all specs (including cross-project spec) 09:26:01 - update HA guide with status quo 09:26:26 - start the ball rolling with at least a little bit of collaborative hacking :) 09:26:31 anything else? 09:27:03 I think no 09:27:46 shouldn't we document our roadmap? is thant the presentation or spec? 09:29:27 samP: good question 09:29:32 If the presentation get accepted, then that will be the doc for future roadmap 09:29:54 samP: yes, although the specs will also define the roadmap 09:30:09 are we still going? 09:30:14 beekhof: yes 09:30:20 i have a few moments :) 09:30:30 beekhof: anything you wanna bring up? 09:30:35 Barcelona or otherwise? 09:30:43 i should probably do a docs blueprint 09:31:00 ok 09:31:27 beekhof: please add me for review when you submit it 09:31:29 also, are there any greivances about my arch proposals? 09:31:50 beekhof: yes :) 09:32:22 beekhof: thanks 09:32:23 beekhof: but I think I mentioned most of them privately already, so no huge surprises 09:32:55 beekhof: IIRC, the main ones are monitoring and DRBD8 09:32:56 ok, so only disagreements about things you're wrong about :) 09:33:21 if you don't want real monitoring, that's fine by me ;-p 09:33:24 i meant to ask... you're using drbd for the database? 09:33:32 some customers, not all 09:33:36 shared storage also supported 09:33:38 you're the one without real monitoring :) 09:34:29 unless you replace all the systemd scripts with OCF, i would argue that "pacemaker" monitoring is worse than nagios and friends 09:34:44 BUT 09:34:47 agreed, and that's what we'll do 09:35:03 you'll maintain OCF agents? 09:35:09 #topic control plane architecture 09:35:21 beekhof: I'm the maintainer already 09:35:36 but for every single openstack daemon? 09:35:41 sure 09:35:53 beekhof: not claiming to be doing a great job yet, but I believe in the mission 09:36:08 * beekhof runs from the mission 09:36:15 pid-only monitoring is dumb 09:36:21 agreed 09:37:23 in any case, i think we can structure the docs to accomodate both variants with minimal problems 09:37:35 someone from your side is prepared to write that bit? 09:39:12 sure 09:39:55 puzzled why you are choosing an approach you just agreed is dumb, but I won't try to stop you ;-) 09:40:16 I appreciate it makes things a lot simpler though 09:41:28 ok 09:41:32 #topic AOB 09:41:34 nagios etc wont be doing pid only monitoring either 09:41:36 anything else? 09:41:55 nagios won't be doing fencing either 09:41:56 but neither will pacemaker be doing pid only monitoring via systemd 09:42:32 no-one ever wants the node fenced because the openstack service died 09:42:51 we'll still fence (via pacemaker-remote failures) if the node as a whole goes down 09:43:04 AOB? 09:43:15 Any Other Business 09:43:18 ah 09:44:20 for some openstack services, I *definitely* want the node fenced if the service died and can't be restarted 09:45:58 which ones? because this point has been made very clear to me 09:46:02 cinder i guess? 09:47:24 aspiers: ^^^ 09:47:59 anything stateful, at least 09:48:51 my understanding is that all the state is in rabbit queues 09:49:29 nope 09:49:52 and the daemons are just pulling jobs off a queue and processing them 09:50:44 the API daemons, yes 09:51:48 but maybe we should declare the meeting over before we continue this :) 09:52:26 OTOH, it might make for interesting reading in the minutes 09:52:35 so what are you planning to monitor via nagios? 09:54:00 beekhof: ^^^ 09:57:00 i dont think its nagios 09:57:12 they have some package in mind but the name escapes me 09:57:25 ok, same question applies though 09:57:28 but let's close the meeting and continue on #openstack-ha 09:57:33 happy to close the meeting for now.... bedtime run is imminent :) 09:57:40 ok :) 09:57:43 #endmeeting