#openstack-trove log

18:00:46 <SlickNik> #startmeeting trove-bp-review
18:00:47 <openstack> Meeting started Mon Sep 15 18:00:46 2014 UTC and is due to finish in 60 minutes.  The chair is SlickNik. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:00:48 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
18:00:51 <openstack> The meeting name has been set to 'trove_bp_review'
18:00:55 <iccha2> o/
18:01:29 <grapex> o/
18:01:31 <denis_makogon> o/
18:01:35 <amrith> ./
18:01:40 <SlickNik> agenda at:
18:01:40 <amcrn> O/
18:01:42 <SlickNik> #link https://wiki.openstack.org/wiki/Meetings/TroveBPMeeting
18:02:04 <schang> o/
18:02:30 <SlickNik> #topic OSProfiler integration
18:02:36 <SlickNik> zhiyan_: around?
18:03:51 <SlickNik> Looks like not.
18:04:08 <SlickNik> Let's come back to this one later.
18:04:25 <SlickNik> #Topic Cassandra clustering
18:04:36 <denis_makogon> it is mine
18:04:37 <vgnbkr> o/
18:04:39 <SlickNik> #link	https://blueprints.launchpad.net/trove/+spec/cassandra-cluster
18:04:48 <sgotliv> o/
18:04:56 <denis_makogon> initial framework was merged
18:05:42 <denis_makogon> and as i can see, it's time to go forward with adding custering support for other datastores
18:05:58 <denis_makogon> this topic is about Cassandra clustering
18:06:39 <SlickNik> denis_makogon: https://wiki.openstack.org/wiki/Trove/cassandra-clustering seems a little light on details.
18:06:58 <SlickNik> For example, what data partitioning strategy are we proposing to use?
18:07:09 <amcrn> agreed, it needs to resemble https://wiki.openstack.org/wiki/Trove/Clusters-MongoDB
18:07:22 <amrith> I posted some comments this AM
18:07:23 <SlickNik> Are we doing replication? How are Snitches handled?
18:07:28 <vigneshvar> amrith: denis_makogon SlickNik : pls give your valuable comments/suggestions on https://review.openstack.org/#/c/103186/
18:08:03 <amrith> using the terminology of a code review; if this were put into an RST etc., I would believe that this needs more work
18:08:15 <amrith> at this stage, I think it is premature to approve
18:08:31 <amrith> because I'm not able to tell for sure whether cassandra will 'fit' in the current framework cleanly
18:08:42 <amrith> and what the 'rough edges' could be.
18:08:46 <amrith> but I think the bp is a start
18:08:51 <amrith> and needs to be finished
18:09:06 <iccha2> also should we start using specs repo since we re looking at bps for kilo
18:09:18 <SlickNik> amrith: +1
18:09:41 <georgelorch> o/
18:09:42 <dougshelley66> o/
18:09:58 <SlickNik> iccha2: Yes, getting that repo up and going is in the works. Should be done for next week's BP meeting.
18:10:07 <amrith> In particular, I would like to make sure that whatever we implement/extend of the framework is also compatible with things (like MySQL/Percona)
18:10:14 <iccha2> swwet thanks SlickNik
18:10:15 <amrith> iccha2, +1
18:10:49 <SlickNik> iccha2 / amrith: https://review.openstack.org/#/c/121457/
18:11:32 <SlickNik> But coming back to this BP — denis_makogon can you switch to the rst format as you continue working on it?
18:11:51 <SlickNik> I suspect folks will have quite a few comments on this, and it'll be easier to review the .rst
18:12:05 <denis_makogon> ok
18:12:41 <SlickNik> denis_makogon: thanks.
18:13:35 <SlickNik> #topic Clustering int-tests
18:13:53 <denis_makogon> that one is pretty simple
18:13:57 <SlickNik> #link https://blueprints.launchpad.net/trove/+spec/clustering-int-tests
18:14:05 <denis_makogon> there's no int tests for mongo clustering at all
18:14:20 <denis_makogon> it was added to be approved by PTL
18:14:43 <denis_makogon> for now we have several API endpoints that sould be tested
18:15:39 <amcrn> denis_makogon: is the intent of adding this to the agenda to ask about its status or ?
18:15:49 <SlickNik> denis_makogon: The design for this needs to be thought through as well. Should every int-tests be spinning up 5 instances for a cluster? How does this effect int-test workload, and run time?
18:17:06 <denis_makogon> amcrn, i planned to work on it, if there's no objections
18:17:06 <grapex> I think we should add int-tests that are optional- as in , we don't *have* to run them all the time and ruin the gate- but *could* be run
18:17:24 <grapex> and could also run in fake mode- which would require fixing the event simulator. :/ I could do that.
18:17:31 <SlickNik> denis_makogon: afaik amcrn, and mat-lowery have been trying to answer some of these design questions to come up with something that is acceptable to run in the gate.
18:17:50 <amcrn> SlickNik: +1, but we wouldn't mind your help denis_makogon
18:18:04 <denis_makogon> SlickNik, ok, thanks
18:18:11 <amcrn> denis_makogon: we'll try to share some details fairly soon about the options we've been discussing
18:18:30 <amrith> At this point, I think the work is "definition"
18:18:35 <SlickNik> grapex: +1. Especially on getting fake mode to cover this so that we can test that in the gate.
18:18:40 <denis_makogon> amcrn, awesome, thanks
18:18:44 <amrith> I understood the approval to mean that in principle we agree with the proposal
18:18:49 <amrith> but the devil is in the details
18:18:55 <amrith> so who is ironing out the details ;)
18:19:09 <amrith> I'm assuming this is mat-lowery amrith SlickNik ...
18:19:11 <amrith> yes?
18:19:31 <grapex> mat-lowery: didn't you have tests that were almost running until you found out about the fake mode limitation?
18:19:43 <mat-lowery> Yeah due to the resources required for a real int test, I was planning on the fake mode-only route first. But then I ran into event simulator limitations.
18:19:56 <denis_makogon> amrith, i guess amcrn and mat-lowery would answer all questions since they are already working on them, but i didn't know about that (my bad =( )
18:20:01 <grapex> Let's proceed with those but not add them to any groups which would run in the gate
18:20:08 <grapex> then hopefully I can fix the event simulator limitations
18:20:36 <amcrn> denis_makogon: we should have flipped the blueprint assignee, i screwed up there, apologies.
18:20:38 <SlickNik> amrith: I believe in addition to amcrn and grapex as well.
18:20:42 <grapex> SlickNik amcrn amrith: Would that work for you?
18:20:56 <denis_makogon> amcrn, it's totally fine
18:21:19 <amrith> grapex, absolutely
18:21:24 <amrith> no objections from me.
18:21:25 <SlickNik> grapex / mat-lowery: Yes, I think adding int-tests even if it's not part of the gate would be good.
18:21:54 <amrith> I would like us to keep one thing in mind as we make these choices
18:21:56 <amcrn> not sure what i'm agreeing to, but sure :)
18:22:03 <amrith> for example
18:22:10 <amrith> SUSE is goign to run their own CI
18:22:27 <amrith> we should make sure that the things we choose will work on their CI as well, without too much of a burden
18:22:46 <amrith> otherwise, things that we have approved, such as SUSE support would be impacted
18:23:02 <amrith> because we implicitly are relying on their CI to find issues in code we commit
18:23:23 <denis_makogon> amrith, not sure about that, since SUSE CI is a third-party CI, if they would have an issues - they would come to us with questions
18:23:52 <denis_makogon> because we can't say for sure that it would work for them
18:24:16 <amrith> denis_makogon, my issue was related to things like the numebr of instances we'd spin up
18:24:19 <amrith> for each test
18:24:22 <amrith> or things like that
18:24:31 <denis_makogon> ok, i get that
18:24:34 <amrith> if we expect that to test clustering, your CI infrastructure should be enormous
18:24:40 <amrith> taht may have a complicating effect
18:24:46 <amrith> on other database vendors who want to participate
18:25:01 <amrith> I don't think there is an imminent threat of that happening
18:25:06 <amrith> but just something to keep in mind
18:25:17 <amrith> dougshelley66, has a CI setup that we're operating
18:25:22 <amrith> I have to keep that in mind as well
18:25:27 <amrith> similarly, Percona
18:25:33 <amrith> georgelorch, will have one he ahs to operate
18:25:35 <amrith> and so on
18:25:41 <amcrn> i'm onboard for fake-mode, but real int-tests is difficult to get working on a single box for clusters; we have all sorts of hacks to get it to work on a fairly beefy box.
18:26:14 * georgelorch nods
18:26:20 <cp16net> i can imagine
18:26:30 <amcrn> amrith: so if we can't run it, i'm not sure the real value of writing it
18:26:39 <denis_makogon> we might also speak with infra guys to understand how much of resources we're able to use for our clustering tests
18:26:43 <grapex> amrith: Agreed. I think we should accept having monstrous tests like clustering in the code even if we don't run them all the time, with the expectation that we can't expect them to always be catching bugs since they won't run frequently. But it would be nice to have them since they will run in fake mode, and we could still run them before the big lettered releases.
18:26:46 <denis_makogon> for the real one
18:27:15 <amrith> amcrn, I think the value of having the tests in there is that there will be runs of those tests, even if they are not on a per commit basis
18:27:15 <amcrn> grapex: but we internally can't even run them on a single box without hacks to nova + diskimage-builder
18:27:19 <amrith> as grapex says above
18:27:39 <grapex> amcrn: Still, fake mode by itself will give a lot of feedback and keep smaller bugs from being introduced.
18:27:43 <amrith> therefore to an earlier point that SlickNik made, there is the question of how these tests should be structured
18:28:08 <grapex> The big risks left open is communication between Trove and the guest agent- but if people change clustering releated RPC calls they will hopefully expect risk in those areas.
18:28:11 <amrith> and I think that at this point that thinking is something that you (grapex, amcrn, mat-lowery, SlickNik ...) are in the best position to do.
18:28:15 <amcrn> grapex: right, i get fake-mode; i'm saying int-tests that aren't runnable.
18:28:23 <grapex> (as if "hopefully expecting risk" has ever saved anyone in this profession)
18:28:30 <amcrn> bah, i mean non-fake isn't really runnable on a single box setup*.
18:28:40 <amrith> amcrn, I think that's fine
18:28:42 <amcrn> wish irc had an edit button sometimes :)
18:29:22 <amrith> I'm assuming that with replication and clustering there will be larger test infrastructures that are required for some testing. and that's goodness.
18:29:46 <amcrn> so the tl;dr here is we're going to go with fake-mode first, which has a dependency on the event simulator fix. beyond this, we need larger deployed test infras to run non-fake int-tests.
18:29:59 <amcrn> if i'm understanding correctly.
18:30:23 <SlickNik> amcrn: I agree with that tl;dr
18:30:37 <denis_makogon> guys, we might take a look at Sahara project, since they are deploying heavy clusters of Hadoop over infra gates
18:30:54 <amcrn> denis_makogon: i believe they're doing single node, but good point, we should confirm.
18:30:56 <SlickNik> In the meantime, I can start some conversations with the infra folks to see how we can get creative with testing this on infra.
18:31:35 <denis_makogon> SlickNik, that would be awesome
18:32:23 <denis_makogon> and when we'll be ready with fake-mode test we would proceed to real-mode ones, and that'll be a bit tough
18:32:24 <SlickNik> denis_makogon / amcrn: I think that's the case too — denis_makogon, can you recheck with sergey?
18:32:33 <denis_makogon> SlickNik, sure
18:32:36 <SlickNik> thanks
18:32:40 <denis_makogon> np
18:33:02 <SlickNik> Alright, I think we have a plan in place for that.
18:33:09 <SlickNik> zhiyan_: back yet?
18:33:35 <SlickNik> .
18:33:42 <SlickNik> #topic Open Discussion
18:33:46 <amrith> question, what is tl;dr?
18:33:57 <dougshelley66> Too long; didn't read
18:33:58 <SlickNik> too long; didn't read
18:34:05 <amrith> HA!
18:34:06 <SlickNik> aka summary
18:34:09 <amrith> I love these calls
18:34:12 <amrith> I learn something new
18:34:57 <cp16net> too long; didn't read
18:35:02 <cp16net> :-P
18:35:29 <SlickNik> #endmeeting