20:00:12 #startmeeting Octavia 20:00:12 never mind I'm lagged by 30 seconds 20:00:12 Meeting started Wed Aug 27 20:00:12 2014 UTC and is due to finish in 60 minutes. The chair is sbalukoff. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:00:15 its our inaugural Octavia IRC meeting 20:00:16 The meeting name has been set to 'octavia' 20:00:21 o/ 20:00:30 Howdy folks! 20:00:32 o/ 20:00:33 hello 20:00:34 \o/ 20:00:38 This is the agenda we're going to be using: 20:00:40 #link https://wiki.openstack.org/wiki/Octavia/Weekly_Meeting_Agenda#Meeting_2014-08-27 20:00:57 +1 20:01:06 So, let's get started, eh! 20:01:09 #topic Review action items from last week 20:01:26 sablle: sbalukoff will probably link them in 3..2..1.. 20:01:39 Haha! 20:01:42 blogan_, ok 20:02:17 I'm here, just distracted by a production issue at the moment :) ping me if you need anything specific 20:02:28 Well the first item on the list here is to go over the benchmarks put together by German 20:02:42 #link https://etherpad.openstack.org/p/Octavia_LBaaS_Benchmarks 20:02:55 xgerman: Can you go ahead and speak to that? 20:03:03 rm_work: great reason to have this on IRC 20:03:08 I compared two listeners in one haproxy / and two haproxy processes with each one listener 20:03:30 the results are fairly similar so either one seems viable 20:03:49 throughput was a tad higher for two listners/1 haproxy but I think tuning can fix that 20:04:02 haproxy - 2 listeners: 545.64 RPS, 430.57 RPS 20:04:02 haproxy - 2 processes: 437.51 RPS, 418.93 RPS 20:04:17 the two values in each line are for the two ports 20:04:18 xgerman: How many times did you re-run this benchmark? 20:04:24 each three times 20:04:36 + glided in to the right concurrent requests 20:04:39 (Given it was being done on cloud instances, where performance can sometimes be variable. ;) ) 20:05:01 which version of HAProxy was used for testing? 20:05:11 1.5 20:05:12 that kind of variation can just be context switching two processes instead of one, too. 20:05:31 Yeah, given the only distinction between the two ports' configurations was the ports themselves, it seems like performance for both in one test ought to be really close. 20:05:54 agreed 20:06:05 So, this being one test: haproxy - 2 listeners: 545.64 RPS, 430.57 RPS 20:06:11 so the difference in the first listeners is not concerning? 20:06:22 545 vs 437? 20:06:24 would it be possible to share your haproxy.conf files? i am curious of you assigned listeners to cores 20:06:36 My point is that is wider than I would have expected. 20:06:37 I can share them 20:06:56 agreed sbalukoff 20:07:06 were the tests run once or several times and averaged 20:07:09 not unless it reproduces a lot. at that rate, that's not a lot of variance, given that it's cloud instances. 20:07:29 i'd like to see benchmarks on much higher RPS 20:07:45 dougwig: Right, but that variance is larger than the variance between running 1 process versus 2. 20:07:47 I ran them several times to see if there are big variations but they clocke din to gther 20:08:06 that tells me the biggest factor for variable performance here is the fact that it's being done on cloud instances. 20:08:07 I was worried about the RPS, too -- but I used standard hardware and a fairly small nox 20:08:11 what is fast enough? the quest for fastest can be endless. does starting simpler mean we can't add in the other? 20:08:38 sbalukoff: didn't you mentiong something about getting 10k RPS per instance? 20:08:50 or did I misunderstand that 20:08:50 blogan_: Yes, but that's on bare meta. 20:08:52 metal. 20:09:01 ah okay 20:09:07 also the exercise was to comapre the two approaches 20:09:13 blogan_: To get that kind of performance... well, apache bench starts to be the bottleneck. 20:09:36 and since they are fairly close I was calling it a tie 20:09:40 So you have to hit the proxy / load balancer with several load generators, and the proxy / load balancer needs to be pointed at a bunch of back-ends. 20:09:40 xgerman: I know, I'm just concerned that the gap could widen with higher loads 20:10:15 was the generator at 100% cpu? ab should scale better than that; it's nowhere near line speed. 20:10:18 blogan_: I have that concern too, but probably less so: 20:10:30 The reason being that if there were a major difference in performance, it would probably show up at these levels. 20:10:35 xgerman, is HP using Xen or KVM? 20:10:43 KVM 20:10:48 danke 20:10:56 And, let's be honest, the vast majority of sites aren't pushing that kind of traffic. Those that are, are already going to want to be on the beefiest hardware possible. 20:11:16 sbalukoff, 100% agree 20:11:30 I also deliberately put the lb on the smallest machine we have to magnify context switching/memory problems 20:11:34 (if any) 20:11:36 xgerman: what size were the requests? 20:11:52 177 Bytes 20:11:52 agree, my point above is that both look fast enough to get started, and not spend too much time here. 20:12:03 dougwig: +1 20:12:29 fine by me 20:12:35 dougwig, +1 BUT it would be nice if RAX did the benchamrk too to get two samples 20:12:50 did you directly load test the back end servers to get a baseline? 20:12:57 yes, I did 20:13:05 what were you seeing direct? 20:13:10 On that note, is everyone up to speed on the other points of discussion here (that happened on the mailing list, between me and Michael)? 20:13:16 xgerman, do you have the configs/files for your tests so it could be easily duplicated 20:13:28 Requests per second: 990.45 [#/sec] (mean) 20:13:45 yes, I will upload the haproxy files 20:13:47 lets talk more about the benchmarks after the meeting 20:13:53 Let me rephrase that: Has anyone not read that e-mail exchange, where we discussed pros and cons of each approach? 20:14:27 sbalukoff, I haven't but I am stil cathing up... Will do rigth after the meeting 20:14:33 i have not, but nor do i feel strongly either way. 20:14:37 I've read it and multiple processes is fine, but was waiting on benchmarks too 20:14:43 because last i read, this wasn't a corner painting event. 20:15:23 dougwig: Mostly it affects a couple workflow issues having to do with communication with the Octavia VM. So no, if we have to change it won't be *that* hard. 20:15:26 dougwig: i think it is 20:15:37 blogan_: Oh? 20:15:55 please expand? because if it is, i think we've got a crap interface to the VMs. 20:16:28 you mean the process per listener vs process per loadbalancer? 20:16:42 blogan_: That is what we're discussing, I think. 20:16:44 correct. 20:17:50 well stats gathering would be different for each approach, provisioning, updating all of that would be affected 20:18:00 blogan_: That is true. 20:18:04 +1 20:18:12 but that woudl fall under sbalukoffs "not that hard" 20:18:15 xgerman, could you please rerun your tests with the keepalive flag (-k) enabled? This will greatly increase your performance 20:18:15 dougwig, why do you think we have a "crap interface to the VMs" 20:18:21 Haha! 20:18:32 You're right. I'm probably trivializing it too much. 20:18:42 and that leads me to wondering if we should abstract the vm <-> controller interface and not just ship haproxy files back and forth 20:18:42 barclaac: if it's something other than "not that hard", it implies we're not abstracting the vm implementation enough. 20:19:00 xgerman, +1 I agree 20:19:00 dougwig +1 20:19:07 Shouldn't the decision on process/listener or process/lb be an implementation detail within the VM? It seems we're making a low level decision and then allowing that to percolate through the rest of Octavia 20:19:30 barclaac: yes, that's another way of saying exactly what i'm trying to communicate. 20:19:35 barclaac: Octavia is an implementation. 20:19:43 So implementation details are rather important. 20:20:00 But within Octavia we want to try to have as loose a coupling as possible between components. 20:20:03 (although it's vm implementation + vm driver implementation, to be precise.) 20:20:16 sbalukoff, But also a framework so yes implementaiton is important but we need the right level of abstraction 20:20:21 If the control plane "knows" that HAProxy is in use we're leaking architectural concerns 20:20:33 barclaac, +1 20:20:40 +1 20:20:43 Um... 20:20:52 Octavia knows that haproxy is in use. 20:20:58 sbalukoff, We need to keep the interface clean 20:21:02 disagree, but if something outside the controller/vm driver knows about haproxy, that's too much leak. 20:21:06 I think that's the statement that I don't agree with. 20:21:09 It's sort of a central design component. 20:21:58 sbalukoff: do we need to store anything haproxy specific in the database? 20:22:01 I'm not sure haproxy is a central design component. Herding SW load balancers is the central abstraction. HAProxy is an implementation detail. 20:22:05 sbalukoff, I always looked at ha-proxy as the first implementation. We could switch ha-proxy with somethign else in the future 20:22:22 If I want to create a VM with nginx would that be possible? 20:22:37 barclaac: You have no idea how much I hate the phrase "it's an implementation detail" especially when we're discussing an implementation. :P 20:22:39 barclaac, +1 20:22:51 the skeleton that blogan_ and i briefly discussed had the notion of "vm drivers", which would encapsulate the haproxy or nginx or other implementation details that lived outside the VMs. 20:22:54 sbalukoff +2 :-) 20:23:17 from my understanding, we were not going to do anything beside haproxy at first, but abstract everything enough to allow easy pluggability 20:23:27 wouldn't we need to support flavors if we want to have different backends? 20:23:28 dougwig +1 that would also allow a future UDP load balacning solution to use Octavia 20:23:29 Well, the haproxy implementation will be the reference. 20:23:31 sballe +1, blogan_ +1 20:23:36 And implementations have quirks. 20:23:44 xgerman: bring on the udp. 20:24:04 :-) 20:24:04 It's going to be the job of any other impelementation to work around those quirks or otherwise have similarly predictable behavior. 20:24:25 I want to avoid that each implementation has to talk in haproxy config files... 20:24:39 xgerman, you've just hit the nail on the head! 20:25:01 sbalukoff, I thought dougwig might want to switch ha-proxy out with his a10 back-end in the future 20:25:03 We should abstract those files 20:25:04 pull up, pull up. do we agree that we're going to have an haproxy vm, and a file or module that deals with that vm, and that module is not going to be "octavia", but rather something like "octavia.vm.driver.haproxy", right ? in which case, i'm not sure this detail of how many processes really matters. 20:25:05 xgerman: Right. It depends on which level you're talking about, and we are sure as hell not being specific enough about that in this discussion. 20:25:48 sballe: Yes, and he should be able to. 20:25:52 dougwig: Thank you. Yes. 20:26:42 sbalukoff, ok so we need t make sure we have enought abraction to allow him todo that and not add too much ha.proxy stuff into the main frameork 20:27:04 * dougwig suspects that we are all violently agreeing. 20:27:14 So it matters for the reference implementation (which is what the point of the discussion was, I thought). 20:27:16 dougwig: We are. 20:27:16 I think the only thing that will know about haproxy is a driver on the controller, and then also if the VM is running Octavia code it would need an haproxy driver 20:27:23 AGREEMENT. 20:27:27 * rm_work is caught up 20:27:54 +1 blogan_ 20:28:21 +1 20:28:23 alright, circling back to 1 vs 2 processes, does it matter? 20:28:36 it matters in the haproxy driver right? 20:28:39 dougwig: It does for the reference implementation. 20:28:53 did i just invoke "implemenation detail"? 20:29:02 So, would anyone object to making a simple vote on this and moving on? 20:29:02 couldn't it be a configurable option of the haproxy driver? 20:29:20 tmc3inphilly: two drivers 20:29:21 right, let whoever writes that driver/VM decide whether to add the complexity now or in a future commit. 20:29:21 tmc3inphilly: I don't see a poitn in that. 20:29:40 sbalukoff: really it could just be two haproxy drivers 20:29:57 dougwig: The reason we are discussing this is because people from HP have (had?) a strong opinion about it. 20:30:04 and right now we have 0 drivers, so cart, meet the horse. 20:30:05 This conflicted with my usual strong opinions about everything. 20:30:08 so HP can write the driver. 20:30:08 Hence the discussion. 20:30:09 :) 20:30:09 blogan_, we would have to maintain extra code 20:30:36 we did our benchmark and it looks like a tie - so we bow to whatever driver is written for now 20:30:37 sballe: totally agree, but thatst he point, what is the way WE want to go wtih this? and if we have two different views then two drivers would need to be created? 20:32:09 so it sounds to me like we can make a consensus decision on what this haproxy drvier will do 20:32:09 Ok, I want to move on. 20:32:12 anyone strongly opposed to 1 lb:1 listener to start? 20:32:15 if not, let's move on. 20:32:19 But I don't want to do so without a simple decision on this... 20:32:23 So. let's vote on it. 20:32:36 abstain 20:32:41 abstain 20:32:51 #startvote Should the (initial) reference implementation be done using one haproxy process per listener? 20:32:52 Begin voting on: Should the (initial) reference implementation be done using one haproxy process per listener? Valid vote options are Yes, No. 20:32:53 Vote using '#vote OPTION'. Only your last vote counts. 20:32:55 +1 for multiple processes 20:32:59 oh 20:32:59 #vote Yes 20:33:10 #vote Yes 20:33:45 No one else want to vote? 20:34:01 I'll give you one more minute... 20:34:06 you got a runaway victory :-) 20:34:13 Haha! 20:34:21 lol 20:34:45 #vote Yes 20:34:54 On account of failure isolation 20:35:08 Ok, voting is about to end... 20:35:21 #endvote 20:35:22 Voted on "Should the (initial) reference implementation be done using one haproxy process per listener?" Results are 20:35:32 ... 20:35:32 no results! 20:35:34 German gets my vote. :-) 20:35:36 no one wins! 20:35:38 #drumroll 20:35:38 A mystery apparently. 20:35:39 HAHA 20:35:44 I'm fine with the single/multiple HAProxy question - my issue is the control plane knowing about HAProxy instead of talking in terms of IPs, backend nodes etc. 20:36:04 barclaac: we all agree on that point. 20:36:08 barclaac: Have you been following the rest of the Octavia discussion at this point? 20:36:25 Because i think you're arguing something we've agreed upon for... probably months. 20:36:46 Anyway, moving on... 20:36:59 #topic Ready to accept v0.5 component design? 20:37:00 sbalukoff: I think he's getting at making sure we're abstracting enough, which seemed to be in question a few minutes ago 20:37:21 but i think we all agree on that now 20:37:22 blogan_ +1 - 20:37:32 next topic 20:37:33 Heh! Indeed. 20:37:53 I'm ready to accept it 20:37:56 Of course. It was blogans comment above about having an haproxy driver in the controller (sorry for delay, had to read back up the stack) 20:38:06 sbalukoff: i'm not today, but i will commit to having a + or - by friday. 20:38:50 Ok, so I'm not going to +2 the design since I wrote it, but we'll need a couple of the other octavia cores to do so. :) 20:38:55 i haven't give a +2 because I think this is important enough to have everyone onboard 20:39:06 we just discovered that we can't have Neutron Floating IPs in private networks 20:39:21 On that note then: Does anyone have any major issues with the design that are worth discussing here? 20:39:47 if i have any, i'll put them on gerrit well before friday, and ping you directly. 20:39:49 sbalukoff, I am fine with the spec assumign that we are flexible with changing things as we start implementing it. 20:39:55 xgerman: I think if we are working with an abstract networking interface anyway, that shouldn't be a problem. 20:40:11 sballe: Yes, of course. This spec is to determine initial direction. 20:40:12 yep. agreed 20:40:50 Basically, I look at it as the 10,000 mile overview map of how the components fit together. 20:41:15 It's definitely open to change as we do the actual implementation and discover where our initial assumptions might have been wrong. 20:41:30 I totally expect that to happen 20:41:35 (Not that I'm ever wrong about anything, of course. I thought I was wrong once, but I was mistaken.) 20:41:42 I'd be amazed if everything went according to plan 20:41:50 blogan_, sbalukoff me too 20:42:10 20:42:35 I love it when a plan comes together 20:42:48 :-) 20:42:58 Ok, so: Doug, German, and/or Brandon: Can I get a commitment to do a (final-ish) review of the component design this week and work toward a merge? 20:43:10 yes 20:43:20 I can +2 it now and let doug be the final +2 20:43:23 or -1 20:43:28 depending on what he decides 20:43:29 #action v0.5 component design to be reviewed / moved to merge this week. 20:43:34 yeah, I can +2, too 20:43:43 Well, if there's a -1, we can probably fix that this week, too. 20:43:46 dougwig just ping us and we will +2 20:44:18 Ok, let's move on to the next topic 20:44:21 #topic Ready to accept Brandon's initial database migrations? 20:44:26 NO 20:44:28 lol 20:44:32 lol 20:44:35 per your comments, some need to be changed 20:44:36 Haha! Indeed. 20:44:38 well, if the creator has no confidence 20:44:55 actually brings up another topic we can discuss 20:45:14 blogan_: Ok, so, do you have enough info there to make the fixes, or are there details that will need to be discussed in the group? 20:45:27 blogan_: Oh, what's that? 20:45:41 I think we should try to start with basic load balancing features first, and then iterate on that after we have a more stable workflow and codebase 20:45:57 +1 20:46:08 so should colocation, apolocation, health monitor settings, etc be saved for that 20:46:30 yeah, we can put them in with the understanding they might need to be refactored/refined 20:46:31 blogan_: I'm game for that, so long as we don't do anything to paint ourselves into a corner with regard to later planned features. 20:46:41 blogan_: Want to call that v0.1 or something? ;) 20:46:48 I might be naive but I am not sure how we can have a LB without health monitoring 20:46:50 0.5 Beta 20:46:51 (I don't think we need a component design document for it, per se.) 20:46:53 sbalukoff: v0.25 20:46:59 Heh! Ok. 20:47:09 sballe: i mean extra health monitor settings 20:47:14 sballe: I don't think health monitoring is one of the non-basic featured. 20:47:19 features. 20:47:20 Yes. 20:47:30 oh ok 20:47:39 blogan_ Agreed. 20:47:46 basically I'm kind of following what neutron lbaas has exposed 20:47:57 v2? 20:47:59 yes 20:48:03 Ok, that works. 20:48:10 Non-shitty core object model and all. :) 20:48:23 Any objections? 20:48:30 should be easy enough to do a v1 driver on that first pass, and get a stable cli/ui to test with. 20:48:31 #salty 20:48:41 #teamsalty 20:48:52 dougwig: +1 20:48:55 Ok then! 20:49:14 ill get those changes up today and please look give it some love and attention 20:49:24 #agreed We will start with a basic (Neutron LBaaS v2) feature set first and iterate more advanced features on that. We shall call it v0.25 20:49:32 dougiwg - no v1 driver we migth end up with users 20:49:53 #action Review blogan's changes regarding the v0.25 iteration. 20:50:16 next topic! 20:50:31 I want to skip ahead a bit on this since we're almost out of time. 20:50:43 #topic Discuss ideas for increasing project velocity 20:51:10 So! Right now, it feels like there ought to be more room for many people to be doing things to bring this project forward. 20:51:19 blueprints, milestones, track deliverables to dates. 20:51:42 sbalukoff, +1 20:51:57 should we put all these tasks as blueprints in launchpad and allow people to just take htem as they want? 20:52:09 blogan_: I'm thinking that's probably a good idea. 20:52:14 blogan_, sounds like a good idea. 20:52:28 +1 20:52:28 Some have volunteered for work already 20:52:36 blogan_, We can always coordinate among ourselves 20:52:44 johnsom +1 20:52:47 johnsom: Yep! So, we'll want to make sure that's reflected in the blueprints. 20:53:09 sbalukoff, you wnated to start a standup etherpad 20:53:22 okay I'll put as many blueprints as I can 20:53:25 xgerman: Thanks-- yes. I forgot about that. 20:53:30 Agreed. The blueprint process is on my list to learn in the near term 20:53:35 #action sbalukoff to start a standup etherpad for Octavia 20:53:48 johnsom: I think it's on many of our lists. :) 20:54:06 same here 20:54:13 ill add as many blueprints as i can, but they're probably not going to be very detailed 20:54:44 no problem we can detail 20:54:45 I also wanted to ask publicly (and y'all can feel free to respond publicly or private as you see fit), what else can I be doing to both make sure you and your individual teams know what they can be doing to be useful, and what else can I do to help you convince your management it's worthwhile to spend time on Octavia? 20:54:52 blogan_, how are we going ot do design review before the code shows up in gerrit? and it takes forever for us to understand the code? 20:55:16 blogan_: I'll also work on adding blueprints and filling in detail. 20:55:37 sballe: good question, and that should be the spec process, but I really think the spec process will totally slow down velocity in the beginning for trivial tasks 20:55:48 blogan_: +1 20:56:01 blogan_, sbalukoff I would like to see details in the bp so we know how it will be implemented with adjustments later if needed 20:56:21 For major features (eg. TLS or L7) we'll definitely want them specced (probably just copied from work on neutron LBaaS), but yes, trivial things shouldn't require a spec up front right now. 20:56:26 maybe I am talking about spces and not bp 20:56:33 sballe: I think the blueprints can be created initially to give out a task list, but when someone grabs that blueprint to work on, they should be responsble for detailing it out 20:56:49 sballe: let's take one example, which is the directory skeleton for octavia. the best spec for that is going to be the code submission. and in the early days, that's going to be true for a fair number of commits. 20:56:53 blogan_, we are totally in agreemet. 20:57:26 sballe: Cool. After the meeting, can you point out a few specific areas where you'd like to see more detail to me? 20:57:40 i have a lot 20:57:47 We've got three minutes left. Anyone have anything dire they want to bring up? 20:57:49 dougwig, with the old LBaaS team we ran into an issue that they didn't document anything and we were told the code was the documentaiton and design document. 20:58:05 yeah, let's avoid that 20:58:12 also we need to bring in QA early 20:58:34 sballe: Yeah, I hate that, too. 20:58:36 anyone have spoken or heard from mestrey? 20:58:39 I loves me some good docs. 20:58:50 sbalukoff: dont we know it 20:58:57 samuelbercovici: nope we ahve not 20:58:58 samuelbercovici: I have not. 20:58:59 samuelbercovici: he was at the neutron meeting monday. 20:59:13 And, still no incubator update, IIRC. 20:59:26 some discussions going on the ML about it 20:59:30 i asked about that at the meeting, and was told to expect it yesterday. 20:59:32 but nothing official 20:59:42 I was told to expect it 2 or 3 weeks ago 20:59:49 I chatted with him briefly yesterday. I'd expect by the end of the week. 20:59:54 Ok, folks! Meeting time is up. Yay IRC. :P 20:59:57 i was not able to get from him a reason why lbaas should fo to incubation 20:59:58 I now expect it at the samet ime Half-Life 3 is released 21:00:03 #endmeeting