16:01:02 #startmeeting 16:01:03 #meetingname ceilometer 16:01:03 Meeting started Thu Jun 7 16:01:02 2012 UTC. The chair is jd___. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:03 #link https://lists.launchpad.net/openstack/msg12851.html 16:01:04 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:05 The meeting name has been set to 'ceilometer' 16:01:16 #chair nijaba dachary 16:01:17 Current chairs: dachary jd___ nijaba 16:01:28 #topic actions from previous meetings 16:01:49 dhellmann: so how was your demo? :) 16:02:00 :-) 16:02:02 it went well 16:02:24 I was able to show messages passing from the agent to the collector and being logged, both from notifications and polling 16:02:51 dhellmann: congratulations 16:03:36 thank you all, again, for all of the work you've put in. I wouldn't be nearly this far if I was doing it all myself. 16:03:42 dhellmann: impressive! 16:04:24 great dhellmann :) 16:04:26 nijaba: anything to report since you had a couple of action items? 16:05:03 Well, I commented on the bug and I started the thread on configuration handling 16:05:40 fair enough :) 16:06:14 #topic Storage backend (high availability, SPOF etc.) 16:07:15 now it's time to discuss how to store the collected data :) 16:07:39 our ops team is recommending postgres, in part because of familiarity and in part because they think it will handle the scale 16:07:46 dhellmann wrote something cf https://lists.launchpad.net/openstack/msg12884.html 16:07:48 #link https://lists.launchpad.net/openstack/msg12884.html 16:07:54 they have specifically warned me off of mongodb 16:08:12 but I don't expect everyone to want to use the same tool, which is why I proposed a plugin api 16:08:18 (thanks, jd___) 16:08:19 dhellmann: I agree with your ops, but I'd suggest to use sqlalchemy as an abstract just because OS already did this choice 16:08:25 yes, absolutely 16:08:39 I intend to have an "sql" or "rdbms" plugin that uses sqlalchemy 16:08:45 but as you stated, this can be pluggable in our case, as you proved and wrote so we can start with one plugin 16:09:11 it may be simplest to create a mongo plugin for testing and experimentation, but we wouldn't use it in production at dreamhost 16:09:25 please, do NOT use SQL Alchemy. That would prevent us from using any noSQL db. You may want to use is in one of the pluging, but not as the plugin method 16:09:45 nijaba: it's not the plugin method 16:09:55 jd___: then that's fine 16:09:59 :) 16:10:22 * nijaba thinks that's one of the bad elements remaining in OpenStack at the moment 16:10:23 dhellmann: well if we both want to use SQL I think so it's likely we can work on this plugin first 16:10:27 right, there would be a single plugin with a name like "rdbms" that uses sqlalchemy to talk to your DB of choice, but the plugin API would be a higher level thing 16:10:40 agreed, jd___ 16:11:05 that sounds good to me 16:11:06 the plugin will need more methods so the API server can use it to query the database, too, but I haven't given those any thought 16:11:18 I would expect them to map pretty closely to the queries in the API itself, though 16:11:19 #agreed jd and dhellmann to focus on an SQL plugin storage 16:11:35 well, looks like we only have have the storage part so far 16:11:44 dhellmann: indeed 16:11:58 we need to define the maping to the API we defined earlier 16:12:03 jd___, do you know anything about how the other OS components handle database migrations? 16:12:22 dhellmann: I know they use 'migrate' 16:12:34 I even wrote one or two migration stuff in the last months 16:12:49 right, nijaba. We should at least add a method to retrieve the raw data so we can test getting data in and out 16:13:05 jd___: oh, good, so you can do that part! :-) 16:13:13 dhellmann: that would sound like a nice first step 16:13:19 dhellmann: it will only be needed for version 2! :-P 16:13:32 well, we have to have something to initialize the database, right? 16:13:50 right, but sqlalchemy does that for us AFAIK 16:14:15 I know it can be used to create the schema, but I think you have to tell it to do that explicitly 16:14:21 we can take that part of the discussion to the mailing list, though 16:14:36 otherwise OS uses this to upgrade between releases: http://code.google.com/p/sqlalchemy-migrate/ 16:14:41 IIRC 16:14:48 dhellmann: sure :) 16:15:01 so the plugin should abstract a migrate function... 16:15:04 so everybody agrees on the plugin system proposed by dhellmann ? 16:15:17 it's a good start, I think 16:15:39 +1 16:15:41 nijaba: if it's needed, each plugin handles its migration; i think mongodb migration can be easy to do since you don't have to do anything to add fields ;) 16:15:56 +1 16:16:23 #agreed use the plugin system proposed by dhellmann at https://lists.launchpad.net/openstack/msg12884.html 16:16:36 #action dhellmann: submit plugin branch for review and merging 16:16:40 as long as we agree to always go through the abstraction to talk to the db, I think we are fine 16:16:53 nijaba, agreed 16:17:43 sure 16:18:01 dhellmann: could you point me to the abstraction related to how the plugin is used to query the database ? 16:18:17 dachary, there aren't any query methods, yet 16:18:36 just like there's a method to store a new event, there would be one or more methods to ask for event data 16:18:55 one would ask for all of the raw events, filtered by account, user, etc -- whatever the API args are 16:19:03 in fac, I think we will have one method per API call type 16:19:09 that's non trivial to abstract. Or do you have an abstract model in mind already ? 16:19:10 yeah, probably 16:19:21 no, I haven't gotten that far 16:19:25 the API is the abstraction... 16:19:47 nijaba: then the database plugin will be in charge of interpreting the API calls. That works for me. 16:20:01 we may be able to build the API server using fewer plugin API calls (too many APIs…) but they will map closely 16:20:12 agreed 16:20:22 the API service will do some parameter validation, call the plugin to get the data, then format it for return 16:20:43 sounds like MVC applied to DB... 16:20:51 something like that :-) 16:21:49 jd___: what do you think? 16:22:07 I think like dhellmann :) 16:22:54 jd___: should we capture on action on building a few example API calls to the plugin? 16:22:59 we restrict the use of a storage plugin to one? 16:23:26 jd___: one at a time? yes 16:23:34 nijaba: not sure it's that useful since we have nothing (no code) related to API for now 16:23:46 jd___, yes, each collector instance would be using only one storage plugin but you could have multiple clusters writing the data to different databases if you wanted 16:23:46 so maybe we should call it an Engine rather than a plugin 16:23:49 yeah I meant one at time at runtime :) 16:23:59 nijaba: +1 16:24:20 let's learn from quantum's mistake here ;) 16:24:26 lol 16:24:40 * dhellmann shakes head 16:24:53 "engine" it is 16:24:59 #action dhellmann rename plugin to engine for storage backend ex-plugin-now-engine system 16:25:09 I hope that's clear 16:25:24 I will do that before submitting the code for review 16:25:34 thanks dhellmann 16:25:59 how do we address SPOF ? 16:26:10 with the database? 16:26:22 well, that's why I wanted us to be able to suport NOSQL dbs 16:26:38 how does nosql relate to spof? 16:26:48 dhellmann: you mean by using a postgresql setup with no spof ? 16:26:51 for instance 16:26:53 there's HA in SQL too 16:27:24 true, but it is a bit easir to setup multiple conf=current master with some NoSQL than with postgres 16:27:26 I'm not an ops expert, but yes, my ops team didn't seem concerned about postgresql as a SPOF so I assume they are planning to cluster it 16:27:53 nijaba, that can be true 16:28:09 good ops :) 16:28:11 the feedback I was getting was that mongo might fall over if we push too much data in 16:28:24 that's anecdotal, but I have to trust my ops team, don't I? 16:28:32 in practice when you need to follow a method to implement "no SPOF", it usualy does not happen. 16:28:46 * jd___ thinks it's a religious war we don't want to get into 16:28:46 dhellmann: that's not the view from our ops here, but used to be until 6 month ago 16:28:59 * nijaba agrees with jd___ 16:29:03 (flat file anyone?) 16:29:10 pickle ftw 16:29:19 I think the idea of introducting a "no SPOF" in the definition of the database was to make it a default instead of a possibility that needs to be implemented on top. 16:29:48 I think it was nijaba wanting to push nosql :) 16:29:53 do any of the SPOF solutions for databases require application code changes? 16:29:54 dachary: how would you do this? Write to 2 dbs at once? 16:31:01 my point was: let's not get stuck with an SQL only impelmentation. If we have an abstraction layer, then we can let the community play around and come up with solutions 16:31:07 I don't think we want the app to be responsible for database reliability. All of the "real" solutions I've seen support some sort of clustering 16:31:21 exactly, we can push that responsibility down into the plugin and not worry about it in the core app 16:31:22 I'd use mongodb because it has this concern built in from the start. Otherwise i'd follow Florian Haas advices regarding HA ;-) 16:32:00 has anyone done any work to estimate the amount of data they will be generating? 16:32:07 dachary: I am hopping to get some resources soon to work on a mongodb engine... 16:32:49 nijaba: great ;-) 16:32:59 I'm just stating a concern but I don't see this as a blocker. 16:33:08 dhellmann: I can take the action to build a calculator 16:33:23 nijaba, excellent, that would be a real help 16:33:40 * nijaba think about a google spreadsheet, if that suits everyone 16:33:48 I have some estimates for the number of VMs we expect to have, but have not had time to do the math on data size, yet 16:33:49 np 16:34:02 think about Swift too 16:34:32 swift as a source for metering message, or as a storgae engine? 16:34:39 I meant source for metering 16:34:45 k 16:36:35 #action nijaba to propose a google spreadsheet calculator to estimate volume of metering message (including nova, swift, cinder, quantum) 16:37:02 anything else? 16:37:32 I think that covers everything I had related to storage 16:37:55 was there a agree on the fact that the database engine has a function to interpret the API queries in addition to the function to store the data ? 16:38:11 a agree => a "dash agree" ;-) 16:38:26 I think there was. 16:38:29 dachary, I think we agreed there would likely be several methods related to querying 16:38:42 and that we still need to define them 16:39:07 I dont see the dash agree matching this 16:39:21 oh, we may not have recorded it that way 16:39:39 I just meant we seemed to come to consensus :-) 16:39:49 absolutely, I got that too ;-) 16:39:58 ok 16:40:26 #agreed a database engine has a function to interpret the API queries in addition to the function to store the data 16:40:47 jd___: thanks 16:40:49 #action dhellmann: start mapping API queries to database engine methods 16:41:04 I'll put together a wiki page with some proposals and we can discuss on the list 16:41:30 dhellmann: if you prime me with a first example, I can take care of the declinations 16:41:47 nijaba, sounds good 16:43:06 was there something else on the agenda for today? 16:43:26 Agent configuration mechanism? 16:43:28 yep 16:43:32 moving on then 16:43:36 #topic Agent configuration mechanism 16:43:56 did we agree on the list that we would use text config files and leave it up to ops to manage them, as with the other components? 16:44:08 that's my point of view at least 16:44:22 so, I must say that I would not be happy with this for meter configuration 16:44:37 I am fine with this being used for the global agent config 16:44:38 I think there are merits on complementing the configurations mechanisms. 16:45:15 bt I think we risk to have unuseable date captured if not all meter for a given value are set to report the same way, or report at all 16:45:16 Configuration engines like puppet or chef have limitations. 16:45:26 nijaba, I think I understand, but how often do you see the agent/meter configurations changing? 16:45:28 it is a real data consistency problem 16:45:52 dhellmann: as often as marketing will ask for tsomthing new ;) 16:46:35 in that case, wouldn't it make sense to just collect as much data as possible? 16:46:55 that would be the brute force approach 16:47:02 well, yeah :-) 16:47:12 I've worked with puppet recently and synchronisation with nagios : it's not pretty. We would be *much* better of using direct connections with the nagios plugins, if it was possible. Instead of going thru the puppet database. 16:48:03 I don't even see what could be configurable and go wrong with plugins for now 16:48:10 how hard do you think it is to have the meter configuration stored and retrived by the agents? 16:48:15 except time sync problem but that we won't solve :) 16:48:39 time sync? isn't that solved by ntp? 16:48:42 jd___: imagine that you want to capture cpu, but for some reason, only half of your host get that 16:48:51 dhellmann: I hope so :) 16:48:58 dhellmann: I think he meant frequency 16:49:18 ah 16:49:54 nijaba, how would that happen? the ops configuration management tool should detect that a config is out of date and fix it, no? 16:50:20 so my proposal is: agent are configured through trditinal means, butagent get meter config from the central collector 16:50:44 dhellmann: in theory, yes, but practice has shown this to not always be so true 16:51:06 very much in a same way a mysql database has communication with its slave for internal purposes, event when it's configured at install time using puppet 16:51:21 and since this causes a real data consistency issue, this is why I am a bit pushy here 16:51:55 dhellmann: I would not trust puppet or chef to handle every use case 16:52:36 dachary, OK. Well, I trust my ops team to figure out how to make that work, but let's assume we need to have this feature for now and discuss what it might look like. 16:52:43 dhellmann: ask your ops if they trust puppet to set up a drbd cluster... 16:52:46 The proposed API seemed more complicated than necessary. 16:52:54 we use chef, but OK :-) 16:53:07 nijaba: another good example, yes. 16:53:08 dhellmann: I am very open to changes 16:53:36 I propose a 2 step system. 16:53:53 On startup, the agent "checks in" with the collector to retrieve its configuration 16:54:19 At any other time, when the configuration is changed, the collector sends the new configuration to the agent. The agent discards its existing configuration and replaces it with the new settings. 16:54:33 warning : 6 minutes left ;-) 16:54:47 we probably need to move this discussion back to the list, then 16:54:51 dhellmann: do we use a cast in that case, or directed message (which implies maintaining a list of gents?) 16:55:01 s/gents/agents/ 16:55:09 nijaba, cast (assuming all agents are configured the same way) 16:55:30 I propose we move to a vote on the principle and move the discussion on the implementation to the list. 16:55:42 ok 16:55:46 dhellmann: that sounds good. Do you want me to rework my proposal, or do you want to have a stab at it? 16:55:59 nijaba, I don't expect to use this feature so maybe you should do it? :-) 16:56:16 dhellmann: k, fair enough 16:56:30 dachary, we should also discuss/vote on whether this is a Folsom feature or a G feature 16:57:14 dhellmann: time box: feature which don't make it for a release are pushed to the next one.... 16:57:21 dhellmann: is there a blocker for it to be a Folsom if someone works on it ? 16:57:30 ah 16:57:33 timebox of course ;-) 16:58:06 I wouldn't block someone else from working on it, but I don't think it should have a high priority given all of the other things we have to do for Folsom 16:58:16 makes perfect sense to me 16:58:20 I would rather see people working on pollsters and notification collection 16:58:21 dhellmann: here you find me in agreement 16:58:54 who is in agreement to the proposal that agent are configured through traditional means, but agent get meter config from the central collector ? 16:59:05 * nijaba really hopes to be able to put some effort where is mouth is 16:59:15 nijaba: you'll have to ;-) 16:59:18 +1 16:59:19 +1 16:59:23 -1 16:59:32 0 17:00:08 #agreed agent are configured through traditional means, but agent get meter config from the central collector ? 17:00:15 nijaba: you take action ? 17:00:41 yes, my action is to rework my proposal on the basis of what was proposed by dhellmann 17:00:53 nijaba: dash action please ;-) 17:01:19 #action nijaba to rework meter configuration proposal on the basis of discussion 17:01:27 end time guys 17:01:38 yes 17:01:40 thanks * 17:01:42 I can hear dhellmann' stomach 17:01:44 tnaks 17:01:52 #endmeeting