22:05:59 #startmeeting db 22:06:00 Meeting started Thu Nov 15 22:05:59 2012 UTC. The chair is devananda. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:06:01 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 22:06:03 The meeting name has been set to 'db' 22:06:24 so who all has stuck around? 22:06:34 * dragondm waves 22:06:34 o/ 22:06:40 0/ 22:06:44 o/ 22:06:54 :) 22:07:08 briefly, the action items from last week look all covered 22:07:26 #link https://review.openstack.org/#/c/15709/ 22:07:38 devananda: meeting outline link? 22:07:43 ah, sorry 22:07:57 agenda is here 22:08:00 #link http://wiki.openstack.org/Meetings/DBTeamMeeting 22:08:34 sorry forgot to wave 22:08:39 i'll get the hang of using the bot eventually :) 22:08:44 hi 22:09:07 so, i posted a list of suspected race conditions (prior link) 22:09:30 devananda: I take it, it will be hard to make tests to trigger the race conditions 22:09:42 yep :( 22:10:09 i haven't managed to repro them locally as it's a matter of timing two threads issuing SELECT & INSERT at the same time 22:10:34 but it is easy to simulate by replaying the transactions recorded in slow log by hand 22:10:56 the solution to most of those should be adding UNIQUE constraints 22:11:08 which, it turns out, someone tried to do last night :) 22:12:05 hwoever that needs work before it is done right 22:12:06 jog0: want to talk about sqlalchemy object leaks? 22:12:16 devananda: sure 22:12:35 https://review.openstack.org/#/c/15450/ 22:12:44 I have another version of the patch coming shortly 22:12:57 I got some good feedback from vishy on this 22:13:26 there are many places on the code where an object is used such as instance.id 22:13:32 instead of instance['id'] 22:13:43 the first step will be to change all of those 22:14:14 and then we can either do decorator magic or explicitly include a function to convert away sqlalchemy objects 22:14:26 this should be doable for G2 22:14:47 sounds good 22:14:52 I am not sure if this will have any strange side effects 22:15:06 where someone assumes they get a sqlalchemy object and tries to modify the DB using it 22:15:17 but we will find out soon 22:15:28 IMNSHO, that shouldn't be happening outside of db/api. but i wouldn't be too surprised if it is .... 22:16:08 jog0: putting decorators on db/api, not db/sqlalchemy/api, yes? 22:16:27 I put the the decrator in db/api but will be moving it to db.sqlachemy 22:16:33 the latter would probably cvause issues because there are many public funtions right now that are called within db/sqlalchemy/api 22:16:53 that is, the same function is called both via db.api and internally via self 22:17:03 with an expectation of slightly different behavior 22:17:13 i'm working on cleaning that up with the db-session-cleanup bp 22:17:36 for now, probably best not to decorate any db.sqlalchemy.api methods that have "session=none" in the definition 22:18:26 devananda: I would like to leave it in db/api but I think that breaks the abstraction itself (no sqlalchemy outside of nova/db/sqlalchemy 22:19:03 if the decorator is sqlalchemy-specific, then i agree 22:19:40 just saying i'm pretty sure it will break stuff if applied to all public methods in db.sqlalchemy.api right now. which will probably help me find what i need to fix, actually :) 22:20:17 devananda: I tried applying to db.sqlalchemy.api and tons broke 22:20:30 heh 22:20:46 moving on then 22:21:12 anyone here to talk about db-common? 22:21:51 or no-db-compute? 22:22:19 russellb was here a minute ago 22:22:30 hi 22:22:43 ummmm! we had a thread in the last week on openstack-dev about the next steps on no-db-compute 22:23:09 #link http://lists.openstack.org/pipermail/openstack-dev/2012-November/002573.html 22:23:13 long live nova-conductor 22:23:26 general agreement on the direction, the name of the service changed since the original message 22:23:37 there's a patch up that creates the service 22:23:52 once that's in, we'll have a big flurry of patches for a while moving db accesses around :-) 22:24:02 great! 22:24:11 so, good progress ... i'm feeling very good about having db access out of nova-compute by grizzly-3 22:24:30 likely not far off at grizzly-2 time 22:25:14 i'll give that patch a review after this 22:25:30 vishy posted to that thread about concerns with db being blocking 22:25:33 i'm very interested to see how some of the longer-running db tasks will get refactored 22:25:36 right 22:25:48 and that a service that acts largely as a db writer is going to be held back because of that 22:25:54 so that's something to have on the radar... 22:25:57 mordred keeps telling me that moving away from the mysql C connector would help with that a lot 22:26:09 but there are apparently reasons we can't yet 22:26:10 yes, some rackspace guys tried the pure python one 22:26:20 and the problem was that it was so much slower that overall it wasn't beneficial 22:26:23 I played with the python one too 22:26:29 but that's heresay from me 22:26:33 it was slower and faster all at the same time 22:26:35 hearsay rather :-) 22:26:37 hah 22:26:44 it also broke migrations IIRC 22:26:48 some things were slower others were faster due to eventlet working better 22:27:02 * mordred would love to poke at the slower bits ... might be possible to fix the performance parts 22:27:12 I mean, the mysql protocol isn't exactly rocket science 22:27:40 russellb: when you say, db writer service willb e held back, what does that mean? 22:27:59 slower/less efficient than it could be 22:28:12 ah 22:28:21 as long as any db operation is blocking, nothing else is going to be running 22:28:33 just due to the joys of eventlet and calling out to native code that may block 22:28:33 not "it wont be implemented". gotcha. 22:28:40 correct 22:29:05 so - there are some places where the python driver actually breaks currently, right? 22:29:12 can always just run more instances of nova-conductor to make up for it in the meantime, but obviously that's not long term ideal. 22:29:25 it seems like if we can get that part of the driver fixed - then we can treat the performance problems like other tuning things 22:29:44 russellb: what about running more connections from nova-conductor, since it is effectively acting as a connection-pooler 22:29:51 also not a great solution, but ... 22:29:57 comstud: were you going to look at the python mysql thing this week? i thought i remembered that coming up on IRC earlier this week 22:30:22 i was going to look at re-adding db pool 22:30:28 devananda: i don't think that helps, eventlet literally won't switch threads to let another thread do something on another connection 22:30:31 python mysql sux perf wise 22:30:42 so how does the pool help? 22:30:46 it doesn't improve anything, and makes it worse in some cases 22:30:59 it allows DB queries to happen in parallel 22:31:22 right now all DB access is serialized 22:31:26 k, i'm trying to understand how that works if eventlet just sits around and blocks while a query is executed 22:31:28 if a DB query takes 5 seconds.. 22:31:32 the whole python process is locked up 22:31:44 with a tpool... other greenthreads can run 22:32:04 ok, so i need to go look at tpool then ... 22:32:18 it's real threads 22:32:31 comstud: awesome. 22:32:36 so ok, makes sense then. :) 22:32:40 unfort eventlet still seems a big uggly 22:32:43 buggy 22:32:46 when using it 22:32:48 but we'll see 22:33:00 :-/ 22:33:01 it'll be an option, not a default 22:33:08 ok. 22:33:11 real threads ++ 22:33:25 devananda: +1 :) 22:33:37 i dont see something like nova-conductor working at scale if it can't multi-thread db connections 22:33:44 agreed 22:33:48 +1 22:33:50 so we need to make it happen one way or another 22:34:16 comstud: thanks for jumping in :) 22:34:34 should there be blueprint to track this? or would this be part of no-db-compute? 22:34:42 should be its own thing IMO 22:34:57 no-db-compute just happens to make it even more important (because of how we're implementing it in the short term) 22:35:10 is that even a nova thing, though? 22:35:19 note on eventlet/gevent greelets, I thought it was a cooperative multithreading, which means that if a greenlet is stuck in an infinite loop, everything is waiting? 22:35:23 well, it is for now 22:35:25 or more generally a python / eventlet / mysql-connector thing 22:35:49 would be good to prove it out in nova first 22:35:51 dkehn: eventlet is cooperative. eventlet.Tpool lets you have a pool of "real" threads in addition, for calling blocking native code. 22:35:55 then can look at how to make it more gneral 22:36:04 dragondm, thx 22:36:22 dripton, thx, sorry dragondm 22:37:20 dkehn: threadpool puts the execution into a real thread 22:37:41 i think that's all on no-db-compute ... :-) 22:37:50 * vishy likes repeating stuff that dripton says :) 22:37:54 russellb: so, separate BP? 22:38:05 for the tpool stuff, yes, IMO 22:38:27 #action russellb to post a BP for db threadpool separate from no-db-compute 22:38:27 who feels like writing it up? :-) 22:38:29 * russellb looks at comstud 22:38:32 hehe 22:38:37 owned 22:38:45 k 22:39:03 fwiw, i'm happy to take a look into it, once baremetal stuff settles down a bit 22:39:25 k, sounds like comstud is the guy to coordinate with 22:39:45 * russellb will be busy making the problem worse 22:39:52 awesome :) 22:40:20 anyone want to jump in? 22:40:46 i'm probably forgetting stuff, but that's all that was on the agenda 22:40:51 I did a wiki page with all the read_deletes='yes' hits. http://wiki.openstack.org/ReadDeletedYesOrOnly 22:41:04 Actually removing them scares the heck out of me; it's easy but will probably break stuff. 22:41:13 wow thats a lot 22:41:58 definitely want to break up the removal into smaller logical pieces if you go forward 22:42:05 to make it easier to review that specific area of usage 22:42:18 well if we are going down the road of still soft deletes and make the deleted column unique this becomes a non-blocker correct? 22:42:32 not sure what makes the most sense chunk wise ... just a general comment :) 22:42:34 jog0: i belive so 22:42:53 If we keep the soft deletes we should be okay. I just don't know how to get rid of them in a portable fashion. 22:43:30 dripton: what about periodic task that cleans up db? 22:43:56 I am leaning towards keeping soft deletes, using UNIQUE(col, deleted), and periodic cleanup task which can be adjusted per-table by the deployer 22:44:07 devananda: +1 22:44:11 jog0: I don't know callers' expectations of how long deleted data has to hang around. 22:44:23 caveat being that i dont think anyone's written a db-agnostic cleanup task 22:44:29 some folks at HP have one for mysql 22:44:37 using events and stored routines 22:44:42 as devananda said it can be adjusted by deployer 22:45:03 if this is an optional feature if we support the common ones isn't that good enough, as long as its easily extensible? 22:45:10 so postgres and mysql 22:45:12 It's the best we can do. 22:46:01 dkehn: this might be something you are interested in? 22:46:11 Is the HP cleanup task open source? (Not that it should be hard to write from scratch.) 22:46:29 devananda, being pulled in a lot of directions at present 22:46:29 i'm not sure. Paul Carlton talked about it at grizzly, so I assume it is... 22:46:32 or could be 22:46:37 dkehn: sure, np 22:46:53 devananda, but, yes 22:47:11 dripton: but yea, it should be easy to rewrite 22:47:55 I haven't seen it but generally this woould be straight forward if we are basing it on a time of life 22:48:31 dripton: from talking with Paul, I think he's willing to share their implementation 22:48:39 dripton: but it wont apply to trunk right now 22:49:33 anyone want to take that and run with it? 22:49:58 I'll take a look, if HP can post the code somewhere. 22:50:11 ^ action item? 22:50:11 jog0: Error: "action" is not a valid command. 22:50:13 The tricky bit will be figuring out how to do db-specific code without making it gross. 22:50:35 #action devananda to find and post existing db-cleanup code 22:50:59 dripton: it'll probably have to be sepaerate migrations for mysql and pgsql :( 22:51:10 at least i dont see another way 22:51:21 Some current migrations have if statements in them, but I don't know of a case where we have entirely separate migrations. 22:51:30 sqlalchemy-migrate does support that, though. 22:52:05 You can do migrations that are sql scripts rather than Python scripts, and then they get put in subdirectories by DB. 22:52:41 that might work 22:52:55 in any case, i'll try to get you the existing implemenation as a starting point at least 22:53:21 Speaking of migrate, I'm planning to send mail to openstack-dev about alembic and backportable migrations next week. Waiting for dprince to finish his migration squashing (he said he's working on it this week) so I don't have to massively rebase. 22:54:07 great 22:54:38 i should probably take a closer look at alembic ... 22:55:57 anything else or shall we #end? 22:56:18 bye all 22:56:20 one more 22:56:25 https://blueprints.launchpad.net/nova/grizzly 22:56:37 there are 3 db bps that don't have assignees 22:57:00 I can take https://blueprints.launchpad.net/nova/+spec/db-api-cleanup 22:57:15 ++ 22:57:17 but the other two need owners, and are  marked as high 22:57:35 i can take db-reconnect, but i think that is, at best, G3, possibly H 22:57:47 devananda: can you assign db-api-cleanup to me 22:57:48 the code is already written 22:58:00 but the patch was denied, and should probably wait till we finish the rest of cleanup 22:58:04 jog0: will do 22:58:26 devananda: agreed about reconnect 22:58:50 that leaves db-archiving 22:59:09 I'll take db-archiving. 22:59:13 awesome 22:59:15 actually 22:59:27 should we add a new one for db-unique-keys? 22:59:36 or is that too fine grained 22:59:53 do you mean unique keys with soft delete? 22:59:56 yes 23:00:03 I think that should be a separate blueprint. 23:00:07 yes to bp. 23:00:15 It's much easier to get in than the rest of db-archiving 23:00:20 separate bp* 23:00:24 #action devananda to post db-unique-key blueprint 23:00:34 cool 23:01:07 I may also take backportable-db-migrations away from vishy, depending on whether people want to do alembic or not. 23:01:23 If not then it's simple but gross. If so then it's bigger. 23:01:52 so you want it if it's bigger, not simple and gross? :-) 23:02:12 I'll do it either way, but if I'm pushing alembic then I *have* to do it. If it's just adding padding numbers then anyone can do it. 23:02:24 makes sense 23:02:41 And I'd rather not have my name on the clearly wrong solution. 23:02:48 hah 23:03:12 Author: Hackity Hack 23:03:22 and we all whistle and ignore it 23:03:33 Alan Smithee. 23:04:31 all righty then :) looks like time to end... 23:04:36 bye again 23:04:38 #endmeeting