16:00:01 #startmeeting cinder 16:00:02 Meeting started Wed Jan 23 16:00:01 2013 UTC. The chair is jgriffith. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:03 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:05 The meeting name has been set to 'cinder' 16:00:38 Looks like we're missing winston and thingee... 16:00:59 DuncanT: Do you want to start or would you prefer to wait for Winston and Mike? 16:01:24 hmmm.... 16:01:25 I can start... if you think they'll have opinions then I can postpone til they turn up 16:01:30 Fair 16:01:40 #topic snapshots and backups etc etc 16:01:48 DuncanT: How's that for a segway :) 16:01:58 hi 16:02:04 So my first question is snapshots and their lifecycle .v. the volume they came from (and go to) 16:02:29 * jgriffith ducks... this always gets messy :) 16:02:40 At the moment, if you create vol1, snap it to snap1, delete vol1 then try to clone snap1, badness happens 16:02:56 Actually right now you can't do that 16:03:16 It doesn't appear to stop me 16:03:19 the vol delete command complains that the vol is in use 16:03:20 jgriffith: exactly, you arent alowd to delete vol if snap exist, i guess 16:03:33 You shouldn't be able to 16:03:36 If you can that's a bug 16:03:39 LVM will have a fit 16:03:47 Oh, maybe I've too many hacks in there, I'll try again on devstack later 16:04:14 DuncanT: If it does, file a bug :) 16:04:15 I want to be able to, either way 16:04:27 errr.... I knew you were going to say that 16:04:54 I just don't understand the big advantage of this 16:05:04 jgriffith: we want to be able to as well 16:05:05 :) 16:05:05 Customers who pay for disk space 16:05:07 Especially once we get backup to swift or xxxx in 16:05:09 I'm here 16:05:17 * jgriffith is always in the minority! 16:05:32 history, snapshots aren't snapshots, they're backups 16:05:38 but that's another topic 16:05:43 If you want to keep a golden image around for fast instance creation, you probably don't want to pay for the space twice 16:05:55 I don't think you should delete vol when it has a snapshot 16:05:58 DuncanT: sure, so do a clone 16:06:01 You /could/ use a backup, but they are slow 16:06:03 :) 16:06:11 Not as slow as LVM snapshots :) 16:06:27 I'm not running LVM ;-) 16:06:32 we also allow the creation of volumes from volumes (is that what people mean by a clone?) 16:06:49 smulcahy: yup 16:06:52 smulcahy: That is the new clone interface, yes 16:07:49 Ok, seems I'm finally going to have to give in on this one 16:07:55 Clone is a possibility, but I don't see a strong reason for the limitations of snapshots, beyond LVM suckage 16:07:59 How about -force option? As long as the user knows what he/she is doing. 16:08:00 (and it works on amazon) 16:08:05 but I still say it has to be consistent behavior for ALL drivers 16:08:28 DuncanT: Well I agree to a point 16:08:38 but really tht was the WHOLE reason I did clones 16:08:48 Should need to be a force option 16:08:59 Shouldn't. rather 16:09:14 I think volume->snap dependency is completely driver dependent 16:09:31 clones loose the read-only-ness though 16:09:33 Hello every one...I am Navneet from NetApp 16:09:37 guitarzan: yes, it is but I want that dependency abstracted out 16:09:44 hi navneet 16:09:49 abstracted out? 16:09:51 how so? 16:09:55 morning Navneet 16:09:56 I agree with DuncanT that immutability of a snap is an important property 16:10:01 jgriffith: how about making snapshot dependency on volumes a driver capability that can be reported back? 16:10:01 bswartz: hi ben 16:10:13 In terms of I don't want our api to be "if this back-end you can do this, if that back-end you can do that" 16:10:15 when then entity is immutable, you know precisely the stat of your golden image 16:10:20 state* 16:10:26 good morning huitarzan 16:10:27 bswartz: driver capability reporting is another subject I'd like to bring up ;-) 16:10:37 jgriffith: but it already is 16:10:41 our driver doesn't support clone 16:10:43 someone elses does 16:10:56 Right but where this is going I tihnk somebody is going to suggest changing that :) 16:11:04 clone is not (yet) supported by most drivers 16:11:07 but you can't force me to implement it 16:11:15 ala "allow delete volumes with snaps on *some* back-ends" 16:11:16 we support it 16:11:36 I guess I don't see the problem 16:11:42 jgriffith: it sounds like you're arguing for homogeneity and against driver capabilities 16:11:50 bswartz: +1 16:12:03 what if generic code supported some form of refcounting? 16:12:06 at least in the core api calls 16:12:07 the downside with homogeneity is that it impedes progress 16:12:11 We were talking about what the minimum feature set for a driver to be accepted is last week 16:12:38 if uncle rackspace decides that it really wants independent snapshots (and our customers definitely do) I'd have to just fork/patch cinder api to allow it 16:13:06 I mean, one could mark a volume for deletion, and space could be reclaimed when possible, depending on the driver capabilities 16:13:15 Same. We already support it (viz a bug found last week) on hpcloud 16:13:20 alright, I'm going to loose this debate so no use continuing 16:13:23 either immediately, or when the last snap is deleted 16:13:27 We can go that route if folks would like 16:13:40 I still don't like it, and don't understand WTF we have clones for then 16:13:47 well, I'm just saying that the product specifies the features 16:14:03 Sorry for the language, I think when it acronyms it's ok though right? :) 16:14:10 yes :) 16:14:11 jgriffith: If we do read-only volumes in future, we can redo most of the snapshot capabilites with them 16:14:12 jgriffith: clones can be faster/more efficient that snapshot+create_vol_from_snapshot 16:14:19 guitarzan: Sure, but how is that communicated to the end user? 16:14:25 in our product documentation 16:14:28 jgriffith: I thought clones were cheap writable copies of a volume, while snaps were cheap immutable copies 16:14:31 openstack is just a means to an end 16:14:32 The end user shoudln't know aything about what the back-end is 16:14:44 JM1: yes 16:15:19 I agree that, in order for this feature to be merged, it should be made to work on the the LVM backend (at a minimum) in some sensible manner 16:15:26 JM1: i like your definition 16:15:34 guitarzan: DuncanT or are you suggesting it's a global config and admin beware if they're back-end supports it or not? 16:15:50 DuncanT: Yeah, that can be done 16:16:00 jgriffith: I'd like the drivers to report it, ideally, rather than needing flags, but yeah 16:16:16 DuncanT: but that's what I don't like 16:16:28 I'm ok with either choice 16:16:29 DuncanT: So if some provider has multiple back-ends 16:16:35 one supports it and one doesnt 16:16:41 hi sorry im late 16:16:43 how does the user know what to expect? 16:16:46 The end user that is 16:16:55 that's certainly a complication 16:16:56 I prefer drivers to report it 16:16:58 Just run it and see if it fails 16:16:59 jgriffith: Ideally I'd make LVM able to do create-from-snap-of-deleted-volume by some manner 16:17:26 jgriffith: Possibly by doing some clone magic behind the scenes 16:17:36 DuncanT: Sure, but back up to my issue with different back-ends having different behavior 16:17:38 jgriffith: I can't do that for all backends though 16:17:48 DuncanT: right 16:17:50 We already have that with clone; 16:17:54 Ok, we're on the same page 16:18:00 haha!!!! 16:18:08 If I use e.g. solidfire and bock, one supports clone, one doesn't 16:18:24 Well get clones added to bock already :) 16:18:32 if every backend uses clone it can be done 16:18:37 It's on the list 16:18:46 There.. problem solved :) 16:18:53 we shouldn't force all drivers to support delete vol with snapshot 16:18:53 But going forward, some drivers will always lag others 16:19:17 It is going to happen the minute you add a new feature to the driver 16:19:33 Some mechanism to manage that seems sensible 16:19:40 then you could have a slow fallback strategy for said new feature 16:19:43 clone support is different. If it is not supported by a driver, mark it not implemented. 16:19:58 for the force flag to work, that's is very specific. 16:20:28 And since some drivers have different abilities (like snap/clone/delete not needing to be run on the volume node 'owning' the volume), a using the same mechanism to express capabilities would be ncie 16:20:33 DuncanT: sure but major things like "deleting" a volume should work the same always IMO 16:21:03 I just think it's bad user experience to sometimes be able to delete a volume and sometimes not and never knowing why 16:21:42 The advantage of a driver expressing capabilities means you can test in API and give back a sensible error message, rather than silent failure like you get with raise NotImplementedException 16:21:47 Just give your customers free or cheap snaps, you're not using any real space anyway :) 16:21:57 DuncanT: Sure but the end user doesn't see that 16:22:07 but the volume is expensive, and they don't want it anymore! 16:22:10 what's the use case for this feature anyway? 16:22:12 The end user just knows this volume can delete, but this one can't 16:22:19 if you have cheap snaps, the cost of keeping the volume around should be low 16:22:22 guitarzan: I know.. I hear ya 16:22:45 jgriffith: If we add a get_capabilites call to the driver, the API can give some explaination why, rather than silent failure 16:22:59 Like I said, you guys are going to win on this, I'm just saying there's a major concern IMO 16:23:05 DuncanT: that can make higher-level automation tedious 16:23:14 DuncanT: I understand that 16:23:17 JM1: Silent failures make it worse! 16:23:19 JM1: how so? 16:23:39 DuncanT: I never pushed for silent failures :) 16:23:50 we're not giving our customers much credit here... they're going to know what product they're buying 16:23:59 are we talking about a force flag in clone volume or delete volume? that is different 16:24:07 xyang_: that's diff 16:24:27 xyang_: We're talking about being able to delete a vol that has a snap on "some" back-ends that support it 16:24:43 * jgriffith thinks it needs to be all or nothing 16:24:47 I'd like to push towards fixing up all drivers to support the facility where possible, just like clone, but having a mechanism to express to the user where that hasn't been done yet / is impossible is also good to have 16:24:49 a force flag in delete volume makes sense 16:25:04 force flag is meaningless though 16:25:06 xyang_: We have a force flag 16:25:07 guitarzan: well what if you mix different backends? I think it's planned 16:25:14 and it doesn't have anything to do with the problem here 16:25:19 ok 16:25:21 If your backend doesn't support it, what is --force supposed to do? 16:25:25 JM1: I personally think that's less likely than others seem to think 16:25:33 DuncanT: silently fail! 16:25:57 guitarzan: Silent failures bad. Worse than just about anything else IMO 16:26:03 sorry, couldn't help it :) 16:26:25 anyway, this horse is dead 16:26:29 I'll code up a concrete patch and get comments then, I think 16:26:40 It sounds like at least some people agree with me 16:26:40 DuncanT: excellent, I'd like to see it 16:26:59 haha 16:27:08 guitarzan: dead horses 16:27:44 DuncanT: Just for the record, I don't disagreee with you, I just don't like different behaviors 16:27:45 You've heard about our burgers, right? We can make good use of dead horses 16:27:51 but alas, I'll approve the patch 16:28:00 hey now... remember who you're talking to 16:28:01 :) 16:28:29 Us horsey people are funy about that sort of thing 16:28:44 jgriffith: I'll try to fix up can't-do-clone error reporting in the same patch 16:28:52 Talk to Tesco about the horses, not me 16:28:59 ah, horse breeders vs. horse eaters, always funny to watch :) 16:29:06 haha 16:29:07 :) 16:29:22 Ok... DuncanT I think you had another topic? 16:30:04 Capability reporting, but I think that will fall out in the same patch then get extended... 16:30:12 Multi AZ was my next one 16:30:20 DuncanT: I would prefer it's a seperate patch 16:30:30 jgriffith: +1 16:30:37 jgriffith: Sure, ok, same patch series 16:30:42 sure 16:30:56 Plans to expose that as an admin extension? 16:31:16 I'm assuming we don't want it customer visible 16:31:22 Indeed 16:31:31 righty oh then 16:31:48 DuncanT: anything else? 16:31:55 Multi-AZ 16:32:02 #topic multi-az 16:32:06 Is anybody else doing it? 16:32:09 So I have a blue print 16:32:18 I'm going to have to look at sooner rather than later 16:32:33 We have to do it really else ec2 is going to be broken 16:32:37 link? 16:32:41 Nova put it in via aggregates 16:33:11 Question 1) Should cross-az volume mounting work? 16:33:30 DuncanT: I don't think EC2 allows it, no 16:33:35 (Nova's AZ support is broken in places - this is being looked at) 16:33:59 guitarzan: https://blueprints.launchpad.net/cinder/+spec/aggregates-in-cinder 16:34:03 Good, because as ideas go, allowing it sucks ;-) 16:34:03 Not much there yet 16:34:08 hehe 16:34:20 cool, thanks 16:34:32 Everybody feel free to provide input on this one 16:34:53 I'm getting a handle on AZ and now aggregates but honestly I dislike all of the above :) 16:35:16 Not in on a philosophical or implementation level or anything, I just find it a bit tedious :) 16:35:20 Syncronising the default AZ in nova with the default AZ in cinder, particularly if you want that to vary per tenant for load ballancing, is an issue, though probably not a major one (e.g. let keystone pick it) 16:35:31 jgriffith: could you elaborate a little bit on this BP? 16:35:37 DuncanT: That's actually the point of that BP 16:35:47 winston-d_: sure 16:36:06 As DuncanT just pointed out AZ's between Nova and Cinder need to be synched 16:36:29 In other words instance and volume have to be in same AZ for attach to work 16:36:41 right 16:36:47 We cheated before and just said, admin set the AZ 16:36:58 But now it's not going ot be quite so simple 16:37:15 The changes in Nova will allow aggreagates, which results in multiple AZ's for a single node 16:37:19 in essence 16:37:25 Also, it's now a bit dynamic 16:37:33 what about cells? How does cells affect this? 16:37:35 well, not a *bit*, it is 16:37:47 cells are different, and I have chosen to ignore them :) 16:37:53 cells shouldn't make much difference I think 16:37:57 xyang_: i don't think so. cell is transparent to end-users 16:38:08 storage location isn't very dynamic by nature 16:38:23 JM1: well yes and no 16:38:48 JM1: You may start with an AZ based on a data center, then break it down to racks 16:38:52 then break it down to PDU's 16:38:54 etc etc 16:39:07 do people expect such small AZ ? 16:39:17 although folks like guitarzan and DuncanT would have better use case descriptions than I 16:39:20 jgriffith: so you want cinder to be able to support aggreagates? 16:39:28 they actually deal with this stuff on a daily basis :) 16:39:35 winston-d_: correct 16:39:35 In my experience, AZs for storage never get smaller than a rack -- usually a group of racks 16:39:41 winston-d_: I think it's goint o have to 16:39:42 to 16:40:02 So we have 3 AZs in a datacenter, each in their open fire-cell and independant core switching etc 16:40:04 bswartz: Yeah, but doesn't the netapp cluster take up an entire rack :) 16:40:09 s/open/own/ 16:40:20 jgriffith: lol 16:40:24 bswartz: :) 16:40:53 So DuncanT does this line up with what you were wanting to talk about 16:41:07 so even with small AZ, data is either here or there, and has to move somehow 16:41:21 it's still not very dynamic I think 16:41:41 JM1: the dynamic piece is the ability to create/change names at any time 16:41:43 jgriffith: Yes, though I'm not yet massively familiar with agrigates .v. cells .v. azs - I can grab a nova person and get them to educate me though 16:42:02 jgriffith: It sounds like you're heading in the same general direction though, which is good 16:42:11 jgriffith: ah ok, that sounds more realistic 16:42:13 DuncanT: so what is your use cases for multiple-AZ? 16:42:28 Basicly just making sure we can sync with Nova and not break 16:42:39 sorry.. that was for DuncanT regarding my direction 16:42:51 winston-d_: Multiple firecells within a datacentre... scalability beyond a few hundred compute nodes 16:43:04 jgriffith: Yeah, but we're also evolving nova at the same time ;-) 16:43:11 and as an aside, I don't think rackspace uses AZs 16:43:15 DuncanT: Yeah, that's the problem :) 16:43:22 guitarzan: oh... really? 16:43:23 we just use cells to scale out, and regions for datacenters 16:43:31 Yay! 16:43:38 Can you connect storage across cells? 16:43:41 DuncanT: yes, that i can understand. 16:43:43 well... doesn't the region coorelate to an AZ? 16:43:47 DuncanT: yes 16:43:56 jgriffith: I don't know, maybe? 16:43:59 guitarzan: isn't cells really just a tag? 16:44:11 ie metadata 16:44:24 So our idea with multiple AZs in a region/datacentre is that if one goes down, the other is pretty much independant 16:44:29 it's passing messages through different queues 16:44:39 but you'll have to talk to comstud to find out what all it entails :) 16:44:39 jgriffith: no, my understanding is cell is a small cloud with almost everything: API/DB/scheduler/ComputeNodes 16:45:01 winston-d_: haha.. that's what I thought of as an AZ :) 16:45:03 DuncanT: isn't that a region within a region? 16:45:06 winston-d_: api calls tend to go to the parent cell 16:45:19 alright, Ithink we're all going to get much smarter on cells and AZ's in the coming weeks 16:45:34 guitarzan: that's right, but as i said, cell is transparent to end users. 16:45:41 yep 16:45:44 Anything else on the topic of AZ's, cells etc? 16:45:47 JM1: Pretty much, currently each of our AZs has its own set of endpoints, which we don't really like 16:45:48 and a child cell can have its own child cell 16:46:19 #topic G3 16:46:23 jgriffith: I'll probably have more in future, for now I need to get reading up on nova approaches a bit more 16:46:51 DuncanT: sounds like we all need to do that homework 16:46:53 :) 16:47:03 I've updated the G3 targets again 16:47:17 https://launchpad.net/cinder/+milestone/grizzly-3 16:47:30 There's an awful lot going on again :) 16:47:40 Big chunk of it is new drivers 16:47:45 sorry :) 16:48:05 :) 16:48:19 jgriffith: need to add cinder protocol enhancements aka 'nas as service' 16:48:20 I'll probably have another small driver + driver update for new API and bug fixes 16:48:49 rushiagr1: the BP is there, we can target it if you think you'll have something in the next couple weeks 16:48:49 rushiagr1: +1 16:49:03 How are things going on that BTW? 16:49:11 Haven't heard anything in a bit 16:49:18 jgriffith: we're aiming to be COMPLETE on 2/14 with some WIP submitted before then 16:49:31 2/14 with some WIP? 16:49:39 Cutting it kinda close for a large change isn't it? 16:50:15 bswartz: that sounds good if you guys can hit it 16:50:26 jgriffith: you yourself said a week before the milestone would be okay as long as the overlap with existing code was minimized 16:50:35 jgriffith: a WIP version before 2/14 16:50:39 we're doing the best we can to make it available before then 16:50:41 That was when the milestone was G2 :) 16:50:46 bswartz: I hear ya 16:50:51 bswartz: I think that's fine 16:50:59 bswartz: I'm just saying be kind to us 16:51:14 I've got patches that are only in the teens in terms of lines 16:51:22 folks haven't been able to review them in a wekk 16:51:24 week 16:51:34 jgriffith: Still waiting for some core nova reviewers(russellb and/or vishy) to look at the FC nova changes https://review.openstack.org/#/c/19992/ only one review since we put the review in on Jan 17th 16:51:35 Your patch is going to be significantly larger :) 16:51:42 we have also submitted NetApp direct drivers for review 16:51:54 we'll help out with reviews from our side 16:52:00 Navneet: Yeah, I'm working on that one 16:52:08 Navneet: are those done or going to be revised once more? 16:52:14 bswartz: how about helping with reviews that are non-netapp? 16:52:23 jgriffith: absolutely 16:52:30 bswartz: awesome 16:52:36 ok.. that all sounds good 16:52:46 I'm looking forward to what you guys came up with 16:52:47 jgriffith: I would be there for reviews too.. Sorry, wasnt able to dedicate time in this first half of this week 16:52:50 jgrifith: thats final from us...it contaons clone vol and filter scheduler related capabilities 16:52:55 Navneet, rushiagr1: please review stuff from https://launchpad.net/cinder/+milestone/grizzly-3 16:52:58 remember, if you can share the source before review even better 16:52:58 jgriffith: we're absolutely interested in doing so 16:53:03 ie github or draft etc 16:53:34 bswartz: Navneet rushiagr1 actually if you could just poke around here: https://review.openstack.org/#/q/status:open+cinder,n,z 16:53:54 esker: cool.. thanks 16:54:02 We can use all the review help we can get 16:54:14 jgriffith: hasn't been doing reviews lately, will catch up 16:54:17 ok 16:54:17 kmartin: when do we get the cinder FC patches? is it waiting on nova getting merged? 16:54:26 That link I gave shows all open cinder patches 16:54:49 and dont' forget there are cinderclient patches too :) 16:55:15 avishay: yes, that was the plan 16:55:26 kmartin: ok thanks 16:55:36 Ok folks, we've got 5 minutes til the other John kicks us out :) 16:55:44 #topic open discussion 16:55:47 Can winston-d give an update on the volume stats changes that we talked about last week? 16:55:56 jgriffith: sure. Would take some more time before I start cinderclient reviews 16:55:58 winston-d_: ^^ 16:56:52 jgrifith: the latest submit from us is large in size...if we can get review comments soon to get it in...request 16:57:11 kmartin: i'm working on RetryFilter, should be able to submit this week. by then the scheduler is able to handle those 3 cases we talked about last week. 16:57:39 winston-d_: thanks. 16:57:48 winston-d_: speaking of which, any chance you can throw out some info on how to configure the fitlers, and how we might do custom filters in the future? 16:57:59 Also how to set up back-end filter :) 16:58:18 maybe a wiki? 16:58:31 design theory, control flow type thing? 16:58:39 jgriffith: sure. my document plan is pending, i have something already but not complete yet. 16:58:48 winston-d_: Ok, great! 16:58:56 alright.. anybody else have anything? 16:59:02 we've got two minutes :) 16:59:08 DOH one minute 16:59:12 When you send a request to cinder, it returns a unique request id - presumably for support/diagnostics 16:59:21 this is currently called X-Compute-Request-Id 16:59:26 should it be called something else? 16:59:29 jgriffith: sure. if you like i can even do a session for filter/weigh development in next summit. :) 16:59:37 winston-d_: +1 16:59:43 winston-d_: fantastic idea! 16:59:54 jgriffith: any thoughts on safely disconnecting iscsi connections w/ multiple LUNs/target? 16:59:58 and currently there is v little diff between /volumes and /volumes/detail - is that intentional or are there plans to strip down the output of /volumes ? 17:00:01 winston-d_: +1 17:00:08 winston-d_: +1, but we would like to get in Grizzly :) 17:00:09 smulcahy: probably, I'll have to llok 17:00:36 smulcahy: is that the same header name as from compute? 17:00:50 avishay: I think your'e the only one with multiple luns per target 17:00:55 jgriffith: I'm guessing something like X-Volume-Request-Id or maybe just X-Request-Id 17:01:00 kmartin: yup, all basic filter scheduler features to make cinder work will get in G. and more advanced filter/weigher can come later. 17:01:06 xyang_ has also :) 17:01:08 We also have multiple luns per target 17:01:10 guitarzan: didn't check but I'm guessing we inherited it in move from nova 17:01:11 might be easiset to just over-ride the disconnect, but I honestly never got back to it 17:01:27 LVM supports it too 17:01:31 winston-d_: ok great, looking forward to it 17:01:31 smulcahy: sure, but since those request ids are the same across the service, it might not make sense to use different header names 17:01:41 avishay: That would be a band-aid at least 17:02:03 guitarzan: smulcahy IIRC there was a reason I left those headers 17:02:10 guitarzan: right, so maybe then something service neutral like X-Request-Id ? 17:02:18 smulcahy: no argument about that from me :) 17:02:22 xyang_: can you explain about LVM to me and jgriffith? 17:02:23 and I believe it had something to dow ith attach/detach as well as avoiding some keystone work 17:02:26 but not sure 17:02:38 not a big issue, just thought I'd mention it 17:02:40 avishay: sure, meet me in #openstak-cidner 17:02:44 Ok wer'e out of time 17:02:47 thanks everyone 17:02:52 #openstack-cinder 17:02:59 and reviews, reviews, reviews!!!!! 17:03:03 #endmeeting