16:00:40 <jgriffith> #startmeeting cinder
16:00:40 <openstack> Meeting started Wed May 27 16:00:40 2015 UTC and is due to finish in 60 minutes.  The chair is jgriffith. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:41 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:43 <DuncanT> hi
16:00:44 <openstack> The meeting name has been set to 'cinder'
16:00:46 <opencompute> hi
16:00:49 <tbarron> hi
16:00:50 <kmartin> o/
16:00:54 <Yogi2> hi
16:00:56 <jgriffith> We've got a pretty full agenda so let's get on with it
16:01:00 <dannywilson> hi
16:01:05 <e0ne> #link https://wiki.openstack.org/wiki/CinderMeetings#Next_meeting
16:01:06 <jgriffith> #link https://goo.gl/XG062E
16:01:19 <jgriffith> e0ne: :) thanks
16:01:36 <jgriffith> reminder: please put your name next to your proposed topic
16:01:56 <jgriffith> #topic Live-Migration changes
16:02:04 <jgriffith> hemna: I'm assumign this is you?
16:02:16 <jgriffith> assuming
16:02:24 <jgriffith> hemna: ?
16:02:37 <jgriffith> giving hemna 30 seconds, then moving on and we'll come back
16:03:00 <jgriffith> 15 more seconds...
16:03:10 <geguileor> Hi
16:03:12 <dulek_home> Can we get back to it at the end? Guys from my team are in a traffic jam and will be able to discuss helping with Nova stuff in 30 minutes.
16:03:18 <jgriffith> Ok, we'll come back *if* we have time
16:03:24 <dulek_home> :)
16:03:25 <kmartin> hemna, is presenting to SNIA right now
16:03:26 <jgriffith> #topic Cinder internal tenant
16:03:31 <jgriffith> patrickeast: you're up
16:03:31 <patrickeast> hi
16:03:40 <patrickeast> so this got brought up a bit at the summit
16:03:49 <patrickeast> for fixing the hidden volumes problem
16:04:01 <patrickeast> i have some mention of it in this spec https://review.openstack.org/#/c/182520/
16:04:03 <mtanino> hi
16:04:03 <patrickeast> for the image cache
16:04:29 <patrickeast> after talking with some folks over the last couple days i wanted to bring it up at a meeting and make sure there wasn’t any strong resistance
16:04:43 <DuncanT> Seems like a good idea to me
16:04:56 <patrickeast> and it looks like there is a review up proposing a hidden flag to volumes, so i think we need to maybe make a decision on which direction we go
16:05:05 <patrickeast> and unify all the various efforts on that approach
16:05:15 <patrickeast> questions? comments? concerns?
16:05:42 * DuncanT prefers a special tenant to the hidden flag
16:05:51 <jgriffith> patrickeast: so I need to read your spec more carefully, but i mentioned the other day another spin
16:05:51 <jungleboyj> o/  Here now.  Sorry.
16:06:01 <tbarron> internal tenant seems potentially useful for lots of stuff
16:06:06 <jgriffith> patrickeast: "special tenant" and public snapshots
16:06:18 <jgriffith> patrickeast: but I know the public snapshots is contentious
16:06:34 <e0ne> DuncanT: +1 for special tenant
16:06:36 <jgriffith> partially because of my own statements :)
16:06:46 <patrickeast> jgriffith: hehe yea id rather not tie special tenants to doing public snapshots, more like if we go down that road we could use the special tenant
16:06:50 <jgriffith> Anybody object to internal tenant?
16:06:58 <tbarron> patrickeast: +1
16:07:22 <jgriffith> patrickeast: well... the problem IMO is your spec is actually to "solve" the image-caching issue
16:07:34 <jgriffith> patrickeast: not "sholud we do special tenant"
16:07:48 <geguileor> +1 to special tenant
16:07:51 <patrickeast> jgriffith: yea i was wondering about that… maybe we should split it out?
16:08:05 <jgriffith> I think we're all agreed on the tenant idea so that's great
16:08:06 <xyang2> special tenant can be used for temporary volume, temp snapshot, and image cache.  fine with me
16:08:15 <jgriffith> patrickeast: might be good to split it out
16:08:27 <jgriffith> patrickeast: for the sake of clarity and "argument" :)
16:08:35 <patrickeast> sounds good
16:08:38 <jgriffith> xyang2: +1
16:08:43 <jungleboyj> Sounds like something that can be used by multiple people.
16:08:48 <jungleboyj> xyang2: ++
16:08:49 <e0ne> xyang2: +1
16:08:54 <patrickeast> i’ll make a new bp and spec that we can use as a dependency for the other ones that would need it
16:08:56 <jgriffith> We need to be carefule though
16:09:01 <rajinir> Like the idea of special internal tenant. Seems like it can be of multi use
16:09:02 <jgriffith> and VERY specific on it's usage
16:09:14 <jgriffith> It's easy for something like this to become HORRIBLY abused!!!!
16:09:17 <patrickeast> jgriffith: +1
16:09:25 <cebruns> jgriffith: +1
16:09:27 <jgriffith> including circumventing things like quotas in public clouds etc
16:09:29 <kmartin> agree, the special tenant would be useful
16:09:30 <patrickeast> we are shooting for just the right amount of abuse
16:09:34 <jgriffith> or private for that matter
16:09:42 <jgriffith> so let's clarify...
16:09:58 <jgriffith> I propse we're very specific and it's NOT just a "special" tenant
16:10:02 <ameade> o/
16:10:07 <jgriffith> which can be anything anybody wants it to be
16:10:08 <scottda> swapsudoexit
16:10:23 <jgriffith> in this case I suggest it's something like "cinder-image" tenant
16:10:43 <DuncanT> +1
16:10:45 <jgriffith> and it's specifically for image caching and management, nothing more
16:11:02 <patrickeast> oh that brings up another thing i wanted some feedback on, should we have multiple of these? one for image caching, or for migration helping, etc?
16:11:06 <e0ne> agree about caching
16:11:06 <jgriffith> if other valid use cases come up we can adjust and deal with them
16:11:17 <e0ne> jgriffith: what type of management do you mean?
16:11:18 <xyang2> jgriffith: I also need it for non-disruptive backup
16:11:23 <jgriffith> patrickeast: so that's the rat-hole I'm hoping to avoid here
16:11:28 <xyang2> jgriffith: vincent needs it for migration
16:11:34 <jgriffith> e0ne: so my use-case is something like this;
16:11:42 <patrickeast> jgriffith: yea, but we already have 3(?) use cases wanting it
16:11:45 <geguileor> If we create specific users we'll end up with a bunch of them
16:11:58 <jgriffith> image-tenant creates bootable volumes from glance images on a periodic of some sort
16:12:03 <e0ne> we need to specify all use-cases in spec
16:12:13 <jgriffith> provides public snapshot or volume to "other" tenants
16:12:16 <DuncanT> Specific users makes figuring out WTF is going on easier
16:12:26 <jgriffith> DuncanT: +1
16:12:44 <jgriffith> DuncanT: so this could turn into a SERIOUS mess if we're not very careful and explicit
16:12:59 <jgriffith> start throwing around migration blah blah blah and we're pretty well sunk IMO
16:13:12 <DuncanT> Indeed, it is already starting to feel like a new nail...
16:13:28 <jgriffith> IMO the image-tenant is just a sort of "glance proxy" to start with
16:13:30 <jgriffith> that's all
16:13:42 <cFouts> o/
16:13:54 <patrickeast> i agree 100% that we don’t want to mis-use this, but if we don’t then we end up with a hidden flag on the volume table *and* special tenants
16:14:06 <jgriffith> patrickeast: ?
16:14:17 <patrickeast> we can’t just exclude migrations or whatever else
16:14:26 <jgriffith> patrickeast: sure we can
16:14:39 <jgriffith> patrickeast: I don't see what migrations has to do with the topic?
16:14:39 <jungleboyj> jgriffith: To be clear, you don't want to have a general user, you want a specific user for a specific purpose.
16:14:41 <patrickeast> right, but then we get https://review.openstack.org/#/c/185857/
16:14:52 <jungleboyj> We can start with image-tenant and then expand.
16:14:58 <patrickeast> the whole point of the special tenant is to avoid a hidden flag on the volume table
16:15:02 * flip214 was reminded to act as timekeeper again....   so, ¼ of the time is gone.
16:15:04 <patrickeast> thats why it came up in the first place
16:15:15 <patrickeast> maybe we can discuss after the meeting more
16:15:22 <patrickeast> i don’t want to hog all of the time
16:15:22 <jgriffith> patrickeast: there... that's fixed
16:15:36 <e0ne> jungleboyj: looks like we neet to start use trusts from the keystone api v3
16:15:53 <xyang2> jgriffith, DuncanT: I do need to create temp volume and snapshot for the non disruptive backup case, so either I need a hidden flag or a cinder tenant for it.  I thought at the summit, cinder tenant was the preferred approach
16:15:55 <e0ne> to make more thin user management
16:16:32 <jungleboyj> xyang2:  ++
16:16:44 <geguileor> Yes cinder tenant was preferred because it didn't meant changes to Quota
16:16:48 <geguileor> among other things
16:16:50 <jgriffith> xyang2: So that may in fact be something that this idea expands to, although I don't understand why it has to be "hidden"
16:17:13 <xyang2> jgriffith: it is a temp snapshot, we don't want user to do operation on it
16:17:16 <jgriffith> xyang2: so online backup requires creation of a snapshot... no problem IMO
16:17:31 <jgriffith> xyang2: well, keep in mind users can't really *do* anything with snapshots anyway :)
16:17:38 <DuncanT> jgriffith: It's the vol from snap for lagacy drivers that is the issue
16:17:42 <jgriffith> xyang2: and frankly what would they *do*
16:17:45 <DuncanT> jgriffith: The snap is fine
16:17:58 <DuncanT> jgriffith: But we need to create a hidden volume
16:18:04 <jgriffith> DuncanT: so mark it as R/O
16:18:04 <xyang2> jgriffith, DuncanT: they can still list it.  is it okay
16:18:18 <jgriffith> frankly I don't care if they can list it
16:18:20 <xyang2> DuncanT: ya, hidden volume is another issue
16:18:36 <DuncanT> jgriffith: quota? Them deleting it in the middle of the operation?
16:18:45 <jgriffith> DuncanT: ummm
16:18:54 <jgriffith> DuncanT: quota = deal with it
16:19:10 <jgriffith> DuncanT: deleting in the middle of an operation we check that sort of thing in the API all the time
16:19:15 <jungleboyj> jgriffith: I think you do care.  People get confused if they see volumes show up that they didn't create.
16:19:18 <jgriffith> DuncanT: if !available raise
16:19:20 <DuncanT> jgriffith: Volumes coming out of nowhere was shown to be a very confusing UI for migrations, I don't think it is going to get any less confusing
16:19:26 <jgriffith> Ok
16:19:32 <xyang2> DuncanT, jgriffith: right, quota is the other issue that why we thought cinder tenant is preferred
16:19:36 <jungleboyj> Maybe it was DuncanT that really cared, but I think we should care.
16:19:46 <tbarron> jungleboyj: +1
16:19:49 <jgriffith> I'll leave it to you all to sort out then, but I suspect you're not going to like the end result :(
16:20:02 <e0ne> jungleboyj: good point
16:20:06 <jgriffith> jungleboyj: and no, I don't care on that particular item
16:20:34 <jungleboyj> jgriffith: Ok, must have been DuncanT
16:20:40 <jgriffith> jungleboyj: people get "more" confused when there's invisible things happening and stuff fails and they have zero idea why
16:20:59 <jgriffith> Ok...
16:21:04 <DuncanT> So we see what it looks like in code having a backup tenant too... it won't be used for any driver that is updated to be able to backup snaps directly, so it is hopefully temporaty
16:21:08 <DuncanT> temporary
16:21:16 <jgriffith> so it sounds like everybody is on board with "special" tenants
16:21:17 <xyang2> jgriffith: what if it is visible to admin only, just not to regular tenant?
16:21:29 <jgriffith> I'll let everybody else argue about where/how they can be used
16:21:35 <patrickeast> ok so… heres my proposal, i’ll write up a spec for the special tenant and put as many of these use cases on there as i can
16:21:43 <jungleboyj> xyang2: ++  We need the tenants for that though, right?
16:21:46 <patrickeast> we can hash out which of them are ‘valid’ or not in the spec review
16:21:54 <xyang2> jungleboyj: yes
16:21:54 <jungleboyj> patrickeast: __
16:21:55 <geguileor> patrickeast: +1
16:21:58 <jungleboyj> patrickeast: ++
16:22:07 <jgriffith> xyang2: sure, maybe that works
16:22:10 <jgriffith> ok
16:22:23 <jgriffith> so patrickeast did we at least cover the main points for you to move forward?
16:22:36 <patrickeast> jgriffith: haha yea, i think everyone seems to be on board
16:22:43 <jgriffith> cool
16:22:45 <patrickeast> just a matter of figuring out exactly how we use them
16:22:51 <jgriffith> :)
16:22:55 <jgriffith> #topic hin provisioning volume capacity consideration in filter scheduler
16:23:03 <jgriffith> that's "thin" by the way :)
16:23:05 <jgriffith> not hin
16:23:11 <jgriffith> xyang2: you're up
16:23:14 <xyang2> is winston here?
16:23:15 <xyang2> ok
16:23:27 <xyang2> this was brought up by patrickeast
16:23:45 <xyang2> so currently we deduct the size of the new volume for thin provisioning from free capacity
16:23:50 <tbarron> xyang2: winston said he had a conflict today
16:24:02 <xyang2> this is the conservative approach we started with in the design
16:24:23 <xyang2> the concern is for thin volume, it is not consumed yet when it is first provisioned
16:24:47 <xyang2> the proposed patch allows two ways of handling this
16:25:09 <xyang2> a flag was added, if it is true, we deduct the size of the new volume
16:25:19 <xyang2> if it is false, we don't deduct it
16:25:28 <xyang2> anyone has an opinion on this
16:25:37 <patrickeast> xyang2: so i think there are actually two things here, my bug originally wasn’t even for that issue (and now our conversation yesterday makes more sense)
16:25:37 <xyang2> ?
16:25:39 <jgriffith> xyang2: well I do of cource :)
16:25:49 <jgriffith> course
16:25:54 <xyang2> jgriffith: go ahead
16:26:02 <patrickeast> the bug https://bugs.launchpad.net/cinder/+bug/1458976 is that you can create 100 2TB devices on a 100TB backend but not 1 200TB volume
16:26:02 <openstack> Launchpad bug 1458976 in Cinder "cannot create thin provisioned volume larger than free space" [Undecided,In progress] - Assigned to Xing Yang (xing-yang)
16:26:14 <patrickeast> then there is the issue of as you create thin volumes
16:26:18 <patrickeast> it eats up ‘free’ space
16:26:22 <patrickeast> until the next stats update
16:26:40 <jgriffith> xyang2: well, I've always been of the opinion that we need to quit pontificating and screwing around report capacities
16:27:15 <jgriffith> xyang2: that means "available" which use the over-prov ratio you implemented
16:27:21 <jgriffith> in the case of thin
16:27:34 <jgriffith> and distinguish between allocated and provisioned
16:27:48 <jgriffith> I've propsed this no less than 1/2 a dozen times in the last 18 months
16:27:54 <guitarzan> that's a crazy bug report :)
16:28:34 <guitarzan> thin provisioning is a crazy pit I'm glad I'm not jumping into :)
16:28:37 <xyang2> jgriffith: sorry, I don't think I completely follow you:(.  what is your suggestion?  by the way, our definition may be a little different
16:29:01 <jgriffith> xyang2: I'm sure our definitions are different which has always been the problem
16:29:14 <jgriffith> xyang2: everybody wants "their" definition and won't compromise on anything
16:29:33 <jgriffith> xyang2: so my proposal is and has been just report the same way the reference LVM thin driver does
16:29:42 <jgriffith> xyang2: allocated vs actual
16:29:58 <jgriffith> xyang2: and calculate deltas to report and schedulel placement
16:30:15 <jgriffith> xyang2: so if you have thin provisioning and a backend with 100G of free space
16:30:37 <jgriffith> it reports free-space + (free-space * over-prov-ratio)
16:31:04 <jgriffith> and free-space = physical - allocated
16:31:14 <jgriffith> allocated is "actual" blocks used
16:31:30 <jgriffith> xyang2: make sense?
16:32:19 <bswartz> jgriffith: the term "allocated" is problematic because it doesn't match the definition of allocated_space in cinder -- I understand what you mean though
16:32:21 <xyang2> jgriffith: so "allocated" means actually used capacity.  I think that is how free is calculated currently, just the term is not the same
16:32:53 <patrickeast> so to make sure i understand if you had 100G of free space, and a 2.0 ratio, you could place a 200G thin volume, right?
16:32:59 <patrickeast> with what you described
16:33:10 <jgriffith> patrickeast: correct
16:33:14 <patrickeast> perfect
16:33:24 <patrickeast> thats what my bug report is for… we can’t do that today
16:33:34 <jgriffith> bswartz: yeah, so the other suggestion for the "name" conflict was apparant or effective
16:33:45 <jgriffith> bswartz: specifically for the scheduler
16:34:20 * flip214 mentions that half of the time is gone.
16:34:28 <bswartz> I think using different terms in different places is what of what leads to the madness and misunderstandings
16:34:53 <jgriffith> flip214: thanks for the reminder sir
16:35:09 <flip214> jgriffith: no problem. glad to be of service!
16:35:11 <xyang2> bswartz, jgriffith: do we really want to start discuss the terms now?  I added a whole section on terms in the cinder spec
16:35:16 <jgriffith> bswartz: well, for an end user allocated should not take into account anything to do with thin
16:35:24 <jgriffith> :)
16:35:32 <jgriffith> xyang2: thanks!  You saved me
16:35:34 <tbarron> xyang2: +1
16:35:37 <jgriffith> and my blood pressure
16:35:49 <bswartz> I just wanted to point out that we don't want to argue about what the terms should mean, we want whatever the terms are to be crisply defined so there is no confusion
16:36:05 <xyang2> jgriffith: ok, I don't think I explained the problem clearly.
16:36:07 <patrickeast> so… terms and formulas aside, i think the original topic isn’t for *how* we calculate the virtual space
16:36:21 <xyang2> jgriffith: there's definitely a bug that reported by patrick
16:36:27 <jgriffith> xyang2: agreed
16:36:29 <xyang2> jgriffith: and I want to fix it.
16:36:40 <jgriffith> xyang2: yes, and that's FANTASTIC
16:36:51 <xyang2> jgriffith: the question is whether we want to also preserve the existing behavior
16:36:57 <jgriffith> xyang2: I'm proposing that we fix it by having the drivers report capacities in a way that isn't stupid
16:37:11 <jgriffith> xyang2: which frankly right now they kinda are
16:37:20 <patrickeast> the bug is a flaw in how we do the filter logic, not the virtual space
16:37:28 <patrickeast> or at least i see it as a flaw
16:37:30 <xyang2> jgriffith: it is not the driver here actually.  the filter scheduler deduct the volume size
16:37:34 <winston-d> patrickeast: +1
16:37:41 <jgriffith> patrickeast: it can be addressed on either side
16:37:50 <jgriffith> patrickeast: and the scheduler may be the right place
16:38:02 <jgriffith> patrickeast: xyang2 my thing was I didn't like the fix and adding a flag etc
16:38:04 <patrickeast> jgriffith: nono, the problem is that the filter doesn’t ever get to the virtual capacity stuff as-is
16:38:11 <tbarron> the issue is about filter scheduler behavior , not driver behavior - given the spec that was approved and implemented in kilo
16:38:12 <winston-d> i don't think we abused the term 'free space' so far
16:38:18 <jgriffith> patrickeast: xyang2 IMHO the drivers and scheduler should work together to just "do the right thing"
16:38:19 <patrickeast> jgriffith: it fails before then on a if free < volume_size
16:38:27 <xyang2> jgriffith: ok, no one likes the flag so far:)
16:38:29 <jungleboyj> tbarron:  ++
16:38:34 <patrickeast> this line needs to be changed
16:38:36 <patrickeast> https://github.com/openstack/cinder/blob/master/cinder/scheduler/filters/capacity_filter.py#L83
16:38:43 <jgriffith> patrickeast: yeah, what I'm trying to say is that our reporting of free is wrong
16:38:45 <patrickeast> or moved *after* we check thin provisioning stuff
16:38:54 <patrickeast> jgriffith: oooo
16:38:57 <patrickeast> ok i see
16:39:04 <winston-d> free space means it's physically available space, without overprovision.
16:39:08 <jgriffith> patrickeast: free in the case of thin support should be "free * over-prov-ratio"
16:39:09 <xyang2> any one want to keep the ability to preserve the volume size, please speak up
16:39:31 <patrickeast> jgriffith: gotcha, later on we call that virtual_free or something
16:39:32 <xyang2> otherwise, since no one likes to flag, we'll just not to preserve it
16:39:33 <jgriffith> winston-d: so that's the quirk
16:39:53 <tbarron> xyang2: I'm ok with that
16:40:02 <jgriffith> winston-d: which is why I then said "ok... add an apparant/virtual/effective" or whatever "-free"
16:40:07 <jgriffith> and use that instead
16:40:22 <jgriffith> honestly if it's thin I don't necessarily know why the scheduler should care
16:40:44 <xyang2> I agree the flag looks ugly.  I just want to see if anyone wants to preserve the existing behavior
16:40:48 <hemna> ok I'm back.  sorry guys, I had a preso dry run to do at the same time as our meeting.
16:40:52 <winston-d> jgriffith: you can create a type to explicitly ask for thin or thick
16:41:15 <DuncanT> jgriffith: thick and thin on the same backend
16:41:24 <xyang2> jgriffith: so I was given a comment 6 months back when I first started working on this, that we should be conservative and assume the new volume will be consumed
16:41:26 <jgriffith> winston-d: Oh, the crazy pool nonsense let's you do both
16:41:34 <jungleboyj> hemna: is back.  Back to discussion.  ;-)
16:41:53 <xyang2> jgriffith: that was why it was deducted in the filter scheduler.
16:42:00 <jgriffith> winston-d: so seems like both numbers is good to have, and you use the one that's applicable based on the volume-type being created/requested?
16:42:03 <jgriffith> xyang2: thoughts?
16:42:04 <DuncanT> jgriffith: You can do both without pools...
16:42:13 <jgriffith> DuncanT: ok
16:42:15 <winston-d> jgriffith: right
16:42:28 <jungleboyj> jgriffith: ++
16:42:31 <jgriffith> xyang2: winston-d so can we just fix it that way rather than flags etc?
16:42:43 <xyang2> yes, some backend can support both thin and thick.
16:42:47 * DuncanT suggests that it looks like we can dump the existing behaviour for thin types... it is broken
16:43:03 <xyang2> I have added a extraspec for thin, thick in the cinder spec
16:43:04 <jgriffith> DuncanT: let's discuss that offline
16:43:12 <jgriffith> DuncanT: oh!
16:43:13 <jgriffith> LOL
16:43:14 <xyang2> so driver can do it if it wants to
16:43:14 <jgriffith> yes
16:43:19 <jgriffith> DuncanT: I agree with you
16:43:37 <winston-d> i'm all for fix this bug without flag
16:43:37 <xyang2> for the particular implementation, this is for thin actually
16:43:46 <jgriffith> xyang2: so driver reports back to scheduler free and apparant-free
16:43:47 <tbarron> winston-d: +1
16:43:51 <patrickeast> winston-d: +1
16:43:59 <jungleboyj> winston-d: +1
16:43:59 <jgriffith> scheudler uses apparant-free for thin type scheduling and free for thick
16:44:09 <jgriffith> ok
16:44:18 <patrickeast> sounds reasonable to me
16:44:21 <jgriffith> so we're all on the same page I think
16:44:29 <xyang2> jgriffith: that is almost there, I don't think we need more reporting
16:44:30 <hemna> so every driver that supports both have to now report apparant-free and free ?
16:44:31 <jgriffith> xyang2: we can chat more between you and I or you and winston-d
16:44:32 <DuncanT> Looks like we're enthusiastically agreeing here, shall we stamp it and move on?
16:44:47 <winston-d> DuncanT: +1
16:44:51 <jgriffith> DuncanT: yeah, but I don't think xyang2 agrees
16:44:57 <patrickeast> although i’m wondering what the odds are of backporting that type of change vs a bug fix for my original problem
16:44:57 <xyang2> jgriffith:sure, but those are already reported
16:45:00 <jgriffith> we can discuss in channel after meeting
16:45:10 <jgriffith> think it's just 'details' at this point
16:45:18 <xyang2> jgriffith: that is what the provisioned_capacity is about
16:45:29 <xyang2> jgriffith: we already have that in driver stats
16:45:38 <tbarron> xyang2: ++
16:45:38 <jgriffith> xyang2: ok, let's talk after... but I'd say then "just use that"
16:45:41 <jgriffith> there's your fix
16:45:56 <xyang2> jgriffith: I'm fine with that
16:45:59 <jgriffith> #topic https://bugs.launchpad.net/cinder/+bug/1458976
16:46:00 <openstack> Launchpad bug 1458976 in Cinder "cannot create thin provisioned volume larger than free space" [Undecided,In progress] - Assigned to Xing Yang (xing-yang)
16:46:01 <jgriffith> GAAAA
16:46:12 <jgriffith> #topic Downgrades in new db migrations
16:46:22 <patrickeast> lol
16:46:22 <jgriffith> DuncanT: you'r one
16:46:24 <jgriffith> on
16:46:27 <jgriffith> patrickeast: :)
16:46:31 <e0ne> i asked this question several weeks ago. DuncanT was against this solution because it makes versioning objects debugging and testing harder
16:47:02 <e0ne> imo, downgrades makes our migrations more complex.
16:47:02 <DuncanT> So hopefully a quick question: Have we come to a conclusion about removing these? None of the new reviews have them
16:47:14 <e0ne> but they are useful for new migrations development
16:47:32 <jgriffith> DuncanT: I am not aware that we had
16:48:05 <DuncanT> Ok, so nobody is going to scream at me for -1ing changes without them. Excellent
16:48:06 <patrickeast> have new changes merged with out them, or is this just reviews?
16:48:09 <e0ne> jgriffith, DuncanT: cross-project spec about removing downgrages migration merged
16:48:24 <hemna> e0ne, url ?
16:48:24 <jgriffith> e0ne: so that's the answer then I guess
16:48:28 <DuncanT> e0ne: cross project specs are advisory
16:49:06 <e0ne> #link https://github.com/openstack/openstack-specs/blob/master/specs/no-downward-sql-migration.rst
16:49:17 <winston-d> in other words, we don't have to follow?
16:49:29 <e0ne> DuncanT: agree. fair enough
16:49:52 <hemna> e0ne, thanks, have to read through that to figure out their justification.
16:50:06 <vilobhmm> will go through the spec eone
16:50:09 <bswartz> why don't people want to support downgrades?
16:50:16 <hemna> downgrades are a bit of a PITA to do in some cases, regarding foreign key constraints, etc
16:50:26 <ameade> esp hard with data migrations
16:50:27 <DuncanT> bswartz: They're hard to write and down get tested
16:50:30 * bswartz mutters under his breath
16:50:30 <e0ne> bswartz: becouse operators don't use it in prod
16:50:31 <ameade> if not impossible
16:50:34 <hemna> bswartz, the url above seems to explain their justifications.
16:50:41 <e0ne> hemna: +1
16:50:45 * bswartz readin...
16:50:52 * bswartz reading...
16:50:54 <hemna> the problem is the downgrades aren't for operators
16:51:02 <hemna> they are for us to ensure our upgrades work
16:51:07 <e0ne> hemna: than's true
16:51:15 <hemna> fwiw
16:52:13 <hemna> "Update oslo.db to return appropriate errors when trying to perform a schema downgrade"
16:52:25 <hemna> so if that gets implemented, our downgrades might start to puke
16:52:30 <bswartz> this sounds like laziness to me
16:52:37 <hemna> and we won't be able to not get rid of them
16:52:38 <DuncanT> The idea of returning to a db dump is just broken in the case of live upgrade though....
16:52:43 * jungleboyj is surprised by it.
16:52:53 <bswartz> downgrades are hard to do right -- so we propose to not do downgrades
16:53:07 <hemna> heh
16:53:08 <hemna> yah
16:53:20 <hemna> I had issues w/ downgrades for the multi-attach patch
16:53:24 <hemna> but worked through it.
16:53:42 <hemna> it seemed like a good exercise to me.
16:53:47 <DuncanT> You didn't want to actually keep the volumes you created recently, right?
16:53:47 <bswartz> nobody cares if downgrades are wildly inefficient, but having them is better than not having them
16:53:57 <jgriffith> bswartz: I think the real point was "they don't really work when there's data" so we shouldn't pretend they do
16:53:58 <jungleboyj> bswartz: ++
16:53:59 <DuncanT> Oh, and the ones you deleted are back
16:54:20 <jgriffith> bswartz: there's also a good question on whether they're actually useful for anything other than debug
16:54:38 <bswartz> useful for debugging seems like a good enough reason to keep them
16:54:51 <bswartz> the real problem is if they're buggy -- the solution is to test them and fix the bugs
16:55:11 <jgriffith> Ok, so I think you all want to do the opposite of the all the folks on the ML
16:55:14 <jgriffith> that's ok by me
16:55:27 <DuncanT> If oslo.db is blocking them we're screwed
16:55:29 <e0ne> imo, re-create db procedure is faster than create downgrade+tests for debugging
16:55:37 <jgriffith> e0ne: +1
16:55:38 <hemna> according to that spec though, oslo.db is going to be updated to not even allow them
16:55:42 <hemna> so I think the point might be moot.
16:55:55 <jgriffith> e0ne: I'm not sure I see why anybody is upset by this but ok
16:56:01 <DuncanT> e0ne: Have you ever tried to recreate a prod db? It's a nightmare
16:56:08 <hemna> DuncanT, +1
16:56:18 <e0ne> DuncanT: i'm talking bout debugging
16:56:18 <jgriffith> DuncanT: have you ever tried to run our downgrades on a production DB?
16:56:24 <jgriffith> I can't imagine that it would work
16:56:28 <patrickeast> but i thought the point was that these were not used in production?
16:56:30 <e0ne> jgriffith: :)
16:56:33 <jgriffith> patrickeast: :)
16:56:44 <DuncanT> jgriffith: We used them on a dev cluster, in that case they actually worked
16:56:49 <jgriffith> and thus the circular-reference argument ensues
16:56:54 <hemna> DuncanT, I think there are 2 purposes for db migrations though.  My primary is during development to ensure the changes I've made actually make it in place.  In this use case, doing a complete db recreate is just as effective as a downgrade.
16:57:09 <jgriffith> hemna: that's what grenade does fyi
16:57:14 <hemna> the other purpose is for live production data, and I'm not sure I ever see a use case where a customer wants to downgrade ?
16:57:20 <jgriffith> hemna: your changes are upgrades, not downgrades
16:57:25 <hemna> yup
16:57:29 <hemna> jgriffith, agreed.
16:57:35 <jgriffith> ok, I think this horse it dead... not sure why we're beating it
16:57:36 <DuncanT> What do you do if your system is busted after an upgrade?
16:57:40 <hemna> so I think I'd be ok with nuking downgrades.
16:57:53 <e0ne> DuncanT: revert from backup?
16:58:00 <DuncanT> e0ne: Live upgrade
16:58:01 <jgriffith> DuncanT: I'm not sure how/why anybody thinks the downgrade scripts are going to help int hat situation anyway?
16:58:23 <hemna> jgriffith, +1
16:58:28 <DuncanT> jgriffith: If they work, they do... the one time I've tried them on they worked fine
16:58:36 <hemna> if your upgrade is roasted, the downgrade most likely won't even work.
16:58:43 <e0ne> DuncanT: i'm not sure that if live upgrade fail, downgrade will work
16:58:44 <jgriffith> DuncanT: so you've used them once in 4 years?
16:58:51 <hemna> jgriffith, :)
16:58:53 <DuncanT> jgriffith: Yes
16:59:06 <jgriffith> DuncanT: don't every become a sales person :)
16:59:13 <jgriffith> s/every/ever/
16:59:19 <hemna> lol
16:59:24 <bswartz> hemna: the DB upgrade may be fine but the new code could have critical bugs making you want to go back to an older version
16:59:28 * jungleboyj is enjoying that mental image
16:59:37 <DuncanT> bswartz: ++
16:59:38 <jgriffith> DuncanT: so honestly I don't care either way, but it sounds as if you need to take it up on the dev ML
16:59:44 <jgriffith> DuncanT: with bswartz
17:00:10 <jgriffith> because it would seem the rest of the OpenStack community has moved on and may be removing the capability form oslo.db anyhow
17:00:19 <jungleboyj> Really seems like the DB should be snapshotted before an upgrade so that it can be rolled back if a disaster occurs.
17:00:21 <jgriffith> and on that note... we're out of time
17:00:22 <hemna> bswartz, we have bugs ?
17:00:22 <DuncanT> jgriffith: Yup, didn't realise that there was an official policy. That answers my question for now
17:00:23 <bswartz> it feels to me like the devs are screwing the users with this change
17:00:26 <hemna> :P
17:00:47 <jgriffith> bswartz: there was never a user that came back and said they have ever used it though
17:00:49 <winston-d> i've never used my
17:00:50 <jgriffith> ok
17:00:55 <DuncanT> jungleboyj: You can't snapshot a live db and expect it to work later... new volumes are lost, deleted volumes are back, it is totally broken
17:01:02 <jgriffith> thanks every one
17:01:04 <winston-d> fire extinguisher on car, but i want to make sure it works when i need it
17:01:04 <bswartz> but it's less work for us, so it's all good </sarcasm>
17:01:05 <jgriffith> #endmeeting