#openstack-meeting-4 log

14:00:03 <rosmaita> #startmeeting cinder
14:00:04 <openstack> Meeting started Wed Jan 29 14:00:03 2020 UTC and is due to finish in 60 minutes.  The chair is rosmaita. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:05 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:07 <openstack> The meeting name has been set to 'cinder'
14:00:12 <rosmaita> #topic roll call
14:00:14 <lseki> hi
14:00:19 <eharney> hey
14:00:27 <LiangFang> hi
14:00:43 <rosmaita> greetings thierry
14:00:59 <rosmaita> #link https://etherpad.openstack.org/p/cinder-ussuri-meetings
14:01:03 <ttx> hi! just lurking :)
14:01:09 <whoami-rajat> Hi
14:01:12 <raghavendrat> hi
14:01:12 <sfernand> hi
14:01:33 <jungleboyj> o/
14:01:42 <rosmaita> looks like a good turnout
14:01:51 <rosmaita> #topic announcements
14:01:53 <tosky> o/
14:02:29 <rosmaita> i've been meaning to mention that you may have noticed, that i'm not as good as jay was about keeping notes in the agenda etherpad
14:02:37 <rosmaita> so if you miss a meeting and want to know what went on
14:02:43 <rosmaita> you need to look at the meeting log
14:02:53 <rosmaita> otherwise, you may think nothing happened!
14:03:00 <rosmaita> ok, first real announcement
14:03:09 <jungleboyj> :-)  I can try to get back to doing notes.
14:03:12 <rosmaita> #link https://etherpad.openstack.org/p/cinder-ussuri-meetings
14:03:20 <rosmaita> that wasn't what i meant
14:03:23 <enriquetaso> o/
14:03:33 <rosmaita> rocky goes to "extended maintenance" status next month
14:03:40 <smcginnis_> I think the meeting logs are the best. Especially with the use of #action, #info, etc.
14:03:52 <jungleboyj> smcginnis_:  :-)
14:03:52 <whoami-rajat> jay for notes ++
14:04:01 <rosmaita> yeah, jungleboyj i'd kind of like to push people to using the meeting logs
14:04:11 <rosmaita> ok but about rocky going to EM ...
14:04:13 <rosmaita> final release must happen before 24 February
14:04:19 <jungleboyj> rosmaita:  Ok.  Sounds good.
14:04:20 <enriquetaso> you are doing great rosmaita
14:04:27 <rosmaita> doesn't look like there are any/many outstanding patches for rocky
14:04:28 <enriquetaso> :P
14:04:50 <whoami-rajat> rosmaita, i think one is mine
14:04:53 <rosmaita> so this is really a notice that if there *is* something that looks like it should be backported, please propose it soon
14:05:22 <rosmaita> whoami-rajat: right, i will keep an eye on that one
14:05:52 <whoami-rajat> rosmaita, thanks
14:06:07 <rosmaita> so we'll do the final rocky release 20 Feb
14:06:42 <rosmaita> second announcement:
14:06:43 <rosmaita> spec freeze on Friday 31 January (must be merged by 23:59 UTC)
14:06:48 <rosmaita> that's this friday
14:06:54 <kaisers> hi
14:06:57 <rosmaita> looks like we have 3 specs still in play for ussuri
14:07:06 <rosmaita> they are on the agenda later
14:07:28 <rosmaita> #topic Continued discussion about 3rd Party CI
14:07:37 <rosmaita> thanks to jungleboyj for keeping on top of this
14:07:44 <rosmaita> jungleboyj: you have the floor
14:07:51 <jungleboyj> :-)
14:08:14 <jungleboyj> Thanks.  So, we started this topic last week and it seemed we needed to continue the discussion this week.
14:08:30 <jungleboyj> Or actually I guess it was during the virtual mid-cycle.
14:08:44 <rosmaita> last week as well
14:08:55 <LiangFang> :)
14:09:01 <jungleboyj> Anyway, I sent an e-mail to the mailing list and also targeted the CI e-mails for failing vendors.
14:09:13 <jungleboyj> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-January/012151.html
14:09:30 <jungleboyj> We got some responses as you can see in the etherpad.
14:09:31 <raghavendrat> hi, i am from HPE. we are trying to bring up our CI
14:09:45 <rosmaita> raghavendrat: that is good to hear
14:09:46 <raghavendrat> its in progress
14:09:47 <jungleboyj> raghavendrat:  Awesome.
14:09:52 <jungleboyj> Thank you for being here.
14:10:26 <jungleboyj> Thanks to ttx  for working with the OSF to reach out to vendors as well.
14:10:53 <jungleboyj> So, the additional drivers to be unsupported has shrunk.
14:11:12 <jungleboyj> The question that is left, however, is what do we do now?
14:11:33 <jungleboyj> Do we need to re-address what we are doing with 3rd Party CI?
14:11:40 <rosmaita> we had floated the idea last week about maybe just unsupporting but not removing drivers
14:11:53 <rosmaita> i think smcginnis had a good point that you can't do that for very long
14:12:03 <rosmaita> as libraries get updated, you will start to get failures
14:12:41 <jungleboyj> True.  We are at the point that we have unsupported/removed nearly half the drivers over the last couple of releases.
14:12:50 <rishabhhpe> Hi , I am from HPE , we are trying to setup for CI .. but facing some difficulties . is there any documentation available or a automated scripts to bring the setup in a single shot ?
14:12:51 <rosmaita> i am hoping the Software Factory project may help with CI
14:13:06 <smcginnis_> An alternative being that we could move them to a different repo with a noop CI job.
14:14:22 <ttx> yeah, only keep CI-tested oens in mainline, and use a separate repo for everything else
14:14:25 <jungleboyj> rosmaita: We someone working on setting up an example of how to use that?
14:14:38 <rosmaita> rishabhhpe: take a look at https://softwarefactory-project.io/docs/index.html
14:14:38 <ttx> The current doc is certainly lacking
14:15:00 <jungleboyj> smcginnis_:  It seems keeping them somewhere is somewhat better than totally removing.
14:15:03 <e0ne> hi
14:15:06 <rosmaita> jungleboyj: tosky was speaking with someone in the cinder channel the other day about it
14:15:32 <rosmaita> i forget who though, but they were setting up a cinder CI
14:15:33 <smcginnis_> jungleboyj: Then if a distro wants to include them: "apt install openstack-cinder openstack-cinder-unsupported-drivers"
14:16:01 <jungleboyj> Ok.  So, that is an option.
14:16:18 <tosky> I just jumped in a discussion started by rosmaita :)
14:16:47 <tosky> smcginnis_: you don't need to move them into a separate repository for distributions to split the packages
14:16:54 <rosmaita> basically, for the Software Factory situation, we need someone to actually set it up for cinder and then report back
14:17:05 <whoami-rajat> it was Hitachi i guess rosmaita tosky
14:17:11 <smcginnis_> tosky: Effect, not cause. ;)
14:17:21 <rosmaita> there is a community around Software FActory, and RDO is using it for CI, so it is pretty solid
14:17:35 <rishabhhpe> <rosmaita> : ok
14:17:36 <rosmaita> whoami-rajat: ty, that's right, it was Hitachi
14:17:45 <tosky> smcginnis_: moving code around complicates the usage of the history; my suggestion would be to keep them in-tree and mark them somehow with some annotation
14:18:15 <smcginnis_> tosky: That's what we have today.
14:18:21 <jungleboyj> smcginnis_:  ++
14:18:22 <smcginnis_> The issue raised is that will eventually break.
14:18:27 <rosmaita> i guess we could blacklist them from tests?
14:18:29 <tosky> smcginnis_ but with removals part
14:18:42 <smcginnis_> So the options are either to remove them completely, or move them somewhere out of the way.
14:18:42 <eharney> putting drivers in a separate repo also means you have to figure out how to keep dependencies in sync, or nobody will actually be able to install the unsupported drivers
14:18:58 <smcginnis_> eharney: Yeah, it just moves the problem really.
14:18:59 <jungleboyj> :-(
14:19:23 <jungleboyj> And since the vendors aren't maintaining them then it is unlikely anyone is going to do that work.
14:20:00 <e0ne> jungleboyj: +1
14:20:10 <m5z> maybe we could move to unsupported list and when any dependency fails remove it?
14:20:23 <tosky> smcginnis_: wouldn't it be possible to disable the setuptools entry points (if they are used; at least for sahara we used them)
14:20:23 <tosky> IMHO, and from the past experience with sahara, either everything should stay in-tree as it is, or each driver should have its own repository from the start
14:20:23 <tosky> any other solution is looking for troubles :)
14:20:24 <smcginnis_> m5z: Was just thinking that.
14:20:32 <smcginnis_> That might be a good compromise.
14:21:02 <rosmaita> m5z: that is a good idea
14:21:14 <rosmaita> i'd prefer to just have one repo
14:21:20 <jungleboyj> rosmaita: ++
14:21:21 <smcginnis_> But then it's a fire and we can't wait to see if they get an update to any dependencies.
14:21:32 <smcginnis_> But probably better than just nuking them right away.
14:21:47 <jungleboyj> :-)
14:22:36 <rosmaita> maybe we could have unsupported -> unit test failures -> removal before next release
14:22:46 <lseki> ++
14:22:52 <rosmaita> we would blacklist as soon as we hit unit test failures
14:22:54 <smcginnis_> We couldn't do removal before next release.
14:22:55 <sfernand> ++
14:23:08 <smcginnis_> It would have to be removal before we can merge anything else because suddenly the gate it borked.
14:23:18 <jungleboyj> Yeah.
14:23:43 <rosmaita> if we blacklisted the tests, wouldn't that unblock the gate?
14:23:44 <jungleboyj> So, it goes away in that release, but that is ok because it was already unsupported.
14:24:25 <smcginnis_> rosmaita: So add a SkipTest to get around it right away, then remove by ~milestone-3 if not fixed?
14:24:37 <smcginnis_> I think I'd rather just remove it at that point.
14:24:59 <jungleboyj> Yeah, not sure the value of delaying the removal.
14:24:59 <rosmaita> well, the skip test would give them a final few weeks to get it done
14:25:00 <smcginnis_> They can always propose a revert if dependencies are fixed, but considering it is already unsupported, that's not likely.
14:25:09 <jungleboyj> Fair enough.
14:25:58 <m5z> smcginnis_: +1
14:26:52 <rosmaita> ok, so we would remove an unsupported driver from the tree immediately upon it causing test failures in the gate
14:26:55 <jungleboyj> smcginnis_:  That is like what we are currently doing.
14:27:17 <smcginnis_> jungleboyj: We wouldn't remove it the cycle after marking unsupported though.
14:27:25 <smcginnis_> Only as soon as it starts causing failures.
14:27:30 <eharney> i think in a lot of cases we can opt to just fix the failing tests ourselves -- this is part of why it's useful to keep them in the tree
14:27:46 <smcginnis_> That might be an option is its something trivial.
14:28:07 <rosmaita> we could keep that as an unadvertised option
14:28:18 <jungleboyj> smcginnis_:  eharney ++
14:28:18 <eharney> yeah
14:28:56 <rosmaita> alright, this sounds good ... i will write up something for us to look at before we announce this
14:29:02 <rosmaita> but it think it's a good direction
14:29:09 <smcginnis_> Soooo... if we adopt this policy, are we going to revert some of the removals we've already done?
14:29:15 <ttx> I see a lot of value in the CI we run ourselves (for "open source software" drivers). I'm unsure of the real value of 3rd-party CI for us. It's really a service for the vendors, to help them check they are not broken by changes
14:29:34 <rosmaita> smcginnis_: foos uwarion
14:29:45 <ttx> So i'm unsure we should support or unsupport them based on availability of CI
14:29:49 <rosmaita> did not mean to say that
14:29:56 <smcginnis_> ttx: It's also good for the project as a whole as it prevents cases where someone installs cinder and has a lot of trouble getting it to run.
14:30:07 <smcginnis_> That looks just as bad for cinder as it does for the vendor.
14:30:20 <ttx> smcginnis_: assuming that the 3rd-party CI actually tests the driver
14:30:22 <smcginnis_> Sometimes more so, because they think it's cinder's problem, not the vendors problem.
14:30:36 <smcginnis_> ttx: Yes, but that's what I'm saying.
14:30:38 <rosmaita> yeah, i would prefer to keep 3rd party CI
14:31:00 <smcginnis_> We need 3rd party CI, or we need to remove non-open drivers from tree.
14:31:09 <jungleboyj> rosmaita:  It is at least an indication that the vendor is engaged.
14:31:11 <ttx> yeah
14:31:34 <rosmaita> smcginnis_: i guess we should consider re-instating the drivers removed during this cycle
14:31:43 <jungleboyj> And I think that there should be some incentive to stay engaged.
14:31:54 <ttx> those are the two options. But I'd say the more difficult we make 3rdparty CI, the less likely it is to report useful results
14:32:21 <smcginnis_> It's been a constant headache, but as a whole, I think our 3rd party CI has been useful.
14:32:22 <rosmaita> ttx: that is why we are pushing Software Factory
14:32:26 <ttx> So the two options really are... simplify 3rd-party CI setup, or remove drivers that require special hardware from the tree
14:32:33 <jungleboyj> Well, that is the thing being worked in parallel is making 3rd Party CI easier.
14:32:44 <ttx> rosmaita: I agree, just trying to reframe why :)
14:33:11 <smcginnis_> It certainly can be simple: https://github.com/j-griffith/sos-ci
14:33:22 <smcginnis_> Just everyone wants to duplicate how infra works.
14:33:23 <jungleboyj> :-)
14:33:49 <jungleboyj> I thought at some point infra was pushing people to do that?
14:34:00 <smcginnis_> I don't think so.
14:34:08 <smcginnis_> This has been a headache for them too.
14:34:25 <jungleboyj> Ok.  Yeah, I was surprised when they came back with that.  I was unaware.
14:34:36 <rosmaita> ok, we need to wrap this up for today
14:34:45 <smcginnis_> Yeah, let's move along.
14:34:53 <smcginnis_> rosmaita: Want to summarize the plan?
14:34:59 <rosmaita> i think we made some progress
14:35:00 <jungleboyj> rosmaita:  Please.
14:35:06 <raghavendrat> one query: whats end date ... when drivers would be marked as uspported/removed ?
14:35:19 <rosmaita> unsupported would be same as now
14:35:33 <rosmaita> removal would be when first failure in our gate occurs
14:35:38 <jungleboyj> rosmaita: ++
14:35:51 <rosmaita> i will write something up for us to review
14:36:04 <smcginnis_> #link https://wiki.openstack.org/wiki/Cinder/tested-3rdParty-drivers#Non-Compliance_Policy
14:36:21 <rosmaita> #action rosmaita write up summary of what we decided or edit ^^
14:36:31 <raghavendrat> ok. will have a look and also keep close watch
14:36:59 <rosmaita> you may want to reach out to the hitachi people and combine efforts on Software Factory
14:37:06 <jungleboyj> Sounds good.  Should I revert the removals that I pushed up this cycle?
14:37:11 <rosmaita> check the openstack-cinder channel log for yesterday
14:37:18 <raghavendrat> ok
14:37:39 <rosmaita> jungleboyj: i would hold off until after we are absolutely sure about this
14:37:52 <rosmaita> (just in case someone thinks of a major objection we haven't considered)
14:37:52 <smcginnis_> Upgrade checkers too.
14:37:58 <rosmaita> right
14:38:06 <rosmaita> thanks jungleboyj and ttx
14:38:08 <jungleboyj> Ok.  So, continue discussion.
14:38:24 <rosmaita> #topic Spec: Volume local cache
14:38:30 <LiangFang> hi
14:38:32 <jungleboyj> Thank you guys.
14:38:34 <rosmaita> #link https://review.opendev.org/#/c/684556/
14:38:55 <LiangFang> should we do a microversion change for this?
14:39:00 <rosmaita> my questions have been met except for the microversion one
14:39:09 <rosmaita> https://review.opendev.org/#/c/684556/12/specs/ussuri/support-volume-local-cache.rst@180
14:39:45 <eharney> i'm not sure "volume details" is the right place for that information unless i'm misunderstanding what that refers to
14:39:57 <eharney> it should be part of the connection info etc, not the volume metadata?
14:40:09 <LiangFang> it is in connection info
14:40:22 <rosmaita> well, the volume-type extra specs will have the cacheable property
14:40:25 <LiangFang> cinder fill the fields in that
14:40:38 <eharney> "volume details" sounds like it would appear on "cinder show" etc
14:41:06 <rosmaita> yes, that's how it sounded to me
14:41:19 <LiangFang> sorry for misleading
14:42:39 <LiangFang> should I change the word "volume details", then keep microversion not change?
14:42:59 <rosmaita> yes
14:43:10 <LiangFang> ok, thanks
14:43:14 <rosmaita> no microversion impact if the API response doesn't change
14:43:44 <rosmaita> ok, other than that, i think eharney and geguileo had a bunch of comments on earlier versions of the spec
14:43:45 <LiangFang> ok
14:44:07 <rosmaita> would be good if you could make sure the current version addresses your concerns
14:44:52 <rosmaita> LiangFang: did you have any questions?
14:45:10 <LiangFang> no more questions now:) thanks
14:45:20 <rosmaita> ok, great
14:45:32 <rosmaita> #topic src_backup_id
14:45:41 <rosmaita> #link https://review.opendev.org/#/c/700977/
14:45:59 <rosmaita> this is close to being done
14:46:16 <rosmaita> we talked last week about could it be a bug instead of a spec
14:46:28 <smcginnis_> Yeah, I still think this should just be dropped as a spec. Just add it.
14:46:30 <rosmaita> but eric brought up a point about us using volume metadata for the field
14:46:42 <rosmaita> i think that needs to be documented
14:46:57 <rosmaita> mainly, that operators can't rely on it being there or accurate
14:47:13 <rosmaita> but otherwise, i think the proposal is fine
14:47:27 <rosmaita> also there was an issue about which id is used for incrementals
14:47:33 <rosmaita> it's addressed in the spec
14:47:58 <rosmaita> so, this will just need quick reviews once it's revised
14:48:07 <rosmaita> but i don't think there's anything controversial
14:48:27 <rosmaita> #topic Spec: 'fault' info in volume-show response
14:48:37 <rosmaita> #link https://review.opendev.org/#/c/689977/
14:48:56 <rosmaita> this is probably not ready
14:49:15 <rosmaita> it's still not clear why the user messages won't work
14:49:33 <rosmaita> and i don't like the idea of adding another DB table until we are sure it's necessary
14:49:39 <eharney> yeah, i still don't have a sense of why we want to add this when we already have a system that attempts to mostly do the same thing
14:49:54 <jungleboyj> ++
14:50:09 <eharney> there are probably some subtle differences but i suspect the answer is to just improve what we have rather than creating a new API for this
14:50:15 <rosmaita> eharney: ++
14:50:25 <rosmaita> i will keep an eye on it for revisions
14:50:41 <whoami-rajat> seems like it's inspired by nova instances having 'fault' property
14:51:19 <rosmaita> yes, it's just not clear to me that it's going to provide the info the proposer is looking for
14:51:19 <eharney> we currently have a scheme that ties faults to operations rather than the object being acted on
14:51:26 <eharney> it's different, but seems to work well
14:51:49 <eharney> if you want something like nova faults you can query our user messages by volume id already
14:52:29 <rosmaita> well, i left enough comments asking for specific answers for what exactly can't be done
14:52:34 <rosmaita> so we'll see what happens
14:52:37 <whoami-rajat> yep, agreed. it's different but works
14:52:45 <rosmaita> #topic sqlalchemy update to 1.3.13 breaks cinder
14:52:54 <rosmaita> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-January/012210.html
14:53:06 <rosmaita> ok, so the situation is that one of our unit tests fails
14:53:33 <rosmaita> i took a look, but it turns out what we're doing in the test *only* happens in that test
14:53:41 <rosmaita> so we could fix this by just changing the test
14:53:56 <rosmaita> or by slightly modifying the db.sqlalchemy.api
14:54:31 <rosmaita> i am inclined to just change the test at this point
14:54:53 <rosmaita> because the db api change loads the glance metadata into each volume object
14:54:55 <eharney> geguileo fixed some DetachedInstanceError problems a while ago, i wonder if this is a similar bug in our objects code that is just being revealed in tests now
14:55:13 <rosmaita> that could be
14:55:42 <rosmaita> most of the time when we want the glance info, we just make a call to get it, we don't expect it in the volume object
14:56:47 <rosmaita> i'll grep the logs for geguileo's fix and see whether it's the same kind of thing
14:56:58 <rosmaita> because i guess we'd do the same fix now to be consistent
14:57:15 <rosmaita> ok, i'll take a look and then update my patch
14:57:37 <geguileo> the issue is usually us trying to do a lazy load when we no longer have the transaction in place...
14:57:41 <rosmaita> i'm not sure how anxious the requirements team is to get sqlalchemy 1.3.13 into u-c
14:57:56 <geguileo> it works if it happens fast enough, but that's not usually the case iirc
14:58:12 <rosmaita> maybe that's why it's suddenly broken
14:58:21 <rosmaita> they may have optimized some code
14:58:30 <rosmaita> and now it can't happen fast enough
14:58:36 <geguileo> in other words, it's usually bad code in cinder, something that could happen in a production env
14:59:04 <rosmaita> as far as i can tell, this particular pattern is only used in that one unit test
14:59:19 <whoami-rajat> i think the bot automatically updates u-c when a lib is released.
14:59:25 <whoami-rajat> i mean puts up a patch for it
14:59:48 <rosmaita> looks like we are out of time
15:00:01 <rosmaita> thanks everyone! will try to have some open discussion next week
15:00:04 <jungleboyj> Thanks!
15:00:05 <whoami-rajat> thanks!
15:00:07 <rosmaita> but the CI discussion was helpful
15:00:16 <raghavendrat> thanks
15:00:27 <rosmaita> #endmeeting