14:00:05 <rosmaita> #startmeeting cinder
14:00:05 <opendevmeet> Meeting started Wed Jun 16 14:00:05 2021 UTC and is due to finish in 60 minutes.  The chair is rosmaita. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:05 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:05 <opendevmeet> The meeting name has been set to 'cinder'
14:00:12 <sfernand> hi
14:00:16 <rosmaita> #topic roll call
14:00:32 <rosmaita> (#topic doesn't work, but i will use it anyway)
14:00:41 <eharney> hi
14:00:43 <walshh_> hi
14:00:54 <e0ne> hi
14:01:03 <enriquetaso> hi
14:01:28 <rosmaita> #link https://etherpad.opendev.org/p/cinder-xena-meetings
14:01:36 <rosmaita> ok, that doesn't work either
14:01:51 <tbarron> hi
14:02:31 <rosmaita> a lot on the agenda today, so let's get started
14:02:40 <rosmaita> #topic announcements
14:02:54 <rosmaita> cinder-tempest-plugin-lvm-lio-barbican job is failing because sqlalchemy 1.4 broke barbican's alembic migration
14:03:03 <rosmaita> this is now fixed, courtesy of geguileo
14:03:11 <rosmaita> https://review.opendev.org/c/openstack/barbican/+/796284/
14:03:34 <rosmaita> barbican isn't currently included in the requirements check job
14:03:44 <rosmaita> https://review.opendev.org/c/openstack/requirements/+/796647
14:04:04 <rosmaita> ^^ proposes a barbican crosscheck job
14:04:37 <rosmaita> are there any other projects we depend on that should be checked?
14:04:39 <jungleboyj> o/
14:04:45 * jungleboyj sneaks in late
14:04:57 <geguileo> rosmaita: I don't think that would have detected this issue
14:05:16 <geguileo> rosmaita: isn't that the unit tests?
14:05:18 <rosmaita> geguileo: it would have caught the ut failures you fixed
14:05:27 <rosmaita> not the migration, though
14:05:37 <geguileo> rosmaita: which was the one that blocked our gate
14:06:14 <geguileo> they need something like our cinder/tests/unit/db/test_migrations.py
14:07:10 <rosmaita> we can suggest that, but the barbican team seems to be under staffed these days
14:08:21 <geguileo> good to know  :-(
14:08:34 <rosmaita> also, i noticed that the nova functional tests are run
14:08:52 <rosmaita> could add ours, not sure how much that would help detect problems
14:09:59 <rosmaita> we should keep this in mind for the next time sqlalchemy is updated to 1.5 or 2.0
14:10:19 <rosmaita> next item: vulnerability:managed tag accepted for os-brick
14:10:37 <rosmaita> which doesn't really change anything because we all thought it was already managed
14:10:51 <rosmaita> next item: request from jungleboyj
14:11:00 <rosmaita> Help me vote for the Y release name:  https://twitter.com/jungleboyj/status/1404464680349929474
14:11:18 <rosmaita> jungleboyj: when is the deadline for that?
14:11:26 <jungleboyj> Yes.  :-)  Just a note that I have a naming poll out there for the Y release.  Have had good participation.
14:11:44 <jungleboyj> I need to submit by vote today, so, if you want to help me pick the name.  Please vote.
14:12:03 <rosmaita> my personal favorite is "You"
14:12:11 <rosmaita> so that no one will know what release you are talking about
14:12:24 * jungleboyj isn't surprised
14:12:26 <rosmaita> next item: reminder about festival of reviews on Friday
14:12:34 <eharney> Yoghurt but no Yogurt?  i dunno...
14:12:35 <jungleboyj> Enter the Chaos Monkey
14:12:37 <rosmaita> info here: http://lists.openstack.org/pipermail/openstack-discuss/2021-June/023100.html
14:13:05 <rosmaita> friday is a holiday in some locations, but not enough to reschedule
14:13:16 <rosmaita> at least that's my impression?
14:14:06 <rosmaita> hearing nothing to the contrary, next item:
14:14:17 <rosmaita> cinder-coresec: need comments on https://bugs.launchpad.net/bugs/1929223 before 23:59 UTC on Friday
14:14:43 <rosmaita> so please comment at your earliest convenience
14:14:58 <rosmaita> finally, reminder that the spec freeze is next friday
14:15:28 <rosmaita> so we need to review specs in a responsive manner, that is, right away
14:15:38 <rosmaita> #link https://review.opendev.org/q/project:openstack%252Fcinder-specs+status:open
14:16:07 <rosmaita> that's it for announcements, unless someone else has something to share?
14:16:52 <rosmaita> ok, moving on
14:17:02 <rosmaita> #topic Two cinder patches blocking glance feature
14:17:07 <rosmaita> whoami-rajat: that's you
14:18:17 <rosmaita> not sure whoami-rajat is around
14:18:23 <rosmaita> but he left enough info in the agenda
14:18:35 <rosmaita> he's been working on hardening the glance_store cinder driver
14:18:50 <rosmaita> and found a cinder issue that needs to be addressed
14:18:59 <rosmaita> #link https://review.opendev.org/c/openstack/cinder/+/783389
14:19:18 <rosmaita> i think ^^ is fine and corrects a mistake when the validation schema stuff was added to cinder
14:19:23 <rosmaita> see my comment on the patch
14:19:42 <rosmaita> the other one is a cinderclient change
14:19:46 <rosmaita> #link https://review.opendev.org/c/openstack/python-cinderclient/+/783628
14:20:26 <rosmaita> i think that one addresses a similar problem in the cinderclient, that is, it is requiring an optional parameter
14:21:02 <rosmaita> anyway, please review so rajat can get that glance_store patch out of his life, which will allow him to concentrate on cinder
14:21:07 <eharney> makes sense
14:21:28 <rosmaita> ok, next topic is a big one
14:21:40 <rosmaita> #topic some concerns about the frequency of Cinder failures in the gate
14:21:51 <rosmaita> jungleboyj is getting pressure from other members of the TC
14:21:59 <rosmaita> and in turn, is passing some pressure onto us
14:22:06 <jungleboyj> :-)  Yes.
14:22:07 <eharney> is there anything going on here other than the known issues with LVM crashing?
14:22:30 <jungleboyj> Based on the discussion yesterday it appears that that is the likely cause for concern.
14:22:30 <rosmaita> eharney: it's hard to tell
14:22:32 <whoami-rajat> rosmaita: sorry was afk, thanks for covering it
14:22:37 <jungleboyj> But it is hard to tell.
14:22:53 <eharney> well it should be easy to quantify the LVM issues with elastic-recheck, has anyone tried that?
14:23:08 <rosmaita> eharney: "should be" and "no"
14:23:43 <enriquetaso> LVM issues is https://bugs.launchpad.net/cinder/+bug/1901783/ ?
14:24:12 <eharney> yes but it fails on more operations than just volume delete
14:24:19 <rosmaita> short-term, i would like to propose no more naked rechecks
14:24:29 <jungleboyj> rosmaita: ++
14:24:37 <rosmaita> for one thing, the first two items i looked at weren't cinder's fault
14:26:04 <jungleboyj> We should also work on getting the LVM crashes fixed.
14:26:16 <rosmaita> yes, i agree
14:26:18 <jungleboyj> And fix the barbican issues.
14:26:30 <eharney> we know how to work around them in a messy way, i think we don't have a way to properly fix them
14:26:39 <eharney> the barbican issues are already fixed AFAIK
14:26:49 <rosmaita> anyway, short term i propose that when you issuue a recheck, do this:
14:26:58 <rosmaita> recheck <job> <failed test name>
14:27:08 <rosmaita> and maybe some info about the failure if relevant
14:27:15 <rosmaita> and you can add info here:
14:27:21 <rosmaita> https://etherpad.opendev.org/p/cinder-xena-ci-tracking
14:27:43 <rosmaita> but if someone has time to set up an elasticsearch query to automate this, that would be better
14:27:58 <rosmaita> but short term it would be good to get some quick data about what is going on
14:28:42 <rosmaita> i thought that most of the failures are in teardown, and not related to actual tests,  but i am not sure whether that's true or not
14:29:53 <rosmaita> any questions about our short term data collection?
14:30:29 <whoami-rajat> from personal standpoint, gates were passing consistently before this barbican issue and now also it's working, maybe failure is more often seen in other project gates
14:30:37 <jungleboyj> And the teardown failures are often due to volumes being left around due to something like the LVM crash.
14:30:53 <jungleboyj> whoami-rajat:  That is a concern.
14:31:28 <rosmaita> well, i think it's mostly in tempest-integrated-storage
14:31:35 <rosmaita> so nova, glance, and cinder
14:33:06 <rosmaita> anyway, no more naked rechecks ... at least pretend you care
14:33:26 <rosmaita> and i guess, review the lvm patches
14:33:36 <eharney> i think we still need to write more lvm patches
14:33:52 <eharney> lvdisplay is not covered, not sure what else
14:36:00 <Guest2396> rosmaita: could we somehow keep that kind of info in the cinder wiki or something?
14:36:19 <rosmaita> Guest2396: which kind of info?
14:36:38 <rosmaita> btw, everyone keep an eye on https://review.opendev.org/c/openstack/cinder/+/772126 (it's in recheck now)
14:36:50 <geguileo> rosmaita: the etherpad link for ci trakcing, and the recheck job failedtestname thingy
14:37:11 <eharney> 722126 doesn't fix the crash (which was the initial hope)
14:37:17 <rosmaita> geguileo: glad you asked, i think i want to put it into the channel topic
14:37:28 <rosmaita> and the wiki, that is a good idea
14:37:46 <geguileo> rosmaita: having the etherpads we are currently using in the wiki could be useful
14:37:59 <geguileo> (for those of us with fish memory)
14:38:01 <rosmaita> #action rosmaita check with opendev team about getting topic changed
14:38:19 <rosmaita> #action rosmaita add current etherpads to wiki
14:38:37 <eharney> 722126 needs a follow-up to retry on crash
14:38:40 <rosmaita> geguileo: there's also the spotlight links on the meeting agenda
14:39:58 <jungleboyj> rosmaita:  We used to have that list in the Wiki.  Needs to be updated.
14:40:03 <rosmaita> eharney: what's the best way to track these? use https://launchpad.net/bugs/1901783 or other bugs?
14:40:17 <rosmaita> jungleboyj: noted
14:40:40 <rosmaita> #action rosmaita publicize the ci-tracking effort in all available methods
14:41:03 <eharney> we should probably write a new bug for retrying all of the other lvm commands that can segfault that weren't covered by 1901783, since that bug already has backports spanning a few branches
14:41:37 <enriquetaso> i can help with that eharney
14:41:43 <rosmaita> ok, let's discuss that during the upcoming bug squad meeting
14:41:58 <enriquetaso> sure
14:41:58 <eharney> ok
14:42:20 <rosmaita> ok, sounds like we have some strategy to address our CI problems
14:42:47 <rosmaita> #topic Community goal proposal: Test with TLS (formerly SSL) by Default
14:42:51 <rosmaita> enriquetaso: that's you
14:43:04 <enriquetaso> Hello, just a quick question around TLS. As this is sort of a community-goal  and I'm not familiar with TLS, I wonder if enabling TLS is a problem for us?
14:43:16 <enriquetaso> Or should we priority this XS review? maybe for our  XS meeting next friday.
14:43:28 <rosmaita> well, the review will not pass
14:43:33 * enriquetaso reading brian's comments on the etherpad
14:43:44 <rosmaita> yeah, i looked into this a bit yesterday
14:44:43 <rosmaita> not sure how big a deal it is if CI can't reliably use TLS for this one test job
14:45:00 <rosmaita> my impression is that there's something going on in the botocore library
14:45:10 <rosmaita> since 2014
14:45:13 <enriquetaso> oops
14:45:16 <enriquetaso> OK
14:45:18 <tosky> I see a reference to an aws-cli issue, but is it relevant for that different S3 implementation?
14:45:37 <rosmaita> tosky: i don't know, you wouldn't think so
14:45:47 <rosmaita> but i think the cli also uses botocore
14:46:25 <rosmaita> it's weird that we would also see it in a fake s3 implementation
14:46:34 <tosky> alternative, not fake :)
14:47:07 <rosmaita> well, apparently it is so close to the original that it is causing the same problem
14:47:14 <rosmaita> :)
14:48:12 <rosmaita> i guess at this point, i can leave a comment on ricolin's etherpad about this and send a note to the ML for anyone interested in the s3 backup service to please take a look?
14:48:23 <enriquetaso> +1
14:48:31 <rosmaita> i guess file a bug as well if there isn't one already for this
14:48:56 <enriquetaso> #action(enriquetaso): reply to ricolin's and fill a bug
14:49:08 <rosmaita> enriquetaso: thanks!
14:49:23 <rosmaita> ok, next topic
14:49:36 <rosmaita> #topic Finishing up snapshotting in-use volumes spec
14:49:39 <rosmaita> eharney: that's you
14:50:10 <eharney> geguileo found one complication i needed to cover here, wanted to make sure people were generally on-board as we get close to the specs deadline
14:50:40 <eharney> my initial spec left out what should happen for people who are, for whatever reason, passing force=False as a parameter to snapshot create
14:51:05 <eharney> since i'm not sure those people exist, i'm leaning toward just not allowing that going forward, as part of this change
14:51:31 <geguileo> eharney: not allowing the force parameter in general, or just the force=false?
14:51:51 <eharney> hrm
14:52:34 <eharney> the spec at the moment says the latter, but you have me wondering if we should do the former now
14:53:50 <rosmaita> #link https://review.opendev.org/c/openstack/cinder-specs/+/781914
14:54:04 <eharney> i outlined some of the options (apparently not including that one) in comments on the previous patchset version
14:54:06 <eharney> rosmaita: right, thanks
14:55:06 <rosmaita> what's the use case for force=false ? just to remind yourself that the volume is attached?
14:55:20 <eharney> i'm not sure there is a very good use case for it
14:55:27 <eharney> it's more just that you could do it..
14:56:30 <eharney> geguileo: thoughts?
14:57:00 <geguileo> rosmaita: the case is where you have code and you are unconditionally passing the parameter, but use a variable to pass it to true or false
14:57:09 <geguileo> eharney: I like the removal of the force parameter
14:57:26 <geguileo> but failing on force=false is probably less code
14:57:37 <eharney> the downside of just removing it altogether is that it's more of a hurdle for people who had just added force=True (which would be the common case)
14:57:53 <geguileo> yeah, that's a big one
14:58:08 <geguileo> they may be using the highest microversion and just passing it like you say
14:58:20 <geguileo> that's a big reason for accepting it
14:58:34 <rosmaita> ok, we are about out of time ... let's discuss on the spec
15:00:06 <rosmaita> thanks everyone, join the bug squad meeting in #openstack-cinder
15:00:22 <rosmaita> #endmeeting