14:00:20 #startmeeting cinder 14:00:21 Meeting started Wed Sep 2 14:00:20 2020 UTC and is due to finish in 60 minutes. The chair is rosmaita. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:22 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:24 The meeting name has been set to 'cinder' 14:00:31 #topic roll call 14:00:33 Hi 14:00:38 hi 14:00:39 hi 14:00:43 ji 14:00:44 hi 14:00:44 hi 14:00:45 hi* 14:00:47 .o. 14:00:52 mep 14:01:33 good turnout 14:01:44 #link https://etherpad.openstack.org/p/cinder-victoria-meetings 14:01:56 #topic announcements 14:02:12 reminder about Forum brainstorming 14:02:25 i sent something to the ML to solicit ideas from operators and other cinder users 14:02:34 #link http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016659.html 14:02:47 and here is where you can add ideas: 14:02:57 #link https://etherpad.opendev.org/p/2020-Wallaby-cinder-brainstorming 14:03:21 hi =] 14:03:40 we will pick what topics to propose at next week's meeting 14:03:51 o/ 14:03:56 in the meantime, you need to be registered for the Summit to attend the Forum 14:04:16 it's free, so go ahead and register 14:04:28 #link https://openinfrasummit2020.eventbrite.com/ 14:05:04 also, we have the Wallaby PTG coming up 14:05:09 is the deadline for proposing topics on Wed 9 sept ? 14:05:35 enriquetaso: i thought friday that week? 14:05:52 thanks! 14:06:18 but if you have an idea, add it to the forum etherpad 14:06:39 ++ 14:06:40 and the wallaby PTG etherpad is 14:06:51 #link https://etherpad.opendev.org/p/wallaby-ptg-cinder-planning 14:07:16 #topic updates - releases 14:07:29 ok, os-brick for victoria has to be released tomorrow 14:07:36 we have one patch in the gate now 14:07:52 #link https://review.opendev.org/730376 14:08:02 previous release was 3.2.1 14:08:11 this will be 4.0.0 because we removed a bunch of connectors for drivers that were removed from cinder 14:08:14 thanks everybody for the feedback to the patch above 14:08:50 we have some stuff piled up for stable/ussuri and train in os-brick 14:09:04 so i will put up patches for those releases soon 14:09:17 next up: python-cinderclient release next week 14:09:26 someone please look at this one! it's a 2 line change! cinderclient support for mv 3.61 (which has merged on cinder side) 14:09:32 #link https://review.opendev.org/#/c/742994/ 14:09:57 the big thing to focus on is cinder patch for per-project default volume types: 14:10:04 #link https://review.opendev.org/#/c/737707/ 14:10:05 * e0ne is feeling like a release-blocker person 14:10:18 because that has a cinderclient support patch: 14:10:25 #link https://review.opendev.org/739223 14:10:39 looks like that patch only bumps the microversion number ? 14:10:52 hemna: yes 14:10:54 hemna: yes 14:11:25 ok done 14:11:32 whoami-rajat has revised https://review.opendev.org/#/c/742994/ 14:11:46 smcginnis, hemna: thank you! 14:12:00 so it could use some eyes, need to make sure it's acceptable so that we know the cinderclient side is correct 14:12:01 rosmaita, that's not the patch ... 14:12:15 oops 14:12:24 #link https://review.opendev.org/#/c/739223/ 14:12:25 #link https://review.opendev.org/#/c/737707/ 14:12:47 we need the cinder-side patch merged first, i think 14:13:05 so to be completely clear: these are really high priority patches 14:13:18 #link https://review.opendev.org/#/c/737707/ 14:13:21 oh the API one 14:13:23 ok 14:13:24 #link https://review.opendev.org/739223 14:13:41 ok, next priority are other cinder features for victoria 14:13:52 use the blueprints: 14:13:52 #link https://blueprints.launchpad.net/cinder/victoria 14:14:06 but i will point out some really easy reviews (and they have been sitting quite a while ...) 14:14:13 support zstd compression: 14:14:14 #link https://review.opendev.org/726765 14:14:21 extra-specs support for volume local cache: 14:14:22 #link https://review.opendev.org/#/c/700799/ 14:14:36 stop sending notifications to deprecated nonstandard publisher_id 14:14:36 #link https://review.opendev.org/747540 14:14:36 hemna: would like you to take a look ^^ 14:14:46 ok 14:14:46 remove deprecated rbd option for OSSN-0085 14:14:46 #link https://review.opendev.org/#/c/747494/ 14:14:46 and there are the driver features 14:15:14 ah yes cool 14:15:50 ok, so to be completely clear: top priority is whoami-rajat's per-project-default-type stuff 14:15:58 because we must release cinderclient next week 14:16:13 so no possibility of FFE to help there 14:16:26 and a reminder to people with driver features 14:16:49 reviewing other people's patches will help you get yours reviewed faster 14:16:53 (we do pay attention) 14:16:55 lots of failures on the per project default type patch :( 14:17:28 yeah, the gate has been really unstable 14:17:57 any other announcements? 14:18:03 (that's all from me) 14:18:42 #topic RBD connector fix 14:18:56 there was some discussion about this today in the cinder channel 14:19:21 it may be possible to simplify the patch, but it requires more testing against various ceph versions 14:20:03 #link https://review.opendev.org/#/c/730376/ 14:20:20 I tested only with octopus and nautilus 14:21:22 the idea is to test against supported versions to check if [global] section works for every supported ceph release 14:21:50 so to be clear,the patch is backward compatible right now, and since the issue has only been reported for octopus, we should be OK 14:22:02 but it may be possible to remove the conditional logic 14:22:22 so for ceph-inclined people, that's something to take a look at 14:22:30 e0ne: anything else? 14:22:41 rosmaita: that's all for me 14:22:53 I raised this topic only because release deadline 14:23:06 and quite appropriately, too 14:23:23 #topic service creation race condition 14:23:29 #link https://bugs.launchpad.net/cinder/+bug/1891330 14:23:30 Launchpad bug 1891330 in Cinder "Duplicate Cinder services DB entries after upgrade" [Low,Triaged] 14:23:40 i mostly just want to raise some awareness 14:23:46 there is an abandoned patch (link in the bug) 14:23:53 but the major objection was that cinder didn't support A/A yet 14:23:58 which isn't the case any more 14:24:02 I'm all for restoring proposed patch to add unique constraints into the DB 14:24:26 well, ordinarily, i would be too, but my experience this past week is making me a bit gun-shy 14:24:31 as we will discuss in a minute 14:25:25 but, it does look like an easy fix 14:25:58 #topic ussuri upgrade issue 14:26:23 this is the issue 14:26:28 #link https://bugs.launchpad.net/cinder/+bug/1893107 14:26:29 Launchpad bug 1893107 in Cinder "Upgrade to Ussuri fails if deleted volumes exist prior to Train" [High,In progress] - Assigned to Mohammed Naser (mnaser) 14:27:06 well, i spent a bunch of time trying to figure out a way to fix this that was low-impact on operators 14:27:12 and i thought i had a solution 14:27:18 but, wound up going back to the original proposal 14:27:50 the issue in a nutshell is that the online migrations to set __DEFAULT__ as the volume type for all untyped volumes/snapshots in Train 14:28:02 did not take into account the soft-deleted volumes/snapshots 14:28:12 so when you try to upgrade to ussuri 14:28:24 and the db_sync puts a non-nullability constraint on the columns 14:28:28 the db_sync fails 14:28:55 this is complicated by the fact that someone discovered a way to delete the __DEFAULT__ volume type in train 14:29:06 so, we don't know for sure that it actually exists 14:29:31 but, the issue only applies to people who upgraded from Stein to Train without first purging the database 14:29:38 (which may actually be a lot of people) 14:30:11 anyway, the best thing i could come up with are these patches: 14:30:21 #link https://review.opendev.org/#/c/748481/ 14:30:28 (but if they don't have issue purging it then they can purge anytime and upgrade successfully) 14:30:33 #link https://review.opendev.org/#/c/748482/ 14:31:00 so, my first ask is for careful reviews of those patches 14:31:08 the code change is simple, and there are tests 14:31:16 whoami-rajat: sometimes operators can't do purge because they use deleted volumes for billing 14:31:28 but there is manual operator intervention required when you don't do the purge 14:31:45 so PLEASE read the release notes carefully to see if i covered everything 14:31:45 e0ne, yep, that's why i said if they don't have issue/don't need that info 14:32:13 and if you are interested in all the crazy schemes i thought of that didn't work, you can read the sorry story here: 14:32:21 fwiw, we do purging 14:32:22 #link https://etherpad.opendev.org/p/ussuri-upgrade-issue 14:33:46 yes, so this shows a weakness in grenade testing of T->U upgrades 14:33:59 the problem only happens in S->T->U upgrades 14:34:13 but i am not going to propose a 3 stage grenade job 14:35:09 the reason i had this on the agenda was that i was going to propose using some of the "placeholder" db migrations 14:35:24 but it turned out that wasn't going to really help 14:35:41 so we are left with really long release notes explaining how to do it yourself 14:36:11 that's all from me, unless anyone has questions 14:36:51 excluding the part that they've renamed the __DEFAULT__ type, they just have to run migrations 14:38:37 #topic open discussion 14:38:58 If i may, just a small note: 14:39:18 Our CI (Quobyte) is one of the failing 3rdparty CIs. 14:39:30 I've been trying to fix it in the last but that did not work out. 14:39:41 I'm currently setting up a completely new CI stack 14:39:49 So i hope i get this back online in the next days 14:39:58 14:40:06 Thanks kaisers 14:40:11 ok, thanks for the update 14:40:41 i would like some feedback on the idea to backport https://review.opendev.org/#/c/741498/ 14:40:41 this doesn't change anything on a normal deployment just this allows the operators to delete the __DEFAULT__ type if they've 14:40:41 set a valid default_volume_type in the conf 14:41:28 on the plus side, that would make the behavior consistent from Train through Victoria 14:41:43 and it doesn't really impact the upgrade issue we just talked about 14:41:48 that patch makes setting the default_volume_type conf entry manditory 14:42:03 which might bust older deployments if you are backporting this 14:42:11 how? 14:42:12 yep, and it defaults to __DEFAULT__ 14:42:20 if that setting isn't in cinder.conf 14:42:22 what happens? 14:42:26 since it's manditory 14:42:33 cinder won't start? 14:42:36 ^^ 14:42:52 well, if they don't set it, they get __DEFAULT__ 14:43:00 if they do set it, they get whatever they set 14:43:02 so then it's not manditory 14:43:12 hemna, https://review.opendev.org/#/c/741498/16/cinder/common/config.py 14:43:14 i think mandatory here means it isn't set to empty/none ? 14:43:15 well, depends on what you mean 14:43:15 * hemna has a confusion 14:43:25 it's required but it also has a default value 14:43:34 IMHO manditory means that you must set it in cinder.conf and have a valid value 14:43:49 if it has a default value then it's not manditory to set it 14:44:11 hemna: +1 14:44:15 it's confusing 14:45:18 to preserve the current behavior, we have defaulted it to __DEFAULT__ so there 14:45:26 's no effort to set it manually 14:45:39 but if you're going to set it, it should be a valid type 14:46:02 also deleting the value set in default_volume_type is not allowed so we atleast have 1 type in deployment and don't allow untyped volumes 14:46:14 of all the confusing things in cinder, this is like the least confusing thing i can think of! :) 14:47:11 hemna: I had the same reaction. ;) 14:48:10 i think what it comes down to is that we want to encourage operators to set that config value 14:48:42 but, they don't have to 14:49:16 you must have a default_volume_type correctly defined at all times, or you can't create volumes 14:49:33 whether you let us define it as __DEFAULT__ or whether you do it yourself 14:49:52 so that strikes me as "mandatory" in at least some aspect of the term 14:51:07 all these changes are a part of a bug saying operators/users get *confused* with the name __DEFAULT__ when they've a different default_volume_type set 14:51:54 I guess it's just the wording that makes it confusing is all. 14:52:41 The confusion is all based on the fact that we didn't do that right and should have kept __DEFAULT__ a hidden internal thing if someone didn't provide a type. 14:52:49 if it's manditory, then the deployer must set it. but he doesn't because there is a default. hence the confusing. 14:52:53 it's not too late to revise the release note to be more clear 14:53:00 https://review.opendev.org/#/c/741498/16/releasenotes/notes/allow-deleting-__DEFAULT__-type-d35dfb5d89760b9b.yaml 14:53:17 so, anyone interested in this issue, please read through ^^ 14:53:50 we can get that clarified and then decide about the backports, squashing in the revised release note 14:56:17 the reported bug mentions the train and ussuri releases so that's also a consideration to backport it 14:57:38 coming up on 2 minutes left 14:58:59 ok, looks like we're done for today! 14:58:59 Please ... look at the release notes on the upgrade issue patches -- with Victoria release coming up, i think people will realize they need to upgrade to ussuri 14:59:09 https://review.opendev.org/#/c/748481/ 14:59:09 https://review.opendev.org/#/c/748482/ 14:59:34 ok, thanks everyone! let's clear out so horizon can start on time 14:59:39 happy reviewing! 14:59:43 #endmeeting