Thursday, 2023-12-07

*** tobias-urdin9 is now known as tobias-urdin13:03
croeland1o/14:00
pranali#startmeeting glance14:00
opendevmeetMeeting started Thu Dec  7 14:00:28 2023 UTC and is due to finish in 60 minutes.  The chair is pranali. Information about MeetBot at http://wiki.debian.org/MeetBot.14:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.14:00
opendevmeetThe meeting name has been set to 'glance'14:00
pranali#topic roll call14:00
pranali#link https://etherpad.openstack.org/p/glance-team-meeting-agenda14:00
pranalio/14:00
mrjoshio/14:00
pranaliok, so assuming everynone is back here let's start :)14:01
rosmaitao/14:01
pranali#topic release/periodic jobs updates14:01
pranaliM2 is 5 weeks from now which will be spec freeze for us as well14:01
pranaliPeriodic jobs are all green except intermittent TIME_OUTs on fips jobs14:02
pranalimoving to next 14:02
pranali#topic RBD deletion Issue14:02
pranali#link https://bugs.launchpad.net/glance/+bug/2045769 - Image remains in active state even image data is deleted from the rbd store14:02
pranaliso i've observed this issue during the new add location api testing when delete is attempted when hash calculation is ongoing after image has set to active14:03
rosmaitayou asked me to look at this yesterday but i forgot14:04
pranaliyeah np14:04
pranaliI just thought you must be having an idea on this bcz i have seen one of you old patch where in the commit msg it's mentioned that when store throws  in use exception it deleted the data as well14:05
pranali#link https://github.com/openstack/glance/commit/f267bd6cde0e2b3ef5d08ae7c91831e1c88ed99014:05
pranalithis one ^14:05
rosmaitaok, i will claim that i co-authored the part that doesn't have a bug14:06
pranaliohh ok14:06
rosmaita(just kidding)14:07
pranali:D14:07
pranaliI've tried to fix that in my current location import patch by marking the image to deleted after catching the exception14:08
pranali#link https://review.opendev.org/c/openstack/glance/+/886749/31/glance/async_/flows/location_import.py#8314:08
pranalibut after noticing this same issue for image download as well i thin kit should be handled in deleted operation it self, right ?14:09
rosmaitasorry, i'm still trying to figure out the context (looking at the bug https://bugs.launchpad.net/glance/+bug/2045769 )14:10
rosmaitawith that bug, for step #114:10
rosmaitathe image has been uploaded and gone active before you go to step #2, is that right?14:11
pranaliyes14:11
rosmaitaok, and since that was a regular 'glance image-create', the hash would be done during the upload (not later? or have we changed that?)14:12
pranaliyeah i think so14:13
rosmaitaok, what i'm getting at is that i don't think the hash computation is involved in this issue14:13
rosmaitathe error in step #2 i'm pretty sure is coming from the client14:14
pranalihmm need to check that but download has got NotFound error since the data was lost14:15
pranali#link https://paste.opendev.org/show/bg8hJ7kF7CYJVM4lZMe2/14:16
rosmaitaright14:16
pranaliI'm just not sure why store raises InUseByStore exception if it deletes the data14:17
rosmaitaright14:17
rosmaitai wonder if the image cache has anything to do with this14:18
abhishekkrosmaita, let me explain the issue here14:18
rosmaitahave you tried it without the cache (or do we always cache these days)14:18
rosmaitaplease!14:18
abhishekkI created the image A of 5 gb (hash is calculated) and image is active now14:19
abhishekkI sent image download request, download started and in 2nd window I sent delete image request14:19
abhishekkNow what happens is download interrupts as data is deleted but delete call fails by saying image in use14:20
abhishekkand image remains active state14:20
abhishekkon 2nd download call we get error that image has no data14:20
abhishekkProblem is store returns image is busy but it also deletes the data from the store14:20
abhishekkAnd user gets delete call failed and he sees image is still active14:21
rosmaitaare all the locations gone at that point?14:21
abhishekk(assume) There is only one location, store deletes the data and returns Busy exception to glance14:22
abhishekkglance does not deletes the location and keeps image in active state14:22
rosmaitabut does it leave the location on the image14:22
abhishekkyes14:22
rosmaitaok14:22
abhishekkI think this is serious issue14:23
abhishekkThere are two possibilities, 14:23
abhishekkregression in ceph14:24
abhishekkor store code is wrong14:24
rosmaitaor both!14:24
abhishekk:D14:24
abhishekkmy suggestion to pranali is deploy quincy and check this scenario14:25
rosmaitaok, on the plus side, though, the user deleted the image and the data is gone, so they will be annoyed that it still shows active, but shouldn't be too annoyed because they deleted it14:25
abhishekkcorrect14:25
rosmaitado we have debug logs from the first image delete in this scenario?14:26
pranaliabhishekk, i've tried to change the ceph version in nova-ceph-multistore job but it's failing  14:27
pranali#link https://zuul.opendev.org/t/openstack/build/e62d4a18b87f4be1872c84c0560f61d314:27
abhishekkpranali, might have it14:28
* pranali is checking 14:28
abhishekkfind out the error, and try, because we need to rule out the possibilities 14:28
abhishekkthis issue can be easily reproducible, so we can get logs again 14:29
rosmaitaso basically, glance_store rbd driver asks ceph to delete the data, it gets back an is-busy-error, but ceph deletes the data anyway14:29
rosmaitaglance thinks that the delete failed, so it keeps the image in 'active'14:30
rosmaitaand doesn't remove the location where it thinks the data is14:30
abhishekkcorrect14:30
rosmaitabut since ceph deleted the data, all downloads fail14:30
abhishekkyes14:30
rosmaitaand this is with current master code, and which ceph?14:31
pranalii think the latest ceph, Reef 14:32
rosmaitaok14:32
pranaliwe have not yet confirmed whether it's there with previous version of ceph as well14:32
pranali#link https://etherpad.opendev.org/p/image-delete-from-rbd-issue14:33
pranaliI've these logs atm14:33
rosmaitaok, thanks14:34
abhishekketherpad is empty?14:34
pranali:O14:34
pranaliI can see the logs in that etherpad14:35
abhishekkits empty for me14:36
mrjoshiIt's empty for me too14:36
abhishekk\o14:37
abhishekk\o/ voodoo14:37
* croeland1 sees nothing14:38
pranalihmm not sure why it's showing me now reconnecting continuously :/ 14:39
* abhishekk it's magic, issue does not want us to solve it except pranali 14:39
pranaliLOL14:39
pranaliok, I think we should move ahead and can continue this discussion on glance channel14:41
pranali#link https://paste.opendev.org/show/b8Lt6CgF5Sjd7Sd3g8SV/, tried to add the logs here14:42
rosmaitaok, i can see that one14:42
abhishekkI think your logs broke etherpad :P14:42
pranaliplz ingnore the above link , #link https://paste.opendev.org/show/b8sruRYp2tcRcJ9sWwqB/ 14:43
abhishekkI think you can explore above possibilities to isolate the problem14:45
abhishekklet's move ahead,14:45
pranaliyeah14:45
pranali#topic Specs14:45
pranali#link https://review.opendev.org/c/openstack/glance-specs/+/899367 - Use Centralized database for cache operations14:46
abhishekkalso do one check, upload large image and during upload delete the image and see what happens14:46
pranali#link https://review.opendev.org/c/openstack/glance-specs/+/900267 - New API to restore image14:46
pranali#link https://review.opendev.org/c/openstack/glance-specs/+/899804 - [Spec Lite] Deprecate location strategy14:46
pranali#link https://review.opendev.org/c/openstack/glance-specs/+/899805 - [Spec Lite] Deprecate cachemanage middleware14:46
pranali#link https://review.opendev.org/c/openstack/glance-specs/+/899857 - Caracal project priorities14:46
pranaliabhishekk, yeah that also should be tried, I will do that14:47
pranalikindly please have a look at these specs, the deprecation specs emails I've sent on ML, so we can wait for those till end of this month if anyone has any objection on the same 14:48
abhishekkplease review centralized db spec, that is most important this cycle14:49
pranaliyes14:49
pranalithat's it from me14:50
pranalilet's move to open discussions14:51
pranali#topic Open Discussions14:51
abhishekkI don't have anything else 14:51
mrjoshiI would like to highlight14:51
rosmaitaok, somebody please bug me tomorrow about reviewing specs14:51
pranalirosmaita, ack :)14:52
abhishekkhaha14:52
* abhishekk signing out14:52
abhishekkthank you all14:52
pranaliThanks everyone for joining !!14:52
pranali#endmeeting14:52
opendevmeetMeeting ended Thu Dec  7 14:52:59 2023 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)14:52
opendevmeetMinutes:        https://meetings.opendev.org/meetings/glance/2023/glance.2023-12-07-14.00.html14:52
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/glance/2023/glance.2023-12-07-14.00.txt14:52
opendevmeetLog:            https://meetings.opendev.org/meetings/glance/2023/glance.2023-12-07-14.00.log.html14:52
*** tobias-urdin34 is now known as tobias-urdin17:27

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!