16:00:16 #startmeeting Cinder 16:00:16 Meeting started Wed Jul 12 16:00:16 2017 UTC and is due to finish in 60 minutes. The chair is smcginnis. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:17 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:20 The meeting name has been set to 'cinder' 16:00:26 Hello. 16:00:42 Hi 16:00:45 hello! 16:00:49 hi 16:01:06 Hello :) 16:01:11 o/ 16:01:13 hi! 16:01:24 Hi 16:01:28 Hello all. 16:01:29 * geguileo is missing the ping };-) 16:01:30 hi 16:01:48 ping dulek duncant eharney geguileo winston-d e0ne jungleboyj jgriffith thingee smcginnis hemna xyang1 tbarron scottda erlon rhedlind jbernard _alastor_ bluex karthikp_ patrickeast dongwenjuan JaniceLee cFouts Thelo vivekd adrianofr mtanino karlamrhein diablo_rojo jay.xu jgregor lhx_ baumann rajinir wilson-l reduxio wanghao thrawn01 chris_morrell watanabe.isao,tommylikehu mdovgal ildikov wxy 16:01:53 viks ketonne abishop sivn 16:01:55 thanks! ;-) 16:02:01 :) 16:02:09 I need a macro or something 16:02:10 <_pewp_> hemna ヘ(°¬°)ノ 16:02:12 .o/ 16:02:21 hi 16:02:51 hi 16:02:58 o/ 16:03:07 OK, guess we can get going. 16:03:12 #topic Announcements 16:03:21 #link https://etherpad.openstack.org/p/cinder-spec-review-tracking Review focus 16:03:42 Still some open work on merged specs for Pike in there. 16:03:47 And we're running really low on time. 16:04:05 So if at all possible, please help with reviews and updates. 16:04:26 I hope to be a little more proactive this time arround and move merged specs that don't fully land. 16:04:34 Since we've just left them for the most part in the past. 16:04:40 smcginnis: ++ 16:04:44 Then people come along a few releases later and ask why it's not working. 16:05:25 smcginnis: I didn't find API-WG decision as was asked for 'Backup Service Enabled API' - so let's move it to Queens 16:05:55 o/ hi 16:06:00 e0ne: OK, great. Want to propose that? 16:06:02 o/ 16:06:22 smcginnis: I need to look on it a bit deeper forst 16:06:28 s/forst/first 16:06:33 e0ne: OK, sounds good. 16:06:35 #link https://etherpad.openstack.org/p/cinder-ptg-queens Planning etherpad for PTG 16:06:48 We have a topic planning etherpad started for the PTG. 16:07:04 Please add any topics you think would be good to discuss face to face there. 16:07:14 Hopefully many of you can attend. 16:07:42 Oh, even if you don't have a topic, please add your name if you are planning to attend so we can get an idea of who will be there. 16:07:44 even if you can't attend - please, add topic you are interesting in 16:07:55 If you don't think you can, please apply for tsp. 16:07:56 e0ne: ;) 16:08:13 diablo_rojo_phon: Jumping the gun! :D 16:08:15 #link https://www.eventbrite.com/e/project-teams-gathering-denver-2017-tickets-33219389087 PTG registration 16:08:19 I think if topic is really important - it should be discussed 16:08:27 #link https://openstackfoundation.formstack.com/forms/travelsupportptg_denver 16:08:27 maybe with some hangout session too 16:08:38 Travel support program is accepting applications. 16:08:41 diablo_rojo_phon, tsp link? 16:08:44 I know it's just important :) 16:08:46 * e0ne still is not sure about PTG attandance 16:08:58 If you are involved but are unable to get company funding, please apply for travel support. 16:09:03 e0ne: :( 16:09:14 We willd definitely try to stream again. 16:09:20 hemna smcginnis: has it on the agenda 16:09:22 Hopefully with decent audio. 16:09:26 oh yah it still costs $100 to attend 16:09:49 hemna: $100 + hotel + flight to US :( 16:09:56 yah 16:10:00 thanks for the links 16:10:09 Swim. 16:10:17 diablo_rojo_phon: Does the TSP cover anything with the registration cost? 16:10:42 Tap can cover all of it. 16:10:44 Tsp 16:10:50 Sweet. 16:10:56 Reg, flight and hotel 16:10:56 So definitely apply if you need it. 16:11:01 Or some subset 16:11:27 #topic Update on mysql/pymysql issues with oslo.db 16:11:34 No name on this one. Was that you arnewiebalck ? 16:11:43 smcginnis: Sorry, that was me. 16:11:50 nope 16:11:52 Just for awareness. 16:11:57 not me, I mean :) 16:12:13 arnewiebalck: Well, it was you. ;) 16:12:14 I talked to the Oslo team about this and they agreed that it was something to improve. 16:12:38 #link https://bugs.launchpad.net/oslo.db/+bug/1692956 16:12:38 Launchpad bug 1692956 in oslo.db "Warn about potentially misconfigured connection string " [Undecided,Fix committed] 16:12:44 <_alastor_> o/ 16:12:54 * smcginnis marks _alastor_ as tardy 16:12:59 They will add a warning to logs when the config option is used and they will update the help text to indicate that the option can cause deadlocks. 16:13:07 gcb was going to push the patch up. 16:13:15 jungleboyj: OK, great. 16:13:23 <_alastor_> smcginnis: sorry teach :P 16:13:29 Looks like he already has a patch linked in there. 16:13:43 I know there was some debate about just internally switching it to the pymysql connection, but didn't want to change behavior on people. 16:13:44 arnewiebalck: You mind taking a look and making sure it looks good to you? 16:13:58 sure, will do 16:14:16 arnewiebalck: Thank you sir. 16:14:23 I will go look at the patch in the bug as well. 16:15:07 That was all on that from me. :-) 16:15:22 jungleboyj: Thanks for the update. 16:15:48 #info oslo.db proposal is to log warning and improve help text. 16:15:58 smcginnis: Welcome. 16:16:01 #topic Documentation Migration 16:16:07 It's the jungleboyj show today. :) 16:16:17 :-) 16:16:23 :) 16:16:27 #link https://review.openstack.org/481756 16:16:29 Just wanted to make people aware of how this is progressing. 16:16:34 #link https://review.openstack.org/481847 16:16:42 #link https://review.openstack.org/481848 16:16:43 So, the main admin-guide content has been migrated and merged. 16:16:54 We have some cli documentation that hasn't merged yet. 16:17:14 Found that the detailed driver config info hasn't been moved yet either. 16:17:22 So in case anyone missed what was going on, it's been decided all documentation is moving out of openstack-manuals into the individual projects. 16:17:32 So we can update docs along with patches. 16:17:41 So first step is getting things moved over. 16:17:45 smcginnis: Oh yeah, background is important. :-) 16:17:50 Then we can improve any formatting and other issues. 16:17:58 jungleboyj: ;) 16:18:06 This is all because of the brain drain from the documentation team. 16:18:19 Which reminds me - no real need for DocImpact tags now. 16:18:27 hemna: Is super excited that we get to maintain our own documentation now. 16:18:33 As if there is a DocImpact, the patch should just include the doc updates. 16:18:35 smcginnis: Good note. 16:18:45 wow 16:18:47 I raised the subject of whether it makes sense or not having openstack commands in all the docs when we don't actually maintain that client... 16:19:01 Anybody thinks this is kind of "weird" 16:19:01 All english grammer and style questions can go to hemna 16:19:03 :D 16:19:03 smcginnis: ++ So, now reviewers need to enforce doing documentation updates in their patches. 16:19:18 geguileo: +1 16:19:27 smcginnis: :) 16:19:27 geguileo: Yeah, I kind of agree there. But not sure if there's a way around that now. 16:19:31 jungleboyj: +1, that's one of the key aspects here 16:19:42 smcginnis: OK, so it is what it is :-( 16:20:06 ildikov: :-) I will be the documentation Czar about would appreciate help enforcing the need for doc updates. 16:20:16 jungleboyj: or they could go in a different patch dependent on the code one, but they should be available before merging the code 16:20:16 Yeah, I think things like the admin guide should show openstack CLI commands, since that's what we want end users to move to. 16:20:27 Even though we don't really have much to do with that. 16:20:37 geguileo: That is good too. 16:20:38 geguileo: +1 16:21:01 jungleboyj: Sorry, I kind of hijacked that. Anything else? 16:21:03 geguileo: We need to move the cinderclient content right now. 16:21:15 jungleboyj: with the Ceilometer team we've been for moving the admin-guide for a while to be able to handle the doc updates along with the code changes 16:21:26 I don't know that we want that to go away but we do need to work on moving people to OpenStack client. Let me think about that. 16:22:19 Anyway, so I will be pushing up a patch for the driver config stuff soon. Then we need to enable handing spinx warnings as errors. 16:22:20 jungleboyj: are you just talking about doc or the cinderclient code? 16:22:30 xyang1: Just the doc. 16:22:34 Ok 16:23:06 There are a lot of docstring issues in our code that I am going to have to push patches up to resolve. Will do that in bite size pieces before enabling errors. 16:23:21 Some have seen me -1 patches for docstring issues. 16:23:49 jungleboyj: :) 16:23:52 I need to work on better understanding what is right and wrong. Hope to have that together for everyone to look at before next week's meeting and then can answer questions. 16:24:17 I just know right now what causes the doc build to fail. How to fix it. Want to find if there are better ways to avoid the warnings. 16:24:45 jungleboyj: I think we all need to learn what is correct formatting, but once we are able to enable warnings as errors, at least it will be pretty obvious. 16:25:04 If you find missing documentation please let me know. 16:25:23 jungleboyj: Careful what you ask for. :) 16:25:32 smcginnis: ++ Yeah, that will help. 16:25:51 I think people will need to get in the habit of doing a docs build with their changes. 16:26:05 Or we could make it a part of pep8 maybe? 16:26:30 it's already a separate job, doesn't need to be done as part of pep8 16:26:39 I would just run their tox check build or whatever after 16:26:41 jungleboyj: we already have doc job. can we re-use it? 16:26:45 Not combine them 16:26:56 jungleboyj: the docs job is fairly quick, so it should be fine 16:27:19 jungleboyj: it's only the matter of raising awareness on that it counts from now on 16:27:19 diablo_rojo_phon: Then people will need to remember to do 'tox -e docs' before the do a review. 16:27:32 ildikov: Right. 16:27:41 jungleboyj: people will learn form the docs job failures, I wouldn't add it to pep8 either 16:27:47 eharney: I wonder if we should have a "tox -e pregitreview" target that does fast8, docs, and py27 or something. 16:27:54 people can just run "tox", dragging docs into pep8 is not the right thing to do 16:28:08 Kind of a "here are the things you should really run before proposing a patch". 16:28:16 smcginnis: That would be nice. 16:28:20 smcginnis: shouldn't that be the list of what runs by default when no environment is specified...? 16:28:48 eharney: Well, I think that does full pep8, not fast8, and both py27 and 35. 16:29:04 eharney: And now some jerk wants to add py36 as well. 16:29:05 :P 16:29:12 lol 16:29:16 hehe 16:29:44 Anyway... anything else to cover jungleboyj? 16:30:05 Anyway, we can bikeshed on that piece when I have the doc builds all working. :-) 16:30:11 +1 16:30:27 smcginnis: Not right now. Appreciate everyone's support getting through the migration. 16:30:41 And being aware that the docs have a new level of importance. 16:30:51 jungleboyj: Thanks! 16:31:02 #topic Gathering of thin provisioning stats in Ocata 16:31:07 arnewiebalck: OK, now it's you. 16:31:25 Ok :) 16:31:58 As mentioned yesterday, we upgraded to Ocata and hit the problem that the provisioning stats gathering broke the upgrade. 16:32:39 The prob is that it cycles through our 4000+ volumes and that takes too long. 16:33:13 So, we had to disable it to upgrade. 16:33:41 arnewiebalck: is that on a specific backend? 16:33:43 #link https://github.com/openstack/cinder/commit/d4fd5660736a1363a4e78480b116532c71b5ce49 16:33:49 Ceph. 16:33:52 it's a ceph issue 16:34:08 eharney: I thought so, but wanted to be sure 16:34:18 I have to admit I’m also struggling with the overall idea behind this. 16:34:30 'cause I remember Jon Bernard mentioning that that was slow 16:35:03 arnewiebalck: I believe it was the only mechanism to get the data (though I don't remember the specifics) 16:35:13 From what I see the code queries Ceph for every volume to get the image size. 16:35:24 Cinder knows these sizes already. 16:35:35 It’s not getting the actual usage from what I see.. 16:35:51 it only knows the provisioned size already, not the amount of data actually written/consumed 16:36:12 The code doesn’t give you that either. 16:36:25 It gives you the allocated size. 16:36:29 the goal of this is to gather that to be able to calculate the overprovisioning ratio 16:36:47 arnewiebalck: it does give you the real size 16:37:28 there was lengthy discussion in reviews about this code (i think the first attempt was wrong and later fixed), so i hope it's doing the right thing at this point 16:37:28 geguileo: I don’t think so. We patched the code and it gives you the allocated size, not the used space of each volume. 16:37:30 it iterates over the diffs to calculate the real size 16:38:37 geguileo: even if it does this, what you wnat to know in the end is how much space is used in the pool, no? 16:38:54 and that is the sum of all the diffs 16:39:23 geguileo: right (if you get the real size :-) ) 16:39:32 arnewiebalck: correct 16:39:42 assuming that there isn't other data written into the pool than just the cinder volumes 16:39:52 in our case you do 4000 calls to Ceph on startup 16:39:57 and on each create 16:40:01 and on each delete 16:40:01 jbernard: could you chime in? 16:40:23 i agree with the direction, as long as pools are cinder-exclusive 16:40:28 else, stats will be misleading 16:40:29 arnewiebalck: no, not on each create (afaik) 16:40:36 Seems like there should be a more efficient way than making 4000 calls. 16:40:48 smcginnis: I don't think there is 16:40:49 geguileo: ok ok, I got excited ;) 16:40:52 and we need to preserve allocated reporting, and not virtual 16:41:01 there was also a pending optimization to move to diff_iterate2 which hasn't been tried afaik 16:41:05 but i think we're all on the same page about that, from reading the backlog 16:41:11 smcginnis: the pool could be used for other volumes 16:41:47 geguileo: then a per-pool call willl be inaccurate 16:41:51 geguileo: if it used by other volumes, it will be very diffcult to do the over subscription correctly 16:41:59 jbernard: yes, that's what I meant 16:42:10 jbernard: that there's no easier way, because it could be shared 16:42:39 if it can, then we must iterate 16:42:41 arnewiebalck: yes, but it's possible 16:43:00 or, document that we perfer it not be 16:43:04 geguileo: sounds pretty complicated 16:43:06 or add a setting 16:43:13 but lets not do that 16:43:25 arnewiebalck: unless we explicitly prevent it somehow it's possible 16:43:58 geguileo: the admin would have to adhere to a policy, and deploy as such 16:44:42 I don't suppose a call could be added for ceph to take a collection and return the result in one call? 16:44:51 jbernard: Or we could check on start that all volumes belong to cinder (or look like they do) and report a warning that data will be inaccurate if not 16:45:00 Not sure passing 4000 IDs is much better. Or possible. 16:45:04 smcginnis: it could, but we'd have to iterate there 16:45:15 smcginnis: and it would take time to adopt 16:45:19 Why isn’t it enough to know how full the pool is? 16:45:38 IN the end, the admin needs to take action when some threshold of real usage is used, no? 16:46:08 Yeah, seems like you would want to know the pool usage total, not just the cinder useage. So if it's used by non-cinder you can actually take some of that into account. 16:46:09 No matter what filled the pool. 16:46:30 smcginnis: Good point. :-) 16:46:31 arnewiebalck: it looks like it got broken and you are right 16:46:39 I was looking at the original code 16:46:53 but this broke it: https://review.openstack.org/#/c/410884/5/cinder/volume/drivers/rbd.py 16:46:57 geguileo: ok, thx for checking! 16:47:02 yeah, i also thought the current code still did diff_iterate, apparently not 16:47:25 arnewiebalck: it's adding the total size so it's like you say, not doing what it should 16:47:50 So 2 issues, right no it's not returning the right data 16:48:12 And it's inefficient 16:49:08 arnewiebalck: Not just inefficient. It breaks large deployments, right? 16:49:36 smcginnis: c-vol for that pool didn’t start 16:49:37 presumably it breaks them by causing a timeout somewhere to be exceeded that could be raised in config? 16:49:41 arnewiebalck: it's not knowing how full the pool is, but how much space WE are using the problem 16:49:53 service-list reported that c-vol as XXX 16:52:41 Maybe enough for the meeting? Sounds like there will need to be some follow up discussion later. 16:52:41 geguileo: You would use a pool for something else than just Cinder volumes (and not have a separate pool)? 16:53:05 arnewiebalck: or you could have 2 different cinder-volume services using the same pool 16:53:06 smcginnis: Shall we open a bug for the follow-up? 16:53:21 arnewiebalck: Sounds like that might be good if the current bug doesn't cover all of it. 16:53:25 arnewiebalck: +1 for bug for it 16:54:11 smcginnis: you mean https://bugs.launchpad.net/cinder/+bug/1698786 16:54:12 Launchpad bug 1698786 in Cinder "cinder-volume fails on start when rbd pool contains partially deleted images" [Undecided,In progress] - Assigned to Ivan Kolodyazhny (e0ne) 16:54:12 ? 16:55:19 geguileo: I can see how you use 2 pools for 1 service, but the other way round? 16:55:32 arnewiebalck: I've seen it done 16:55:47 I'm not saying it makes sense, buuuuut, I've seen it 16:55:51 geguileo: was there an explanation ? ;) 16:56:02 geguileo: ah, I see :-D 16:56:19 arnewiebalck: Yeah, that's what I was thinking of. 16:56:40 Any other things we need to discuss yet? 4 minutes. 16:57:42 OK, let's wrap up then. Thanks everyone. 16:57:44 arnewiebalck, smcginnis: it's a different bug 16:58:10 e0ne: ok, I’ll open one then 16:58:27 #endmeeting