16:00:22 #startmeeting cinder 16:00:22 Meeting started Wed Mar 25 16:00:22 2015 UTC and is due to finish in 60 minutes. The chair is thingee. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:23 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:24 hi! 16:00:26 The meeting name has been set to 'cinder' 16:00:31 o/ 16:00:33 hi everyone! 16:00:34 hi 16:00:35 hi 16:00:37 hello 16:00:38 o/ 16:00:39 o/ 16:00:40 good day folks 16:00:40 Hi 16:00:42 hi 16:00:42 o/ 16:00:46 hello 16:00:47 * jungleboyj waves from the beach 16:00:48 hi 16:00:50 hi 16:00:52 Hi 16:01:00 agenda items for today: 16:01:02 * smcginnis doesn't think jungleboyj does vacations well. 16:01:02 #link https://wiki.openstack.org/wiki/CinderMeetings#Next_meeting 16:01:26 #topic Future of tasflow in cinder 16:01:28 abhishekk: hi 16:01:37 o/ 16:01:37 thingee: hi 16:01:39 hello 16:01:41 o/ 16:01:48 abhishekk: so there has been zero progress on discussion with Cinder and taskflow 16:01:53 yes 16:01:56 abhishekk: just like the last time this was brought up 16:02:17 abhishekk: there have been a bit more pressing issues going on in Cinder, if you're keeping up with the mailing list 16:02:17 thingee: yes, we are ready to put constructive efforts for L 16:02:22 thingee, abhishekk: we've got design session proposals for it 16:02:46 thats great 16:03:02 abhishekk: so yes there's a summit session. unless there is anything new to bring to the table on this topic 16:03:05 yes, I've proposed it, but to prepare it would be great to know the direction 16:03:27 I agree with dulek 16:03:33 ok what's new? 16:04:13 dulek: i propose to update our patches before summit 16:04:27 thingee: I guess not much, I would just love to know if the decision on stepping out from TaskFlow is taken or not 16:04:41 I would like to ask that there is a decide approach and not X number of approaches 16:04:41 thingee: yes, Are we going to step out from using TaskFlow? 16:04:54 I just find the whole thing confusing otherwise...which is why I think no progress has been made. 16:05:02 on getting things merged 16:05:25 thingee: TBH, we started work on it too late for K 16:05:29 abhishekk: that's the point of the summit 16:05:40 abhishekk: which is a topic I will say comes up every summit and midcycle meetup 16:05:46 abhishekk: which is not a great sign for a project 16:05:54 e0ne: ^ 16:06:18 thingee: agree, it's reasonable 16:06:25 thingee: that's ok for me, so we'll decide at the summit 16:06:36 thingee: right, hopefully we can come to some conclusion in this summit 16:07:02 I agree that it is concerning that we keep discussing it and not making progress. Taskflow is hard to wrap our heads around. 16:07:10 if there is a split in the decision, I'll weigh in more after giving my initial thoughts at the summit based on the decided approach from you all on state transitions. 16:07:32 who wants to make sure that there is a decided approach to give at the summit? 16:07:43 and ideally decided early before the summit so we can familiarize ourselves 16:07:57 thingee: +1 16:08:05 is there an overview of cons of using taskflow? 16:08:12 We need better examples and reasons for why taskflow is better if we are going to make a decision at the Summit. 16:08:24 thingee: +1 16:08:41 I for one need guidance on how to debug issues when taskflow is involved. I get lost. 16:08:56 thingee: I can gather approach ideas and communicate to the ML before the summit. 16:08:57 tbarron: +2 16:09:00 eikke: there is a lot of scattered reviews of us realizing some issues we hit along the way. If you take the recent allocation bug that jgriffith was trying to solve for sometime and then dulek had to write a hack for to get around... 16:09:04 dulek: which by the way, thank you 16:09:22 thingee: np, but I got to agree, it was harsh 16:09:34 #action dulek to make sure there is a decided approach to give at the summit and sent to the mailing list beforehand 16:09:37 it worries me that most of the taskflow discussions seem to revolve around general feeling and not analysis of real pros/cons... it would be good to get some organized information 16:09:51 eharney: +2 16:09:58 eharney: +2 16:09:58 eharney: +1 eikke mentioning that made me realize that as well 16:10:04 eharney: +1 16:10:05 +10 16:10:10 eharney: +1 16:10:14 I ust don't understand what it improves yet. 16:10:15 eharney: +1 16:10:20 to help the taskflow folks, I think it would be good if someone gathered the issues from the cinder side 16:10:24 Need concrete pros and cons. 16:11:05 * thingee would vote for jgriffith to provide some information to whoever collects this info 16:11:06 jungleboyj: as i sad few month before, it's not very useful w/o persistance 16:11:31 anyone able to volunteer to gather the issues with taskflow + cinder? 16:11:45 thingee: i can do it 16:11:57 #action e0ne to gather the cons with taskflow in Cinder 16:12:03 e0ne: please talk to jgriffith 16:12:07 e0ne: and thank you! 16:12:14 thingee: sure, i'll do 16:12:16 e0ne: Would you mind to use ML to communicate that? We may start a discussion here which is goo.d 16:12:28 dulek: great idea 16:12:30 s/here/there 16:12:31 thank you e0ne 16:12:37 #undo 16:12:38 Removing item from minutes: 16:12:43 Adding to etherpad for summit would be good too. 16:12:55 #action e0ne to gather the cons with taskflow in Cinder and post to ML for discussion 16:13:01 cebruns: it's there 16:13:03 also remember to actually compare not using taskflow to an alternative (missing functionality we'd like, implementing some complex thing ourselves...) 16:13:06 from what others told me 16:13:12 * thingee checks 16:13:17 dulek: i'll chat you too for it:) 16:13:24 e0ne: cool :) 16:13:49 #link https://etherpad.openstack.org/p/cinder-liberty-proposed-sessions 16:13:50 Yeah - meant adding the investigation/pros/cons results to etherpad as wella s ML 16:13:52 cebruns: ^ 16:13:57 cebruns: good idea 16:14:19 ok I think this was great to discuss to get ourselves more organized for the topic at the summit 16:14:23 eharney: yes, imo, the main problem, we don't use needed features from taskflow 16:14:23 thingee: I guess we'll take care of that with e0ne 16:14:25 thanks dulek e0ne and abhishekk 16:14:28 it could be eaiser to understand the real benefit of taskflow if there were some carefully designed synthetic 'failure' cases, which one can actually reproduce if needed, that taskflow helps. 16:16:23 winston-d_zzZ: +1 16:16:43 winston-d: +1 and showing "when Cinder encounters this..." taskflow can do "this" to alleviate it... 16:16:45 winston-d: +1 16:16:46 winston-d: I agree. I think some of persistence patches have not been completely visible of what scenario it actually fixes. e.g. if this node fails, we have to wait for that node to come back to resume 16:16:49 winston-d: +1 16:17:17 * thingee might be wrong but remembers DuncanT being a bit involved with that and calling them out 16:17:42 Erk, sorry, hadn't noticed the time 16:17:54 also with some of the persistence stuff being limited to use sqlite on the node itself. did that get resolved? 16:17:56 thingee: we don't use taskflow for cinder recovering at all 16:18:14 e0ne: correct. there are patches in discussion about this though 16:18:33 thingee: no. we can use and DB that is supported by TF(Mysql, postgresql, etc) 16:18:38 thingee: This creates new issues, which I wanted to start discussion in ML thread mentioned by abhishekk 16:18:53 thingee: it was consern for it in dulek's spec 16:18:54 thingee: But that's true, we should revisit that 16:19:22 dulek: ok so nothing has changed there? I apologize but I'm not up to speed on the particular patch that this was brought up on 16:20:11 thingee: we moved our focus from persisnance to K-3, afair 16:20:12 thingee: I've changed the priorities, will make sure we have clear vision before the summit. 16:20:23 so nobody was involved in it 16:20:43 got it 16:20:58 dulek, abhishekk, e0ne anything else for this topic? I think this was great to start preparing 16:21:13 thingee: I'm fine, thanks a lot! 16:21:20 thingee: thanks a lot! 16:21:34 #topic CI Status Page 16:21:40 dulek, e0ne: thank you 16:21:47 #idea The status page either needs to be updated or we should link to Mike's spreadsheet. 16:21:47 the 3PAR driver is using taskflow for retype, as we have to make 2-3 calls to the array depending on what needs to be changed and rollback if one of those fails. 16:21:49 jungleboyj: hi 16:21:58 thingee: Hey, this is just real quick. 16:22:23 Had a question about whether the status page that is currently linked here: https://wiki.openstack.org/wiki/Cinder/tested-3rdParty-drivers 16:22:36 Should be updated or if we should change that link to the spreadsheet you have for CI status. 16:22:54 I don't think my non-editable spreadsheet is a good idea. I realize it's creating confusion with people as being what others should follow. It was purely for my tracking purposes to temporarily keep track of CIs 16:22:55 This is out of date: https://wiki.openstack.org/wiki/Cinder/third-party-ci-status 16:23:00 i thought the decision a while ago was that we were not updating the wiki page 16:23:08 #link https://wiki.openstack.org/wiki/Cinder/third-party-ci-status 16:23:11 people should be keeping their thirdpartysystems wiki pages up to date 16:23:26 yeah do we even need that page? 16:23:31 https://wiki.openstack.org/wiki/ThirdPartySystems 16:23:57 ameade: it's a good point 16:23:59 patrickeast: +1 16:24:02 anyone disagree? 16:24:15 +1, one less thing to keep track of 16:24:20 agreeing to not update it is rather confusing if we leave it there for people to find and link to... 16:24:21 The page was my attempt to do an open version of Thingee's spreadsheet... feel free to nuke it 16:24:35 eharney: +1, we should probably remove it 16:24:56 #agreed nuke https://wiki.openstack.org/wiki/Cinder/third-party-ci-status in favor of https://wiki.openstack.org/wiki/ThirdPartySystems 16:25:13 patrickeast: +1 if that page isn't used we should get rid of it and link to the Third Party systems page instead. 16:25:14 I will also leave a link on my spreadsheet at the top 16:25:21 in case people find that 16:25:24 Sounds good. 16:25:37 jungleboyj: sounds good...I'll do a redirect 16:25:50 #action thingee to redirect old CI page to decided CI page 16:26:10 #action thingee to leave a link on his spreadsheet to the decided page as well 16:26:15 jungleboyj: anything else? 16:26:34 thingee: No, that was it. I will make sure we have our pages updated. 16:26:40 Thank you for the decision. 16:26:59 #action thingee to communicate to driver maintainers about making sure pages are up-to-date 16:27:06 should the https://wiki.openstack.org/wiki/CinderSupportMatrix be update with the drivers removed? 16:27:35 kmartin: good point. Nothing is final right now...after RC I'll do a clean up if needed 16:28:02 #action thingee to clean up driver support matrix page after RC 16:28:09 jungleboyj: thanks 16:28:15 #topic Deadline for readding volume drivers and RC 16:28:27 Thank you! 16:28:57 I have a couple questions on how to test unmerged drivers once we figure this out 16:29:08 so RC, is projected to 4/9 on the launchpad page. I learned this morning from ttx that's not a real date. 16:29:41 we will do a cut when the bug list is empty 16:29:47 ameade: submit a patch with your drivers added. then you should be able to test it 16:30:09 xyang2: would that be sufficient to show stability? 16:30:22 lets figure out the deadline stuff first 16:30:27 thingee: What does that mean? 16:30:51 * thingee looks at calendar 16:30:56 ameade, https://wiki.openstack.org/wiki/Cinder/tested-3rdParty-drivers & specific CINDER_BRANCH as shown there 16:31:07 thingee: I assume you mean when the targeted bug list is empty? 16:31:20 jungleboyj: yes 16:31:40 so the bug list can change 16:31:43 can grow 16:31:49 if there are real showstoppers 16:31:57 that's why setting deadlines in RC isn't easy 16:32:08 can we set a minimum date? 16:32:09 ameade, or manually cherry-pick your driver's patch on top of the latest 16:32:11 #link https://launchpad.net/cinder/+milestone/kilo-rc1 16:32:16 ameade: yea 16:32:22 so I was think 4/6 ? 16:32:23 asselin: that's what i'm doing now but i gotta make sure everyone agrees thats fine 16:32:29 thingee: only show stoppers can be targeted? 16:32:42 xyang2: should be at this point 16:32:58 xyang2: I'll definitely help drivers in focus on big bug fixes for them though of course 16:33:06 xyang2: already did it for a couple of drivers so far 16:33:39 so is 4/6 fair? 16:33:48 thingee: that sounds ok to me 16:33:49 thingee: so you mean don't target driver bugs, but we can still review and approve them? 16:33:56 thingee: sounds good 16:33:56 gives me enough time to not drive other projects insane on waiting on cinder to do a cut 16:33:57 how long do we want to see new CIs run for? 16:34:02 thingee: could we target medium bugs for RC-1 and not for RC-2, etc? 16:34:19 Sure. Is cinder the long pole? 16:34:42 ameade: I'll defer to DuncanT and asselin on that. 16:34:52 on how long things should be running stable 16:35:07 ameade, here's what I would do to run your unmerged driver on all cinder patches: http://paste.openstack.org/show/196670/ 16:35:18 ameade, assuming you're using devstack-gate to setup the job 16:35:36 DuncanT: any opinions? 16:35:38 I'd like for us to lay out some of these requirements 16:36:04 or asselin 16:36:18 should i be posting on sandbox? is it ok if I just append to our other CIs? Should i post on every single patch or just the ones that pass jenkins? 16:36:44 i know that running all 304 tests is now a requirement 16:36:57 sandbox is good for setting up and testing. after that works, you should really be commenting on all cinder patches 16:37:06 thingee: We don't have long to make the decision, so we can't make it very long. If the earliest you're setting the RC is 6th, then I'd say 7 days with a ping on channel for all breakages to say they've been noticed int hat period? 16:37:14 How long the driver stable? 16:37:46 That gives nearly a week to sort out any issues 16:37:54 DuncanT: 7 days? hmm...how far along are you ameade ? 16:38:16 well i specifically pretty much have it done 16:38:31 scaling it to all patches is my particular challenge 16:38:34 with limited hardware 16:39:18 How to test the driver is stable?according to the reporting? 16:39:51 I would like to propose five days of the CI being stable on posting results. I think anyone who is going to make this exception window should be pretty much there. 16:39:53 by now 16:40:03 ameade: There's been a general consensious that running after a jenkins +1 is reasonable 16:40:17 DuncanT: I think asselin and you would conflict there. 16:40:30 but yes there are CI's doing this today 16:40:34 >.< 16:40:36 asselin: Oh? 16:40:50 I've no objection to running after jenkins +1 16:40:51 so here are my thoughts on that... 16:41:04 thingee: +1 Seems reasonable. 16:41:04 I think this is really great progress we have made on all vendors 16:41:14 lets not ruin that 16:41:47 if people having scaling issues, lets allow it and once we got things more figured out/stable lets revisit the idea of being proactive on all patches 16:41:49 Ok, 16:42:17 thingee: that would be great, we have more hardware coming mid april 16:42:28 and i think we get 99% of the value just running on most patches 16:43:02 ameade: I was impressed netapp was already addressing scaling issues, but I can certainly say some people are running on some pretty "interesting" setups to meet the requirement 16:43:27 thingee: we definitely went all in with CI 16:43:58 so my plan is to have this thing run silently for a day or two more 16:44:06 #agreed 4/6 cinder community will verify CI's for exception 16:44:23 #agreed CI's for exception must be stable/reporting for 5 days before 4/6 16:44:35 #action thingee to communicate to ML about this 16:45:15 thingee: including weekends so starting at 4/1? 16:45:15 #action thingee to do reminder in next cinder meeting 16:45:27 xyang2: yes =/ 16:45:30 Who all is hoping for an exception? 16:45:44 jungleboyj: <--this guy 16:45:46 i mean 16:45:49 <--this guy 16:45:52 lol 16:45:55 jungleboyj: huawei, netapp, oracle have a good chance from the sound of things 16:46:16 the rest of the drivers have no been communicating anything on the ML 16:46:18 :-) 16:46:19 not* 16:46:21 Thanks 16:46:24 thingee: thanks:) 16:46:35 jungleboyj, I think huawei is 16:46:46 Ok, that was what I expected. Wnated to make sure there weren't others. 16:46:47 We will do our best 16:46:58 I appreciate it 16:46:58 thingee: are there also a few windows drivers too? 16:47:03 that said, the community and specifically asselin has been responding to people on the ML about open issues on their CI's 16:47:21 thingee: I would like to note that you are being more than considerate with this exception. 16:47:33 xyang2: I heard they would like to get exceptions on those. haven't seen ci's reporting yet or communication on the ML 16:47:41 May be we met some network problem,we will fix 16:47:50 jungleboyj: yes 16:49:13 To be clear, I didn't have to grant anyone exceptions on Kilo RC. I know people are angry with me, but regardless I saw this was an opportunity to continue to help people that were removed. 16:49:48 thingee, +1 16:49:54 thingee: what seems to be the important thing is we know the drivers work lol 16:49:54 regardless I'm very happy with progress we're all making 16:49:59 +1 16:50:01 Just to be clear, drivers have to be in on 4/6. We will continue to accept bugs until we cut for RC. 16:50:21 Should we only merge targetted bug fixes? 16:50:22 and it's all thanks to driver maintainers getting their ci's ready to better openstack 16:50:41 and then people like jgriffith patrickeast and asselin helping others with their ci's 16:50:47 thingee: +2 16:50:58 jungleboyj: Certainly you should -1 any bug fixes you think are high risk 16:51:13 DuncanT +1 16:51:15 and i think we should all recognize asselin and anteaya for their hard work of the third party ci meetings and helping in infra channels with any issues people hit 16:51:35 also thanks to DuncanT for help in the channel with CI questions and being our liason as well! 16:52:04 yep ! 16:52:53 ok anything else? 16:53:07 thingee: nope 16:53:11 thingee: for liberty, 16:53:23 thingee: do we require driver to have CI before merge? 16:53:30 xyang2: good question 16:53:31 thingee: by Liberty-1? 16:53:37 my thoughts are yes 16:53:43 thingee: +1 16:53:44 wanted to poll others what they thought 16:53:47 +1 16:53:55 we should... no point going through the stress of another deadline like this 16:53:56 +1 16:54:06 I think it is better this way. so we don't have to remove them later 16:54:06 +1 16:54:14 xyang2: agreed 16:54:19 Agreed. Now that we have enforced the requirement, lets not do this again. 16:54:22 I'd do new drivers the same way as kilo. Takes time to move big companies. 16:54:25 Havea deadline. 16:54:34 I'd like to avoid another ML thread like this later 16:54:59 Swanson: I think we should not even merge drivers without CI, so we don't have to remove them 16:55:01 The thing is, we don't need to alter our plans for big companies... we just say no 16:55:12 #agreed in liberty, volume drivers are required to have a CI *before* merge 16:55:27 #action thingee to update wiki with this information 16:55:41 #action thingee to make sure to send in beginning of L about volume driver plans 16:55:50 what about target drivers? 16:55:52 hemna: ^ 16:55:54 heh 16:56:20 I think we need to talk about the plan to CI target drivers as well as the Fibre Channel Zone manager drivers (2 of them) in Vancouver 16:56:20 Target drivers are already setting up CI's for K-3...didn't even have to ask people :) 16:56:31 hemna: yes good ideas 16:56:38 hemna: anyone from brocade showing up? 16:56:44 I think brocade has said they are starting to look at their driver 16:56:49 hemna: also is there a proposed session? 16:56:52 but I don't know the status. haven't heard from Cisco at all. 16:57:02 * asselin had a meeting with brocade on ci 16:57:11 hemna: I don't think this needs to be in the fishbowl sessions. Doesn't need user input 16:57:21 thingee, agreed 16:57:29 asselin: care to share? 16:57:34 I have also toyed with the idea of taking the FCZM out of cinder itself 16:57:37 and make it a lib. 16:57:38 dunno 16:57:47 hemna: brocade has started to look at their driver.... OK that's a start! 16:57:55 just gave them an overview.what it is, why, etc. 16:58:19 2min reminder 16:58:22 asselin: well received in your opinion? 16:58:29 thingee, yes 16:58:32 great 16:58:41 That is good news. 16:58:47 progress! 16:58:55 #topic open discussion 16:59:07 Hi all 16:59:11 For the drivers have been removed now, can I change the stack.sh script to add the removed driver before start c-volume in the CI? 16:59:33 to test against the real storage backend 16:59:41 Liu, yes 17:00:01 Liu: http://paste.openstack.org/show/196670/ was recommended earlier 17:00:03 thanks everyone 17:00:04 Liu, I posted this earlier which is one way to do that: http://paste.openstack.org/show/196670/ 17:00:06 #endmeeting