16:00:01 #startmeeting Cinder 16:00:02 Meeting started Wed Sep 28 16:00:01 2016 UTC and is due to finish in 60 minutes. The chair is smcginnis. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:03 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:05 The meeting name has been set to 'cinder' 16:00:06 dulek duncant eharney geguileo winston-d e0ne jungleboyj jgriffith thingee smcginnis hemna xyang1 tbarron scottda erlon rhedlind jbernard _alastor_ bluex patrickeast dongwenjuan JaniceLee cFouts Thelo vivekd adrianofr mtanino yuriy_n17 karlamrhein diablo_rojo jay.xu jgregor baumann rajinir wilson-l reduxio wanghao thrawn01 chris_morrell stevemar watanabe.isao,tommylike.hu mdovgal 16:00:06 hi 16:00:09 hello 16:00:09 <_alastor_> o/ 16:00:12 hi 16:00:13 o/ 16:00:14 yough 16:00:16 hi 16:00:16 hey! 16:00:16 o/ 16:00:17 hi 16:00:18 Hey everyone 16:00:18 hi 16:00:20 Hello! 16:00:20 Hi! 16:00:21 hola 16:00:25 Agenda: https://wiki.openstack.org/wiki/CinderMeetings#Next_Cinder_Team_meeting 16:00:25 goobily hoobily 16:00:30 Hi there 16:00:34 \o/ 16:00:42 hi 16:00:49 hi 16:00:53 .o/ 16:01:19 #topic Announcements 16:01:47 hi 16:01:47 There was a question earlier in channel, so I'll state it here to make it official - I'd like Ocata to be mostly bugfix and stabilization. 16:02:00 smcginnis, +1 16:02:05 There are some features in flight that I consider part of that "stabilization". 16:02:06 smcginnis: +1 16:02:14 <_alastor_> smcginnis: +! 16:02:22 <_alastor_> smcginnis: +1 16:02:23 smcginnis: WILL IT BE OPEN TO NEW DRIVERS? 16:02:27 opss 16:02:28 I'd love to see some of that get wrapped up and not be "in-progress" for another release. 16:02:30 :D 16:02:32 erlon: YES 16:02:33 :) 16:02:39 smcginnis: +1 16:02:40 erlon, inside voice. 16:02:42 can we explicitly add "testing" to that list, please? 16:02:43 * erlon screaams! 16:02:44 hah! 16:03:12 DuncanT: Yeah, I think testing will be a very important part of both stabilization and bugfixing. 16:03:19 (I'm very slow typing at the moment due to injury, sorry) 16:03:31 But I'll try to remember to call that out explicitly from now on. 16:03:35 DuncanT: How's the hand? 16:03:57 Not fallen off yet. Not actually working very well though 16:04:02 :/ 16:04:10 Hopefully it improves. 16:04:13 o/ 16:04:18 #link https://etherpad.openstack.org/p/ocata-cinder-designsummit-planning Summit planning 16:04:28 Hello 16:04:36 With those goals in mind - we need to plan our summit sessions. 16:04:54 Testing will be part of it for sure. 16:05:00 Please add ideas to the etherpad. 16:05:17 Once we get a little closer we can prioritize and get sessions assigned. 16:05:28 smcginnis: I guess we need a session on all the things we're considering "in-progress"? 16:05:48 dulek: Sure, we could. 16:06:07 dulek: It needs to be discussed somewhere at least. :) 16:06:39 One more announcement before moving on with the agenda - I need to cut RC2. 16:06:40 And what are we actually considering in-progress besides A/A? 16:07:03 dulek: Maybe I'll start an etherpad. 16:07:06 dulek: rolling upgrades 16:07:07 smcginnis: you are talking about core features, right? new drivers and driver features are still ok? 16:07:46 xyang2: Yes, contained to drivers it should be OK. Just no changes to Cinder core that could destabilize things or slow down completion of things like HA A/A. 16:07:51 Cinder-Nova API changes and multi-attach are in-progress 16:07:54 bswartz: I would love to continue the work on stabilizing that in Ocata, but I don't think it should make a priority per-se. 16:07:56 Nova-cinder api 16:08:04 DuncanT: +1000 :) 16:08:21 scottda, I think multi-attach is probably out for O 16:08:24 DuncanT: That will be a big one. I think we're close there now. 16:08:27 scottda, we need to stabilize the new api first 16:08:37 my $0.02 16:08:38 Okay, these 3 things probably will drain enough time to fill-up Ocata. 16:08:48 hemna: Yeah, if we can at least get the new APIs in palce, then hopefully we're in a good spot for Pike. 16:08:48 dulek: getting better validation and testing of rolling upgrade would be great to see 16:08:59 DuncanT: Working on it! :) 16:09:16 So on the subject of me needing to cut the RC2... 16:09:17 hemna: Nooooo!!!! 16:09:18 hemna, smcginnis: There will be riots, you remember? ;) 16:09:18 smcginnis: hemna I do plan to have multi-attach shortly after the new api calls 16:09:21 #topic FFE RBD Replication 16:09:33 geguileo: You have the floor. 16:09:38 jgriffith: +1 16:09:44 smcginnis: thanks 16:10:01 Ok, so RBD replication was accepted as a FFE 16:10:20 And all dependent patches have merged in master 16:10:32 Just the RBD driver patch remains 16:10:39 geguileo: So you're probably not going to like this, but... 16:10:40 And it needs some love 16:10:55 geguileo: This has gotten late and that's a lot of changes to push in right before cutting a release candidate. 16:11:08 geguileo: I'd actually feel better leaving that for O at this point. 16:11:11 :''-( 16:11:13 geguileo: I did some testing today, it works for me 16:11:22 I'll review the code tonight 16:11:33 smcginnis: In my defense all those other patches are bugs we had in Cinder 16:11:49 smcginnis: So a backport and merge at this point is no longer an option? 16:11:56 geguileo: If folks are comfortable with it and feel the risk is low, I can probably be convinced otherwise. 16:12:10 smcginnis: Should we vote or something? 16:12:12 geguileo: But I'd want to see those go through today if we're going to do it. 16:12:17 I can't believe it's not merged already! I thought that FFE was granted more than a week ago 16:12:26 Who has tested the patches in a real deployment? 16:12:38 smcginnis: I have ;-) 16:12:50 geguileo: Well that's good! :) 16:12:51 smcginnis: And apparently e0ne has as wel 16:12:59 s/wel/well 16:13:25 Anybody can easily test it, I provided a script to deploy everything 16:13:32 2 Ceph clusters properly configured 16:13:36 geguileo, smcginnis: I tested it with multinode devstack with 2 separate ceph clusters 16:13:39 Devstack with the patches, etc 16:13:49 e0ne: OK, great. That does help. 16:14:02 IMO it's way too late to be merging features into Newton -- I would feel nervous about even merging a bugfix this late 16:14:21 bswartz, +1 16:14:30 bswartz: That's my dilemma. 16:14:35 It's hopefully isolated to the driver. 16:14:48 But changes should be very minimal by this point. 16:15:04 But it's contained to a single driver and enabled by config only, right? 16:15:14 when the FFE request game through, I thought it was debatable, and I assumed it would merge immedately after it was granted 16:15:26 dulek: The remaining patch to master is 16:15:27 bswartz: +1 16:15:39 I think it's too late now. 16:15:42 dulek: But for the backport it requires the other patches as well 16:16:03 how likely are we to end up backporting the bugfix patches regardless of the rbd replication code? 16:16:06 geguileo: Oh, sure, I don't find them itrusive. 16:16:36 geguileo: so, we should allow FFE not only for RBD replication patch:( 16:16:37 dulek: If those are not intrusive the RBD part is disabled by default 16:16:56 String changes as well... 16:17:19 e0ne: Yep 16:17:41 e0ne: At least 1 of the patches already has 2 +2 and +A 16:17:42 e0ne: Wait, bugfixes aren't features. 16:17:45 e0ne: And another one 1 +2 16:18:16 So, my 2c, I think for pretty much every driver that implemented replication there were additional fixes needed after the fact as people started using and testing it in different scenarios 16:18:17 dulek: not critical bugfixes are not allowed for Newton at the moment 16:18:29 geguileo: Sorry, I think we're going to have to push to O at this point. 16:18:39 e0ne: That's a fair point. 16:18:40 smcginnis: OK 16:18:41 So imo the odds of these not breaking *something* is pretty low 16:19:13 smcginnis: I'd like to withdraw my agenda item, on consideration. It's way too late. 16:19:14 patrickeast: By that you mean breaking something out of the replication stuff or the replication having bugs? 16:19:23 patrickeast: ++ 16:19:23 DuncanT: Was wondering about that. ;) 16:19:37 geguileo: both I suppose 16:19:41 XD 16:19:45 smcginnis: I've not been paying enough attention to where we were in the cycle 16:20:30 geguileo, DuncanT: Both of these I think we can get going in master. But just too late with too much risk with an RC2 imminent. 16:21:04 noted 16:21:30 geguileo: I'll buy you a (free) beer in Barcelona. :) 16:21:39 XD XD XD 16:21:42 Thanks! 16:21:54 Moving on then... 16:21:58 #topic Getting ActiveActive/HA in the O release 16:22:04 So, here's a chance to help geguileo mop up some of the tears shed for not getting RBD replication into Newton... 16:22:06 scottda, geguileo: 16:22:08 Buying a "free" beer is a paradox. :P 16:22:12 :) 16:22:16 dulek: ;) 16:22:29 dulek: I'll even buy two. 16:22:31 We're into the 3rd? release for AA/Ha 16:22:44 scottda: Yup 16:22:45 scottda: Yep 16:22:48 I think we've all seen the architecture, and merged a bunch of patches in N. 16:23:03 IF we're committed to getting this in O, let's get it in soon. 16:23:09 scottda: +1 16:23:14 +1 16:23:16 This will allow focus on testing and finding bugs. 16:23:20 scottda: +1 16:23:31 scottda: +1 16:23:43 do we want to move feature freeze to O-1 milestone? 16:23:44 I semi-jokingly proposed a review day and Merge Fest. But maybe that's a good idea? 16:24:15 When I look at the Ocata schedule I see 8 weeks that are not holidays between Design summit and Feature Freeze 16:24:17 So with no features going in can we expect every company to pull resources? 16:24:18 e0ne: 0-1 is just 2 weeks after the summit. This is a bad idea IMO. 16:24:25 WE merged about 12 patches on the last day of the mid-cycle, and really moved this along. Should we try something like that soon? Before the summit? 16:24:39 e0ne: We'll just be restrictive about what we do allow in. 16:24:46 dulek: I had to agree with you 16:24:55 bswartz: Yeah, really short span on this one. 16:24:58 scottda: +1 16:25:08 * scottda waits patiently for the conversation to settle down 16:25:30 scottda: Sounds like a plan. 16:25:50 scottda: Good idea. 16:26:40 geguileo: Can you go over the patches in the BP and make sure they are ready to go, and status updated as to which are ready for review? 16:26:40 Schedule reminder for folks: https://releases.openstack.org/ocata/schedule.html 16:26:52 #link https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support 16:26:53 scottda: OK 16:26:54 is there a clear mark at this point of when we will consider HA "complete"? (i.e. after patchsets X, Y, and Z land?) 16:27:29 eharney: Good question 16:27:37 scottda: I'll work on the replication to A/A patch and create unit tests for all WIP patches by next week 16:27:42 scottda: And then update the BP 16:27:59 scottda: So on next meeting we should be able to set a day for the merging 16:28:06 Maybe geguileo can also indicate the "complete" point, as opposed to any "optional" stuff? 16:28:10 eharney: how about when it's deployed in production? 16:28:10 geguileo: +1 16:28:41 bswartz: well i assume we will be shaking out bugs for a bit, just wondering as far as what it means to "get it in O" 16:28:46 geguileo: We have other patches we can merge in the meantime, right? 16:28:59 geguileo: I mean while you're working on the replication piece. 16:29:03 #action geguileo Will update AA/HA BP and prepare for code reviews and merging 16:29:20 The idea at the midcycle was to get things merged so they could get more runtime and shake out bugs. 16:29:20 smcginnis: Yeah, I'll update the BP and remove the -2 I have on the first patch 16:29:31 I think now's a good time for doing that again for O. 16:29:36 geguileo: Cool, thanks! 16:29:46 smcginnis: In the BP I'll reflect which patches are ready for merging 16:29:50 smcginnis: geguileo I updated that BP a couple weeks ago, just to keep Merged vs. WIp stuff up to date 16:29:54 geguileo: what your blog link on how to test AA again? Is it in one of the etherpad? 16:30:03 eharney: that should be determined by the stakeholders who want this feature so badly 16:30:17 #link http://gorka.eguileor.com/manual-validation-of-cinder-aa-patches/ 16:30:20 xyang2: ^^ 16:30:29 scottda: thanks 16:30:37 xyang2: I have to update the post, but it's this one: http://gorka.eguileor.com/manual-validation-of-cinder-aa-patches/ 16:30:38 And automated tests are also being worked on? 16:30:48 scottda: You were faster than me :-) 16:30:51 geguileo: thanks 16:31:35 I don't think there is enough detail on that blog page..... 16:31:48 smcginnis: We were just discussing last hour in cinder_testing how to do automated tests... 16:31:56 hemna: I'll update it and add a little bit more including the API cleanup stuff ;-) 16:32:00 :P 16:32:00 smcginnis: Some of it will prove tricky. 16:32:29 Yeah, it's tricky because you can't tell if they are getting evenly distributed and if the DB is being properly cleaned up 16:32:36 But once code is in, we get existing Tempest testing for free. So there'll be some indication that things still work and don't get broken. 16:32:44 By evenly I mean round robin 16:33:51 Is there a way to trigger some issues without having A/A deployment? 16:34:33 winston-d_: We had discussed how some type of error-injection would be good. But there's no infra for that ATM 16:34:34 winston-d_: What do you mean? r:-?? 16:35:38 Killing nodes? 16:35:53 I mean to prove Cinder A/A works, job distribution is one thing, the other is things like DLM is working properly. 16:36:14 winston-d_: Oh, yeah, that's the next step of what I want to manually test 16:36:21 probably need some rally runs for that 16:36:49 winston-d_: But I want to get all the cinder stuff in before focusing on the DLM part 16:36:57 Yeah, and some manual testing with sleeps would be good for checking the races. But hard to do in an automated infra-approved way. 16:37:31 scottda: I have some ideas on how to do that, I just have to find the time to do a PoC 16:38:50 OK, well it looks like some of the Cleanup patches could be reviewed and merged anytime. 16:39:27 I really want to see how Cinder with A/A is running differently from what it is now, e.g. using the hostname hack and rabbit to do round-robin 'A/A'. 16:39:27 and geguileo is going to look at some of the stuff marked WIP at the moment, and clean up the BP before next week. 16:39:45 winston-d_: Look at geguileo 's blog for manual testing. 16:40:03 winston-d_: It show really good details. 16:40:11 reading 16:40:42 Anything else on this we should cover in the meeting? Or just follow up in channel as we go? 16:40:54 Nothing more from me. 16:41:03 I'm good 16:41:07 Thanks guys. 16:41:23 DuncanT's topic is deferred. Any other topics? 16:41:56 Going once... 16:42:06 Going twice... 16:42:14 OK, thanks everyone! 16:42:18 Thanks! 16:42:21 thx 16:42:23 see you next week 16:42:28 #endmeeting