17:02:22 #startmeeting Designate 17:02:23 Meeting started Wed Sep 3 17:02:22 2014 UTC and is due to finish in 60 minutes. The chair is Kiall. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:02:24 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:02:26 The meeting name has been set to 'designate' 17:02:28 Hey all - Who's about? 17:02:32 o/ 17:02:34 o/ 17:02:34 o/ 17:02:35 o/ 17:02:36 padkrish: switching to neutron channel 17:02:39 o/ 17:02:47 #topic Action Items from last week 17:02:50 o/ 17:02:59 1st was rjrjr_ to incorporate periodic sync into SP manager spec 17:03:10 rjrjr: I saw a review a couple of mins ago, was it ^? 17:03:19 Kiall: That was storage. 17:03:21 should have the spec done today. next couple of hours. 17:03:26 rjrjr: perfect :) 17:03:36 timsim: whoops! Haven't looked at it yet ;) 17:03:44 2nd was Kiall to review j-3 viablity of TSIG BPs 17:04:02 We hit a bug in dnspython that made it impossible to implement TSIG :( 17:04:13 There was a fix, but it wasn't released until a day or so ago 17:04:34 So - I've pushed those to kilo-1 17:05:00 3rd was mugsie look at the client release 17:05:13 Glad dnspython devs were responsive 17:05:23 yup - that was for betsy's SOA change 17:05:30 which wasn't the issue 17:05:37 eankutse: yep, he did the release the day after I emailed, but time was difficult! 17:05:54 cool 17:05:58 mugsie: OK, ekarlso's asking for a release too for the rally stuff he did.. I'll cut one today. 17:06:06 #action kiall to do client release today 17:06:22 #topic Release Status 17:06:26 yay 17:06:29 #link https://launchpad.net/designate/+milestone/juno-3 17:06:46 j3 is tomorrow.. So, anything we don't land today is pushed! 17:07:03 and the gate is up to mad levels ;) 17:07:10 As always ;) 17:07:34 ttx emailed people and asked all projects to avoid any unessasary commits being +A'd 17:07:41 emailed the -dev list* 17:07:43 Anyway - I think we're in a good position - https://review.openstack.org/#/c/118436/ and https://review.openstack.org/#/c/118625/ would be nice to land, that's all I know of before hand 17:07:50 off hand* 17:07:58 yup 17:07:59 Anyone have any other reviews they feel need to land? 17:08:28 nope - but i think we should have a FF exception for the domain transfer btween tenant 17:08:31 s 17:08:54 what is FF exception ? 17:08:57 Yes, I think so too, have you had time to finish the rebase? 17:09:10 not yet - but the next few evenins should see it done 17:09:10 feature freeze exception.. i.e. landing a feature after j-3 17:09:18 What about the remove priority in v2? 17:09:27 That’s done, but I don’t see it in the list 17:09:51 betsy: also agreed, getting rid of that before juno proper would be great - as it's an API diff 17:10:15 Or, did it go in j-2? 17:10:33 I didn't think it did! Can't remember off hand 17:11:08 I didn’t think so either 17:11:16 #action Kiall to discuss FF exceptions with theirry during 1:1 tomorrow. 17:11:54 betsy: I'll check them out tomorrow, just in case it hasn't ;) 17:12:12 So.. Anything else on j3 before we move on 17:12:12 kiall: thx 17:12:18 before we move on?* 17:12:48 I'm taking silence as a no! 17:12:52 #topic Server Pools - Minidns finalize outstanding comments (vinod) 17:13:00 #link https://review.openstack.org/#/c/112688/1/specs/juno/server-pools-minidns-support.rst 17:13:22 vinod1: you're up 17:13:59 I wanted to clarify the comments on poll_for_serial_number 17:14:11 In particular the return status 17:14:38 My reasoning on having the different return values was "If there are multiple updates in a short time, we could end up with 2 or more outstanding poll requests. So a response could potentially have a serial number newer than the one we are looking for. So if return a SUCCESS_NEWER, the pool manager can update all the outstanding requests till the returned serial number" 17:14:47 Does this make sense? 17:15:38 It does, but can then same not be done with a simple return of (SUCCESS, serial) rather than (SUCCESS_NEWER, serial) ? 17:15:43 so what if there is more than 2 outstnding? 17:15:54 Shouldn't a successful return automatically set all of the ones with a lower serial? 17:15:54 should we return the serial number? 17:16:02 timsim: exactly :) 17:16:36 rjrjr: The proposal does, it returns 2 values - a status and the serial. I think the comments are questioning what the possible statuses should be 17:16:48 How about for the error case? 17:17:12 ERROR_LOWER 17:17:12 expire time has elapsed. The actual_serial number from the last SOA response is returned. 17:17:31 ERROR_TIMEOUT 17:17:31 expire time has elapsed. There was no SOA response at all from the nameserver. In this case the serial number returned is invalid and should not be used. 17:18:11 Do we need 2 different ERROR values or do we have a special serial number to indicate the ERROR_TIMEOUT case? 17:18:21 I guess I wonder what value we get from distinguishing between ERROR_TIMEOUT and ERROR_LOWER, from a user of the API, they really should only have to care if it worked or not 17:18:25 not sure why we need status outside of serial number. return the serial number. if there is an error, throw an exception. the calling side will know what to do with the serial number. 17:18:28 I feel like either way it's going to be in ERROR and be retried anyway right? 17:19:05 rjrjr: good point actually, we use exceptions to indicate a failure rather than a return value 17:19:29 With the ERROR_LOWER case, we could update all the pending up to the serial number returned with a success 17:19:35 well, this is an out going call 17:19:53 so throwing an execption in mdns wont let central know 17:20:01 s/central/pool manager/ 17:20:39 seems we are encapsulating logic in this call that should be on the calling side. i thought server pools was making this call? 17:20:40 vinod1: well, pretend we only returned the serial number for a minute.. With just that, we can already update everything that's completed 17:20:47 mugsie: yes, it will be returned over RPC 17:21:00 this is an out going call from mdsn-> pool manager 17:21:46 mugsie: that's not how I read the spec? 17:22:08 thats what was dicussed in Seattle 17:22:18 as it is a long running operation 17:22:18 also, timeout, retry, expire should just be timout and retry. 17:22:25 If we make the poll_for_serial_number as non-blocking (i.e. as an rpc cast) then we would need an outgoing call 17:22:34 so we cast the request to do it, and the mdns calls back with the result 17:22:50 tim mentioned this in my server pools spec. the expire is not needd. 17:23:26 we said we were goign to make it a cast as it could be a quite lng runnign operation 17:23:45 mugsie / vinod1 - Okay, I see what your saying now. poll_for_serial_number is a PoolMgr->mDNS call, as discussed. But the *response* from that is a separate mDNS->PoolMgr call 17:24:04 yeah 17:24:05 rjrjr: I put in timeout, retry, expire to mimic the soa. if we remove expire, then we would need a number_of_retries 17:24:45 timeout, retry interval, number of retries. expire seems strange to me. 17:25:14 i will remove expire and add a number_of_retries 17:25:52 also, why make it a cast? just have server pool make the call and handle it. 17:26:18 rjrjr: its too long running 17:26:26 rjrjr: So, poolmgr will need to make N of these each time a change happens, if you use a call, you have to do them in serial rather than parallel 17:26:39 rjrjr: It also involves DNS protocol (ie checking if something is active) which we wanted to isolate in MiniDNS 17:26:50 If we could fire them all off, and then wait for all to complete, that would be fine IMO 17:26:53 we could kick off threads for each of the requests? 17:27:37 rjrjr: True, we could. 17:27:57 that makes more sense to me than having a cast (callback?) 17:28:01 So - If this was a call rather than cast, does the issue go away? 17:28:33 I think it does, since the method can just return the serial, or raise an exception. 17:28:52 but, on a busy pool manager, this could present a problem 17:28:59 kicking off 100s of threads 17:29:01 how well can we scale up in the pool manager having calls vs casts? 17:29:10 Out of the Pool Manager-to MiniDNS? I feel like that could fall over if a lot of the DNS servers were unresponsive. 17:29:11 mugsie: greenthreads, so virtually no overhead 17:29:33 but there is an upper limit... 17:29:43 won't minidns have the same issue though? 17:29:45 I am not sure what the problem with the callback solutiion is 17:29:53 mugsie: me either. 17:30:36 there is no real problem.... just disagreements about a functions arguements 17:30:39 With the large number of these that could get kicked off, it seems cleaner to me to just fire them off, not wait for them. 17:30:55 mugsie: Well, as rjrjr said in not so many words, it's a break from the norm in Designate.. And, as a result, we're trying to figure out how best to handle "callbacks" rather than worrying about implementing the feature ;) 17:31:04 not really 17:31:18 (I'm not saying callbacks are bad, or that we should change our minds, simply discussing the options) 17:32:04 Should we vote? 17:32:08 yeah - we went around this rabbit hole for 30 mins in seattle, and decided it was a better option to do the callbacks. 17:32:26 timsim: Yea, I do agree - Fire and forget is certainly cleaner.. 17:33:02 so, this function returns nothing then? 17:33:09 sorry, method? 17:33:18 in minidns? no 17:33:25 rjrjr: correct 17:34:02 It call's back out to poolmgr whenever it's ready and has something useful to report 17:34:55 so I am assuming 'cast' it is 17:35:03 sounds like it 17:35:13 seems like we are deciding on a solution to a problem that may or may not exist. 17:35:20 Okay, So back to the issue at hand ;) Regardless of if it's a callback or not.. the data passed back to central would be the same - the Q is what should that data be 17:35:34 what that data should be* 17:36:05 If you have the serial in hand, I don't see a reason not to pass it back? 17:36:22 I'm a fan of a simple status - SUCCESS/FAILURE, along with the current serial number (if known) 17:36:31 yeah: ++ ^ 17:36:37 Kiall: +1 17:36:51 In the case of ERROR_TIMEOUT what do we return for the serial number - a null value? 17:37:28 vinod1: Yea, None would make sense where it's unknown 17:38:02 Okay - I will update the spec to reflect this discussion 17:38:20 The code on the far side needs to be able to deal with that anyway (it's either `if serial is None` or an `if status == "ERROR_TIMEOUT" or status == "ERROR_BADPACKET" or .. ` etc 17:39:28 how many of these casts do we have? trying to fit the server pool spec to this spec is why i'm wondering. 17:40:00 sorry, pool manager spec. 17:40:04 rjrjr: Do you mean how many functions? 17:40:14 rjrjr: every time the content of a zone change's multiplied by the number of backend DNS servers 17:40:28 changes* 17:40:43 correct. i've been following your spec for specifics in my spec (hence my using expire as a parameter). 17:41:18 is this the only callback i need to add? 17:41:55 yes - that is the only callback. notify_zone_changed is also a cast - but in this case, there is no callback 17:41:55 i think so... 17:41:57 does python allow me to pass the callback method as an argument? 17:42:36 you won't need to do that, just have an externally face rpc api method that mdns calls 17:42:45 s/calls/casts to/ 17:42:50 with the logic in it 17:42:55 rjrjr: so, these are less like JS callbacks where you pass a function reference and more along the lines of the mDNS method making an outgoing RPC call to poolmgr 17:43:38 And, to answer the first Q - Yes, I believe this is the only method where the CB pattern is used 17:44:06 and we are absolute sure we need/want it? 17:44:16 I think so 17:44:20 Yessir. 17:45:47 we ok to move on then? 17:45:48 rjrjr: I think we could live without it. But, It's defiantly cleaner for this kind of thing. If the implementation turns into a rats nest, we can circle back. For now, let's stick with the agreed plan! 17:45:59 +1 17:46:29 Okay, moving on :) 17:46:31 #topic Server Pools - what items can we start next? (vinod) 17:46:35 And.. Pools again :D 17:47:01 I thought once the Server Pools -mdns spec is approved, I can start working on it 17:47:16 yup - but it will be hard to test it 17:47:41 but i think that work can happen 17:48:01 Yes - but since I have time now to work on the next item - I thought I will get started on it 17:48:13 as can the work of getting the manager started (in a basic form) as well 17:48:24 We're a day off j3, which is feature freeze.. So we can't land any of these in master.. But, We can A) submit WIP reviews.. (Hard to collaborate tho), or B) Fork to a shared GH for those working on it, then submit once master opens to new features again 17:48:39 Which brings to the next agenda item 17:48:42 Server Pools - do we have a branch to work on next? 17:48:46 No feature branch, eh? 17:49:07 None of the other projects do that, so, I'd prefer not to start a new trend ;) 17:49:28 i would prefer to ;) 17:50:02 we can always remove the branch when we are back in the master with the work. are there things that would prevent us from branching and working? 17:50:10 yeap 17:50:13 we already have branches for earlier stable releases - so logically makes sense to have branches for future unstable ones too 17:50:14 rjrjr: yep, all the Gerrit ACLs prevent us 17:50:36 WIP Reviews are probably fine at first. Once they're ready to land we can put them in a fork or something. 17:50:37 https://wiki.openstack.org/wiki/Branch_Model 17:50:38 we can ask in infra to have those permissions opened, 17:50:40 didn't know. make sense then. 17:50:56 I will ask if there is any problem doing it 17:51:07 seems like branching should be allowed by the core team at least. 17:51:08 nova is discussing doing it as well on the mailing list 17:51:32 It kind of seems like something people would want to do, I'm surprised it hasn't been done. 17:52:01 yeah - I think there is some philisophical objection to it in openstack 17:52:05 mugsie: BTW come Sept 25th, master is open to new features again.. 17:52:14 3 weeks ago 17:52:15 i'm with vinod1. specs aside, i'm ready to start coding pieces of this as well. 17:52:16 away* 17:52:41 rjrjr: Some of us have some spare cycles too, do you need any help with the pool manager etc stuff? 17:52:48 yeah - but some of this will have to happen in tandem, and a feature branch will make a colaboration much easier 17:53:20 Also - The *intent* of feature freeze is to shift developers focus from new features to stabilizing a release.. Creating a feature branch and moving everyone there somewhat defeats that purpose 17:53:26 vinod1, i'd love to talk to you about IXFR. it is absolutely need by us. 17:53:52 rjrjr: We'd be happy to take one of the coding pieces off your plate, it you'd like. 17:53:59 yes - but if people have cycles now, to do feature work, we should be using them. 17:54:02 sure, IXFR. 17:54:05 And - We're talking about 3 weeks here, which isn't 6 months etc 17:54:27 rjrjr: We're more thinking the bits that can be done right now, Pool Manager, Storage, etc? 17:55:08 Realistically, it'll probably be at least 2 weeks before anything is ready to merge, we can probably wait another week to merge stuff? 17:55:54 Kiall: To your point of 3 weeks, as timsim suggests, we can start off with WIP and then if it is ready to merge before that - come back to this discussion again 17:56:13 vinod1: Yep - Agreed. 17:56:29 timsim, let's chat afterwards. 17:56:34 vinod1, +1. 17:56:55 rjrjr: ok. 17:57:26 Okay - Moving on so :) 17:57:31 #topic Open Discussion 17:57:42 paris talks. any updates? 17:57:44 3 mins ;) 17:57:47 ;) 17:57:55 only one got in 17:58:02 rjrjr: did you not get the email etc for yours? 17:58:03 which one? 17:58:27 https://openstacksummitnovember2014paris.sched.org/event/d375ee7f87608f12d1dfb0868f0eb826 17:58:46 and sched is down already 17:58:54 the general overview of designate 17:58:55 lol - loaded for me 17:59:02 refresh, and it fails. 17:59:21 wow. guess i'm not going to paris. 8^( 17:59:31 rjrjr: no chance without a talk? 17:59:37 none. 17:59:47 Urgh.. Annoying! 17:59:55 Okay.. Well, times up. SlickNik will start getting abusive if we don't clear out! ;) 18:00:03 #endmeeting