14:00:44 #startmeeting oslo 14:00:45 Meeting started Fri Aug 16 14:00:44 2013 UTC and is due to finish in 60 minutes. The chair is markmc. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:46 \o 14:00:47 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:49 The meeting name has been set to 'oslo' 14:00:50 \o/ 14:00:50 #topic hello everybody 14:00:53 o/ 14:01:06 yo yo 14:01:07 casual nick friday, eh? :) 14:01:11 :D 14:01:14 yeah 14:01:47 #link https://etherpad.openstack.org/HavanaOsloMessaging 14:02:35 ok 14:02:46 I'm figuring on this for an agenda: 14:02:53 agenda: 14:02:53 oslo.messaging status 14:02:54 secure messaging 14:02:54 reject/requeue/ack etc. 14:02:59 anyone got anything else? 14:03:35 * dhellmann shakes head 14:03:40 markmc: ++ 14:03:43 #topic oslo.messaging status 14:04:02 so, I'm a bit distracted hear because I'm nose deep in nova unit tests 14:04:10 started with like 10k failing, down to about 300 14:04:20 nice 14:04:20 o/ 14:04:39 feeling confident I'll finished them up today or over the weekend 14:04:56 no idea how whether more issues will turn up under devstack 14:05:05 awesome. We said today was to go / no-go day for oslo.messaging 14:05:12 the tests so far have found a couple of bloopers in oslo.messaging, but seems pretty good so far 14:05:21 yeah 14:05:26 I'd like to be further along today 14:05:42 but the nova port is coming along so well, and nearly there ... 14:05:52 I'm still feeling like it's right to move forward with it 14:06:08 I figure it's best to just limit oslo.messaging to Nova for Havana, though 14:06:19 any thoughts? 14:06:39 yeah, I agree with that! I actually blocked a Glance patch just waiting for oslo.messaging to land in Nova 14:06:51 flaperboon, which patch was this? 14:06:58 I think it is fair to focus on the big brother first 14:07:01 lemme get that for you 14:07:05 cheers 14:07:09 so long as the messages on the notification exchanges are the same as before, there should be no breakage 14:07:19 markmc: https://review.openstack.org/#/c/37511/ 14:07:49 o/ 14:07:49 the patch pulls in oslo.rpc and my suggestion is to do neither 'til the Ith development starts 14:08:00 o/ 14:08:11 jd__: simo o/ 14:08:41 sandywalsh, yeah, there should be zero wire format changes 14:08:49 cool 14:08:59 sandywalsh, the drivers themselves are pretty minimally modified 14:09:10 I want to avoid refactoring them until we've got Nova moved over 14:09:31 don't want to be trying to debug regressions and being unsure whether it's the refactoring of the drivers, or the porting to the lib 14:09:43 markmc: should we move over other projects before starting any change? 14:09:51 flaperboon: not sure it's a worth idea to block this for Havana seeing how little intrusive it is too glance 14:09:51 s/change/refactor/ 14:09:52 makes sense. I think notifications are the only things that go cross-service, yes? 14:10:14 sandywalsh, yes 14:10:32 flaperboon, I'd be happy to start refactoring once Nova is moved over, think that's a big enough user 14:10:41 flaperboon, but, might be best to leave it until Icehouse 14:10:50 markmc: agreed 14:10:54 if the oslo-incubator and oslo.messaging drivers are kept similar in Grizzly 14:11:03 jd__: you reckon? My thought was to avoid duplicating work 14:11:07 that'll make fixing stuff anything that comes up hell of a lot easier 14:11:14 I mean, pulling in oslo.rpc and then migrating to oslo.messaging 14:11:48 I may have been too paranoid, though 14:11:51 flaperboon: now that flwang wrote the patch… we'll update with oslo.messaging for Icehouse, IMHO it's still better to use Oslo's code than Glance's code for H :) 14:11:51 markmc: grizzly? 14:12:06 heh, sorry 14:12:08 Havana 14:12:17 brain is fried from nova's tests 14:12:32 :-) 14:12:54 assuming the nova work goes ok over the next few days 14:13:07 the big risk is going to be finding nova-core folks to review it 14:13:15 jd__: right, I'll give it an extra thought afterwards, thanks for the advice! 14:13:18 large and tedious patch, breaking it up as much as I can 14:13:21 but it's still huge 14:13:24 flaperboon: np :) 14:13:38 ok, seems we don't have much to talk about here 14:13:41 * markmc moves on 14:13:46 #topic secure messaging 14:13:50 simo, yo 14:14:00 simo, care to summarize the current state of affairs? 14:15:13 hmm 14:15:24 markmc: yeah, unfortuantely work has been slowed down 14:15:38 api review on Keystone side is going on 14:15:54 there are some minor changes to the API that will need to be reflected in securemessage.py code 14:16:16 thierry suggested that KDS not land in Havana though and we instead commit it when Icehouse is started 14:16:40 given nova will probably not be able to use this stuff until IceHose anyway I guess that's reasonable at this point 14:17:02 yeah, I'm tending to agree 14:17:21 we're close, but it's that 80/20 thing 14:17:27 I have a few patches to add group jey support not merged in oslo yet, but I have been holding waiting for KDS review so we do not have to change them again aftr commit 14:17:46 I'd much rather see this land early in Icehouse and be in awesome shape come release time 14:18:00 rather than include in Havana with a "it's there, but you can't really use it" caveat 14:18:01 markmc: we tried and we got at least a poc we know works (wellI know it does at any rate, don;t think anybody else have tested) 14:18:28 markmc: ideally I keep patches up to date, and test and we land them as soon as Icehose opens 14:18:35 yeah 14:18:51 simo, assuming the oslo.messaging port of nova makes it 14:18:57 markmc: should we port them to oslo.messaging before landing them at this point ? 14:19:04 it probably makes sense to re-focus the work on oslo.messaging 14:19:08 ok 14:19:16 but also wait until the icehouse branch of oslo.messaging opens 14:19:20 then I'll ping you as soon as the nova work is done 14:19:24 cool 14:19:39 this is not that urgent that I need to distract you now 14:19:44 I Can keep testing on the current base 14:19:48 and have patches ready 14:19:49 thinking about the API implications, the main one will be to add a "source" name to RPCClient 14:20:00 then we simply 'translate' them into oslo.messaging 14:20:10 cool 14:20:18 markmc: yes I looked into that and it was 'hard' 14:20:28 simo, why hard? 14:20:32 in fact I was going to ask for leniency and wait until oslo.messaging to do that change 14:20:43 markmc: lot's of changes required in callers :) 14:20:54 outside of oslo-incubator I mean 14:20:55 oh, you're talking about having to add the source name to all cast()/call()s 14:21:04 yup 14:21:06 right, it's all changed in oslo.messaging 14:21:33 yep so it is better if we do that in the context of oslo.messaging given major changes to users of the rpc code already happen there 14:21:37 http://docs.openstack.org/developer/oslo.messaging/rpcclient.html 14:21:51 self._client = messaging.RPCClient(transport, target) 14:21:57 self._client.call(ctxt, 'test', arg=arg) 14:22:12 you'll want to add a 'source' arg to messaging.RPCClient() I think 14:22:25 confusingly, the type of 'source' would be messaging.Target 14:22:27 markmc would it make sense to split topic in topic, host there ? 14:22:33 yes 14:22:44 source = messaging.Target(topic='compute', server='myhost') 14:23:00 ah target is already a tuple ? 14:23:19 the Target data structure contains exchange, namespace, version, topic, server, fanout 14:23:20 http://docs.openstack.org/developer/oslo.messaging/target.html 14:23:23 so, yeah 14:23:28 ok 14:23:31 we're keeping topic and server separate already 14:23:36 from the example I didn't realize it: 14:23:43 target = messaging.Target(topic='testtopic', version='2.0') 14:23:48 see no 'host' part in here :) 14:24:05 right, you typically do it this way 14:24:10 anyway 14:24:15 no need to go in specifics here 14:24:22 self._client = messaging.RPCClient(transport, Target(topic='compute')) 14:24:47 cctxt = self._client.prepare(server='myhost') 14:24:48 I think we agree to move to have the rest land in Icehouse immediately after oslo.messaging goes in 14:24:54 cctxt.cast(...) 14:25:05 great, sounds good 14:25:06 it was always a big ask to get this into havana 14:25:06 onwards to icehouse :) 14:25:10 yup 14:25:12 no problem 14:25:12 * markmc moves onto the next topic 14:25:22 #topic reject/requeue/ack etc. 14:25:30 sandywalsh, you're up :) 14:25:41 :) where to start ... 14:25:47 sandywalsh, from the vagueness of the topic, you can see I'm not totally following this yet 14:25:55 * markmc is being a slacker 14:26:01 so, in nova we're currently using queuing systems in two ways 14:26:20 1. api -> scheduler = classic queueing. N producers, M consumers 14:26:33 2. scheduler -> compute = "rpc" 1:1 14:26:43 and 3. notifications = classic queueing 14:26:56 for classic queueing, we need reliability 14:27:02 ack/requeue/reject 14:27:14 focusing on rpc-only, we don't need that 14:27:46 but for things like billing events (which we can't drop and the reason notification support was added) ... we need reliability 14:28:22 :) not sure what to add 14:28:41 interesting, you see a need for that reliability for #1 too 14:28:54 yes, definitely 14:28:57 (not disagreeing, just interesting we've had it so long without it) 14:29:03 and orchestration/state management will need it too 14:29:12 I think the current proposal is very low risk in the short term. 14:29:19 indeed, I think it makes sense in lots more topic than we might think initially 14:29:31 I'd be very hesitant to change the behaviour of #1 in havana 14:29:43 makes total sense to me for #3 14:29:56 markmc: that'd be a difference between "works most of the time" and "work reliably" ;) 14:30:04 markmc, agreed ... we've hacked around it in the past by putting "retry" code in every service 14:30:25 so, unwinding that now would be risky 14:30:39 I'm not sure how we expose the semantics in oslo.messaging, but I'm not really concerned 14:30:46 we'll figure it out 14:30:57 mmh, perhaps we could revert this commit: https://github.com/openstack/oslo.messaging/commit/84a0693737814665f7b0eb2f42fd9ec99bdb80d5 14:30:59 at the very least, in message, we should make the message part of the api 14:31:01 API flags per rpc server, transport specific flags, user config 14:31:07 some combination of those, we'll see 14:31:12 and put ack/reject/requeue methods on it 14:31:21 the particular implementations can handle accordingly 14:31:52 nah, messages aren't explicitly part of the API 14:32:15 i.e. for RPC it's client.cast('mymethod', **mywargs) 14:32:29 and on the server side, it's gets dispatched to an endpoint method called mymethod 14:32:33 so there's no message handle 14:32:37 right, for notifications it makes more sense 14:32:52 I actually think this behaviour (for rpc) is something you'd configure per server 14:32:55 and delivery_info metadata is handy for N:M communications 14:33:12 e.g. server = messaging.RPCServer(reject_on_error=True) 14:33:16 or whatever 14:33:34 I think my current branch does it pretty well ... it's very specific to join_consumer_pool(..., ack_on_error=False) as the only way to enable those operations. 14:33:41 I wonder is there anything interesting on the RPCClient side 14:34:05 RPCClient(server_shouldnt_drop_messages=True) 14:34:10 reject ... the consumer has to throw RejectMessageException() to tell the rpc layer to reject the message. 14:34:15 this is kind of klunky 14:34:39 but since Message isn't a first-class citizen there's not many other ways 14:35:08 well 14:35:19 the notification client's API 14:35:25 surely, when you dispatch a notification 14:35:38 any exceptions raised by it should cause the message to be rejected? 14:35:47 is RejectMessageException() just an implementation detail? 14:36:03 no, they should requeue() by default 14:36:09 and let another worker pick it up 14:36:11 oh, ok 14:36:12 sorry 14:36:22 explain the use case for reject() ? 14:36:27 mmh, sounds more like a target thing to me, which is what the Driver and Listener get 14:36:33 "I've seen this message before" 14:36:35 requeue() is where there's an error in the consumers processing, right? 14:36:39 oh 14:36:41 correct 14:37:06 don't we have automatic duplicate message rejection for RPCs ? 14:37:17 why are we pushing dup detection to the API user for notifications? 14:37:49 hmm, not sure what you mean? 14:38:23 I think sandywalsh case is when you receive a notification, start treating it, fail, call requeue() to retry later, receive it again, finally decides you already handled it and so call reject() 14:38:55 why would we reject the message if we get it a second time? 14:39:04 ok, so you only know you've handled it by looking in ceilometer's database 14:39:13 as opposed to an in-memory cache of messages you've seen before 14:39:14 jd__, correct, that's the common use-case ... for example, save to db, do some down-stream processing which fails. ... we don't want to reprocess the message another time 14:39:18 markmc: yes 14:39:30 dhellmann, right, that's my next question - why reject() vs just ack()ing duplicate messages 14:39:32 but that's a larger discussion, more around CM pipelines (a different ML thread :) 14:39:36 sandywalsh: why does the message delivery system care about that though? 14:39:40 maybe the question is "was it wise to call requeue() the first time?" 14:39:41 markmc: exactly 14:40:14 dhellmann, the message may have to go to another worker 14:40:30 remember we're talking N:M here ... lots of producers, lots of consumers 14:41:03 don't get it 14:41:10 if a worker in a pool has seen this before 14:41:17 why would you want another worker to get it? 14:41:30 sandywalsh: if the workers are different, the should each have their own queue. If they are for load balancing, then they should pull from the same queue and ack messages that are delivered properly. 14:41:56 and handled properly, even if handling means deciding that it has been seen and doesn't need to be processed again 14:42:01 the simple use case is: get an event, store an event. fail on the return. 14:42:12 we don't want to repeat that process (we'll fill the db) 14:42:38 but we can't ack() since we could lose the event 14:42:52 and we never want to lose a billing event 14:43:58 btw what happens if there's no ack nor reject nor anything? (if that's possible?) 14:44:04 if there is an error, then yes, report the error to trigger re-delivery 14:44:15 if the error is "I've seen this already", that's not an error IMO 14:44:21 But in that example you already stored the event. 14:44:26 So you haven't lost anything. 14:44:32 dhellmann, correct, which is the raise RejectMessage() situation 14:45:05 jd__, there has to be an ack() at the very least or the queuing system will never drain 14:45:19 mmhh, but, why requeue wouldn't work here? 14:45:23 sandywalsh, why are we worried about losing messages we've marked as "I've seen this already" 14:45:45 surely we want to lose them? 14:45:48 sandywalsh: oh I'm just wondering what the queueing system does in that case 14:45:56 won't another worker see it, notice it's a dup and also reject it? 14:45:56 flaperboon, requeueing is what would happen with any other exception type (not RejectMessage) 14:46:10 sorry, nevermind the last comment 14:46:10 what would RejectMessage do, then? 14:46:19 sandywalsh: ^ 14:46:25 dhellmann, flush the message from the queue 14:46:32 how is that different from ack? 14:46:34 sandywalsh: but that's done automatically now 14:46:48 I mean, ack is always called (as for now) 14:46:55 yeah, reject and ack() are essentially the same 14:47:07 ok 14:47:16 then why have reject as a special case? 14:47:30 there are some minor things about dead-letter queues,e tc 14:47:42 but, not critical for this discussion 14:47:53 I'd be happy to just have ack() and requeue() 14:48:08 i.e. notice the duplicate message, but do nothing special, it's a no-op 14:48:13 rather than raise RejectMessage 14:48:15 that seems like enough to me 14:48:29 sure, I'm cool with that 14:48:35 coolness 14:48:51 gets rid of a mess too :) 14:49:10 +1 14:49:35 sandywalsh, ok, on the general point of maintaining API compatibility 14:49:47 any API in oslo-incubator can be changed incompatibly 14:49:54 sometimes a compatibility hack is nice 14:50:06 but if the incompatible change isn't going to effect too many users 14:50:10 or the hack is too gross 14:50:22 then I'd actually prefer the incompatible change 14:50:47 it's in oslo.messaging that we'll have to go to great lengths to avoid incompatible changes :) 14:51:03 k 14:51:08 sandywalsh, oh, yes - delivery_info 14:51:12 what's the use case for that? 14:51:37 there are possible work arounds on the client side, but basically its a means to get metadata about the event with the message 14:51:45 "which topic did this come in on" 14:51:51 "what queue did it come from" 14:51:58 "when did it arrive" 14:52:24 so, it's not a deal breaker, but there's some good stuff in the delivery info 14:52:27 ok 14:52:31 I'd be happy to have that 14:52:37 so long as we put an abstraction in place 14:52:41 +1 14:52:42 yep, certainly 14:52:49 so let's put it on the todo list for the oslo.messaging notification client API 14:52:56 yeah, that sounds really close to the Target class already, right? 14:53:08 I don't remember if Target has the queue name 14:53:30 no, it doesn't 14:53:35 dhellmann: nope, just topic 14:53:37 and exchange 14:53:42 makes sense 14:53:56 so a new class, I guess 14:54:05 "number of retries" 14:54:16 potentially very handy info 14:54:37 ok, cool 14:54:40 yeah, if we make a class it's easy to include all sorts of things like that and add new ones later 14:54:47 yep 14:54:55 just about out of time, so ... 14:54:59 #topic open floor 14:55:05 and it doesn't have to be messaging :) 14:55:15 * markmc notes boris-42's oslo.db proposal 14:55:36 #link http://lists.openstack.org/pipermail/openstack-dev/2013-August/013748.html 14:55:42 anything else? 14:56:20 Would it make sense to create an oslo.string (perhaps a different name) I was thinking we could collect strutils, gettext there 14:56:22 gettext, I'd put into oslo.i18n 14:56:23 fairly specialized area 14:56:32 I don't really know what do do about *utils 14:56:40 could do oslo.utils 14:56:46 markmc: me neither 14:56:50 but fear it would become a dumping ground 14:57:01 markmc: I think it would :/ 14:57:25 Yeah, utils ~= I don't know where this should go, I'll put it in utils 14:57:36 however, if we don't have oslo.utils we'll end up with a bunch of repos with 1 module in them 14:57:43 oslo.strings just for strutils 14:58:03 oslo.utils.strutils, and friends 14:58:20 just some way of figuring out what should and shouldn't go into oslo.utils is needed 14:58:25 some way of explaining it 14:58:35 e.g. "function <10 lines" 14:58:35 meh 14:58:48 Probably going to be a lot of judgment calls there either way. 14:58:54 well, actually utilities instead of just common code 14:59:00 I'd rather we stick to libraries having clear purposes and not have a "generic" lib 14:59:01 maybe the categories of core python libs that the utilities relate to 14:59:10 dhellmann: +1 14:59:17 oslo.coreutils :) 14:59:21 LOL 14:59:24 because "core" clarifies everything 14:59:30 huahuahua 14:59:40 if that means we put gettext and strutils in oslo.text instead of oslo.i18n and oslo.utils maybe that's not a bad thing 14:59:53 dhellmann: yeah, that's what I had in mind 14:59:53 dhellmann, yeah, maybe 14:59:57 oslo.text works 15:00:12 having something like oslo.text (I'm bad with mod names) w/ both in there 15:00:18 we'll also have to keep an eye on the circular dependencies, of course 15:00:32 * jd__ suggests oslo.data 15:00:55 jd__, oslo.data vs oslo.db :) 15:01:05 oslo.teletext, a blast from the past 15:01:07 ok, we're outta time 15:01:12 good topic for HK 15:01:17 #endmeeting