14:00:44 <markmc> #startmeeting oslo
14:00:45 <openstack> Meeting started Fri Aug 16 14:00:44 2013 UTC and is due to finish in 60 minutes.  The chair is markmc. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:46 <oldben> \o
14:00:47 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:49 <openstack> The meeting name has been set to 'oslo'
14:00:50 <flaperboon> \o/
14:00:50 <markmc> #topic hello everybody
14:00:53 <dhellmann> o/
14:01:06 <flaperboon> yo yo
14:01:07 <markmc> casual nick friday, eh? :)
14:01:11 <flaperboon> :D
14:01:14 <flaperboon> yeah
14:01:47 <markmc> #link https://etherpad.openstack.org/HavanaOsloMessaging
14:02:35 <markmc> ok
14:02:46 <markmc> I'm figuring on this for an agenda:
14:02:53 <markmc> agenda:
14:02:53 <markmc> oslo.messaging status
14:02:54 <markmc> secure messaging
14:02:54 <markmc> reject/requeue/ack etc.
14:02:59 <markmc> anyone got anything else?
14:03:35 * dhellmann shakes head
14:03:40 <flaperboon> markmc: ++
14:03:43 <markmc> #topic oslo.messaging status
14:04:02 <markmc> so, I'm a bit distracted hear because I'm nose deep in nova unit tests
14:04:10 <markmc> started with like 10k failing, down to about 300
14:04:20 <dhellmann> nice
14:04:20 <sandywalsh> o/
14:04:39 <markmc> feeling confident I'll finished them up today or over the weekend
14:04:56 <markmc> no idea how whether more issues will turn up under devstack
14:05:05 <flaperboon> awesome. We said today was to go / no-go day for oslo.messaging
14:05:12 <markmc> the tests so far have found a couple of bloopers in oslo.messaging, but seems pretty good so far
14:05:21 <markmc> yeah
14:05:26 <markmc> I'd like to be further along today
14:05:42 <markmc> but the nova port is coming along so well, and nearly there ...
14:05:52 <markmc> I'm still feeling like it's right to move forward with it
14:06:08 <markmc> I figure it's best to just limit oslo.messaging to Nova for Havana, though
14:06:19 <markmc> any thoughts?
14:06:39 <flaperboon> yeah, I agree with that! I actually blocked a Glance patch just waiting for oslo.messaging to land in Nova
14:06:51 <markmc> flaperboon, which patch was this?
14:06:58 <flaperboon> I think it is fair to focus on the big brother first
14:07:01 <flaperboon> lemme get that for you
14:07:05 <markmc> cheers
14:07:09 <sandywalsh> so long as the messages on the notification exchanges are the same as before, there should be no breakage
14:07:19 <flaperboon> markmc: https://review.openstack.org/#/c/37511/
14:07:49 <jd__> o/
14:07:49 <flaperboon> the patch pulls in oslo.rpc and my suggestion is to do neither 'til the Ith development starts
14:08:00 <simo> o/
14:08:11 <flaperboon> jd__: simo o/
14:08:41 <markmc> sandywalsh, yeah, there should be zero wire format changes
14:08:49 <sandywalsh> cool
14:08:59 <markmc> sandywalsh, the drivers themselves are pretty minimally modified
14:09:10 <markmc> I want to avoid refactoring them until we've got Nova moved over
14:09:31 <markmc> don't want to be trying to debug regressions and being unsure whether it's the refactoring of the drivers, or the porting to the lib
14:09:43 <flaperboon> markmc: should we move over other projects before starting any change?
14:09:51 <jd__> flaperboon: not sure it's a worth idea to block this for Havana seeing how little intrusive it is too glance
14:09:51 <flaperboon> s/change/refactor/
14:09:52 <sandywalsh> makes sense. I think notifications are the only things that go cross-service, yes?
14:10:14 <markmc> sandywalsh, yes
14:10:32 <markmc> flaperboon, I'd be happy to start refactoring once Nova is moved over, think that's a big enough user
14:10:41 <markmc> flaperboon, but, might be best to leave it until Icehouse
14:10:50 <flaperboon> markmc: agreed
14:10:54 <markmc> if the oslo-incubator and oslo.messaging drivers are kept similar in Grizzly
14:11:03 <flaperboon> jd__: you reckon? My thought was to avoid duplicating work
14:11:07 <markmc> that'll make fixing stuff anything that comes up hell of a lot easier
14:11:14 <flaperboon> I mean, pulling in oslo.rpc and then migrating to oslo.messaging
14:11:48 <flaperboon> I may have been too paranoid, though
14:11:51 <jd__> flaperboon: now that flwang wrote the patch… we'll update with oslo.messaging for Icehouse, IMHO it's still better to use Oslo's code than Glance's code for H :)
14:11:51 <dhellmann> markmc: grizzly?
14:12:06 <markmc> heh, sorry
14:12:08 <markmc> Havana
14:12:17 <markmc> brain is fried from nova's tests
14:12:32 <dhellmann> :-)
14:12:54 <markmc> assuming the nova work goes ok over the next few days
14:13:07 <markmc> the big risk is going to be finding nova-core folks to review it
14:13:15 <flaperboon> jd__: right, I'll give it an extra thought afterwards, thanks for the advice!
14:13:18 <markmc> large and tedious patch, breaking it up as much as I can
14:13:21 <markmc> but it's still huge
14:13:24 <jd__> flaperboon: np :)
14:13:38 <markmc> ok, seems we don't have much to talk about here
14:13:41 * markmc moves on
14:13:46 <markmc> #topic secure messaging
14:13:50 <markmc> simo, yo
14:14:00 <markmc> simo, care to summarize the current state of affairs?
14:15:13 <markmc> hmm
14:15:24 <simo> markmc: yeah, unfortuantely work has been slowed down
14:15:38 <simo> api review on Keystone side is going on
14:15:54 <simo> there are some minor changes to the API that will need to be reflected in securemessage.py code
14:16:16 <simo> thierry suggested that KDS not land in Havana though and we instead commit it when Icehouse is started
14:16:40 <simo> given nova will probably not be able to use this stuff until IceHose anyway I guess that's reasonable at this point
14:17:02 <markmc> yeah, I'm tending to agree
14:17:21 <markmc> we're close, but it's that 80/20 thing
14:17:27 <simo> I have a few patches to add group jey support not merged in oslo yet, but I have been holding waiting for KDS review so we do not have to change them again aftr commit
14:17:46 <markmc> I'd much rather see this land early in Icehouse and be in awesome shape come release time
14:18:00 <markmc> rather than include in Havana with a "it's there, but you can't really use it" caveat
14:18:01 <simo> markmc: we tried and we got at least a poc we know works (wellI know it does at any rate, don;t think anybody else have tested)
14:18:28 <simo> markmc: ideally I keep patches up to date, and test and we land them as soon as Icehose opens
14:18:35 <markmc> yeah
14:18:51 <markmc> simo, assuming the oslo.messaging port of nova makes it
14:18:57 <simo> markmc: should we port them to oslo.messaging before landing them at this point ?
14:19:04 <markmc> it probably makes sense to re-focus the work on oslo.messaging
14:19:08 <simo> ok
14:19:16 <markmc> but also wait until the icehouse branch of oslo.messaging opens
14:19:20 <simo> then I'll ping you as soon as the nova work is done
14:19:24 <markmc> cool
14:19:39 <simo> this is not that urgent that I need to distract you now
14:19:44 <simo> I Can keep testing on the current base
14:19:48 <simo> and have patches ready
14:19:49 <markmc> thinking about the API implications, the main one will be to add a "source" name to RPCClient
14:20:00 <simo> then we simply 'translate' them into oslo.messaging
14:20:10 <markmc> cool
14:20:18 <simo> markmc: yes I looked into that and it was 'hard'
14:20:28 <markmc> simo, why hard?
14:20:32 <simo> in fact I was going to ask for leniency and wait until oslo.messaging to do that change
14:20:43 <simo> markmc: lot's of changes required in callers :)
14:20:54 <simo> outside of oslo-incubator I mean
14:20:55 <markmc> oh, you're talking about having to add the source name to all cast()/call()s
14:21:04 <simo> yup
14:21:06 <markmc> right, it's all changed in oslo.messaging
14:21:33 <simo> yep so it is better if we do that in the context of oslo.messaging given major changes to users of the rpc code already happen there
14:21:37 <markmc> http://docs.openstack.org/developer/oslo.messaging/rpcclient.html
14:21:51 <markmc> self._client = messaging.RPCClient(transport, target)
14:21:57 <markmc> self._client.call(ctxt, 'test', arg=arg)
14:22:12 <markmc> you'll want to add a 'source' arg to messaging.RPCClient() I think
14:22:25 <markmc> confusingly, the type of 'source' would be messaging.Target
14:22:27 <simo> markmc would it make sense to split topic in topic, host there ?
14:22:33 <markmc> yes
14:22:44 <markmc> source = messaging.Target(topic='compute', server='myhost')
14:23:00 <simo> ah target is already a tuple ?
14:23:19 <markmc> the Target data structure contains exchange, namespace, version, topic, server, fanout
14:23:20 <markmc> http://docs.openstack.org/developer/oslo.messaging/target.html
14:23:23 <markmc> so, yeah
14:23:28 <simo> ok
14:23:31 <markmc> we're keeping topic and server separate already
14:23:36 <simo> from the example I didn't realize it:
14:23:43 <simo> target = messaging.Target(topic='testtopic', version='2.0')
14:23:48 <simo> see no 'host' part in here :)
14:24:05 <markmc> right, you typically do it this way
14:24:10 <simo> anyway
14:24:15 <simo> no need to go in specifics here
14:24:22 <markmc> self._client = messaging.RPCClient(transport, Target(topic='compute'))
14:24:47 <markmc> cctxt = self._client.prepare(server='myhost')
14:24:48 <simo> I think we agree to move to have the rest land in Icehouse immediately after oslo.messaging goes in
14:24:54 <markmc> cctxt.cast(...)
14:25:05 <markmc> great, sounds good
14:25:06 <markmc> it was always a big ask to get this into havana
14:25:06 <markmc> onwards to icehouse :)
14:25:10 <simo> yup
14:25:12 <simo> no problem
14:25:12 * markmc moves onto the next topic
14:25:22 <markmc> #topic reject/requeue/ack etc.
14:25:30 <markmc> sandywalsh, you're up :)
14:25:41 <sandywalsh> :) where to start ...
14:25:47 <markmc> sandywalsh, from the vagueness of the topic, you can see I'm not totally following this yet
14:25:55 * markmc is being a slacker
14:26:01 <sandywalsh> so, in nova we're currently using queuing systems in two ways
14:26:20 <sandywalsh> 1. api -> scheduler = classic queueing. N producers, M consumers
14:26:33 <sandywalsh> 2. scheduler -> compute = "rpc" 1:1
14:26:43 <sandywalsh> and 3. notifications = classic queueing
14:26:56 <sandywalsh> for classic queueing, we need reliability
14:27:02 <sandywalsh> ack/requeue/reject
14:27:14 <sandywalsh> focusing on rpc-only, we don't need that
14:27:46 <sandywalsh> but for things like billing events (which we can't drop and the reason notification support was added) ... we need reliability
14:28:22 <sandywalsh> :) not sure what to add
14:28:41 <markmc> interesting, you see a need for that reliability for #1 too
14:28:54 <sandywalsh> yes, definitely
14:28:57 <markmc> (not disagreeing, just interesting we've had it so long without it)
14:29:03 <sandywalsh> and orchestration/state management will need it too
14:29:12 <sandywalsh> I think the current proposal is very low risk in the short term.
14:29:19 <jd__> indeed, I think it makes sense in lots more topic than we might think initially
14:29:31 <markmc> I'd be very hesitant to change the behaviour of #1 in havana
14:29:43 <markmc> makes total sense to me for #3
14:29:56 <jd__> markmc: that'd be a difference between "works most of the time" and "work reliably" ;)
14:30:04 <sandywalsh> markmc, agreed ... we've hacked around it in the past by putting "retry" code in every service
14:30:25 <sandywalsh> so, unwinding that now would be risky
14:30:39 <markmc> I'm not sure how we expose the semantics in oslo.messaging, but I'm not really concerned
14:30:46 <markmc> we'll figure it out
14:30:57 <flaperboon> mmh, perhaps we could revert this commit: https://github.com/openstack/oslo.messaging/commit/84a0693737814665f7b0eb2f42fd9ec99bdb80d5
14:30:59 <sandywalsh> at the very least, in message, we should make the message part of the api
14:31:01 <markmc> API flags per rpc server, transport specific flags, user config
14:31:07 <markmc> some combination of those, we'll see
14:31:12 <sandywalsh> and put ack/reject/requeue methods on it
14:31:21 <sandywalsh> the particular implementations can handle accordingly
14:31:52 <markmc> nah, messages aren't explicitly part of the API
14:32:15 <markmc> i.e. for RPC it's client.cast('mymethod', **mywargs)
14:32:29 <markmc> and on the server side, it's gets dispatched to an endpoint method called mymethod
14:32:33 <markmc> so there's no message handle
14:32:37 <sandywalsh> right, for notifications it makes more sense
14:32:52 <markmc> I actually think this behaviour (for rpc) is something you'd configure per server
14:32:55 <sandywalsh> and delivery_info metadata is handy for N:M communications
14:33:12 <markmc> e.g. server = messaging.RPCServer(reject_on_error=True)
14:33:16 <markmc> or whatever
14:33:34 <sandywalsh> I think my current branch does it pretty well ... it's very specific to join_consumer_pool(..., ack_on_error=False) as the only way to enable those operations.
14:33:41 <markmc> I wonder is there anything interesting on the RPCClient side
14:34:05 <markmc> RPCClient(server_shouldnt_drop_messages=True)
14:34:10 <sandywalsh> reject ... the consumer has to throw RejectMessageException() to tell the rpc layer to reject the message.
14:34:15 <sandywalsh> this is kind of klunky
14:34:39 <sandywalsh> but since Message isn't a first-class citizen there's not many other ways
14:35:08 <markmc> well
14:35:19 <markmc> the notification client's API
14:35:25 <markmc> surely, when you dispatch a notification
14:35:38 <markmc> any exceptions raised by it should cause the message to be rejected?
14:35:47 <markmc> is RejectMessageException() just an implementation detail?
14:36:03 <sandywalsh> no, they should requeue() by default
14:36:09 <sandywalsh> and let another worker pick it up
14:36:11 <markmc> oh, ok
14:36:12 <markmc> sorry
14:36:22 <markmc> explain the use case for reject() ?
14:36:27 <flaperboon> mmh, sounds more like a target thing to me, which is what the Driver and Listener get
14:36:33 <sandywalsh> "I've seen this message before"
14:36:35 <markmc> requeue() is where there's an error in the consumers processing, right?
14:36:39 <markmc> oh
14:36:41 <sandywalsh> correct
14:37:06 <markmc> don't we have automatic duplicate message rejection for RPCs ?
14:37:17 <markmc> why are we pushing dup detection to the API user for notifications?
14:37:49 <sandywalsh> hmm, not sure what you mean?
14:38:23 <jd__> I think sandywalsh case is when you receive a notification, start treating it, fail, call requeue() to retry later, receive it again, finally decides you already handled it and so call reject()
14:38:55 <dhellmann> why would we reject the message if we get it a second time?
14:39:04 <markmc> ok, so you only know you've handled it by looking in ceilometer's database
14:39:13 <markmc> as opposed to an in-memory cache of messages you've seen before
14:39:14 <sandywalsh> jd__, correct, that's the common use-case ... for example, save to db, do some down-stream processing which fails. ... we don't want to reprocess the message another time
14:39:18 <jd__> markmc: yes
14:39:30 <markmc> dhellmann, right, that's my next question - why reject() vs just ack()ing duplicate messages
14:39:32 <sandywalsh> but that's a larger discussion, more around CM pipelines (a different ML thread :)
14:39:36 <dhellmann> sandywalsh: why does the message delivery system care about that though?
14:39:40 <jd__> maybe the question is "was it wise to call requeue() the first time?"
14:39:41 <dhellmann> markmc: exactly
14:40:14 <sandywalsh> dhellmann, the message may have to go to another worker
14:40:30 <sandywalsh> remember we're talking N:M here ... lots of producers, lots of consumers
14:41:03 <markmc> don't get it
14:41:10 <markmc> if a worker in a pool has seen this before
14:41:17 <markmc> why would you want another worker to get it?
14:41:30 <dhellmann> sandywalsh: if the workers are different, the should each have their own queue. If they are for load balancing, then they should pull from the same queue and ack messages that are delivered properly.
14:41:56 <dhellmann> and handled properly, even if handling means deciding that it has been seen and doesn't need to be processed again
14:42:01 <sandywalsh> the simple use case is: get an event, store an event. fail on the return.
14:42:12 <sandywalsh> we don't want to repeat that process (we'll fill the db)
14:42:38 <sandywalsh> but we can't ack() since we could lose the event
14:42:52 <sandywalsh> and we never want to lose a billing event
14:43:58 <jd__> btw what happens if there's no ack nor reject nor anything? (if that's possible?)
14:44:04 <dhellmann> if there is an error, then yes, report the error to trigger re-delivery
14:44:15 <dhellmann> if the error is "I've seen this already", that's not an error IMO
14:44:21 <oldben> But in that example you already stored the event.
14:44:26 <oldben> So you haven't lost anything.
14:44:32 <sandywalsh> dhellmann, correct, which is the raise RejectMessage() situation
14:45:05 <sandywalsh> jd__, there has to be an ack() at the very least or the queuing system will never drain
14:45:19 <flaperboon> mmhh, but, why requeue wouldn't work here?
14:45:23 <markmc> sandywalsh, why are we worried about losing messages we've marked as "I've seen this already"
14:45:45 <markmc> surely we want to lose them?
14:45:48 <jd__> sandywalsh: oh I'm just wondering what the queueing system does in that case
14:45:56 <markmc> won't another worker see it, notice it's a dup and also reject it?
14:45:56 <sandywalsh> flaperboon, requeueing is what would happen with any other exception type (not RejectMessage)
14:46:10 <markmc> sorry, nevermind the last comment
14:46:10 <dhellmann> what would RejectMessage do, then?
14:46:19 <flaperboon> sandywalsh: ^
14:46:25 <sandywalsh> dhellmann, flush the message from the queue
14:46:32 <dhellmann> how is that different from ack?
14:46:34 <flaperboon> sandywalsh: but that's done automatically now
14:46:48 <flaperboon> I mean, ack is always called (as for now)
14:46:55 <sandywalsh> yeah, reject and ack() are essentially the same
14:47:07 <dhellmann> ok
14:47:16 <dhellmann> then why have reject as a special case?
14:47:30 <sandywalsh> there are some minor things about dead-letter queues,e tc
14:47:42 <sandywalsh> but, not critical for this discussion
14:47:53 <sandywalsh> I'd be happy to just have ack() and requeue()
14:48:08 <markmc> i.e. notice the duplicate message, but do nothing special, it's a no-op
14:48:13 <markmc> rather than raise RejectMessage
14:48:15 <dhellmann> that seems like enough to me
14:48:29 <sandywalsh> sure, I'm cool with that
14:48:35 <markmc> coolness
14:48:51 <sandywalsh> gets rid of a mess too :)
14:49:10 <flaperboon> +1
14:49:35 <markmc> sandywalsh, ok, on the general point of maintaining API compatibility
14:49:47 <markmc> any API in oslo-incubator can be changed incompatibly
14:49:54 <markmc> sometimes a compatibility hack is nice
14:50:06 <markmc> but if the incompatible change isn't going to effect too many users
14:50:10 <markmc> or the hack is too gross
14:50:22 <markmc> then I'd actually prefer the incompatible change
14:50:47 <markmc> it's in oslo.messaging that we'll have to go to great lengths to avoid incompatible changes :)
14:51:03 <sandywalsh> k
14:51:08 <markmc> sandywalsh, oh, yes - delivery_info
14:51:12 <markmc> what's the use case for that?
14:51:37 <sandywalsh> there are possible work arounds on the client side, but basically its a means to get metadata about the event with the message
14:51:45 <sandywalsh> "which topic did this come in on"
14:51:51 <sandywalsh> "what queue did it come from"
14:51:58 <sandywalsh> "when did it arrive"
14:52:24 <sandywalsh> so, it's not a deal breaker, but there's some good stuff in the delivery info
14:52:27 <markmc> ok
14:52:31 <markmc> I'd be happy to have that
14:52:37 <markmc> so long as we put an abstraction in place
14:52:41 <flaperboon> +1
14:52:42 <sandywalsh> yep, certainly
14:52:49 <markmc> so let's put it on the todo list for the oslo.messaging notification client API
14:52:56 <dhellmann> yeah, that sounds really close to the Target class already, right?
14:53:08 <dhellmann> I don't remember if Target has the queue name
14:53:30 <markmc> no, it doesn't
14:53:35 <flaperboon> dhellmann: nope, just topic
14:53:37 <flaperboon> and exchange
14:53:42 <dhellmann> makes sense
14:53:56 <dhellmann> so a new class, I guess
14:54:05 <sandywalsh> "number of retries"
14:54:16 <sandywalsh> potentially very handy info
14:54:37 <markmc> ok, cool
14:54:40 <dhellmann> yeah, if we make a class it's easy to include all sorts of things like that and add new ones later
14:54:47 <sandywalsh> yep
14:54:55 <markmc> just about out of time, so ...
14:54:59 <markmc> #topic open floor
14:55:05 <markmc> and it doesn't have to be messaging :)
14:55:15 * markmc notes boris-42's oslo.db proposal
14:55:36 <markmc> #link http://lists.openstack.org/pipermail/openstack-dev/2013-August/013748.html
14:55:42 <markmc> anything else?
14:56:20 <flaperboon> Would it make sense to create an oslo.string (perhaps a different name) I was thinking we could collect strutils, gettext there
14:56:22 <markmc> gettext, I'd put into oslo.i18n
14:56:23 <markmc> fairly specialized area
14:56:32 <markmc> I don't really know what do do about *utils
14:56:40 <markmc> could do oslo.utils
14:56:46 <flaperboon> markmc: me neither
14:56:50 <markmc> but fear it would become a dumping ground
14:57:01 <flaperboon> markmc: I think it would :/
14:57:25 <oldben> Yeah, utils ~= I don't know where this should go, I'll put it in utils
14:57:36 <flaperboon> however, if we don't have oslo.utils we'll end up with a bunch of repos with 1 module in them
14:57:43 <flaperboon> oslo.strings just for strutils
14:58:03 <markmc> oslo.utils.strutils, and friends
14:58:20 <markmc> just some way of figuring out what should and shouldn't go into oslo.utils is needed
14:58:25 <markmc> some way of explaining it
14:58:35 <markmc> e.g. "function <10 lines"
14:58:35 <markmc> meh
14:58:48 <oldben> Probably going to be a lot of judgment calls there either way.
14:58:54 <flaperboon> well, actually utilities instead of just common code
14:59:00 <dhellmann> I'd rather we stick to libraries having clear purposes and not have a "generic" lib
14:59:01 <markmc> maybe the categories of core python libs that the utilities relate to
14:59:10 <flaperboon> dhellmann: +1
14:59:17 <markmc> oslo.coreutils :)
14:59:21 <flaperboon> LOL
14:59:24 <markmc> because "core" clarifies everything
14:59:30 <flaperboon> huahuahua
14:59:40 <dhellmann> if that means we put gettext and strutils in oslo.text instead of oslo.i18n and oslo.utils maybe that's not a bad thing
14:59:53 <flaperboon> dhellmann: yeah, that's what I had in mind
14:59:53 <markmc> dhellmann, yeah, maybe
14:59:57 <markmc> oslo.text works
15:00:12 <flaperboon> having something like oslo.text (I'm bad with mod names) w/ both in there
15:00:18 <dhellmann> we'll also have to keep an eye on the circular dependencies, of course
15:00:32 * jd__ suggests oslo.data
15:00:55 <markmc> jd__, oslo.data vs oslo.db :)
15:01:05 <simo> oslo.teletext, a blast from the past
15:01:07 <markmc> ok, we're outta time
15:01:12 <markmc> good topic for HK
15:01:17 <markmc> #endmeeting