15:01:31 #startmeeting Zaqar 15:01:32 Meeting started Mon Nov 17 15:01:31 2014 UTC and is due to finish in 60 minutes. The chair is flaper87. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:33 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:36 The meeting name has been set to 'zaqar' 15:01:42 * flaper87 still remembers how this thing works 15:01:45 #topic Roll Call 15:01:55 vkmc: cpallares dynarro_ zhiyan ? 15:01:57 around ? 15:01:58 o/ 15:02:01 o/ 15:02:02 * cpallares is not here 15:02:25 ok, looks like it's just us 15:02:37 #topic Agenda 15:02:39 #link https://wiki.openstack.org/wiki/Meetings/Zaqar#Weekly_Zaqar_.28queuing.29_Team_Meeting 15:02:51 kgriffs? 15:02:57 short list, straight items, easy to go through but long 15:03:12 we'll start with some of those specs and then continue next week 15:03:18 o/ 15:03:22 * flaper87 wanted to say something else but he forgot 15:03:24 kgriffs: hey :D 15:03:28 #link https://wiki.openstack.org/wiki/Meetings/Zaqar#Weekly_Zaqar_.28queuing.29_Team_Meeting 15:03:32 kgriffs: ^ agenda 15:04:01 #topic Persistent Transport https://review.openstack.org/#/c/134567/ 15:04:10 vkmc: want to say something ? 15:04:20 flaper87, of course 15:04:40 so we have discussed about this addition during the design sessions 15:04:59 there are several uses cases that we are not covering right now with the WSGI transport 15:05:15 so having a persistent transport alternative aims to cover those 15:05:30 I was worried about how coupled is Zaqar's transport with WSGI 15:05:42 but according flaper87's comments on the spec that shouldn't be a problem 15:05:52 flaper87, could you explain more how you are expecting that to work? 15:05:58 vkmc: sure 15:06:27 precisely, using WSGI transport as a reference for other transports 15:06:32 at the very beginning, many moons ago, we wanted to have a common API layer for *everything* - including the wsgi transport - but that resulted in way too much duplication 15:07:00 the idea now is to keep the wsgi transport as-is and implement that common API layer (this is something cpallares worked on) just for persistent transports 15:07:09 or well, "message based transports", if you will 15:07:28 the wsgi transport is the reference in terms of documentation, implementation, supported endpoints etc 15:07:42 and it'll, at least for now, be our recommended transport 15:08:13 +1 for that 15:08:15 Zaqar offers a RESTful API as its main way of communication 15:08:31 I reviewed the spec, it looks good to me. I dropped some comments there 15:08:49 the milestone proposed sounds reasonable and I'm happy you're willing to be the primary contact on it 15:08:57 so this transport addition will depend on the creation of the common API layer 15:08:58 does anyone want to jump in as a secondary contact? 15:09:14 vkmc: yes. Oh, btw, you'll have to complete that :P 15:09:21 jokes apart, cpallares worked on it 15:09:23 flaper87, that's what I wanted to know 15:09:26 :D 15:09:39 if she's got time to help completing it, it'd be amazing 15:09:46 I do want to help 15:09:50 vkmc: otherwise, you'll have to 15:09:57 vkmc: could you add cpallares as a secondary contact? 15:09:59 I don't remember exactly what is needed. 15:10:02 before she changes her mind 15:10:04 cpallares told me that she found a dead end on that, or I misunderstood that? 15:10:04 :P 15:10:04 lol 15:10:23 probably related to the wsgi thing? 15:10:26 * vkmc rush to add cpallares name to the spec 15:10:32 probably yes 15:10:39 we wanted to change that when cpallares was working on that cross-api thingy 15:10:51 #action vkmc to add cpallares as a secondary contact 15:11:08 #action cpallares and vkmc to coordinate on the things to do on that spec 15:11:23 #action kgriffs to say something ;) 15:11:29 #undo 15:11:29 Removing item from minutes: 15:11:34 #action kgriffs to say something related to that spec ;) 15:11:37 so, I don't see any mention of data encoding in the spec 15:11:44 * kgriffs does that 15:11:44 kgriffs: good point 15:11:50 vkmc: ^ 15:11:58 i would like to have that decided before approving 15:12:04 good point 15:12:08 btw, I looked at apache avro 15:12:16 We talked about msgpack, protobuf and I've heard capnprot is good 15:12:24 https://avro.apache.org/docs/current/#compare 15:12:25 we were considering apache avro, protobuf and msgpack 15:12:32 http://kentonv.github.io/capnproto/ 15:12:38 it's like protobuf with steroids 15:12:39 * kgriffs clicks 15:12:40 not sure 15:12:55 ok, the website already gained my attention 15:13:04 cerealization protocol lol 15:13:12 doge approves 15:13:20 vkmc: mind doing a small research and adding it to the spec ? 15:13:33 flaper87, sure thing 15:13:49 ok 15:13:55 vkmc: Who can resist that infinitely faster sticker? 15:13:56 #link http://jparyani.github.io/pycapnp/ 15:14:07 cpallares, someone without feelings 15:14:17 fwiw, avro i don't think will work for us 15:14:24 unless we do multipart mime or something 15:14:42 since you have to have a spec defined in advance, and we want to treat the message body as "spec-less" or "we don't care" 15:15:00 i could be wrong, but that is my first impression 15:15:00 kgriffs: +2 15:15:10 ok, lets wait for vkmc feedback on that 15:15:24 vkmc: it'd be great to have one proposed and the others as alternatives 15:15:34 flaper87, sounds good to me 15:15:43 coolio 15:15:45 lets move on 15:15:48 what are the evaluation criteria? 15:16:12 I'd like to see each option scored by criteria 15:16:15 kgriffs: dude, you're giving me more work, I was offloading everything on vkmc 15:16:17 :P 15:16:28 :) 15:16:37 Lets put some critirias on an etherpad 15:16:47 right after the meeting so that vkmc can work based on that 15:16:54 i think it will actually make things easier to eval and communicate the decision 15:16:58 let's do that yeah 15:17:08 flaper87, woooooork! 15:17:20 Roughly I think dynamicity, performance and adoption 15:17:26 cool, that was the only thing I had to add re that spec 15:17:31 As in support in the different languages and whatnot 15:17:43 perf on serializing/deserializing 15:17:44 vkmc: can you create the etherpad? 15:17:53 drop it here when you do, we'll all contribute to it 15:17:55 right. also keep in mind that javascript in the browser would be really nice to support 15:18:03 sure 15:18:06 kgriffs: I kinda think that's a must 15:18:07 since horizon wants push notifications 15:18:25 since browsers are one of the things driving this websocket work 15:18:39 we could complete the etherpad during the open discussion 15:18:41 or later 15:18:50 #action vkmc to evaluate serialization protocols for the websocket transport 15:18:55 roger 15:18:58 lets move on 15:19:03 #topic https://review.openstack.org/#/c/129192/ Notifications 15:19:44 As far as new features go, our priorities for this cycle are Notifications and persistent transport 15:19:57 everything blocking those 2 blueprints should be re-schedule after k-1 15:20:14 flwang: and vkmc wrote some comments in that spec 15:20:16 #link https://review.openstack.org/#/c/129192/ 15:20:33 I think the most critical one is what the endpoint should look like 15:21:01 IMO notifications should be done after queue/topics migration 15:21:02 The currently proposed endpoint is `/v2/subscribe/ 15:21:06 I don't like that idea but... 15:21:20 vkmc: we might make it work in parallel 15:21:30 flaper87, yeah, that would work too 15:21:37 bet lets give the s/queue/topic/ thing a week to gather feedback 15:21:37 sorry, i'm late! 15:21:42 zhiyan: np 15:22:08 flaper87: should we think more in terms of resources when designing the URLs? 15:22:10 I have to update that spec and add an subscription request example to it 15:22:16 i.e., subscribe --> subscriptions 15:22:41 kgriffs: wait, we just found a typo in the spec 15:22:43 :P 15:22:47 /v2/subscritions/{subscription}/ 15:22:54 then below 15:22:55 kewl 15:22:57 /v2/subscribe 15:23:03 kgriffs: comment ? :D 15:23:07 otherwise I'll forget 15:23:15 kgriffs: in other words, I totally agree with you 15:24:41 kk 15:24:45 I can comment 15:24:50 Any comments on the milestone ? 15:25:07 It's K-2 because k-1 seems to be too close to have something working 15:25:30 I'd like to give this thing enough time for discussion and it should definitely have more iterations 15:26:15 * flaper87 watches a butterfly passing by 15:26:22 * flaper87 starts following that butterfly 15:26:34 * flaper87 starts singing barney's songs 15:26:46 * flaper87 is jumping around 15:26:49 no, wait, stop! 15:26:56 * kgriffs can't stand the barney 15:26:59 * flaper87 stops and slaps himself 4 times 15:27:02 * kgriffs would rather go see a dentist 15:27:09 * flaper87 knew that would get kgriffs attention 15:27:12 LOOOL 15:27:12 so... 15:27:37 seems like the big thing to figure out on this is how to push the notifications 15:27:56 because we need some sort of worker pool that is out-of-band from the web server, right? 15:27:58 mmh, actually, I thought that was clear 15:28:00 :/ 15:28:02 ah wait 15:28:04 ok yeah 15:28:10 sorry, I read publishing 15:28:12 (facepalm) 15:28:22 kgriffs: correct 15:28:24 i don't see mention of worker pool in the spec 15:28:33 because I wanted to avoid workers 15:28:35 :/ 15:28:41 I took a look at how ceilo does it 15:28:54 they use eventlet + calls to the many webhooks 15:29:06 I'm also doing some research on what the best way to do this is 15:29:23 oh, so they have eventlet worker pool that lives inside the web server? 15:29:28 thing is, we need to communicate with the workers and using a message broker is a no-go her 15:29:34 kgriffs: yup 15:29:34 (conceptually) 15:29:35 AFAICT 15:29:41 s/her/here/ 15:29:55 hey, I know, let's use celery 15:29:59 * kgriffs runs and hides 15:30:04 OMG, (plop) 15:30:06 lol 15:30:41 aaanyway, yeah, I think we should sort that out and add talk about it in the spec 15:30:43 I don't think each backend should have a specific implementation to communicate with the workers 15:30:52 ok 15:31:05 flaper87: I feel like we should define the "worker" model, whatever that is 15:31:09 #action to clarify how notifications will be pushed to clients 15:31:22 but then you can load drivers for different targets 15:31:26 kgriffs: wait, now that you mention that, I think I did say taskflow there 15:31:28 (webhook, email, sms, etc.) 15:31:38 oh I didn't 15:31:52 anyway, taskflow has this worker model based on eventelt that we could probably use 15:32:00 oh. 15:32:06 would it be overkill? 15:32:14 the workflow is like 1 step long 15:32:14 #link http://docs.openstack.org/developer/taskflow/engines.html#parallel 15:32:28 kgriffs: you could create 1 task and just execute 15:32:32 execute it* 15:32:50 I'm still doing some research there 15:32:52 #undo 15:32:53 Removing item from minutes: 15:32:58 #action flaper87 to clarify how notifications will be pushed to clients 15:33:11 #link http://docs.openstack.org/developer/taskflow/engines.html#parallel 15:33:13 damnit 15:33:13 ok 15:33:15 anyway 15:33:38 I just don't see what taskflow provides over just using a simple python queue and eventlet/gevent/tulip pool 15:33:38 * flaper87 thinks every bot command should've an id 15:34:04 kgriffs: exactly that, it has support for gevent, eventlet and tulip (?) already 15:34:09 we don't have to write it 15:34:10 oic 15:34:34 flaper87: does it handle retries? 15:34:38 yup 15:34:55 ok, gtk 15:35:09 it's basically celery^W 15:35:12 lol 15:35:18 ;) 15:35:28 ok, I'll update the spec 15:35:34 in-process celery, anyway 15:35:54 one thing before we move on, I'd really like to get both specs merged this week (persistent transport and notifications) so, lets keep an eye on those 15:36:01 I don't like to chase people 15:36:09 before we wrap out this notifications discussion 15:36:18 flaper87: I'd like to see a little info on the pros/cons of doing it in-process (ceilometer way) vs. a dedicated worker pool. 15:36:29 kgriffs: sure thing 15:36:30 kgriffs, any suggestions on how we can manage the subscriptions? 15:36:32 vkmc: 'sup ? 15:36:47 considering the fact that queues/topics will be removed 15:36:59 does it make sense to have another endpoint for those? 15:38:06 after putting some extra thoughts on this, I'd probablt just go with what's proposed because it works with both queues and topics 15:38:22 ok, sounds good 15:38:27 and it separates messaging and notifications 15:38:31 at least logically 15:38:41 vkmc: kgriffs thoughts? 15:38:46 cpallares: ^ everyone ^ 15:38:53 someone up there? ^ 15:38:58 someone down there? ^ 15:39:07 Gandalf ^ ? 15:39:12 Sauron ^ ? 15:39:15 yeah, considering the fact that we are not relying in containers anymore 15:39:20 I think that is the best approach 15:39:26 coolio 15:39:32 I'm a bit concened on how we are going to keep it updated thoguh 15:39:42 though* 15:39:43 vkmc: what do you mean? 15:39:49 what do we have to keep updated ? 15:39:56 I mean, the mappings queue/topic - subscriber 15:41:08 vkmc: ah well, mmh, I think that's the user's responsibility 15:41:15 k k 15:41:18 Should we consider the service crash when notification is working? May part of the subscriber get the message and others not. 15:41:19 vkmc: you mean, if a queue is deleted we should delete the subscription? 15:41:24 am I following you ? 15:41:26 yup 15:41:27 you are 15:41:51 if a client dies, the subscription should be removed as well 15:41:54 and that kind of scenarios 15:41:56 jeffrey4l: totally, that will have to consider but I prefered to leave that part out of the spec and make it part of the review 15:42:13 ok. 15:42:15 vkmc: yeah, for the permanent ones we'll leave that to the user 15:42:35 k 15:42:36 :) 15:42:40 vkmc: for the client ones, I think it'll be easier. Once the push to the client fails, we can remove it 15:42:41 :D 15:42:59 sounds great 15:43:04 thanks for clearing up 15:43:07 wait 15:43:10 np, my pleasure 15:43:15 OMG, kgriffs WHAT ? 15:43:17 :P 15:43:24 lol 15:43:33 I think jeffrey4l's question has bearing on whether we use an in-process task queue for notifications 15:43:37 are we talking some thing like `auto_delete` in rabbitmq? 15:43:50 or does taskflow store the "task" in a DB? 15:43:57 so it can survive a process crash 15:44:01 kgriffs: I think it does, in a sqlite db 15:44:10 I'm not sure, though 15:44:18 ok, that is definitely something to verify 15:44:22 but even though, I think that's something we can keep track of 15:44:30 iirc, it could use a dedicated db 15:44:32 that's why I said we should keep it in mind for reviews 15:45:02 flaper87: well, I just don't want to pick something that we know off the bat can't handle a process crash. 15:45:09 if it can use a sqlite db to keep track of things, I'd rather have 1 for each worker than a centralized one 15:45:18 kgriffs: totally, +@ 15:45:20 kgriffs: totally, +2 15:45:33 I'll dig more and make it explicit in the spec 15:45:33 flaper87: also, performance/scaling is going to be an important evaluation criteria 15:46:03 kgriffs: right, but that also depends a lot on the task itself 15:46:16 I mean, the things the task does 15:46:24 anyway, we basically ate our time 15:46:29 we've like 14 mins left 15:46:37 sure, it depends on a lot of things. :p 15:47:01 at least we know whom to blame 15:47:10 :P 15:47:19 we always know whom to blame 15:47:19 anything else on this topic? 15:47:33 I'll update the spec and get back to you tomorrow looking for more feedback 15:47:35 just saw taskflow has a persistence backend. including impl_dir impl_sqlalchemy impl_zookeeper 15:47:46 jeffrey4l: +2 15:47:52 * flaper87 remembers reading something 15:48:34 we also need to dig into how difficult it'll make deploying zaqar 15:48:47 agreed, and operation 15:48:50 We're already splitting management/data layers 15:49:01 which in most cases will require 2 different DB instances 15:49:42 yes ;) 15:50:02 There are several more specs to review so I'll probably ping you all throughout the week to get your feedback on those 15:50:09 No spec, no patches, no reviews, no progress 15:50:20 the sooner we agree/merge those, the better 15:50:30 kk. I have to write some specs too 15:50:44 I'll get that done ASAP 15:50:47 kgriffs: yup, please, ping me (and everyone) as soon as they're up 15:50:55 fwiw, they are: 15:50:55 kgriffs: thanks a lot 15:51:05 * flaper87 reads carefully 15:51:30 redis pool, large-scale load testing/locust, and TempURL thingy 15:51:56 kgriffs: ahhh, many important things there. Are you going to be the main contact for all of them ? 15:52:14 yes, but I will need some help getting them implemented 15:52:19 if not, lets find someone interested in helping you out 15:52:23 awesome 15:52:27 I'm sure jeffrey4l wants too 15:52:33 * flaper87 just volunteered jeffrey4l 15:52:38 ah and kragniz too 15:52:41 I'm sure of that 15:52:53 anyway, lets have some open discussion 15:52:54 my pleasure ;p 15:53:02 #topic Open Discussion 15:53:11 I just have 1 thing to say... actually 2 15:53:49 The first one is that besides the specs and things we have to work on, I'd like us to focus *a lot* on tackling technical debt 15:54:05 I did a full bug-list triage last week to clear some bugs out 15:54:17 there are still many and there are many TODOs/FIXMEs in the code base 15:54:27 please, grep the code base and help tackling those 15:54:45 +1 15:54:45 jeffrey4l has been doing an amazing bug-squashing series of patches 15:54:50 so, thanks a lot jeffrey4l 15:54:58 I haven't checked the ML list yet, but any update on "layers" from the TC? 15:55:08 So, again, focus on technical debt and features 15:55:17 but give technical debt enough priority 15:55:21 the second thing is: 15:55:56 You're all amazing, I had an amazing time with you all in Paris and I'm sad that some of you couldn't make it there. The project would be nothing without you so please, keep it up and lets keep it going 15:56:11 kgriffs: no updates so far 15:56:17 just the TC voting by numbers thread 15:56:28 Another notification use case: should we support work queues like http://www.rabbitmq.com/tutorials/tutorial-two-python.html 15:56:36 I haven't attended TC meetings lately so it's probably been discussed 15:56:37 I'd like to second that; we have a totally awesome crew and let's keep doing innovative work 15:57:14 jeffrey4l: some of those patterns are in the TODO dreams for Zaqar :P 15:57:28 jeffrey4l: but yeah, I think we would in a future 15:57:29 :D greatest team ever 15:57:39 please, do feel free to write specs, etherpads and start discussions 15:57:45 flaper87, great 15:57:47 we love to discuss things even if they can't happen yet 15:58:09 oh, one last thing 15:58:15 kgriffs: 2 mins 15:58:16 everyone let's keep improving our docs 15:58:23 +1! 15:58:25 that will really help adoption 15:58:38 and... 15:58:39 I also have a last thing: I chatted with Boris from Rally earlier today , he asked me if we are willing to add rally job to zaqar check pipeline 15:58:41 kgriffs: +2 15:58:45 the wiki page probably needs to be updated 15:58:52 (we can discuss later) 15:59:00 vkmc: last time we all talked about it, I think we agreed on doing that 15:59:04 k k 15:59:05 esp. "state of the project" 15:59:19 kk, gtg 15:59:23 +1 kgriffs 15:59:26 great meeting, catch you later 15:59:29 #endmeeting