15:00:03 <rhochmuth> #startmeeting monasca
15:00:03 <openstack> Meeting started Wed Mar 30 15:00:03 2016 UTC and is due to finish in 60 minutes.  The chair is rhochmuth. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:04 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:07 <rhochmuth> o/
15:00:07 <openstack> The meeting name has been set to 'monasca'
15:00:10 <Kamil> o/
15:00:15 <fabiog> o/
15:00:20 <rhochmuth> Agenda is at, https://etherpad.openstack.org/p/monasca-team-meeting-agenda
15:00:26 <rhochmuth> Agenda for Wednesday March 30, 2016 (15:00 UTC)
15:00:26 <rhochmuth> 1. Non-periodic metrics and aggregate functions, are all suitable ?
15:00:27 <rhochmuth> 2. Anyone else seeing https://bugs.launchpad.net/monasca/+bug/1559165?
15:00:27 <rhochmuth> 3. Anyone seeing period monasca-api hangs (java)?
15:00:27 <rhochmuth> 4. Compression support for POSTing metrics?
15:00:27 <openstack> Launchpad bug 1559165 in Monasca "monasca-api log flooded with jetty parser errors (java api)" [Undecided,New]
15:00:27 <rhochmuth> 5. Non-periodic/periodic metrics - update
15:00:27 <bklei> o/
15:00:27 <rhochmuth> 6. Grafana 2 horizon update https://review.openstack.org/#/c/295983/
15:00:27 <hosanai> o/
15:00:27 <rhochmuth> Needed for devstack integration
15:00:31 <rhochmuth> hi everyone
15:00:32 <shinya_kwbt> o/
15:00:48 <tomasztrebski> o/
15:00:54 <bklei> good morning
15:00:55 <slogan> morning
15:01:05 <rhochmuth> looks like a few agenda items are up today
15:01:13 <rbak_> o/
15:01:16 <rhochmuth> but we should get through hopefully relatively quickly
15:01:16 <ddieterly> o/
15:01:22 <rhochmuth> and then can have other discussion
15:01:31 <tomasztrebski> cool
15:01:38 <rhochmuth> so first topic
15:01:47 <rhochmuth> #topic Non-periodic metrics
15:02:00 <rhochmuth> tomasz: Is that you?
15:02:08 <tomasztrebski> yes, actually it was a question: Non-periodic metrics and aggregate functions, are all suitable ?
15:02:47 <rhochmuth> are you referring to the case of applying like a statistic function to a non-periodic metric
15:02:54 <tomasztrebski> yeah
15:03:11 <tomasztrebski> something like: does avg make sense for metric that are non-periodic
15:03:40 <bklei> it does in our case
15:03:41 <rhochmuth> i was hoping to not do something in this area, but now that you've brought it up, there are probably items worth addressing
15:03:47 <rhochmuth> i think it is ok
15:03:59 <rhochmuth> but, the question is what should the average be
15:04:14 <rhochmuth> for a metric that you haven't received in a long time
15:04:22 <tomasztrebski> bklei: good to know
15:04:33 <rhochmuth> should it be the average of the last value
15:04:39 <qwebirc46327> o/
15:04:39 <rhochmuth> or should it be 0
15:04:43 <rhochmuth> or NaN
15:05:07 <tomasztrebski> I am not sure how does it work in thresh right now...does it calculate average from all measurements or just a few last
15:05:10 <rhochmuth> personally, i'm ok addressing the non-periodic metrics in just the threshold engine for now
15:05:22 <slogan> it doesn't make sense without a time range specified.
15:05:58 <rhochmuth> in the threshold engine it sould just be the last value
15:06:34 <rhochmuth> so, if the value hasn't been sent within the period, the value is just the last value that was sent
15:06:42 <rhochmuth> that was the interpretation that i had
15:07:24 <rhochmuth> so, let's say normally a periodic metric is sent every 30 seconds approximately
15:07:46 <rhochmuth> if it is a non-periodic metric then the assumption is that the value for the metric, is just the last value
15:08:22 <tomasztrebski> slogan: if the time period would have to specified that would include modification in the UI as well, not to mention other components...that's not so trivial I guess
15:08:36 <slogan> But average could be useful if a range is specified? Consider rainfall. Answers are completely different if averaged by day, month, year, season.
15:08:47 <slogan> nod
15:09:09 <tomasztrebski> roland: so, right now let's stay with what we have and keep this topic open, if someone would come (either Fujitsu or someone else) with the decent idea how to implement right
15:09:16 <tomasztrebski> and basically I agree with slogan
15:09:25 <tomasztrebski> but again, doing all that right now is too much
15:09:37 <slogan> makes sense
15:09:50 <rhochmuth> that works for me
15:10:00 <tomasztrebski> works for me to
15:10:20 <slogan> could lead to some interesting insights in a UI to be avble to compute an average on the fly by dialing a periond knob
15:10:46 <slogan> s/periond/period/
15:11:26 <rhochmuth> ok, are we good
15:11:35 <rhochmuth> is it time to move on?
15:11:56 <tomasztrebski> for me it is, that answers I guess the community feeling about that
15:12:10 <rhochmuth> thanks tomasz
15:12:13 <rhochmuth> #topic Anyone else seeing https://bugs.launchpad.net/monasca/+bug/1559165?
15:12:14 <openstack> Launchpad bug 1559165 in Monasca "monasca-api log flooded with jetty parser errors (java api)" [Undecided,New]
15:12:16 <bklei> that's me
15:12:26 <rhochmuth> stop bringing bugs
15:12:30 <tomasztrebski> just a quick question => build from stable/mitaka ?
15:12:35 <bklei> since we updated to the latest monasca-api (java) -- logs flooded with
15:12:36 <slogan> probably user error
15:12:37 <bklei> org.eclipse.jetty.http.HttpParser: Parsing Exception: java.lang.IllegalStateException: too much data after closed for HttpChannelOverHttp@6f19e5fb{r=1,a=IDLE,uri=-}
15:12:39 <bklei> :)
15:12:39 * slogan runs
15:12:58 <rhochmuth> lost my train of thought, that was a joke
15:13:06 <bklei> we hadn't brought in an api since 12/10, but i've narrowed it to a change between 12/10 and 1/19
15:13:14 <bklei> not that it narrows it down much
15:13:18 <rhochmuth> still come back to your questino tomasz
15:13:29 <bklei> i can keep digging -- but is helion seeing this?
15:13:37 <bklei> or anyone else using latest monasca-api?
15:13:39 <rhochmuth> bklei: we are not aware of any issues, and have been doing a lot of testing
15:13:41 <bklei> (java)
15:13:52 <rhochmuth> we are using latest java monasca-api in helion
15:13:52 <tomasztrebski> we are in middle of updating monasca-* to stable/mitaka
15:13:53 <bklei> i'm talking FLOODING the logs
15:13:56 <bklei> but api works
15:14:00 <rhochmuth> and are getting ready for a release
15:14:07 <rhochmuth> so have been doing a lot of testing
15:14:17 <tomasztrebski> so I'd have ask Artur tommorow about that, till now nothing like that has been spotted
15:14:29 <bklei> ok, maybe we're special at TWC will keep digging
15:14:35 <rhochmuth> i'll check with our testers, but they woudl have told us if they saw something
15:14:40 <bklei> k, thx
15:14:46 <rhochmuth> you guys are always special IMHO
15:14:52 <bklei> short bus
15:15:08 <slogan> might be interesting to see if the exception handling causes a retry or is just informative
15:15:25 <rhochmuth> i'll also update my devstack env with a java build to see what happens
15:15:27 <slogan> you say things work still, no appararent loss of data, functionality?
15:15:31 <bklei> from what i read, it could be a bad client call -- sending extra stuff
15:15:38 <bklei> but the old monasca-api didn't care
15:15:49 <bklei> exactly -- seems that it works fine
15:16:07 <tomasztrebski> maybe something that pushes data to monasca was changed in a way that causing that problem
15:16:15 <rhochmuth> that is a possiblity
15:16:21 <bklei> could be, and the new api is just more sensitive
15:16:23 <rhochmuth> we continue to improve the input validation
15:16:30 <tomasztrebski> after all monasca-api is a server that monasca-agent (mainly or only) sends data to, right
15:16:31 <tomasztrebski> ?
15:16:34 <rhochmuth> although, i would have expected a better error message
15:16:45 <bklei> yeah, and some custom scripts/crons
15:17:03 <rhochmuth> the log that you've supplied doesn't seem very informative
15:17:11 <rhochmuth> which isn't expected
15:17:18 <bklei> that's all that's there -- and lots of it
15:17:36 <bklei> i'll keep digging and report back
15:17:43 <rhochmuth> ok, thanks
15:17:53 <bklei> or let me know if you see/figure it out
15:17:53 <rhochmuth> i'll ask around, and do some test too
15:17:55 <bklei> thx
15:18:14 <slogan> maybe this lends some clues: http://stackoverflow.com/questions/29527803/eliminating-or-understanding-jetty-9s-illegalstateexception-too-much-data-aft
15:19:04 <bklei> yeah, that's the page that talks about a badly written client :)
15:19:10 <slogan> perhaps an issue with closing the http connection, hard to say
15:19:19 <slogan> so what is the client in this case?
15:19:30 <bklei> primarily monasca-agent
15:19:44 <bklei> but also some cron-driven scripts/POSTs
15:19:45 <slogan> ok
15:19:50 <rhochmuth> aha
15:20:20 <bklei> i can shut different clients off in my sandbox, that's a good idea i hadn't thought of, to narrow it down
15:20:24 <slogan> should be isolatable with some tracing I suppose
15:20:32 <slogan> yes
15:20:37 <slogan> divide and conquer
15:20:40 <bklei> :)
15:20:55 <tomasztrebski> LOG_LEVEL=All and go for it ]:->
15:21:14 <bklei> yup
15:21:15 <rhochmuth> that does seem the first place to look, since no one else has seen it
15:21:45 <bklei> ok, i think we can move on
15:21:57 <rhochmuth> ok, thanks bklei
15:22:04 <rhochmuth> tomasz you were asking a question
15:22:22 <rhochmuth> about stable/mitaka
15:22:35 <tomasztrebski> it was to bklei: does the bug started to happen on stable/mitka build of monasca-api
15:23:12 <bklei> yeah, we brought in a new java api about a week ago, and hadn't done that since 12/10, that's when it began
15:23:21 <rhochmuth> sounds like it was well before stable/mitaka as that was only created last Friday
15:23:48 <bklei> we built it locally with master at 2016-03-17 23:14
15:24:04 <tomasztrebski> yeah, so we are now updating to this branch as well, so I'll ask Artur who's doing the upgrade to take a look at logs for monasca-log-api
15:24:10 <tomasztrebski> bleh...monasca-api
15:24:19 <tomasztrebski> specifically I guess it is requests.log, right ?
15:25:23 <bklei> oh -- in our environment, /var/log/monasca/monasca-api.log
15:26:11 <rhochmuth> #topic Compression support for POSTing metrics?
15:26:27 <rhochmuth> started next topic, while previous one wraps-up
15:26:36 <bklei> did we skip a topic -- api hang?
15:26:48 <rhochmuth> #topic Anyone seeing period monasca-api hangs (java)?
15:26:50 <rhochmuth> oops
15:26:53 <tomasztrebski> i guess we did :)
15:26:54 <bklei> we've seen this happen 3 times this week, anyone else see this in testing?
15:27:16 <rhochmuth> no
15:27:19 <tomasztrebski> again...nope :(
15:27:25 <bklei> nothing in the logs, process unresponsive, doesn't respond to thread dump, just dead
15:27:54 <bklei> we're in the process of trying to get better debug msgs by leaving verbose logging on, hope to catch it
15:28:05 <bklei> just wondering if anyone else has noticed this
15:28:21 <rhochmuth> we haven't every hit any problems like this
15:28:34 <bklei> k
15:28:42 <tomasztrebski> and this is also for this master build you mentioned ?
15:29:00 <bklei> exactly -- hadn't seen it prior to that upgrade
15:29:15 <rhochmuth> possibly a related problem them
15:29:19 <rhochmuth> to your previous one
15:29:27 <rhochmuth> is memory growing
15:29:34 <rhochmuth> are file handles being used up
15:29:37 <rhochmuth> and not release
15:29:45 <rhochmuth> is vertica and kafka up
15:30:05 <rhochmuth> just throwing out the usual places to look
15:30:08 <bklei> will validate that next time (file descriptors, etc) -- yeah the whole stack was still up and healthy
15:30:53 <bklei> ok, moving on i guess
15:31:14 <rhochmuth> i guess, i don't have any great suggestions
15:31:31 <rhochmuth> we just did a huge amount fo scale testing and haven't seen any issues like this occur
15:31:36 <bklei> i hadn't thought of the FD check, that's helpful
15:31:41 <tomasztrebski> just one question: does this hanging seems to happen right away, or does it take some time ?
15:31:46 <bklei> over time
15:31:52 <tomasztrebski> mhm
15:32:21 <rhochmuth> i'll check around and see if i'm come up with more ideas
15:32:25 <bklei> thx
15:32:44 <rhochmuth> we had an issue with kafka on using a lot of fd's, but not sure if that woudl impact you
15:33:08 <slogan> easy way to check that would be to look at /proc/fds perhaps
15:33:09 <bklei> hmm, can look there, we haven't updated kafka in a while in our env
15:33:14 <slogan> see if it is growing
15:33:17 <tomasztrebski> well file descriptions are limited per user (soft and hard) so if the process owner would reach the limit
15:33:18 <bklei> +1
15:33:27 <tomasztrebski> that would cause problem only for given application
15:33:34 <tomasztrebski> as far as I know how limits work :D
15:33:53 <rhochmuth> #topic Compression support for POSTing metrics?
15:34:08 <bklei> this is more a question -- have we considered this?
15:34:19 <slogan> is this gzip compression of http payloads?
15:34:21 <tomasztrebski> you mean something like gzip and stuff ?
15:34:29 <tomasztrebski> slogan: :P
15:34:42 <bklei> one of the primarly consumers of monasca here at TWC asked about zipping up data to POST
15:34:52 <bklei> like zlib
15:35:00 <slogan> I got flustered yesterday by horizon doing this, I was trying to sniff the traffic and see what was being sent... :-(
15:35:42 <rhochmuth> brain lapse on my http
15:35:46 <tomasztrebski> did costumer asked only to zip traffic toward monasca-api ? or also data that is sent via kafka queue ?
15:35:55 <tomasztrebski> because I think that it'd be also possible
15:36:13 <bklei> just straight to monasca endpoint
15:36:17 <rhochmuth> but are you asking about just applying a compression encoding
15:36:24 <bklei> yes
15:36:58 <slogan> are we sure the compression/decompression time night not become a bottleneck somehow?
15:37:04 <slogan> er, might
15:37:07 <bklei> not a huge issue/request, just throwing it out there
15:37:08 <rhochmuth> there isn't anything in the monasca-api to accept different encodings
15:37:25 <bklei> that's what i figured, and we've got bigger fish to fry atm
15:37:27 <rhochmuth> so, it could involve code changes
15:37:43 <rhochmuth> not necessarily code, but decorators and annotation
15:37:47 <bklei> yeah
15:38:02 <rhochmuth> i'm assuming dropwizard could handle automatically
15:38:04 <rhochmuth> if enabled
15:38:11 <rhochmuth> and not sure about falcon
15:38:24 <rhochmuth> so, it could be low-hanging fruit to enable this
15:38:49 <slogan> would this be something you'd want a config knob for, or perhaps something in the API so it could be enabled, disabled by a client?
15:38:57 <bklei> i have to drop and take my kid to the dentist, thx for discussing the topic(s)
15:39:11 <rhochmuth> bye bklei
15:39:23 <tomasztrebski> in falcon I did a quick search and it looks like that it should be possible directly on gunicorn (or simpy WSGI server)
15:39:33 <rhochmuth> that makes sense
15:40:14 <rhochmuth> since bklei is gone
15:40:17 <rhochmuth> #topic Non-periodic/periodic metrics - update
15:40:24 <tomasztrebski> that's me
15:40:55 <tomasztrebski> just wanted to give an update for latest patch sets, that would allow reviewers to grap the concept I am trying to introduce there better
15:41:37 <tomasztrebski> so: following roland's and craig's advice, I modified the schema for SubAlarm table by adding sporadic field (boolean, false by default)
15:42:10 <tomasztrebski> that accomplishes the case where topology is restarted somehow and threshold is able to recognize sub alarms for which sent metric were sparse
15:42:18 <tomasztrebski> since most of them will be periodic
15:42:22 <rhochmuth> cool
15:42:28 <tomasztrebski> https://review.openstack.org/#/c/292758
15:42:52 <rhochmuth> i'll start reviewing and testing it out
15:43:14 <tomasztrebski> this change contains code that pulls that information from metric and saves in SubAlarm right before it is persisted (either mysql or ORM)
15:43:28 <tomasztrebski> I've added comparing metric definitions between metric and sub alarm
15:43:40 <tomasztrebski> but I am not sure if that accomplishes multitenant support
15:43:51 <tomasztrebski> meaning to say in metric definition there is no information about the tenant
15:44:20 <tomasztrebski> so I'd still need to figure out how to get that information in order to mark only specific SubAlarm as sporadic, not all for given Alarm
15:45:01 <tomasztrebski> so that's more or less that for now
15:45:08 <rhochmuth> ok, i'll let craig know what is going on
15:45:18 <rhochmuth> and i'll need to dive in deeper in the code
15:45:26 <rhochmuth> to help out comment at this point
15:45:32 <tomasztrebski> I will try to put references to either mysql schema or mysql migration script that would add sporadic field to exisiting schema
15:45:37 <tomasztrebski> possible at gist
15:45:42 <tomasztrebski> *a gist
15:46:05 <rhochmuth> yes, i'll need that
15:46:18 <rhochmuth> you just added a single column or multiple
15:46:23 <tomasztrebski> a single column
15:46:50 <tomasztrebski> I dug a bit between tables and it looks like that SubAlarm is the best place to put it
15:47:20 <rhochmuth> that sounds right
15:47:47 <tomasztrebski> before that: one functional requirement should also be there (at least I hope I implemented it right)
15:48:01 <tomasztrebski> meaning to say: once alarm for sporadic metric enters OK/ALARM state
15:48:13 <tomasztrebski> it wont ever go back to UNDETERMINED statte
15:48:16 <tomasztrebski> *state
15:48:25 <rhochmuth> that is my understanding too
15:48:44 <tomasztrebski> that's what we all agreed upon :)
15:49:32 <rhochmuth> should we move on?
15:49:35 <tomasztrebski> in the meantime there are some useful log outputs (debug or trace) I've added that allow to track what's going on with Metric,Alarm or SubAlarm between different bolts
15:49:39 <tomasztrebski> yes, that's all
15:49:40 <tomasztrebski> :)
15:49:55 <rhochmuth> ok, thanks tomasz, i'll start getting into it
15:50:01 <rhochmuth> #topic Grafana 2 horizon update https://review.openstack.org/#/c/295983/
15:50:10 <rhochmuth> rbak: u there?
15:50:10 <rbak_> That's me
15:50:34 <rbak_> Just a reminder that this patch exists.  I had one question, but no reviews so far.
15:50:58 <rbak_> I can't really integrate grafana 2 into devstack unless this gets merged.
15:51:30 <rhochmuth> looks like shinya tests
15:51:37 <rhochmuth> sorry, tested it
15:52:15 <rhochmuth> does this work in the old monasca-vagrant environment ?
15:52:23 <rhochmuth> i was wondering how to test?
15:52:44 <rbak_> It defaults to the old grafana, so you'd have to configure it yourself.
15:52:44 <shinya_kwbt> +1
15:52:53 <rbak_> You'd also need to build grafana.
15:53:05 <rhochmuth> If shinya ways it is ready, i'm ready to merge
15:53:10 <rhochmuth> you to you shinya
15:53:22 <rbak_> Works for me.  I just didn't want to merge without input.
15:53:35 <shinya_kwbt> Yes I tested with grafana2
15:54:18 <rhochmuth> sounds like i should +2, unless anyeone else want to take a close look
15:54:19 <shinya_kwbt> I configured grafana2
15:54:30 <rhochmuth> i would like to see this added to the DevStack env ASAP
15:54:39 <rhochmuth> then i won't need to worry about monasca-vagrant anymore
15:54:47 <rhochmuth> if that works for everyone
15:56:38 <slogan> works for me, I just +1'd it
15:56:39 <shinya_kwbt> +1
15:56:56 <rhochmuth> sorry for the delay, but if anyone else want to review further please let me know in the next 5 seconds,
15:57:07 <rhochmuth> else, i', ready to merge it
15:57:19 <slogan> time's up!
15:57:33 <rhochmuth> ok, it is merged
15:57:36 <rbak_> thanks
15:57:45 <rbak_> I'll get back to devstack today then
15:57:49 <rhochmuth> if you need any help with the devstack, please let me know
15:57:52 <rbak_> will do
15:58:11 <rhochmuth> well, it is almost top of the hour again
15:58:17 <rhochmuth> seems like we coudl go on again
15:58:23 <rhochmuth> any questions in closing
15:58:27 <tomasztrebski> https://review.openstack.org/#/c/297039/, Roland if would take a look at this change that would be great, you've added v3 implementation, which I adjusted in several places, would be great to if you could check that
15:58:32 <tomasztrebski> mine :D
15:58:32 <Kamil> so grafana2 will be a part of mitaka release, right?
15:58:57 <rhochmuth> grafana 2 won't be part of mitaka release as it isn't an official openstack project
15:59:12 <rhochmuth> but it will work and be integrated into the DevStack plugin shortly
15:59:19 <rhochmuth> tomasz: i'll review your code
15:59:23 <Kamil> ok. thx
15:59:26 <tomasztrebski> roland: thx
15:59:31 <slogan> rhochmuth: just a ping: we need to settle on talks in Austin, or not. Soon :-)
15:59:36 <hosanai> tanks & bye
15:59:47 <rhochmuth> yes slogan
15:59:53 <rhochmuth> i'm still waiting to hear back
16:00:05 <shinya_kwbt> bye
16:00:07 <slogan> from the event about rooms?
16:00:07 <rhochmuth> by hosanai
16:00:13 <rhochmuth> thanks shinya
16:00:14 <tomasztrebski> cheers and nice day...or evening...depends on time zone you guys are in... (laughing....)
16:00:35 <rhochmuth> #endmeeting