15:00:32 <gordc> #startmeeting telemetry
15:00:33 <openstack> Meeting started Thu Feb 25 15:00:32 2016 UTC and is due to finish in 60 minutes.  The chair is gordc. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:34 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:36 <openstack> The meeting name has been set to 'telemetry'
15:00:38 <ildikov> o/
15:00:51 <r-mibu> o/
15:01:03 <liusheng> o/
15:01:06 <sileht> o/
15:01:10 <ityaptin> o/
15:01:50 <_nadya_> o/
15:02:09 <nijaba> o/
15:02:44 <gordc> ok let's start, i think some people are on PTO
15:02:57 <gordc> #topic recurring: roadmap items (new/old/blockers) https://wiki.openstack.org/wiki/Telemetry/RoadMap
15:03:12 <gordc> we're basically up for time on features for Mitaka
15:03:14 <idegtiarov> o/
15:03:23 <gordc> the items we were tracking last week seem to be ok
15:03:56 <gordc> i'll run through each in the subtopics
15:04:03 <gordc> but any pressing concerns?
15:04:29 <gordc> m-3 is next week so basically all features now will need to be very very small
15:05:18 <gordc> cool. let's move to the projects
15:05:29 <gordc> #topic aodh topics
15:05:44 <gordc> right now we're tracking composite alarms for Mitaka
15:05:48 <gordc> main patch is in
15:05:57 <gordc> we just need approval on api and client
15:06:01 <liusheng> gordc: thanks
15:06:20 <gordc> https://review.openstack.org/#/c/257722/
15:06:44 <gordc> https://review.openstack.org/#/c/284022/
15:07:02 <gordc> if we can get reviews and get that merged that'd be great
15:07:35 * gordc nudges sileht
15:08:03 <gordc> has anyone looked at switchig ceilometerclient to aodhclient in heat?
15:08:03 <liusheng> it seems the jenkins has been broken :(
15:08:23 <gordc> liusheng: yeah, it's being fixed
15:08:31 <sileht> liusheng, gordc I will check that one last time
15:08:37 <gordc> we have some time next week to merge as well.
15:09:00 <liusheng> gord, sileht, cool, tanks!
15:09:07 <gordc> i'll try taking a look at porting ceilometerclient to aodhclient in heat... i'm guessing it won't make it for Mitaka though
15:09:27 <liamji> liusheng: the problem we found in the our working day is fixed. Now it is the second one :)
15:09:35 <ildikov> gordc: can we get an exception for it?
15:09:47 <gordc> ildikov: in heat?
15:09:57 <ildikov> gordc: yes
15:10:17 <gordc> ildikov: i'll give them a ping and see.
15:10:25 <gordc> i'm hoping it's an easy swap
15:10:45 <ildikov> gordc: me too, this is why I asked
15:11:18 <gordc> #action see if heat will allow aodhclient FFE
15:11:42 <gordc> aside from that, i think Aodh is what it is?
15:11:49 <gordc> r-mibu: did you have success with tempest?
15:12:10 <gordc> i believe pradk mentioned whatever we have now is conflicting with what exists in tempest repo
15:12:14 <r-mibu> you mean running test with plugins?
15:12:23 <gordc> r-mibu: correct
15:12:38 <r-mibu> right, so I'll fix the id of test that may fix the bug
15:12:51 <r-mibu> but didn't check yet
15:13:04 <r-mibu> will do by this week :)
15:13:05 <gordc> r-mibu: do you have time next 2 weeks?
15:13:08 <gordc> ok
15:13:21 <r-mibu> yep, other stuffs done :)
15:13:22 <gordc> i'm going to make tempest stuff FFE since it's only tests.
15:13:31 <gordc> anyone have concerns?
15:13:31 <r-mibu> ok
15:14:01 <r-mibu> docs...
15:14:19 <r-mibu> as you pointed in review
15:14:37 <gordc> r-mibu: ok. let's try to get it working as soon as possible. but i'm ok with cutting m-3 and merging tempest stuff in an rc build unless someone has an issue
15:14:55 <gordc> r-mibu: docs for tempest tests?
15:15:01 <ildikov> I will run a config guide update for Aodh
15:15:04 <r-mibu> docs for aodh
15:15:20 <r-mibu> but, yes, that's not big problem for m-3
15:15:24 <gordc> ildikov: cool cool. thanks for tracking that
15:15:29 <ildikov> I think we will need to look into the Alarming section of the Admin Guide in OS Manuals
15:15:57 <gordc> ildikov: defnitely. we can still make changes to docs after m-3 correct?
15:15:58 <ildikov> I'm not sure how much that part is outdated, so prolly pretty much :)
15:16:19 <ildikov> gordc: sure, we have some time during the stabilization period
15:16:26 <gordc> ildikov: awesome
15:16:50 <gordc> we need to do docs for all of aodh, our dev docs are non-existent too
15:16:59 <ildikov> gordc: of course sooner the better, but still we will have a better picture right after m-3 regarding what made it and what to document
15:17:07 <gordc> ildikov: sounds good
15:17:16 <gordc> anything else for aodh?
15:17:31 <llu-laptop> test :(
15:18:01 <llu-laptop> anyone see me? can't read any messages :(
15:18:32 <liusheng> llu-laptop: I can see you :)
15:18:33 <r-mibu> llu-laptop: i can see your message
15:18:33 <ildikov> llu-laptop: I see your messages
15:18:40 <neelashah> I can see your messages llu-laptop
15:18:57 <gordc> and he's gone.lol
15:19:05 <gordc> let's move on for now
15:19:14 <gordc> #topic ceilometer topics
15:19:27 <gordc> we have two items here to get merged
15:19:56 <gordc> ityaptin's patch for minimising nova-api load: https://review.openstack.org/#/c/284322/
15:20:14 <gordc> i think that needs a docimpact since we added a new optoin
15:20:48 <gordc> and liamji's patch for neutron v2: https://review.openstack.org/#/c/277434/
15:21:24 <gordc> r-mibu: same FFE for tempest in ceilometer
15:21:48 <r-mibu> got it
15:21:57 <neelashah> gordc neutron v2 is failing due to gate issues?
15:22:17 <gordc> we all happy? let's hold off on any other features
15:22:27 <idegtiarov> gordc, what about event transformers FFE?
15:22:29 <gordc> neelashah: yeah, we need to fix a gnocchclient issue first and we should be ok
15:22:41 <neelashah> gordc - ok, thanks
15:22:59 <gordc> idegtiarov: i have some concerns but we can talk about that now
15:23:22 <gordc> #topic event bracketer transformer
15:23:26 <gordc> #link https://review.openstack.org/#/c/266488/
15:23:30 <idegtiarov> great what is your main concern?
15:23:50 <gordc> i don't understand why latency is an event.
15:24:04 <gordc> and why it needs to be calculated inline/stream
15:24:38 <gordc> also, the code seems be very inflexible.
15:24:40 <idegtiarov> it is event that could be published to -  - tosamplenotifier://  and will be stored as sample
15:25:45 <gordc> idegtiarov: and it has real-time requirement because?
15:26:50 <idegtiarov> gordc,  if you need to get alarm based for example on latency_time event/sample it is
15:26:59 <_nadya_> gordc: quick question about alarms transformer
15:26:59 <_nadya_> gordc: events, sorry
15:26:59 <_nadya_> #link https://review.openstack.org/#/c/266488/5 perhaps we can start with instances only in Mitaka?
15:27:40 <gordc> idegtiarov: but the alarm scenario is handled by the timeout mechanism in aodh no?
15:28:53 <idegtiarov> gordc, as an example we could be interested why instances booting longer then 10 minutes and create alarm for that case
15:29:57 <idegtiarov> the main idea is have tool for event transformation and store it as event/sample
15:30:05 <_nadya_> can we convert only to samples? I agree that "latency" is mostly about sample
15:30:17 <gordc> shouldn't timeouts be done by event alarm?
15:30:42 <idegtiarov> we will can when https://review.openstack.org/#/c/227106/ will be merged
15:31:17 <_nadya_> it looks it's ready to be
15:31:21 <r-mibu> i agree gordc - alarming logics can be put in aodh rather than ceilometer itself
15:32:04 <r-mibu> if creation time of instance can be meter, i'm ok
15:32:39 <gordc> _nadya_: yeah. i think it's definitely a measurement. i'm wondering how many measurements it is though.
15:32:44 <idegtiarov> r-mibu, it is not alarming logic it is logic of event transformation that could be used for statistics of booting instances or alarming based on new samples/event
15:32:55 <gordc> we will really only have one latency measurement per resource (you can only ever create once)
15:33:25 <gordc> is this better as a query feature in events api
15:34:29 <gordc> usually that's how most BI tools work. you calculate the data from specific log records
15:34:45 <idegtiarov> much better as for me because it is rather expensive to index events traits end we already have event_type indexing in mongodb so api requests for new events will be pretty fast
15:35:17 <_nadya_> gordc: it is only one for now, right. But we can have latency for many different resources: instances, volumes and so on. In M* we may start with instances only
15:36:38 <gordc> is the 'resource' == host? because it doesn't matter how many different resources you'll have, it still just one entry for each instance/volume/etc... no?
15:37:39 <gordc> for me, i think this functionality is better as post-storage work, i don't really see the real-time requirement of it. that's my main point
15:38:00 <ildikov> gordc: I tend to agree with you on this point
15:38:46 <r-mibu> adding new logic might affect event processing and having date in workers make difficult in HA/multi-worker
15:38:52 <_nadya_> It looks so great to have alarm: "look, your instances start to boot more then 10 minutes"
15:39:00 <idegtiarov> it is not only about booting time, but for example instances update
15:39:06 <gordc> so just referencing stacktach and how they were planning to implemented alarms, i believe they also do these calculations in post
15:39:59 <gordc> r-mibu: yeah, it definitely complicates stuff having a global cache shared across workers/systems...
15:40:15 <gordc> although notification agent is/should be smart enough to redirect to common queues.
15:40:57 <_nadya_> dunno, we already have "online" mechanism for transformers
15:41:44 <ildikov> _nadya_: the instance booting time issues is more an alert in definition, also I would assume it gets interesting when it happens with all of them not just one
15:41:45 <gordc> _nadya_: so the alarm comment i believe we want to have it handled by Aodh
15:41:54 <gordc> you are defining the rule there already.
15:42:10 <liusheng> IIUC, if we emit measurements on events latency,  these measurements sparse in timeline, if we have alarms on these measurements, the state of alarms maybe always "insufficient data"
15:42:33 <idegtiarov> ildikov, not only alerts but could be used for statistics of booting vms
15:42:56 <ildikov> idegtiarov: that can be a post operation/query as well
15:43:04 <gordc> ildikov: +
15:43:10 <idegtiarov> not now
15:43:22 <idegtiarov> even not with ceilo api
15:43:24 <gordc> idegtiarov: but it could be, if someone worked on it ;)
15:43:35 <ildikov> gordc: +1 :)
15:44:04 <r-mibu> idegtiarov: i understand your use case, but it sometimes won't work since we cannot make sure that ceil receive set of start and end message
15:44:08 <_nadya_> gordc: store events in time series storage and have post processing? in Gnocchi?
15:44:22 <gordc> idegtiarov: i had this topic for making events more useful for BI at last summit? i just didn't do anything, so it's kinda my fault (but i won't admit it)
15:45:59 <gordc> r-mibu: right, the potential latency in MQ may cause weird results from real-time pov
15:46:08 <_nadya_> r-mibu: I hear this very often. But actually, doesn't it mean that we cannot provide a reliable billing? We lose notifications about instances, sorry
15:46:18 <idegtiarov> r-mibu when we do not receives end event it sould be alarm otherwise we will have and could used if we need such data
15:46:47 <gordc> _nadya_: i'm not sure if we need gnocchi specifically. gnocchi i think is continuous measurements over time
15:47:09 <gordc> latency seems to be measurements in set time
15:47:48 <r-mibu> i assume for billing purpose operator will check db record as well, otherwise I will boot many instances on that system :)
15:47:49 <_nadya_> gordc: so...what storage will be used for events? sql?
15:48:40 <ildikov> for billing we usually have some freedom I think regarding messages and it usually does not have to be real-time
15:49:03 <_nadya_> r-mibu: :)
15:49:32 <gordc> _nadya_: existing storage: sql/elasticsearch.
15:49:36 <r-mibu> idegtiarov: yep, and that can be done in aodh + events storage
15:49:49 <_nadya_> let's move on, I see community point here
15:49:55 <idegtiarov> actually to do the same operation on not indexing data will be a big issue for big event collection
15:50:01 <gordc> _nadya_: in theory, this should be doable in elasticsearch.
15:50:35 <gordc> _nadya_: i also believe stacktach offers some mechanisms to handle related events (i don't know status of all that though)
15:50:40 <_nadya_> gordc: I don't like it is external, not ceilometer-core. But perhaps I need to think more
15:50:56 <gordc> what's external?
15:51:32 <_nadya_> gordc: that this statistics should be calculated outside ceilometer, in external system
15:51:47 <gordc> idegtiarov: but we index event_type and all the traits...
15:52:37 <idegtiarov> we do not index traits
15:53:15 <gordc> ceilometer gathers data, normalises and transforms. gnocchi does a lot of stuff 'outside ceiloemter' but it's still our project
15:54:45 <gordc> idegtiarov: https://github.com/openstack/ceilometer/blob/master/ceilometer/storage/sqlalchemy/models.py#L294
15:55:05 <gordc> i don't understand, it seems index'd. it's a primary key
15:55:39 <gordc> if not, it should be.
15:55:48 <gordc> sileht: you have anything for gnocchi?
15:56:16 * sileht is reading backlog
15:56:17 <idegtiarov> gordc, I mean in mongodb
15:56:32 <gordc> sileht: no backlog. just asking if we can leave gnocchi topics :)
15:57:00 <gordc> idegtiarov: we probably should? or not use mongodb :P
15:57:09 <idegtiarov> :P
15:57:14 <sileht> gordc, oh I have released gnocchiclient 2.2.0 and I will start working on gnocchi dispatcher for bachting measurements
15:57:30 <gordc> tbh, it seems like i'm not the only person who has issues so maybe we should punt it for Mitaka
15:57:58 <idegtiarov> o no :(
15:58:12 <sileht> idegtiarov, gordc why not having both indexes, the new one and the old one  ?
15:58:20 <ildikov> gordc: agreed
15:58:26 <gordc> idegtiarov: let's move this to main chanell post meeting
15:58:35 <gordc> #topic gnocchi topics
15:58:35 <idegtiarov> k
15:59:00 <gordc> sileht: i had a question, do we want to make the gnocchi dispatcher use new batching support
15:59:15 <sileht> gordc,  why not ?
15:59:19 <gordc> for mitaka?
15:59:31 <gordc> i just want to know if we should track it
15:59:41 <gordc> someone is going to yell soon
15:59:42 <sileht> I'm a bit lost on where we are on the roadmap
15:59:54 <gordc> sileht: :) i'll ask in main channel
16:00:05 <gordc> thakns everyone
16:00:09 <gordc> #endmeeting