08:00:14 #startmeeting vitrage 08:00:19 Meeting started Wed Nov 30 08:00:14 2016 UTC and is due to finish in 60 minutes. The chair is ifat_afek. Information about MeetBot at http://wiki.debian.org/MeetBot. 08:00:20 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 08:00:23 The meeting name has been set to 'vitrage' 08:00:27 Hi everyone :-) 08:00:56 hi 08:01:00 Hello there 08:03:22 hi 08:03:58 #topic plans for Ocata 08:04:16 let’s start by talking about Ocata. I have some suggestions, will be happy to hear yours 08:04:23 hi guys! 08:04:48 Top priority: We should finish the integration with RedHat, so we will have vitrage rpms ready 08:05:21 Hi guys 08:05:33 Integrations with other proejcts are also very important. Aodh for example - we want to be able to show Vitrage alarms in Aodh 08:05:54 Another integration is with collectd and DPDK 08:06:54 We must complete our commitments to OPNFV Doctor project. We should install Vitrage as part of OPNFV installers (Apex should be the first one), and add Vitrage test scripts to OPNFV testing environment 08:07:37 We should improve Vitrage UI, mainly the entity graph 08:08:06 And support multi-tenancy in the UI. It is partially supported already, but we need to add for example Vitrage menu to Admin 08:08:47 And obviously - we should add more use cases to Vitrage. We can explore existing Zabbix alarms, and think of ways to add deduced alarms and RCA on top 08:09:31 Another important goal is to add a persistent graph database, in favor of HA support and alarm history. As this requires a lot of work and Ocata is a very short cycle, I’m not sure we’ll be able to complete this. 08:09:45 Your comments? other ideas? 08:09:47 ifat_afek: sounds good 08:10:04 Great success! 08:11:25 I think making vitrage more compatible with all these projects is critical, to help vitrage become more embedded in the community. The DPDK is also very good when thinking about OPNFV context. 08:11:46 elisha_r: I agree 08:12:05 elisha_r: +1 08:13:42 multi tenancy already supported in Vitrage, but 08:14:49 in the horizon at the moment we need to add in the system tab to be able to show all the alarms of all the tennats, and all the whole entity_graph of all the tenants 08:15:10 alexey_weyl: right. thanks for the clarification 08:15:53 basically our main goals are to add a persistent graph and history on top of this graph. 08:16:20 what about supporting sensu ? 08:16:26 also to add database that will store the templates 08:16:36 eyalb: is there a request for that? 08:16:52 redhat guys suggested it 08:17:06 eyalb: ok. I guess adding another datasource is not a lot of work 08:17:12 and also to handle the new vitrage_id that will more like a UUID and not what we have today 08:17:38 the new vitrage_id is very important indeed 08:18:06 eyalb: the only problem to add such a datasource is to find out how exactly it works, and what we can extract from it 08:19:08 its a monitoring tool #link https://sensuapp.org/ 08:19:28 @danoffek is currently working on Vitrage ID 08:19:55 it is designed to replace nagios and zabbix 08:20:39 from their site: Sensu allows you to reuse monitoring checks and plugins from legacy monitoring tools like Nagios, Icinga, Zabbix, and many more. Sensu was designed from day one as a replacement for an aging Nagios installation, and to this day monitoring plugin compatibility remains as one of Sensu's most compelling features 08:21:07 eyalb: sounds great 08:21:50 let’s consider it then 08:22:05 #topic Status and Updates 08:22:24 I’m working on a bug with the evaluator 08:22:32 #link https://bugs.launchpad.net/vitrage/+bug/1645659 08:22:32 Launchpad bug 1645659 in Vitrage "template actions override one another" [High,In progress] - Assigned to Ifat Afek (ifat-afek) 08:22:41 turns out that if a scenario has both set-state and mark-down actions, the second one “wins” and the changes of the first one are overwritten. I’m trying to figure out why it happens 08:22:53 That’s it on my side 08:23:10 I will update 08:23:43 I have done some name changes in the constants file. 08:24:04 we have changed the sync_type to datasource_action 08:24:19 sync_mode to graph_action 08:24:38 so it will be more clear and understandable what each one of them means 08:25:08 alexey_weyl: I think it significantly improved the code readability 08:25:30 I am glad :) 08:25:34 I will update 08:25:49 I added a support for osc in the vitrage client 08:26:31 I fixed the devstack installation of the vitrage dashboard 08:26:54 and updated the bash completion file of the vitrage client 08:26:55 eyalb: osc is openstack client? 08:26:57 thats all 08:27:00 yes 08:27:08 great 08:27:15 About the persistent graph which is going to be in Ocata, I think that we should use neo4j for that. 08:27:55 Neo4j has a community and many companies use it. 08:27:58 neo4j is probably a good option. we should check what it means to use it in openstack, since I don’t think it is there yet 08:28:17 It also has very nice python plugins 08:28:41 It isn't in Openstack, and need to see how we push it in 08:29:02 right, we should make sure there is no license issue 08:29:12 anyone else has updates? 08:29:24 I'll update 08:29:36 I finished the code about getting aodh alarm notification, ongoing add the test case. 08:29:55 Still have a issue to improve: may lost notification messaging. How to remedy it, synchronize with snapshot peroidly or something else? It's a little probability. We can improve it after supporting the function 08:30:23 dwj: thanks for the first change, I think it improved the code 08:30:30 as for the second issue - can you ellaborate? 08:31:38 for example, if we lost alarm.creation notification, the receive another notification about the alrm 08:32:06 so you mean we will not know the details of the alarm, if we missed its creation? 08:32:14 we can't get all the alarm info for update the vertex 08:32:20 that’s a problem, I agree 08:32:25 yes 08:32:53 get_all will fix it, but it happens only once in about 10 minutes I think 08:33:35 from your experience, does it happen that notifications get lost? because if we start with get_all, and then listen to all notifications, it should be ok, right? 08:34:27 No,it doesn't happen,but it is still a issue to consider 08:35:33 but if you never seen it happen, then why consider it? I agree that it can be a bug, but if it is a rare occasion then maybe it will be good enough to fix it by the next get_all after a few minutes 08:37:27 I’m just trying to understand if you have a specific use case you are worried about. You are right for being worried, but I am trying to understand how often we should expect it to happen 08:38:07 ListenerService should synchronize with snapshotService for get_all, the update the cached alarms 08:38:45 you suggest this as a solution? 08:40:44 yes, it can be a solution, or when we can't find the alarm in cache,we can query to aodh 08:41:07 ok, I think it may work 08:42:14 getting notification from Aodh is very important, thanks for taking care of it :-) 08:42:39 no problem. :) 08:43:50 any other issues? 08:44:16 No 08:44:39 anyone else would like to update? 08:46:36 goodbye then, see you next week 08:47:02 #endmeeting