09:01:09 <ifat_afek> #startmeeting vitrage
09:01:10 <openstack> Meeting started Wed Jan  6 09:01:09 2016 UTC and is due to finish in 60 minutes.  The chair is ifat_afek. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:01:11 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:01:14 <openstack> The meeting name has been set to 'vitrage'
09:01:18 <alexey_weyl> Hello There :)
09:01:23 <eyalb> hello
09:01:27 <ifat_afek> Hi everyone, welcome back 
09:01:34 <umargolin> hi
09:02:23 <lhartal> hi all :)
09:02:27 <emalin> hi
09:02:30 <Ohad> hi
09:02:39 <lhartal> long time...
09:02:47 <nadav_yakar> hi
09:03:05 <elisha_r> hi all
09:03:07 <emalin> long time no see
09:06:23 <ifat_afek> Our agenda:
09:06:32 <ifat_afek> * Current status and progress
09:06:38 <ayah> hi
09:06:40 <ifat_afek> * Review action items
09:06:50 <ifat_afek> * Next steps
09:06:57 <ifat_afek> * Open Discussion
09:07:07 <ifat_afek> #topic Current status and progress
09:07:25 <ifat_afek> A short update about Vitrage documentation: Maty checked this issue, and we cannot place our documentation in the official openstack place (http://docs.openstack.org) until we are accepted under the big tent.
09:07:37 <ifat_afek> I suggest that for now we add our detailed design diagrams in Vitrage main page
09:07:58 <ifat_afek> Update on what I did: I started working on Nagios plugin for the synchronizer. As a first stage, I’m going to implement the get_all for nagios services (tests).
09:08:01 <inbar_stolberg> hello
09:08:16 <ifat_afek> For notifications, we have decided not to register to Nagios event handlers, as it raises security issues (how will Nagios call vitrage). Instead, we will take Nagios snapshots periodically and compare them.
09:08:34 <ifat_afek> I was also involved in the discussions on the consistency process (with alexey_weyl, elisha_r and Asi), and plan to document the use cases and challenges. We should continue with the design this week.
09:08:57 <ifat_afek> We worked on the first integration of the synchronizer, processor, graph, api and UI.
09:09:04 <ifat_afek> alexey_weyl, can you update?
09:09:10 <alexey_weyl> I would love to
09:09:25 <amir_gur> Hi
09:09:59 <alexey_weyl> I have performed the integration of the synchronizer + processor + transformer. Now it works and runs
09:10:51 <alexey_weyl> In addition I have a created an openstack service for "vitrage-graph" which runs the the synchronizer, entity graph, consistency and api handler oslo services
09:11:17 <alexey_weyl> if you want to run it, you can do: "sudo pip install -e."
09:11:34 <alexey_weyl> which will install the services, and then you can run "vitrage-graph"
09:11:39 <ifat_afek> cool!
09:12:03 <emalin> very nice
09:12:21 <ifat_afek> Ohad, can you update about our discussions with PinPoint?
09:12:26 <elishar_r> cool!
09:12:43 <Ohad> We had a meeting with PinPoint – OPNFV project aiming to provide RCA framework for NFVI and VIM layers focusing on network issues.
09:12:56 <Ohad> We found good alignment between use cases from both projects covering failures from physical and virtual layers. PinPoint are working on gap analysis to find out which information/ data exists in different projects or tools in order to understand root cause of failures and to define the APIs needed for it.
09:13:32 <elishar_r> @Ohad - can you explain what "aiming" means?
09:13:33 <Ohad> It looks like Vitrage perfectly match for providing the get physical/virtual topology and mappings APIs and we will keep working together on this.
09:14:08 <lhartal> @alexey_weyl: cool - lets do next week integration including zones, hosts and instances :)
09:14:52 <alexey_weyl> @lhartal: sounds great :)
09:15:25 <ifat_afek> alexey_weyl: this is great, let's do the full integration next week
09:15:59 <ifat_afek> #action alexey_weyl continue with the integration, including zones, hosts and instances
09:16:01 <Ohad> Elisha: PinPoint is a requirement project
09:16:35 <alexey_weyl> Ok :)
09:17:06 <ifat_afek> Ohad, elisha_r: we are in the process of finalizing vitrage API so we can send the definition to PinPoint, and verify it matches their use cases
09:17:29 <eyalb> I am still working on api
09:17:45 <eyalb> first version was written with a simple filter
09:18:05 <eyalb> next we will use a more complex filter
09:18:22 <eyalb> i did an integration with UI
09:18:55 <eyalb> they are using the client and were able to retrieve a mock graph
09:19:02 <Ohad> Eyalb: once we have a version, please share it with PinPoint
09:19:23 <eyalb> still need to work with dany to call the api handler
09:19:33 <eyalb> ohad sure
09:19:43 <eyalb> thats it
09:19:51 <ifat_afek> eyalb, so once Dani is done, we will have an end-to-end integration?
09:20:10 <eyalb> hopefully yes
09:20:16 <ifat_afek> great
09:20:38 <ifat_afek> nadav_yakar, can you update about the synchronizer status?
09:20:59 <nadav_yakar> we have finalized the synchronizer design which includes hosts, zones and instances snapshotting and notifications propagation. I have checked in the synchronizer's plugin execution framework and worked with Alexey to integrate it with the vitrage graph
09:22:06 <inbar_stolberg> get_all for host and zone are ready
09:22:40 <nadav_yakar> yes, the instances snapshotting process is also checked in
09:22:57 <ifat_afek> great
09:23:24 <ifat_afek> who else wants to update?
09:24:52 <elishar_r> I've started compiling information on how Vitrage will work with Neo4J or Titan (or any other persistant GraphDB) that can replace NetworkX.
09:25:05 <emalin> I did little research about oslo.service and it's multi-thread support
09:26:07 <emalin> It seems that we can use one process with multi-thread of greenlet while working with networkx
09:26:33 <emalin> And multi processes while working with Neo4j
09:26:51 <emalin> or other graph db that support access from multi processes
09:27:24 <alexey_weyl> Sounds great! good solution!
09:27:29 <idan_hefetz> my update: currently working to implement the Get Topology query api over NetworkX, so we can request a filtered subgraph of the entity graph.
09:27:43 <ifat_afek> emaiin, so the design for networkx is finished for now?
09:28:26 <emalin> ifat_afek: if think is finished
09:28:34 <ifat_afek> great
09:28:42 <emalin> I think it's finished
09:29:01 <ifat_afek> any other updates? if not, let's move on
09:29:28 <ifat_afek> #topic Review action items
09:29:39 <ifat_afek> • ifat_afek check Aodh integration workaround and update Ceilometer blueprints
09:29:50 <ifat_afek> I sent an email to Aodh mailing list, and specifically to Julien and Ryota. Got no reply, could be because of the holidays. Will try again in a week or two.
09:30:12 <ifat_afek> I also emailed Aodh and asked them not to remove the ability to send notifications about alarm status changes (they planned to remove it), because we want to register to these notifications.
09:30:26 <ifat_afek> • nadav_yakar checkin a basic synchronizer FW for the vitrage graph to interface with and see that we are on the same page
09:30:40 <nadav_yakar> done
09:30:44 <ifat_afek> • ifat_afek check how we should add vitrage documentation
09:30:47 <gsagie> i have a question, NetworkX is persistent or it has a way to keep the graph in RAM?
09:30:55 <gsagie> sorry for stepping in :)
09:31:13 <nadav_yakar> in memory only
09:31:19 <gsagie> cool, thanks
09:31:55 <gsagie> looks like an interesting project
09:32:03 <elishar_r> Let me expand a bit on NetworkX
09:32:53 <elishar_r> We started with NetworkX b/c it's pure python (and the only significant graph DB in python project we could find).
09:33:06 <elishar_r> It's in-memory
09:33:51 <elishar_r> However, for good performance we are already working now on a design that will allow using persistant graph-DB, such as Neo4J and Titan, instead.
09:34:52 <elishar_r> Already now, in our design, we use a interface called "Graph Driver" that will remain the same even when we replace the graph DB backend in the future.
09:35:32 <gsagie> elishar_r : why there is a performance problem ? i would assume that in-memory should be faster then persistant one in general
09:35:42 <gsagie> or the package itself (NetworkX) is not so good?
09:36:34 <elishar_r> NetworkX itself has reasonable performance, as-is. however, there are a few other performance issues we want to address.
09:36:58 <elishar_r> first of all, it's not persistant. that means that if Vitrage fails, we need to rebuild the DB when it goes back online
09:37:40 <elishar_r> second of all, in pure python there is little support for real multi-threading, while when using a persistant DB like Neo4J we can access it in parallel from different sources.
09:38:26 <gsagie> elisha_r: thanks for explaining, it make sense
09:38:45 <elishar_r> finally, doing things in-memory means we cannot leverage distribution, and limits the graph size as well. Those are the main points
09:38:53 <elishar_r> sure :)
09:39:18 <emalin> But networkx is very use full for dev env
09:39:30 <emalin> you don't need to install any 3rd party DB
09:40:00 <ifat_afek> BTW, we found out that networkx is already in use in openstack (by TaskFlow project if I'm not mistaken)
09:40:41 <emalin> And networkx really fast
09:41:10 <ifat_afek> ok, let's go back to the action items
09:41:12 <ifat_afek> • ifat_afek check how we should add vitrage documentation
09:41:16 <ifat_afek> done, already discussed it
09:41:31 <ifat_afek> • decide on Vitrage next use cases
09:43:09 <ifat_afek> We talked about the second use case. It will include the evaluator for RCA purposes, nagios synchronizer (only snapshots), and physical resources synchronizer
09:43:35 <ifat_afek> #topic Next Steps
09:43:42 <ifat_afek> so we already discussed the integration
09:43:54 <ifat_afek> and the second use case
09:43:58 <ifat_afek> anything else?
09:44:48 <nadav_yakar> I want to adapt our synchronizer framework
09:44:53 <emalin> We hope to start working on message bug listener plugin
09:44:54 <ifat_afek> #action finalize get topology API
09:44:56 <emalin> for nova
09:45:35 <nadav_yakar> adapt our framework per our design changes and oslo conventions
09:45:45 <ifat_afek> #action ifat_afek update the documentation on vitrage main page with our latest design diagrams (of vitrage graph and the synchronizer)
09:46:13 <ifat_afek> #topic Open Discussion
09:46:26 <ifat_afek> I had a look yesterday at Telemetry and Monasca IRC meeting logs, to see if they are doing anything that interests us.
09:46:48 <ifat_afek> Monasca started working on a cassandra time-series DB. This is not related directly to Vitrage, but if they introduce cassandra to Openstack and handle cassandra installation, it might help us in our future “real” graph-database implementation.
09:47:00 <ifat_afek> #link https://blueprints.launchpad.net/monasca/+spec/monasca-cassandra
09:47:19 <ifat_afek> As for Ceilometer, I noticed two interesting issues:
09:47:40 <ifat_afek> They want to improve their alarm rules. They will define complex conditions of and/or over several threshold conditions.
09:47:51 <ifat_afek> #link https://blueprints.launchpad.net/ceilometer/+spec/composite-threshold-rule-alarm
09:48:07 <ifat_afek> one of their future targets (i.e. not for mitaka?) is application level monitoring
09:48:17 <ifat_afek> #link https://wiki.openstack.org/wiki/Telemetry/RoadMap
09:48:27 <ifat_afek> anything else?
09:51:29 <lhartal> we are going to present Vitrage first demo next week
09:51:37 <lhartal> We're planing to display the first use case: Vitrage show topology
09:52:12 <lhartal> including zones, hosts and instances
09:54:01 <gsagie> cool, is this going to be recorded?
09:54:07 <lhartal> yes
09:54:11 <gsagie> great
09:55:02 <lhartal> we will put a link in Vitrage website
09:55:38 <ifat_afek> cool! we will also email it to openstack dev list
09:56:15 <lhartal> #action: presenting first Vitrage demo
09:56:34 <alexey_weyl> Thumbs up!
09:56:34 <ifat_afek> an update on behalf of Marina: she prepared a dev stack that we can use for our tempest tests
09:57:33 <ifat_afek> see you next week then
09:57:54 <gsagie> cya!
09:57:57 <eyalb> bye
09:58:05 <alexey_weyl> bye bye :)
09:58:09 <elishar_r> bye
09:58:12 <lhartal> bye
09:58:17 <amir_gur> bye
09:58:32 <ifat_afek> #endmeeting