Several sources of data consist of events representing relationships between entities, like user interactions in a social network, clicks on pages linking to each other, purchases of products on web stores, etc. These streams of real-time data can be represented as dynamic graphs, where each event adds or updates an edge in the graph.

Processing dynamic graphs is a challenging task that requires sophisticated state management, snapshotting mechanisms, and incremental graph algorithms. Luckily, several graph computations, like graph statistics, aggregates, and graph sketches, as well as more complex algorithms like connected components and bipartiteness detection, can be computed in a single-pass fashion. Single-pass algorithms process each edge once and do not need to store or access the complete graph state.

We want to present the openCypher project, whose purpose is to make Cypher available to everyone – every data store, every tooling provider, every application developer. openCypher is a continual work in progress. Over the next few months, we will move more and more of the language artifacts over to GitHub to make it available for everyone.

openCypher is an open source project that delivers four key artifacts released under a permissive license:(i) the Cypher reference documentation, (ii) a Technology compatibility kit (TCK), (iii) Reference implementation (a fully functional implementation of key parts of the stack needed to support Cypher inside a data platform or tool) and (iv) the Cypher language specification.

We are also seeking to make the process of specifying and evolving the Cypher query language as open as possible, and are actively seeking comments and suggestions on how to improve the Cypher query language.

The purpose of this talk is to provide more details regarding the above-mentioned aspects.

Bruno Latour wrote a book about philosophy (an inquiry into modes of existence). He decided that the paper book was no place for the numerous footnotes, documentation or glossary, insteadgiving access to all this information surrounding the book through a web application which would present itself as a reading companion.He also offered to the community of readers to submit their contributions to his inquiry by writing new documents to be added to the platform.The first version of our web application was built on PHP Yiii and MySQL on the server side. This soon proved to be a nightmare to maintain because of the ultra-relational nature of our data.We refactored it completely to use node.js and Neo4J. We went from a tree system with internal links modelized inside a relational database to a graph of paragraphs included into documents, subchapters etc. all sharing links between them.On the way, we've learned Neo4J thoroughly, from graph data modeling to cypher tricks and developped our custom cypher query graphical monitor using sigma.js in order to check our data trans-modeling consistency.

During this journey, we've stumbled upon data model questions : ordered links, sub items grouping necessity, data output constraints from Neo4J, and finally the limitations of Neo4J community edition.Finally we feel much more confortable as developers in our new system. Reasoning about our data has become much easier and, moreover, our users are also happier since the platform's performance has never been better.

- our application's data needs- our shift from a MySQL data model to a Neo4J graph model- our feedbacks in using a graph database and more precisely Neo4J including our custom admin tool [Agent Smith](https://github.com/Yomguithereal/agent-smith)- a very quick description of the admin tools we built to let the researchers write or modify contents (a markdown web editor)

The research has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) / erc Grant ‘IDEAS’ 2010 n° 269567”

I'll introduce differential dataflow, an open-source analytics platform, and describe how it enables fundamentally new approaches to large-scale graph processing. Specifically, we'll see how to fairly easily write and run standard graph analyses, whose output results are automatically updated as their inputs changed. On billion-edge graphs this approach can both be more efficient than platforms like GraphX and provide sub-second update times.

At the Leipzig University, we develop Gradoop [1], a framework for distributed,declarative graph analytics on top of Apache Flink [2]. Gradoop is designedaround the so-called Extended Property Graph Model (EPGM) and supportssemantically rich, schema-free graph data. In this model, a database consistsof multiple property graphs, which we call logical graphs. These graphs areapplication specific subsets from shared vertex and edge sets. The EPGMprovides operators for both single graphs as well as collections of graphs.Operators may also return single graphs or graph collections thus enablingthe definition of analytical workflows in a declarative way.

In my talk, I would like to give an overview of Gradoop, the EPGM and itsoperators and show how Apache Flink helps us by presenting a subset of ouroperator implementations. Furthermore, I will sketch the usefulness of Gradoopby presenting an analytical use case from the business intelligence domain.

Gradoop is open-source and licenced under GPLv3. The Gradoop source code anda short documentation can be found on GitHub [3], a more detailed explanationof the data model and our operators can be found in a technical report [4].

Massive graph data sets are pervasive in contemporary application domains. Hence, graph database systems are becoming increasingly important. In the study of these systems, it is vital that the R&D community has shared benchmarking solutions for the generation of database instances and query workloads having predictable and controllable properties. Similarly to TPC benchmarks for relational databases, benchmarks for graph databases have been important drivers for the Semantic Web and graph data management communities. Current benchmarks, however, are either limited to fixed graphs or graph schemas, or provide limited or no support for generating tailored query workloads to accompany graph instances.

To move the community forward, a benchmarking approach which overcomes these limitations is crucial. In this talk, we present the design and engineering principles of gMark, a domain- and query language-independent open-source graph benchmark addressing these limitations of current solutions. A core contribution of gMark is its ability to target and control the diversity of properties of both the generated graph instances and the generated query workloads coupled to these instances. A further novelty is the support of recursive regular path queries, a fundamental graph query paradigm. We illustrate the flexibility and practical usability of gMark by showcasing the framework's capabilities in generating high quality graphs and workloads, and its ability to encode user-defined schemas across a variety of application domains.

This is joint work with Guillaume Bagan, Angela Bonifati, Radu Ciucanu, Aurélien Lemay, and Nicky Advokaat.

With more than 9 million users and 21 million repositories, Github is the world's biggest code sharing platform. Its API offers a window to the public activity of about 600,000 events a day. In this talk, Christophe will present how he transformed this activity into a graph and mapped the network flow between users, communities, programming languages, and code repositories. He will demonstrate how to gain new insights by building interest graphs and recommendation engines on top of this valuable data