Quite extraordinary release Neo4j 2.0 drops today

Neo Technologys Rik Van Bruggen explains how Neo4j 2.0 makes complex things easy with key new features and overhauls of the graph database.

This September, Neo4j’s
creator Emil
Eifrem told JAXenter that version 2.0 of Neo Technology’s
signature graph database represented the biggest change in the
graph database formula since its launch in 2000.

Although the community have been able to follow
each step of Neo4j 2.0’s development, after months of waiting, the
final ready to drive version dropped today. You can get tinkering
right away by downloading it here.

Prior to this release, in the November edition of
the publication, Neo4j’s Rik Van Bruggen gave JAX
Magazine readers a snapshot of some of Neo4j 2.0’s key
features. Before you take a test drive, check out how the graph
database has evolved.

The (r)evolutionary (r)evolution of the
graph database

There’s been a number of nice articles written about
graphs, graph databases, and, more specifically, Neo4j in the past
couple of months. Each one jumping on the hype train and brimming
with ‘revelations’ about the coolness of graph databases in general
– especially Neo4j. In this article, we would like to continue this
love-in by shining a spotlight on a couple of fantastic new
features that are part of Neo4j 2.0, and that, in our humble
opinion, quite frankly, are better than sliced bread. Much
better.

Under the hood: evolving from a “property
graph” to a “labeled property graph”

For the longest time, Neo4j has been using a very
specific, rich and expressive data model to represent networks and
graphs natively in the Neo4j database. This makes queries (aka
traversals) so much easier to think of, translate into a query
language (cypher), execute against a running database server, and
then maintain and share with others.

The “property graph” was, and is, a fine data model
for highly connected data, allowing you to store data in a) nodes
(aka vertices), b) relationships (aka edges) and c) properties on
a) and b). Nodes and relationships were always equal citizens, and
will continue to be so.

But there were issues with the data model: it did not
allow for a “meta-model” that would describe the data structures in
the database, and many users would have to therefore emulate that
themselves. Everyone was reinventing the wheel, introducing “type
nodes” and “type properties” into the database that would achieve
the metamodel goals somehow – but it really was not a very nice
solution.

That’s why Neo4j 2.0 introduces the concept of a “node
label” into the data model. This is not just a cosmetic change to a
perfectly fine graph database: it’s a fundamental new data model
concept that allows users to create “subgraphs” into the property
graph.

Labels do many things for you today, but will do even
more for you in the future. Today, Labels:

Provide you with a much simpler data model by doing
away with the need for you to create the meta-structure yourself
(see above).

Allow for a much cleaner, simpler, and guaranteed
indexing mechanism to the data. In the past, indexing of the data
in Neo4j was a bit of a problem: there was never a hard guarantee
that the data in the index would be the same as the data in the
graph – it was left up to the user of the graph to ensure that
consistency. Now, the database takes care of this.

Allow for an even more declarative query language
(cypher). Previously, Cypher would always start with a start, which
made it very clear that your queries would have to be “graph local”
or “egocentric” – starting at a starting point and crawling out
from there. But you had to decide where to start. You had to tell
the database how to approach the crawl – which actually is not how
a declarative query language should work. You are supposed to
declare what you want and then let the database figure out how to
get you what you want. Now, with the labels and the new indexing,
you can forget about the start clause. Just define the pattern in
the match clause, and be done with it – Neo4j will figure it out
from there. Much more intuitive.

But imagine what we could do with labels in the
future. Labels provides us with structure in the graph – so some of
the things that we would like to do at some point in the future. We
could use Labels for things like:

Imposing constraints on the data. We have done a bit
of this in 2.0 – but the possibilities for expansion in this domain
are large.

Taking the knowledge of the graph that Labels give to
us to do much more query optimisation.

Implementing security structures in the graph

Distributing the graph across multiple machines
(provide sharding) – should that ever be required.

The long and short of it is that Labels are new – but
extremely powerful and much simpler for the novice graph users.

New Neo4j shell tools – MUCH easier import
capabilities

What do people usually want to do with a database?
Store data in it, right? Well, there used to be a time when
importing your data into Neo4j was complicated. Many people have
voiced these concerns – and I really believe that Neo4j is finally
addressing these concerns in a great way. Yes, of course, if you’re
already a Java-loving rocket scientist, these new techniques won’t
mean much to you – but to the average mortals out there, they will
make a world of a difference.

Two things usually complicated the import process:

Do you want to be importing into a running
database?

What kind of scale are we looking at? Thousands or
millions of things?

In all of these cases, the Neo4j-shell-tools allow you
to parametrize the import process, and go from model to reality in
a very reasonable timeframe. All you need to do is learn the Neo4j
shell tools syntax, fire up the Neo4j shell, and get going.

New Neo4j Browser tool – visualisation and
more

And then there is the all-new Neo4j Browser tool that
is part of the 2.0 release. For those of you that are new to the
graph database world, a word of context.

First of all: graph visualisations are important.
Almost every user of the Neo4j graph database uses some kind of
graph visualisation as part of the user interface. Whether they are
using the stock Neo4j webadmin tool, a custom-developed
visualisation solution (using tools like d3.js, vivagraph.js, or
similar) or a commercial product (like Linkurio.us or Keylines) –
the human-navigatable nature of graph exploration solutions is
super-interesting, and very different from the traditional “excel
spreadsheet” approach of interacting with data.

Secondly: ad hoc queries are important. That’s why
Neo4j has initiated the development of a declarative,
easy-to-write-but-even-easier-to-read query language called Cypher
in the first place. But query languages require tools to be able to
exploit them in a productive manner – tools that allow you to
experiment, learn, retry and iterate on your queries so that you
can gradually make more sense of your data.

These two things are the exact two things that the new
Neo4j browser delivers on. A powerful visualisation solution that
allows for flexible colouring of nodes, relationships and paths,
and a powerful test and development environment for ad hoc cypher
queries. Very useful.

Conclusion

The new, 2.0 release of the world’s leading graph
database has a number of amazing new features that have made this
powerful release really worth the wait. Both from a fundamental
architecture point of view (Labels), an adoption point of view
(easy import) and from a development point of view (the new
Browser) great progress has been made that should pave the road for
many great things to come. It’s not a revolution, it’s an
evolution. But an evolution that could have revolutionary
consequences for the graph database market. I for one, can’t
wait.

Rik Van Bruggen is the regional territory manager
for Neo Technology for the BeNeLux, UK, and the Nordics. He has
been working for startup companies for most of his career. He has a
fond technical interest, and really is passionate about business –
and about Belgian Beer Twitter: @rvanbruggen