iansrobinson.com

Archive for June, 2013

This example shows a simple solution for calculating the shortest weighted path between two nodes in a reasonably dense graph (say 30 relationships per node) using Cypher.

Consider the following graph:

Let’s say we want to calculate the shortest weighted path between A and I. All we have to do is sum the weights along each path. Doing this, we find that (A)->(B)->(C)->(I) is weight 15, (A)->(D)->(E)->(I) is weight 12, and (A)->(F)->(G)->(H)->(I) is weight 8. Therefore, (A)->(F)->(G)->(H)->(I) is the shortest weighted path, even though it has one more relationship than either of the other two paths.

The summing of weights is accomplished using Cypher’s reduce function.

For modestly sized graphs with up to 30 relationships per node, this query performs reasonably well. For very large, extremely densely connected graphs, however, it’s not going to cut it: those types of graph require an implementation of a shortest weighted path algorithm such as Dijkstra or A*. Running on a MacBook Air against a graph comprising 1 million nodes, 12.5 million relationships, with between 4 and 20 outgoing CONNECTED_TO relationships per node (average 12), average execution time is 120 ms. Increasing the depth of the search and/or the density of the graph will increase query times.

This example has been tested against Neo4j 1.9. The code for the example can be downloaded from here.

Alternative implementation

Instead of using r in relationships(p) to iterate the relationships along each path, we could bind a path’s CONNECTED_TO relationships to an identifier instead, like this:

This example shows how to add a node to the head of a linked list using Cypher.

Let’s assume we have a graph containing an indexed node that can act as the list’s owner. What we want to do is build up a linked list, with the head of the list attached to the owner node using a HEAD relationship. When a new node is added to the list, it becomes the head node. The HEAD relationship is detached from the old head node and attached to this new head node. At the same time, the new head node is linked to the old head node using a PREV relationship.

The diagram below shows a linked list comprising three nodes. The list is attached to an owner node named ‘my-list’.

To build the linked list shown above, we would execute this Cypher statement three times.

How it works

The START clause looks up the node representing the list’s owner in an index. The CREATE UNIQUE clause then creates a new node and links it to the owner using a HEAD relationship. The {newElement} parameter here allows us to pass a map of the new node’s properties to the query.

Having created the new head node, we then pipe it and the owner node to a second query using WITH.

In this second query we try to match the list owner, the new head node and the old head node (if there is one):

If we find an old head node, we delete its HEAD relationship, because it is no longer valid. We then create a PREV relationship that links the new head node to this old head node:

The MATCH clause here is similar to a pattern matching clause in a functional language: if Cypher can match the (newHead)<-[:HEAD]-(owner)-[oldRel:HEAD]->(oldHead) pattern in the graph, it will invoke the subsequent DELETE and CREATE clauses; otherwise it will terminate without deleting or creating any further relationships.

This example has been tested against Neo4j 1.8.1 and 1.9. The code for the example can be downloaded from here.

The current version of Neo4j has two different types of index: named indexes and automatic indexes. The Cypher query examples throughout the Graph Databases book use named indexes. In this query, for example, venue, city and author are named node indexes:

There’s a problem, however: data created using a Cypher CREATE statement won’t be indexed in a named index. This has led to some confusion for anyone wanting to code along with the examples: if you use the CREATE statements as published, the query examples won’t work.

If you want to code along with the examples, the easiest thing to do is create the data using Cypher and automatic indexes. You’ll need to modify the queries to use the automatic indexes, but the changes aren’t complicated.

Configure Automatic Indexes

The examples below show how to run the Shakespeare examples (pp.47-50) using automatic indexes.

Before you create any data, you’ll need to configure the automatic indexes. The approach you take will depend on whether you are running Neo4j in embedded or server mode.