Presentation Materials and other URLs

Contributing

Have you found a bug, need help or have a patch ? Just clone neo4j.rb and
send me a pull request or email me. Do you need help - send me an email
(andreas.ronge at gmail dot com). Please also check/add issues at
lighthouse, neo4j.lighthouseapp.com

Installation

Running all RSpecs

To check that neo4j.rb is working:

cd neo4j # the folder containing the Rakefile
rake # you may have to type jruby -S rake depending how you installed JRuby

Three Minute Tutorial

Neo node space consists of three basic elements: nodes, relationships that
connect nodes and properties attached to both nodes and relationships. All
relationships have a type, for example if the node space represents a
social network, a relationship type could be KNOWS. If a relationship of
the type KNOWS connects two nodes, that probably represents two people that
know each other. A lot of the semantics, the meaning, of a node space is
encoded in the relationship types of the application.

Creating Nodes

Transactions

Almost all Neo4j operation must be wrapped in a transaction as shown above.
In the following examples we assume that the operations are inside an Neo4j
transaction. Neo4j.rb supports auto commits without wrapping Neo4j
operations in Neo4j::Transactions. To enable that require the file
'neo4j/auto_tx' instead of 'neo4j'

Example

require 'neo4j/auto_tx'
# no need to wrap operations in Neo4j::Transaction.run do .. end
Node.new

Creating Relationships

Accessing Relationships

Example of getting relationships

node1.relationships.empty? # => false
# The relationships method returns an enumeration of relationship objects.
# The nodes method on the relationships returns the nodes instead.
node1.relationships.nodes.include?(node2) # => true
node1.relationships.first # => the first relationship this node1 has which is between node1 and node2 of any type
node1.relationships.nodes.first # => node2 first node of any relationship type
node1.relationships.incoming(:friends).nodes.first # => node1 first node of relationship type 'friends'
node2.relationships.incoming(:friends).first # => a relationship object between node1 and node2

The Node and NodeMixin

In the next tutorial you will learn how to use this Neo4j::NodeMixin in
your own domain model classes.

Ten Minute Tutorial

Creating a Model

The following example specifies how to map a Neo4j node to a Ruby Person
instance.

require "rubygems"
require "neo4j"
class Person
include Neo4j::NodeMixin
# define Neo4j properties
property :name, :salary
# define an one way relationship to any other node
has_n :friends
# adds a lucene index on the following properties
index :name, :salary
index 'friends.age' # index each friend age as well
end

Neo properties and relationships are declared using the 'property'
and 'has_n'/'has_one' NodeMixin class method. Adding new
types of properties and relationships can also be done without declaring
those propertie/relationships by using the operator '[]' on
Neo4j::NodeMixin and the

'<<' on the Neo4j::Relationships::RelationshipTraverser.

By using the NodeMixin and by declaring properties and indices, all
instances of the Person class can now be stored in the Neo4j node space
and be retrieved/queried by traversing the node space or performing Lucene
queries.

A lucene index will be updated when the name or salary property changes.
The salary of all friends are also indexed which means we can query for
people who has friends with a certain salary.

Creating a node

Creating a Person node instance

person = Person.new

Setting properties

Setting a property:

person.name = 'kalle'
person.salary = 10000

If a transaction is not specified then the operation will automatically be
wrapped in a transaction.

Dynamic Properties

Notice that it is not required to specify which attributes should be
available on a node. Any attributes can be set using the [] operator.
Declared properties set an expectation, not an requirement. It can be used
for documenting your model objects and catching typos.

Example:

person['an_undefined_property'] = 'hello'

So, why declare properties in the class at all? By declaring a property in
the class, you get the sexy dot notation. But also, if you declare a Lucene
index on the declared property and update the value, then the Lucene index
will automatically be updated. The property declaration is required
before declaring an index on the property.

Dynamic Relationships

Like dynamic properties, relationships do not have to be defined using
has_n or has_one for a class. A relationship can be added at any time on
any node.

Example:

person.relationships.outgoing(:best_friends) << other_node
person.relationship(:best_friend).end_node # => other_node (if there is only one relationship of type 'best_friend' on person)

Finding Nodes and Queries

There are three ways of finding/quering nodes in neo4j:

1. by traversing the graph
2. by using lucene queries
3. using the unique neo4j id (Neo4j::NodeMixin#neo_node_id).

When doing a traversal one start from a node and traverse one or more
relationships (one or more levels deep). This start node can be either the
reference node which is always found (Neo4j.ref_node) or by finding a start
node from a lucene query.

Lucene Queries

There are different ways to write lucene queries. Using a hash:

Person.find (:name => 'kalle', :age => 20..30) # find people with name kalle and age between 20 and 30

Deleting a Relationship

If a node is deleted then all its relationship will also be deleted
Deleting a node is performed by using the delete method:

person.delete

Node Traversals

The has_one and has_many methods create a convenient method for traversals
and managing relationships to other nodes. Example:

Person.has_n :friends # generates the friends instance method
# all instances of Person now has a friends method so that we can do the following
person.friends.each {|n| ... }

Traversing using a filter

person.friends{ salary == 10000 }.each {|n| ...}

Traversing with a specific depth (depth 1 is default)

person.friends{ salary == 10000}.depth(3).each { ... }

There is also a more powerful method for traversing several relationships
at the same time - Neo4j::NodeMixin#traverse, see below.

Example on Relationships

In the first example the friends relationship can have relationships to any
other node of any class. In the next example we specify that the
'acted_in' relationship should use the Ruby classes Actor, Role and
Movie. This is done by using the has_n class method:

class Role
include Neo4j::RelationshipMixin
# notice that neo4j relationships can also have properties
property :name
end
class Actor
include Neo4j::NodeMixin
# The following line defines the acted_in relationship
# using the following classes:
# Actor[Node] --(Role[Relationship])--> Movie[Node]
#
has_n(:acted_in).to(Movie).relationship(Role)
end
class Movie
include Neo4j::NodeMixin
property :title
property :year
# defines a method for traversing incoming acted_in relationships from Actor
has_n(:actors).from(Actor, :acted_in)
end

Lucene Document

In lucene everything is a Document. A document can represent anything
textual: Word Document, DVD (the textual metadata only), or a Neo4j.rb
node. A document is like a record or row in a relationship database.

The following example shows how a document can be created by using the
''<<'' operator on the Lucene::Index class and found
using the Lucene::Index#find method.

Example of how to write a document and find it:

require 'lucene'
include Lucene
# the var/myindex parameter is either a path where to store the index or
# just a key if index is kept in memory (see below)
index = Index.new('var/myindex')
# add one document (a document is like a record or row in a relationship database)
index << {:id=>'1', :name=>'foo'}
# write to the index file
index.commit
# find a document with name foo
# hits is a ruby Enumeration of documents
hits = index.find{name == 'foo'}
# show the id of the first document (document 0) found
# (the document contains all stored fields - see below)
hits[0][:id] # => '1'

Notice that you have to call the commit method in order to update the index
on the disk/RAM. By performing several update and delete operations before
a commit will be much faster then performing commit after each operation.

Keep indexing on disk

By default Neo4j::Lucene keeps indexes in memory. That means that when the
application restarts the index will be gone and you have to reindex
everything again.

Notice that even if it looks like a new Index instance object was created
the index.uncommited may return an not empty array. This is because
Index.new is a singleton - a new instance object is not created.

The Neo4j Module

The Neo4j module is used to map Ruby objects to nodes and relationships in
a network. It supports two different ways of retrieval/quering:

Start and Stop of the Neo4j

Unlike the Java Neo4j implementation it is not neccessarly to start Neo4j.
It will automatically be started when needed. It also uses a hook to
automatically shutdown Neo4j. Shutdown of neo4j can also be done using the
stop method, example:

Neo4j.stop

Neo4j Configuration

Before using Neo4j the location where the database is stored on disk should
be configured. The neo4j configuration is kept in the Neo4j::Config class:

Neo4j::Config[:storage_path] = '/home/neo/neodb'

Lucene Integration

Neo4j.rb uses the Lucene module. That means that the Neo4j::NodeMixin has
method for both traversal and lucene queries/indexing.

Lucene Configuration

By default lucene indexes are kept in memory. Keeping index in memory will
increase the performance of lucene operations (such as updating the index).

The reindexer extension that is used in the example above will for each
created node create a relationship from the index node
(Neo4j.ref_node.relationships.outgoing(:index_node)) to that new node. The
all method use these relationships in order to return nodes of a certain
class. The update_index method also uses this all method in order to update
the index for all nodes of a specific class.

Relationship has_n and has_one

Neo relationships are asymmetrical. That means that if A has a relationship
to B then it may not be true that B has a relationship to A.

Relationships can be declared by using the 'has_n' or
'has_one' Neo4j::NodeMixin class methods.

has_n

The has_n Neo4j::NodeMixin class method creates a new instance method that
can be used for both traversing and adding new objects to a specific
relationship type.

For example, let say that Person can have a relationship to any other node
class with the type 'friends':

class Person
include Neo::Node
has_n :knows # will generate a knows method for outgoing relationships
end

The generated knows method will allow you to add new relationships,
example:

By doing this you can add a relationships on either the incoming or
outgoing node. The from method can also take an additional class parameter
if it has incoming nodes from a different node class (see the
Actor-Role-Movie example at the top of this document).

The block { name == 'andreas' } will be evaluated on each node in
the relationship. If the evaluation returns true the node will be included
in the filter search result.

Traversing Nodes

The Neo4j::NodeMixin#traverse method is a more powerful method compared to
the generated has_n and has_one methods. Unlike those generated method it
can traverse several relationship types at the same time. The types of
relationships being traversed must therefore always be specified in the
incoming, outgoing or both method. Those three methods can take one or more
relationship types parameters if more then one type of relationship should
be traversed.

Traversing Nodes of Arbitrary Depth

The depth method allows you to specify how deep the traverse should be. If
not specified only one level traverse is done.

Example:

me.traverse.incoming(:friends).depth(4).each {} # => people with a friend relationship to me

Traversing Nodes With Several Relationship Types

It is possible to traverse sevaral relationship types at the same type. The
incoming, both and outgoing methods takes list of arguments.

Example, given the following holiday trip domain:

# A location contains a hierarchy of other locations
# Example region (asia) contains countries which contains cities etc...
class Location
include Neo4j::NodeMixin
has_n :contains
has_n :trips
property :name
index :name
# A Trip can be specific for one global area, such as "see all of sweden" or
# local such as a 'city tour of malmoe'
class Trip
include Neo4j::NodeMixin
property :name
end
# create all nodes
# ...
# setup the relationship between all nodes
@europe.contains << @sweden << @denmark
@sweden.contains << @malmoe << @stockholm
@sweden.trips << @sweden_trip
@malmoe.trips << @malmoe_trip
@malmoe.trips << @city_tour
@stockholm.trips << @city_tour # the same city tour is available both in malmoe and stockholm

Then we can traverse both the contains and the trips relationship types
Example:

Traversing Nodes With a Filter

It's possible to filter which nodes should be returned from the
traverser by using the filter function. This filter function will be
evaluated differently depending on if it takes one argument or no
arguments, see below.

Filtering: Using Evaluation in the Context of the Current Node

If the provided filter function does not take any parameter it will be
evaluted in the context of the current node being traversed. That means
that one can writer filter functions like this:

Filtering: Using the TraversalPostion

If the filter method takes one parameter then it will be given an object of
type TraversalPosition which contains information about current node, how
many nodes has been returned, depth etc.

The information contained in the TraversalPostion can be used in order to
decide if the node should included in the traversal search result. If the
provided block returns true then the node will be included in the search
result.

The filter function will not be evaluated in the context of the current
node when this parameter is provided.

For example if we only want to return the Trip objects in the example
above:

# notice how the tp (TraversalPosition) parameter is used in order to only
# return nodes included in a 'trips' relationship.
traverser = @sweden.traverse.outgoing(:contains, :trips).filter do |tp|
tp.last_relationship_traversed.relationship_type == :trips
end
traverser.to_a # => [@sweden_trip]

Relationships

A relationship between two nodes can have properties just like a node.

If a Relationship class has not been specified for a relationship then any
properties can be set on the relationship. It has a default relationship
class: Neo4j::DynamicRelation

If you instead want to use your own class for a relationship use the
Neo4j::NodeMixin#has_n.relationship method, example:

class Role
# This class can be used as the relationship between two nodes
# since it includes the following mixin
include Neo4j::RelationMixin
property :name
end
class Actor
include Neo4j::NodeMixin
# use the Role class above in the relationship between Actor and Movie
has_n(:acted_in).to(Movie).relationship(Role)
end

Finding Relationships

The Neo4j::NodeMixin#relationships method can be used to find incoming or
outgoing relationship objects. Example of listing all types of outgoing
(default) relationship objects (of depth one) from the me node.

me.relationships.each {|rel| ... }

If we instead want to list the nodes that those relationships points to
then the nodes method can be used.

me.relationships.nodes.each {|rel| ... }

Listing all incoming relationship obejcts of any relationship type:

me.relationships.incoming.each { ... }

Listing both incoming and outgoing relationship object of a specific type:

Deleting a relationship

Use the Neo4j::RelationshipMixin#delete method. For example, to delete the
relationship between n1 and n2 from the example above:

n1.relationships.outgoing(:friends)[n2].delete

Finding nodes in a relationship

If you do not want those relationship object but instead want the nodes you
can use the 'nodes' method in the Neo4j::RelationshipMixin object.

For example:

n2.relationships.incoming.nodes # => [n1]

Finding outgoing/incoming nodes of a specific relationship type

Let say we want to find who has my phone number and who consider me as a
friend

# who has my phone numbers
me.relationships.incoming(:phone_numbers).nodes # => people with my phone numbers
# who consider me as a friend
me.relationships.incoming(:friends).nodes # => people with a friend relationship to me

Remember that relationships are not symmetrical. Notice there is also a
otherway of finding nodes, see the Neo4j::NodeMixin#traverse method below.

Transactions

All operations that work with the node space (even read operations) must be
wrapped in a transaction. For example all get, set and find operations will
start a new transaction if none is already not runnig (for that thread).

If you want to perform a set of operation in a single transaction, use the
Neo4j::Transaction.run method:

Example

Neo4j::Transaction.run {
node1.foo = "value"
node2.bar = "hi"
}

There is also a auto commit feature available which is enabled by requiring
'neo4j/auto_tx' instead of 'neo4j', see the three minutes
tutorial above.

Rollback

Neo4j support rollbacks on transaction. Example: Example:

include 'neo4j'
node = MyNode.new
Neo4j::Transaction.run { |t|
node.foo = "hej"
# something failed so we signal for a failure
t.failure # will cause a rollback, node.foo will not be updated
}

Everytime a node of type SomeNode (or a subclass) is create, deleted or
updated the lucene index of will be updated.

Reindexing

Sometimes it's neccessarly to change the index of a class after alot of
node instances already have been created. To delete an index use the class
method 'remove_index' To update an index use the class method
'update_index' which will update all already created nodes in the
neo database.

In order to use the update_index method you must include the reindexer
neo4j.rb extension. This extensions will keep a relationship to each
created node so that it later can recreate the index by traversing those
relationships.

Updating Lucene Index

The lucene index will be updated after the transaction commits. It is not
possible to query for something that has been created inside the same
transaction as where the query is performed.

Quering (using lucene)

You can declare properties to be indexed by lucene by the index method:

Unmarshalling

The neo module will automatically unmarshalling nodes to the correct ruby
class. It does this by reading the classname property and loading that ruby
class with that node.

class Person
include Neo::Node
def hello
end
end
f1 = Person.new {}
# load the class again
f2 = Neo4j.load(foo.neo_node_id)
# f2 will now be new instance of Person, but will be == f1
f1 == f2 # => true

Reference node

There is one node that can always be find - the reference node,
Neo4j::ReferenceNode. Example:

Neo4j.ref_node

This node can have a relationship to the index node (Neo4j::IndexNode),
which has relationships to all created nodes. You can add relationships
from this node to your nodes.

Performance Issues

It is recommended to wrap several Neo4j operations including read
operations in a singel transaction if possible for better performance.
Updating a lucene index can be slow. A solution to this is to keep the
index in memory instead of on disk.

I'm currently looking at how to scale neo4j.rb by a simple master-slave
cluster by using REST, see the REST extension below.

Extensions: Replication

There is an experimental extension that makes it possible to replicate an
neo4j database to another machine. For example how to use it see the
test/replication/test_master.rb and test_slave.rb It has only been tested
to work with a very simple node space.

Extension: REST

There is an REST extension to Neo4j.rb. It requires the following gems

* Sinatra >= 0.9.4
* Rack >= 1.0
* json-jruby >= 1.1.6

For RSpec testing it also needs:

* rack-test

For more information see the examples/rest/example.rb or the examples/admin
or Neo4j::RestMixin.

Extension: find_path

Extension which finds the shortest path (in terms of number of links)
between two nodes. Use something like this:

This extension is still rather experimental. The algorithm is based on the
one used in the Neo4j Java IMDB example. For more information see
Neo4j::Relationships::NodeTraverser#path_to or the RSpec find_path_spec.rb.

Ruby on Rails with Neo4j.rb

Neo4j.rb does work nicely with R&R. There are two ways to use neo4j.rb
with rails - embedded or accessing it via REST.