RDF.rb: A Public-Domain RDF Library for Ruby

We have just released version 0.1.0 of RDF.rb, our RDF library for
Ruby. This is the first generally useful release of the library, so I
will here introduce the design philosophy and object model of the library as
well as provide a tutorial to using its core classes.

Once installed, to load up the library in your own Ruby projects you need
only do:

require 'rdf'

The RDF.rb source code repository is hosted on GitHub. You can
obtain a local working copy of the source code as follows:

$ git clone git://github.com/bendiken/rdf.git

The Design Philosophy

The design philosophy for RDF.rb differs somewhat from previous efforts at
RDF libraries for Ruby. Instead of a feature-packed RDF library that
attempts to include everything but the kitchen sink, we have rather aimed
for something like a lowest common denominator with well-defined, finite
requirements.

Thus, RDF.rb is perhaps quickest described in terms of what it isn't and
what it hasn't:

RDF.rb does not have any dependencies other than the Addressable gem
which provides improved URI handling over Ruby's standard library. We
also guarantee that RDF.rb will never add any hard dependencies that would
compromise its use on popular alternative Ruby implementations such as
JRuby.

RDF.rb does not provide any resource-centric, ORM-like abstractions to
hide the essential statement-oriented nature of the API. Such
abstractions may be useful, but they are beyond the scope of RDF.rb
itself.

RDF.rb does not, and will not, include built-in support for any RDF
serialization formats other than N-Triples and N-Quads.
However, it does define a DSL and common API for adding
support for other formats via third-party plugin gems. There presently
exist RDF.rb-compatible RDF::JSON and RDF::TriX gems that add
initial RDF/JSON and TriX support, respectively.

RDF.rb does not, and will not, include built-in support for any particular
persistent RDF storage systems. However, it does define the interfaces
that such storage adapters could be written to. Again, add-on gems are
the way to go, and there already exists an in-the-works RDF.rb-compatible
RDF::Sesame gem that enables using Sesame 2.0 HTTP endpoints
with the repository interface defined by RDF.rb.

RDF.rb does not, and will not, include any built-in RDF Schema or
OWL inference capabilities. There exists an in-the-works
RDF.rb-compatible RDFS gem that is intended to provide a naive
proof-of-concept implementation of a forward-chaining inference
engine for the RDF Schema entailment rules.

RDF.rb does not include any built-in SPARQL functionality per se,
though it will soon provide support for basic graph pattern
(BGP) matching and could thus conceivably be used as the basis for a
SPARQL engine written in Ruby.

RDF.rb does not come with a license statement, but rather with the
stringent hope that you have a nice day. RDF.rb is 100% free and
unencumbered public domain software. You can copy, modify,
use, and hack on it without any restrictions whatsoever.
This means that authors of other RDF libraries for Ruby are perfectly
welcome to steal any of our code, with or without attribution. So, if
some code snippet or file may be of use to you, feel free to copy it and
relicense it under whatever license you have released your own library
with -- no need to include any copyright notices from us (since there are
none), or even to mention us in the credits (we won't mind).

So that's what RDF.rb is not, but perhaps more important is what we want it
to be. There's no reason for simple RDF-based solutions to require enormous
complex libraries, storage engines, significant IDE configuration or XML
pushups. We're hoping to bring RDF to a world of agile programmers and
startups, and to bring existing Linked Data enthusiasts to a platform
that encourages rapid innovation and programmer happiness. And maybe
everyone can have some fun along the way!

It is also our hope that the aforementioned minimalistic design approach and
extremely liberal licensing can help lead to the emergence of a
semi-standard Ruby object model for RDF, that is, a common core class
hierarchy and API that could be largely interoperable between a number of
RDF libraries for Ruby.

With that in mind, let's proceed to have a look at RDF.rb's core object
model.

The Object Model

While RDF.rb is built to take full advantage of Ruby's duck typing and
mixins, it does also define a class hierarchy of RDF objects. If
nothing else, this inheritance tree is useful for case/when matching and
also adheres to the principle of least surprise for developers
hailing from less dynamic programming languages.

The RDF.rb core class hierarchy looks like the following, and will seem
instantly familiar to anyone acquainted with Sesame's object model:

The five core RDF.rb classes, all of them ultimately inheriting from
RDF::Value, are:

RDF::Literal represents plain, language-tagged or datatyped literals.

RDF::URI represents URI references (URLs and URNs).

RDF::Node represents anonymous nodes (also known as blank nodes).

RDF::Statement represents RDF statements (also known as triples).

RDF::Graph represents anonymous or named graphs containing zero or
more statements.

In addition, the two core RDF.rb interfaces (known as mixins in Ruby
parlance) are:

URI references (URLs and URNs) are represented in RDF.rb as instances of the
RDF::URI class, which is based on the excellent
Addressable::URI library.

Creating a URI reference

The RDF::URI constructor is overloaded to take either a URI string
(anything that responds to #to_s, actually) or an options hash of URI
components. This means that the following are two equivalent ways of
constructing the same URI reference:

Blank nodes are represented in RDF.rb as instances of the RDF::Node class.

Creating a blank node with an implicit identifier

The simplest way to create a new blank node is as follows:

bnode = RDF::Node.new

This will create a blank node with an identifier based on the internal Ruby
object ID of the RDF::Node instance. This nicely serves us as a unique
identifier for the duration of the Ruby process:

bnode.id #=> "2158816220"
bnode.to_s #=> "_:2158816220"

Creating a blank node with a UUID identifier

You can also provide an explicit blank node identifier to the RDF::Node
constructor. This is particularly useful when serializing or parsing RDF
data, where you generally need to maintain a mapping of blank node
identifiers to blank node instances.

The constructor argument can be any string or any object that responds to
#to_s. For example, say that you wanted to create a blank node instance
having a globally-unique UUID as its identifier. Here's how you would do
this with the help of the UUID gem:

require 'uuid'
bnode = RDF::Node.new(UUID.generate)

The above is a fairly common use case, so RDF.rb actually provides a
convenience class method for creating UUID-based blank nodes. The following
will use either the UUID or the UUIDTools gem, whichever happens to be
available:

Creating a plain literal

Note, however, that in most RDF.rb interfaces you will not in fact need
to wrap language-agnostic, non-datatyped strings into RDF::Literal
instances; this is done automatically when needed, allowing you the
convenience of, say, passing in a plain old Ruby string as the object value
when constructing an RDF::Statement instance.

Creating a language-tagged literal

To create language-tagged literals, pass in an additional ISO language
code to the :language option of the RDF::Literal
constructor:

The datatype URI can be given as any object that responds to either the
#to_uri method or the #to_s method. In the example above, we've called
the #date method on the RDF::XSD vocabulary class which represents the
XML Schema datatypes vocabulary; this returns an RDF::URI instance
representing the URI for the xsd:date datatype.

Creating implicitly datatyped literals

You'll be glad to hear that you don't necessarily have to always explicitly
specify a datatype URI when creating a datatyped literal. RDF.rb supports a
degree of automatic mapping between Ruby classes and XML Schema datatypes.

In most common cases, you can just pass in the Ruby value to the
RDF::Literal constructor as-is, with the correct XML Schema datatype being
automatically set by RDF.rb:

RDF statements are represented in RDF.rb as instances of the
RDF::Statement class. Statements can be triples -- constituted of a
subject, a predicate, and an object -- or they can be quads that
also have an additional context indicating the named graph that they are
part of.

The subject should be an RDF::Resource, the predicate an RDF::URI, and
the object an RDF::Value. These constraints are not enforced, however,
allowing you to use any duck-typed equivalents as components of statements.

Creating an RDF statement with a context

Pass in a URI reference in an extra :context option to the
RDF::Statement constructor to create a quad:

Since statements can also have an optional context, the following will
return either nil or else an RDF::Resource instance:

statement.context #=> an RDF::Resource or nil

Working directly with triples and quads

Because RDF.rb is duck-typed, you can often directly use a three- or
four-item Ruby array in place of an RDF::Statement instance. This can
sometimes feel less cumbersome than instantiating a statement object, and it
may also save some memory if you need to deal with a very large amount of
in-memory RDF statements. We'll see some examples of doing this this later
on.

RDF graphs are represented in RDF.rb as instances of the RDF::Graph class.
Note that most of the functionality in this class actually comes from the
RDF::Enumerable and RDF::Queryable mixins, which we'll examine further below.

Creating an anonymous graph

Creating a new unnamed graph works just as you'd expect:

graph = RDF::Graph.new
graph.named? #=> false
graph.to_uri #=> nil

Creating a named graph

To create a named graph, just pass in a blank node or a URI
reference to the RDF::Graph constructor:

RDF::Queryable is a mixin that provides RDF-specific query methods for any
object capable of yielding RDF statements. At present this means simple
subject-predicate-object queries, but extended basic graph pattern matching
will be available in a future release of RDF.rb.

In what follows we will consider RDF::Queryable methods specifically as
used in instances of the RDF::Graph class.

Querying for specific statements

The simplest type of query is one that specifies all statement components,
as in the following:

statements = graph.query([subject, predicate, object])

The result set here would contain either no statements if the query didn't
match (that is, the given statement didn't exist in the graph), or otherwise
at the most the single matched statement.

The #query method can also take a block, in which case matching statements
are yielded to the block one after another instead of returned as a result
set:

The Mailing List

Coming Up

In upcoming RDF.rb tutorials we will see how to work with existing RDF
vocabularies, how to serialize and parse RDF data using RDF.rb, how to write
an RDF.rb plugin, how to use RDF.rb with Ruby on Rails 3.0, and much
more. Stay tuned!