RDF: Ready for Prime Time

Not long ago, Marc Canter, one of the early founders of Macromedia, talked about RDF
and
the Semantic Web in his weblog. Specifically, he wrote:

"I've been spending more and more time trying to grok the RDF folks. I have to say
I like
what I see and hear, but what I don't see are many apps and services actually up
and running and working.

...

We have a saying over here: "put up or shut up." I'm still looking for two different
RDF
apps or services to work together in some meaningful way. Then bring on the
books."

Considering that I'm "bringing on a book" on RDF this month, I thought it appropriate
to
answer Marc's plea for meaningful, working examples of RDF apps and services, especially
those that work with other RDF-based services. My problem, though, is that I have
only a
limited amount of time and space in this article; I can only cover a few of them.
However,
best to just start, but first, a little digression into RDF and XML.

RDF/XML: The Syntax That Could

You probably know that RDF has both a defined model as well as a preferred serialization,
RDF/XML. In many ways there's been far less criticism of RDF than there has been of
the
RDF/XML syntax. Tim Bray, one of the creators of XML has said:

"Speaking only for myself, I have never actually managed to write down a chunk of
RDF/XML
correctly, even when I had the triples laid out quite clearly in my head. Furthermore,
once
again speaking for myself, I find most existing RDF/XML entirely unreadable. And I
think I
understand the theory reasonably well."

Tim even went so far as to offer his own version of RDF/XML, which he called RPV.

I've found that the more a person works with markup such as XML, the more they dislike
RDF/XML. I've also found that no matter the alternative proposed, someone else will
dislike
it just as much, which makes RDF/XML a bit of a "damned if you do, damned if you don't"
proposition.

Ultimately, if RDF is ready for prime time, then so is RDF/XML. Regardless of our
views of
it, it's official, it's real, and it's here now. So on to the RDF applications, starting
with the basics: the APIs.

RDF APIs

For every programming language you're interested in, there's most likely an RDF API
and a
library implementing it. If you're interested in Java, one of the most popular Java
RDF
libraries for RDF is Jena, from HP's Semantic Web Research Lab. The current version
of Jena
is 1.6.1, which is the one I've used, but there is a beta-release of a new version
(Jena2), and it's the one you'll most
likely want to investigate. As you'll see later, Jena is used for several utilities
and
applications.

For those interested in Python, the most popular RDF library -- which also includes
a
triplestore with several different backends -- is Daniel Krech's RDFLib. Want something a little more unusual? Try Wilbur, a Common Lisp RDF library, written
by Ora Lassila, one of the creators of RDF.

For those who work primarily with Microsoft development environments, there is a C#
RDF
Parser called Drive, which provides an API to parse
RDF/XML into an in-memory RDF graph for manipulation. It's fully compatible with the
.NET
platform, and it can also be used with the open source variant of .NET, Mono.

If Perl is more your thing, there's Ginger Alliance's PerlRDF, a library I've used
in several small applications at my site. And other, popular applications like Six
Apart's
weblogging application, Movable Type, are also
using it. Six Apart extended the PerlRDF module by creating a new module, XML::FOAF, which enables
autodiscovery and processing of FOAF files. FOAF, or Friend-of-a-Friend, is an RDF
vocabulary for defining hierarchies of acquaintances and is now one of the most popular
uses
of RDF/XML.

If you want support for multiple RDF languages as well as a more sophisticated framework
and data persistence, you'll want to check out Dave Beckett's Redland. In addition to providing a
persistent data store, as well as multiple language support (Python, Perl, Java, Tcl,
and
Ruby), Redland also provides support for an independent RDF parser called Raptor. Raptor has been used,
independently, in other applications, including several FOAF apps, as well as RDF Gateway,
a commercial product I'll discuss later in this article.

RDF Vocabularies

FOAF is one of the more popular vocabularies of RDF/XML.
Just a quick perusal at the FOAF web site will show dozens of uses of FOAF in tools
ranging
from a FOAFBot, created by Edd Dumbill and used to provide services within chat forums,
to
uses of FOAF in desktop tools within the OS X environment for managing contacts. My
own FOAF
file is at
http://weblog.burningbird.net/foaf.rdf, and consists of pointers to friends I know
online, though the list is incomplete.

The beauty of FOAF lies in its simple way of describing personal information, including
our
work and academic affiliations. The power of FOAF lies in its ability to list acquaintances
who themselves may have FOAF files. Over time, this interlinked network can expand
until
it's a simple matter of mapping out who is connected, directly to indirectly, to each
other.

Another RDF vocabulary in popular use is RSS
1.0. Webloggers and other online publications use RSS 1.0 to provide information about
updates at their web sites, including the date of the update, the author, an excerpt
of the
material and so on.

A third RDF vocabulary is the RDF/XML used to describe Creative Commons licenses, a new way to provide
more detailed information about use of copyrighted material.

All three vocabularies use, in one way or another, elements from the Dublin Core Metadata
Initiative (DCMI), as
defined in RDF/XML. However, these vocabularies aren't the only ones available using
RDF/XML. In fact, the W3C uses RDF/XML to define the underlying syntax for its own
Web Ontology Language (OWL) effort. With RDF
providing the underlying model, and OWL adding higher-level ontology support, it's
only a
matter of time before a host of sophisticated, domain-specific ontologies spring up,
all of
them interoperable because of the underlying use of RDF/XML.

In fact, there's a host of tools and utilities people can use right now to work with
RDF/XML directly or with OWL.

Tools and Utilities to Work with RDF/XML

As much as I like RDF/XML, even I'll admit that it requires time to understand and
work
with, and not everyone has either a desire or an inclination for this effort. Thankfully,
there's plenty of tools available to allow people to manually create or read RDF/XML.

The most commonly used RDF utility is the RDF
Validator, a tool to check your RDF/XML to ensure that it's valid, as well as to
generate different views of the model data. I find that when working with an API,
I'll use
the Validator to validate my sample RDF/XML, view the model to ensure I've created
the
appropriate one, and then create the triples to use as a pattern with my RDF/XML API
calls,
in whatever language I'm coding.

Another handy utility for working with RDF/XML is the BrownSauce RDF Browser. This web application
uses Jena. It can open an RDF/XML document and provide easily readable and hypertext-linked
pages of the RDF data contained in the document. Best of all, the browser also opens
any
associated RDF Schema documents that provide information about the RDF elements themselves,
through the relationships described in the schema, and through comments provided with
the
schema elements.

A long-time advocate of RDF and a friend of mine, Danny Ayers, has been busy at work
on Ideagraph, a tool for visually mapping ideas and then
generating RDF/XML from the results. In addition to this effort, the tool can also
act as a
RDF-based weblogging tool, as well as an RSS aggregator.

Isaviz is another popular visual-editing
tool for creating, importing, and working with RDF documents in RDF/XML, and within
other
serialization formats such as Notation 3 and N-Triple format. It's particularly useful
when
you're creating a new RDF vocabulary and want to use a visual tool for this effort
rather
than trying to create the vocabulary in RDF/XML manually. However, I prefer to use
the tool
to work with existing RDF/XML documents, particularly larger ones, because the tool
has a
way of being able to zoom in on components of a model, to create snapshots of particular
paths, and to query on specific elements. In particular, if you're documenting an
existing
RDF/XML vocabulary, Isaviz can be useful for providing snapshots of particular instances
of
data.

Most of these tools are geared more for working directly with RDF/XML vocabularies.
If
you're working with an ontology instead, then you must look at Protege, from Stanford University. This tool not
only allows you to define an ontology using an easy-to-use user interface, you can
then
create forms to capture the ontology data. Once the forms are defined, the tool can
then be
used to capture instances of data based on the ontology. Currently. effort is underway
to
provide support for OWL files, and mapping between Protege's own ontology language
and the
W3C language. Regardless, the data captured by Protege can be output in multiple formats,
most particularly RDF/XML.

Peripheral RDF Support in Other Tools and Utilities

Of course, tools that focus purely on RDF, whether to create RDF or to consume RDF,
are
handy when you're starting work with RDF--but what about RDF in the real world?

Probably one of the first uses of RDF/XML was by those involved in the Mozilla effort,
which still uses RDF/XML for all of its automated Table of Contents data and processing.
In
fact, it was through my interest in the Mozilla development environment that I became
exposed to RDF/XML (see www.mozilla.org/rdf/doc/).

If you've worked with Linux then you're most likely familiar with RPM, a way of packaging
Linux applications for easy installation. What you may not know is that RDF has been
used
with RPM to provide metadata about the package being installed. A utility created
by Daniel
Veillard, rpmfind, uses RDF to discover RPM installations
on Rpmfind.Net, a database of RPM packages maintained by the W3C. Though the original
creator of the product is no longer maintaining rpmfind directly, the source is now
located
at sources.redhat.com, and I'm still using rpmfind
for my own server.

Earlier I mentioned Movable Type and its use of RDF for autodiscovery of FOAF files.
The
application also uses RDF/XML to annotate weblog entries with trackback information, which can be used
to document links from one weblog to another and provide reverse link information.
This same
functionality has been isolated for use by other tools, weblogging or otherwise.

Spring, a Mac OS X desktop tool created
by Robb Beal, provides support for dragging and dropping FOAF files. Find an FOAF
link in a
web page? Click on it and drag it to Spring in order to automatically transform the
FOAF
contents into the tool.

As ubiquitous as RDF is becoming, creeping its way into a favorite tool or utility
near
you, the power of the RDF model's inferential capability is particularly apparent
when you
look at some of the larger applications that are being built on RDF.

Larger Applications

People at MIT are working on an application, called DSpace, which will maintain a digital repository of information. The application is
geared to any larger organization such as a college or university that wants to maintain
a
searchable index of publications from its members. DSpace is a freely available, open
source
application that makes use of an ontology, Harmony/ABC and RDF to
maintain the historical subsystem. RDF
Gateway is a Semantic Web application server that uses RDF as the core of all of its
services. With the Gateway, you get access to a persistent data store that can be
queried
using an inferential engine that goes beyond normal SQL-like queries. Included with
the
application is support for server-side scripting similar in nature to both ASP (Active
Scripting Pages) and JSP (Java version of same).

Siderean Software's Seamark is another commercial
application that makes use of RDF and a persistent data source, but Seamark focuses
primarily on site navigation. Plugged In Software's Tucana Knowledge Store provides
sophisticated knowledge-based querying of large stores of data, again based on RDF.

These companies are just the first to start looking at RDF and the RDF data model
for use
in large-scale, sophisticated applications. And then there's the Semantic Web.

The Semantic Web

It's funny in a way, but I can sit down and rattle off a dozen uses for the RDF data
model
and the associated RDF/XML without once mentioning its primary purpose, which is to
provide
support for the Semantic Web efforts. All uses of RDF for any purpose are good because
they
increase our familiarity with the specification as well as the syntax. In addition,
applications that increase the level of RDF/XML out on the web add to the pool of
accessible
data on which we are slowly building the Semantic Web. Through the use of RDF, we
know that
all of the vocabularies are compatible.

Beyond these good and practical uses of RDF I've described earlier in the article,
and
unlike XML or HTML or XHTML, the RDF model, and its associated syntax, brings with
it the
ability to define statements about data, rather than to just record pieces of data.
Add to
this the use of OWL, and we begin to have the ability to mine for knowledge, not just
words.

Consider poetry. My favorite poem is Walt Whitman's "Song of the Open Road", with
its
friendly and positive imagery of life as an adventure, a road to follow with glee.
In fact,
the use of "road" as a metaphor for life and life's journey is quite common in poetry.
(See
an excellent article, "Poetry of
the Open Road.") However, it's the very use of imagery and metaphor in poetry that
defeats traditional web discovery techniques.

Currently we have the ability to use keyword searches within search engines such as
Google,
and with this we can find poems that mention the word "road". This is all well and
good, but
in the future, as the use of RDF and RDF/XML expands, we'll be able to do searches
that not
only provide links to poems that have used "road", but also know which poems use the
word as
a metaphor for life, which have used it metaphorically to describe freedom, and which
are
just talking about roads as roads.

Eventually as RDF insinuates itself throughout the web, as it has already started,
you'll
be able to search on "road" and "poem" and "metaphor for life" and not get this
article back as a result. As much as I like the thought of people reading this article,
that
search result will be a good thing because this article is not about poems, metaphors,
and
life. It's about RDF and how it is now more than ready for prime time.