From programming to everything else

Generating Clojure from an Ontology

I’ve been fascinated with RDF for years, but I always end up frustrated when I try to use it. How do you read/write/manipulate RDF data in code? Sure, there are lots of libraries, but they all represent RDF data as its primitive structures: statements, resources, literals, etc. Working with data through these APIs feels like using a glovebox. To get anything useful done, you have to define mappings between RDF properties/classes and normal data structures in your programming language — classes, maps, lists, whatever. In effect, you have to define everything twice.

Some Java APIs allow one to add annotation properties to classes and methods, with the annotations defining the mapping between Java objects and RDF triples. It’s convenient, and familiar if you’ve used Java persistence frameworks like Hiberante, but you still have to define everything twice — once in your RDF schema, once in Java code.

Other libraries generate Java source code from RDFS or OWL ontologies. This means you don’t have to define everything twice, but adds another step to the write-compile-run cycle, and limits you to the semantics that the code generator can understand. In particular, certain features of RDFS/OWL — multiple inheritance, sub-properties — do not map well into Java.

What I really wanted was a way to create and work with RDF data in Clojure, using the same map/set/sequence APIs that I use for any other Clojure data structure. I flirted with implementing RDF in Clojure but lost interest when I realized that 1) there’s a lot more to implementing RDF than datatype conversions; and 2) my Clojure library suffered from the same glovebox problem as the Java RDF libraries.

The solution, however, was staring me in the face all along. Clojure is a Lisp. I can generate functions directly, without any intermediate “source” representation. I can use my own customized validation and type-checking functions. Furthermore, I can extend the definitions in my RDF schema with new Clojure functions.

Here’s what I ended up with: I designed a simple OWL ontology using Protege 4 and saved it as RDF/XML. Then I used the Sesame 2 library to find all the RDF classes and properties defined in my ontology, and create the appropriate getter, setter, and constructor functions in Clojure. It looks something like this:

The resource-to-symbol function creates a symbol named for the local name of the RDF class, with the full URI of its XML namespace in the symbol’s metadata. The call to intern defines a new function that takes no arguments and returns a Clojure map with the symbol as its :type.

Suppose I have a class named Document in my ontology. I now have a Clojure function named Document that creates a new instance of that class, represented as a Clojure map. Furthermore, using Clojure hierarchies and the isa? function, I can generate Clojure code that implements the subclass relationships defined in the ontology. Whee!

I don’t entirely know where I’m headed with this, but I like the way it’s going. I can define my own data types, decide how they map to Clojure data structures, and have code that’s always up-to-date with my RDF vocabulary.

4 thoughts on “Generating Clojure from an Ontology”

Thanks for the interesting post. I am playing around with Clojure and RDF technologies as well. I am invoking the OpenRDF Sesame Java API via Clojure. Like you I do not know exactly where I am going but my intuition tells me something interesting may come out of this exploration. For one, you mention Protégé. It seems like you may be able to bypass Protégé and have a “command-line” or “shell” (or even DSL) to directly create/read/update ontologies/RDF directly from the Clojure REPL. Also if you load various triple stores, and you are good enough at Clojure, you can start slicing and dicing RDF interesting and perhaps novel ways from the REPL. Good luck, and let us know how all this is going.

I took a slightly different approach to RDF in Clojure, since I really wanted to both leverage Sparql (particularly the new 1.1 stuff) and work with my result sets in Incanter. It’s my first project with Clojure, so it’s still a learning experience, but I decided to throw it up on Clojars and Github at https://github.com/ryankohl/seabass . Nonetheless, your idea of a more-or-less ORM layer between straight-up Clojure data structures and RDF graphs is pretty cool. Have you returned much to this recently?