Normally, the unit of matching in ARQ is the basic graph pattern (a sequence of triple patterns). These sets of triple patterns are dispatched to Jena for matching by Jena's graph-level query handler.
Each kind of storage provides the appropriate query handler. For example, the database fastpath
is a translation of a set of triple patterns into a single SQL query involving joins.

There is also a default implementation that works by using plain graph find (a triple with possible wildcards) so a new storage system does not need to provide it's own query handler until it wants to exploit some feature of the storage.

If a function property is encountered, then it is internally treated as a call to be an extension. There is a registry of
function properties to implementing code.

where ext:list is a function that bind its arguments
(unlike a FILTER function). The property function form is legal SPARQL.

So, this mechanism shows that collection access can be done in SPARQL
without resorting to handling told blank nodes.

cwm (which is a forward chaining rules engine) and Euler
(which is a backward-chaining rules engine) already provide this style of access. Their property is - the subject and object meanings are the other way.

11 February 2006

This approach means that the same source code is used for both the Java world
and the .Net world, making future improvements visible to both from a single source tree.

I tried doing it for .Net on Windows with
C# Express
and IKVM-0.24.0.1.

Summary

SPARQL queries work.

Using Jena from C# works for small scale cases - lots of checking to do but
it should be a matter of verifying everything from the dependent libraries works properly.

Some things aren't working but there are a few hotspots of trouble that, when
fixed, mean that the majority (may be all) of the Jena test suite will run. As
it is at the moment, quite a lot can be done including using the ARQ command line programs.

The Conversion

The IKVM bytecode conversion route is my preferred choice because it means one source
codebase, not two. When I tried this before, I got an early version of ARQ up and running.
But it wasn't complete; the first big block was the lack of java.nio.charset
support in GNU Classpath. Jena and ARQ have lots of tests of internationalization and
charsets. That alone was enough to make it not worthwhile exploring further at the time.

The process is simple: run ikvmc on all the jars to get a library. Ignore all
the warnings about missing stuff. It's surprising what various libraries actually
reference - Log4j has references to a lot of log record transports. At the simplest:

ikvmc *.jar -out:XXX.dll -target:library

I've now broken this in two DLLs: jena-libs.dll (all the jars except the jena ones)
and jena.dll (jena.jar, jenatest.jar, arq.,jar, iri.jar)
but that is just because I keep building the DDLs while testing.

It takes a minute or so (less time than building jena.jar itself).
The result is two DLLs of about totaling 16M - the whole assembly is about
23M including the three IKVM DLLs. Not small - but it works and it is simple to do.

file:///c:/absolute was incorrectly turned into a windows filename. Worked OK with
Sun's Java but not IKVM. Fixed.

GNU Classpath bugs:

InputStreamReader(InputStream, Charset) is broken although the other two
constructors that allow the charset conversion to be explicitly
controlled do seem to work. This can be worked around in Jena.
Bugzilla Entry.

Zero-width lookbehind regexs aren't implemented. They are used by JJC's new IRI code.
Bugzilla Entry.