Spatial searches with SPARQL

This module was first released with Jena 2.11.0.

This is an extension to Apache Jena ARQ, which combines SPARQL and simple spatial query.
It gives applications the ability to perform simple spatial searches within SPARQL queries.
Spatial indexes are additional information for accessing the RDF graph.

The spatial index can be either Apache Lucene for a
same-machine spatial index, or Apache Solr
for a large scale enterprise search application.

Important note In order to read geo data in 2) WKT literal format, jena-spatial uses JTS Topology Suite,
which is under LGPL licence. jena-spatial does not make a hard dependency on JTS. In other words,
if an end user just uses the feature of 1), there's no need to depend on JTS (i.e. nothing needs to be done). If he wants 2),
he can make it by setting the SpatialContextFactory of
EntityDefinition
to JtsSpatialContextFactory, which is an optional choice. In this way, the JTS libs should be in the classpath. Here's the sample code:

However, there may be more predicates for other data sources for both 1) and 2).
jena-spatial provides an interface for consuming all kinds of custom geo predicates.
You can simply add predicates to let jena-spatial recognize them using
EntityDefinition:

Query for the ?place within the radius distance of the location of (latitude, longitude). The distance units can be: "kilometres"/"km", "miles"/"mi", "metres"/"m", "centimetres"/"cm", "millimetres"/"mm" or "degrees"/"de", which are delivered as the optional strings (the default value is "kilometres"). limit is an optional integer parameter for the limit of the query results (if limit<0, return all query results).

The usual way to describe an index is with a Jena assembler description. Configurations can also be built with code. The assembler describes a "spatial dataset" which has an underlying RDF dataset and a spatial index. The spatial index describes the spatial index technology (Lucene or Solr) and the details needed for each.

A spatial index has an
EntityDefinition
which defines the properties to index, the name of the lucene/solr field used for storing the URI itself (e.g. "entityField") and its geo information (e.g. latitude/longitude as "geoField"), and the custom geo predicates.

For common RDF spatial query, only "entityField" and "geoField" are required with the builtin geo predicates working well. More complex setups, with multiple custom geo predicates besides the two fields are possible.
You also optionally use JtsSpatialContextFactory to support indexing WKT literals.

Once setup this way, any data added to the spatial dataset is automatically indexed as well.

Key here is that the assembler contains two dataset definitions, one for the spatial dataset, one for the base data. Therefore, the application needs to identify the text dataset by its URI 'http://localhost/jena_example/#spatial_dataset'.

It's required to add the field definitions for "entityField" and "geoField" respectively in schema.xml of Solr.
The names of the fields in EntityDefinition should be in accordance with those in schema.xml.
Here is an example defining the names of "entityField" as "uri" and "geoField" as "geo":

When working at scale, or when preparing a published, read-only, SPARQL service, creating the index by loading the spatial dataset is impractical. The index and the dataset can be built using command line tools in two steps: first load the RDF data, second create an index from the existing RDF dataset.