Geo-Spatial Indexes

ArangoDB features a Google S2 based geospatial index
since version 3.4.0, which supersedes the previous geo index implementation.
Indexing is supported for a subset of the GeoJSON geometry types
as well as simple latitude longitude pairs.

AQL’s geospatial functions and GeoJSON constructors are described in
Geo functions.

Using a Geo-Spatial Index

The geospatial index supports containment and intersection
queries for various geometric 2D shapes. You should be mainly using AQL queries
to perform these types of operations. The index can operate in two different
modes, depending on if you want to use the GeoJSON data-format or not. The modes
are mainly toggled by using the geoJson field when creating the index.

This index assumes coordinates with the latitude between -90 and 90 degrees and the
longitude between -180 and 180 degrees. A geo index will ignore all
documents which do not fulfill these requirements.

GeoJSON Mode

This creates the index on all documents and uses geometry as the attributed
field where the value is either a
Geometry Objector a coordinate array. The array must contain at least two numeric values
with longitude (first value) and the latitude (second value). This corresponds
to the format described in
RFC 7946 Position.

All documents, which do not have the attribute path or have a non-conform
value in it, are excluded from the index.

A geo index is implicitly sparse, and there is no way to control its sparsity.
In case that the index was successfully created, an object with the index
details, including the index-identifier, is returned.

Non-GeoJSON mode

This index mode exclusively supports indexing on coordinate arrays. Values that
contain GeoJSON or other types of data will be ignored. In the non-GeoJSON mode
the index can be created on one or two fields.

The following examples will work in the arangosh command shell.

To create a geo-spatial index on all documents using latitude and
longitude as separate attribute paths, two paths need to be specified
in the fields array:

It creates a geospatial index on all documents using location as the path to the
coordinates. The value of the attribute has to be an array with at least two
numeric values. The array must contain the latitude (first value) and the
longitude (second value).

All documents, which do not have the attribute path(s) or have a non-conforming
value in it, are excluded from the index.

A geo index is implicitly sparse, and there is no way to control its sparsity.
In case that the index was successfully created, an object with the index
details, including the index-identifier, is returned.

In case that the index was successfully created, an object with the index
details, including the index-identifier, is returned.

Indexed GeoSpatial Queries

The geospatial index supports a variety of AQL queries, which can be built with the help
of the geo utility functions. There are three specific
geo functions that can be optimized, provided that they are used correctly:
GEO_DISTANCE, GEO_CONTAINS, GEO_INTERSECTS. Additionally, there is a built-in support to optimize
the older geo functions DISTANCE, NEAR and WITHIN (the last two only if they are
used in their 4 argument version, without distanceName).

When in doubt whether your query is being properly optimized,
check the AQL explain
output to check for index usage.

Query for Results near Origin (NEAR type query)

The first parameter can be a GeoJSON object or a coordinate array in [longitude, latitude] ordering.
The second parameter is the document field on which the index was created. The function
GEO_DISTANCE always returns the distance in meters, so will receive results
up until 100km.

Query for Sorted Results near Origin (NEAR type query)

A basic example of a query for the 1000 nearest results to an origin point (ascending sorting):

The first parameter of GEO_INTERSECTS must be a polygon. Other types are not valid.
The second parameter must contain the document field on which the index was created.

GeoJSON

GeoJSON is a geospatial data format based on JSON. It defines several different
types of JSON objects and the way in which they can be combined to represent
data about geographic shapes on the earth surface. GeoJSON uses a geographic
coordinate reference system, World Geodetic System 1984 (WGS 84), and units of decimal
degrees.

MultiLineString

Polygon

A GeoJSON Polygon consists
of a series of closed LineString objects (ring-like). These Linear Ring objects
consist of four or more vertices with the first and last coordinate pairs
being equal. Coordinates of a Polygon are an array of linear ring coordinate
arrays. The first element in the array represents the exterior ring.
Any subsequent elements represent interior rings (holes within the surface).

A linear ring may not be empty, it needs at least three distinct coordinates

Within the same linear ring consecutive coordinates may be the same, otherwise
(except the first and last one) all coordinates need to be distinct

A linear ring defines two regions on the sphere. ArangoDB will always interpret
the region of smaller area to be the interior of the ring. This introduces a
practical limitation that no polygon may have an outer ring enclosing more
than half the Earth’s surface

MultiPolygon

A GeoJSON MultiPolygon consists
of multiple polygons. The “coordinates” member is an array of
Polygon coordinate arrays.

Polygons in the same MultiPolygon may not share edges, they may share coordinates

Polygons and rings must not be empty

A linear ring defines two regions on the sphere. ArangoDB will always interpret
the region of smaller area to be the interior of the ring. This introduces a
practical limitation that no polygon may have an outer ring enclosing more
than half the Earth’s surface

Arangosh Examples

Creates a geospatial index on all documents using location as the path to the
coordinates. The value of the attribute has to be an array with at least two
numeric values. The array must contain the latitude (first value) and the
longitude (second value).

All documents, which do not have the attribute path or have a non-conforming
value in it, are excluded from the index.

A geo index is implicitly sparse, and there is no way to control its sparsity.

In case that the index was successfully created, an object with the index
details, including the index-identifier, is returned.

To create a geo index on an array attribute that contains longitude first, set
the geoJson attribute to true. This corresponds to the format described in
RFC 7946 Position

Returns a geo index object if an index was found. The near or
within operators can then be used to execute a geo-spatial query on
this particular index.

This is useful for collections with multiple defined geo indexes.

collection.geo(location_attribute, true)

Looks up a geo index on a compound attribute location_attribute.

Returns a geo index object if an index was found. The near or
within operators can then be used to execute a geo-spatial query on
this particular index.

collection.geo(latitude_attribute, longitude_attribute)

Looks up a geo index defined on the two attributes latitude_attribute
and longitude-attribute.

Returns a geo index object if an index was found. The near or
within operators can then be used to execute a geo-spatial query on
this particular index.

Note: this method is not yet supported by the RocksDB storage engine.

Note: the geo simple query helper function is deprecated as of ArangoDB
2.6. The function may be removed in future versions of ArangoDB. The preferred
way for running geo queries is to use their AQL equivalents.

Examples

Assume you have a location stored as list in the attribute home
and a destination stored in the attribute work. Then you can use the
geo operator to select which geo-spatial attributes (and thus which
index) to use in a near query.

constructs a near query for a collectioncollection.near(latitude, longitude)

The returned list is sorted according to the distance, with the nearest
document to the coordinate (latitude, longitude) coming first.
If there are near documents of equal distance, documents are chosen randomly
from this set until the limit is reached. It is possible to change the limit
using the limit operator.

In order to use the near operator, a geo index must be defined for the
collection. This index also defines which attribute holds the coordinates
for the document. If you have more then one geo-spatial index, you can use
the geo operator to select a particular index.

Note: near does not support negative skips.
// However, you can still use limit followed to skip.

This will add an attribute distance to all documents returned, which
contains the distance between the given point and the document in meters.

collection.near(latitude, longitude).distance(name)

This will add an attribute name to all documents returned, which
contains the distance between the given point and the document in meters.

Note: this method is not yet supported by the RocksDB storage engine.

Note: the near simple query function is deprecated as of ArangoDB 2.6.
The function may be removed in future versions of ArangoDB. The preferred
way for retrieving documents from a collection using the near operator is
to use the AQL NEAR function in an AQL query as follows:

constructs a within query for a collectioncollection.within(latitude, longitude, radius)

This will find all documents within a given radius around the coordinate
(latitude, longitude). The returned array is sorted by distance,
beginning with the nearest document.

In order to use the within operator, a geo index must be defined for the
collection. This index also defines which attribute holds the coordinates
for the document. If you have more then one geo-spatial index, you can use
the geo operator to select a particular index.

collection.within(latitude, longitude, radius).distance()

This will add an attribute _distance to all documents returned, which
contains the distance between the given point and the document in meters.

collection.within(latitude, longitude, radius).distance(name)

This will add an attribute name to all documents returned, which
contains the distance between the given point and the document in meters.

Note: this method is not yet supported by the RocksDB storage engine.

Note: the within simple query function is deprecated as of ArangoDB 2.6.
The function may be removed in future versions of ArangoDB. The preferred
way for retrieving documents from a collection using the within operator is
to use the AQL WITHIN function in an AQL query as follows:

Since ArangoDB 2.5, this method is an alias for ensureGeoIndex since
geo indexes are always sparse, meaning that documents that do not contain
the index attributes or has non-numeric values in the index attributes
will not be indexed. ensureGeoConstraint is deprecated and ensureGeoIndex
should be used instead.

The index does not provide a unique option because of its limited usability.
It would prevent identical coordinates from being inserted only, but even a
slightly different location (like 1 inch or 1 cm off) would be unique again and
not considered a duplicate, although it probably should. The desired threshold
for detecting duplicates may vary for every project (including how to calculate
the distance even) and needs to be implemented on the application layer as
needed. You can write a Foxx service for this purpose and
make use of the AQL geo functions to find nearby
coordinates supported by a geo index.