Data can be stored in different formats. CrateDB has different types that can
be specified if a table is created using the the CREATE TABLE
statement. Data types play a central role as they limit what kind of data can
be inserted, how it is stored and they also influence the behaviour when the
records are queried.

Data type names are reserved words and need to be escaped when used as column
names.

The float and double data types are inexact, variable-precision numeric
types. It means that these types are stored as an approximation. Therefore,
storage, calculation, and retrieval of the value will not always result in an
exact representation of the actual floating-point value.

For instance, the result of applying sum or avg aggregate functions may
slightly vary between query executions or comparing floating-point values for
equality might not always be correct.

All types have the same ranges as corresponding Java types. You can insert
any number for any type, be it a float, integer, or byte as long as
its within the corresponding range. Example:

CrateDB conforms to the IEEE 754 standard concerning special values for
floating point types (float, double). This means that it also
supports NaN, Infinity, -Infinity (negative infinity), and -0
(signed zero).

The timestamp type is a special type which maps to a formatted string.
Internally it maps to the UTC milliseconds since 1970-01-01T00:00:00Z stored as
long. Timestamps are always returned as long values.

(Default: 0.025 (2,5%)) The measure of acceptable
error for shapes stored in this column expressed as a
percentage value of the shape size The allowed maximum is
0.5 (50%).

The percentage will be taken from the diagonal distance
from the center of the bounding box enclosing the shape to
the closest corner of the enclosing box. In effect bigger
shapes will be indexed with lower precision than smaller
shapes. The ratio of precision loss is determined by this
setting, that means the higher the distance_error_pct
the smaller the indexing precision.

This will have the effect of increasing the indexed shape
internally, so e.g. points that are not exactly inside
this shape will end up inside it when it comes to querying
as the shape has grown when indexed.

tree_levels:

Maximum number of layers to be used by the PrefixTree defined
by the index type (either geohash or quadtree, see
Geo Shape Index Structure). This can be used to control
the precision of the used index. Since this parameter requires a
certain level of understanting of the underlying implementation,
users may use the precision parameter instead. CrateDB uses
the tree_levels parameter internally and this is what is
returned via the SHOWCREATETABLE statement even if you use
the precision parameter. Defaults to the value which is 50m
converted to precision depending on the index type.

Computations on very complex polygons and geometry collections are exact but
very expensive. To provide fast queries even on complex shapes, CrateDB uses a
different approach to store, analyze and query geo shapes.

The surface of the earth is represented as a number of grid layers each with
higher precision. While the upper layer has one grid cell, the layer below
contains many cells for the equivalent space.

Each grid cell on each layer is addressed in 2d space either by a Geohash
for geohash trees or by tightly packed coordinates in a Quadtree. Those
addresses conveniently share the same address-prefix between lower layers and
upper layers. So we are able to use a Trie to represent the grids, and
Tries can be queried efficiently as their complexity is determined by the
tree depth only.

A geo shape is transformed into these grid cells. Think of this transformation
process as dissecting a vector image into its pixelated counterpart, reasonably
accurately. We end up with multiple images each with a better resolution, up to
the configured precision.

Every grid cell that processed up to the configured precision is stored in an
inverted index, creating a mapping from a grid cell to all shapes that touch
it. This mapping is our geographic index.

The main difference is that the geohash supports higher precision than the
quadtree tree. Both tree implementations support precision in order of
fractions of millimeters.

The column policy can be configured to be strict, rejecting any subcolumn
that is not defined upfront in the schema. As you might have guessed, defining
strict objects without subcolumns results in an unusable column that will
always be null, which is the most useless column one could create.

Another option is dynamic, which means that new subcolumns can be added in this object.

Note that adding new columns to an object with a dynamic policy will affect
the schema of the table. Once a column is added, it shows up in the
information_schema.columns table and its type and attributes are fixed.
They will have the type that was guessed by their inserted/updated value and
they will always be not_indexed which means they are analyzed with the
plain analyzer, which means as-is.

If a new column a was added with type integer, adding strings to this
column will result in an error.

The third option is ignored which results in an object that allows
inserting new subcolumns but this adding will not affect the schema, they are
not mapped according to their type, which is therefor not guessed as well. You
can in fact add any value to an added column of the same name. The first value
added does not determine what you can add further, like with dynamic
objects.

An object configured like this will simply accept and return the columns
inserted into it, but otherwise ignore them.

Ignored objects should be mainly used for storing and fetching.
Filtering by and ordering on them is possible but very performance
intensive. Ignored objects are a black box for the storage engine, so
the filtering/ordering is done using an expensive table scan and a
filter/order function outside of the storage engine.

Arrays can be written using the array constructor ARRAY[] or short [].
The array constructor is an expression that accepts both literals and
expressions as its parameters. Parameters may contain zero or more elements.

Synopsis:

[ARRAY]'['element[,...]']'

All array elements must have the same data type, which determines the inner
type of the array. If an array contains no elements, its element type will be
inferred by the context in which it occurs, if possible.