Context Navigation

PostGIS Raster Beta Documentation

NOTE: Some parts of this documentation need to be updated.

1 - Introduction

PostGIS Raster is an extension of PostGIS aiming at developing support for raster. It is a new project very different from the previous PGRaster project and also very different from Oracle GeoRaster...

PostGIS Raster's goal is to implement the RASTER type as much as possible like the GEOMETRY type is implemented in PostGIS and to offer a single set of overlay SQL functions (like ST_Intersects) operating seamlessly on vector and raster coverages.

Load your raster data using gdal2wktraster.py and make your queries! In PostGIS Raster...

RASTER is a new column type (like the PostGIS GEOMETRY type)

one table with a column of type raster = one raster coverage (like a one table PostGIS vector coverage)

one table row with a column of type raster = one tile (like a one row PostGIS vector geometry feature)

each raster tile has: a pixel size, a width and a height, a georeference, a variable number of band, a pixeltype per band and a nodata value per band; everything essential to do basic GIS raster operations.

...is an extension of PostGIS to be installed separately...

installable over any version of PostGIS higher than 1.3.5.

merge with PostGIS might occur in the future...

...has a loader similar to shp2pgsql.exe (gdal2wktraster.py)...

allowing loading of a single raster or a set of rasters (using wildcard) into a tiled coverage.

for efficient overlay analysis operations between vector and raster layers...

...or OUTSIDE the database (as JPEG or TIFF)...

so desktop and web applications can quickly access and load raster tiles and nevertheless benefits from the powerful PostGIS GiST spatial index. Every PostGIS Raster SQL functions working with in-the-db raster tiles work seamlessly with out-the-db raster tiles.

...introduces the concept of raster objects...

geographic features are stored as variable size raster tiles instead of polygons.

allows vector to raster conversion without lost of information.

...is much more simple than PGRaster and Oracle GeoRaster! PostGIS Raster supports...

only one type (instead of two in Oracle Spatial: SDO_GEORASTER & SDO_RASTER). In PostGIS Raster there are no differences between rasters and tiles: a tile is a raster and a raster is a tile. i.e. one row = one tile = one raster; one table = one raster coverage.

no metadata (like PostGIS)

no masks (you can create a mask as a band)

no multiple dimensions (only two: x, y). Not to be confused with bands; PostGIS Raster DO supports multiband raster...

no pyramids (reduced resolution coverages can be stored as a separate layer)

Now, in case you checked out the code with a SVN client, you'll need to generate the configure script with:

>./autogen.sh

In case you got the last development snapshot, the configure script is packed with the code. The next steps are:

>./configure --with-raster
> make
> sudo make install

Now, you have PostGIS 2.0 with PostGIS Raster extension installed in your system.

PostgreSQL provides a utility called pg_config to enable extensions like PostGIS to locate the PostgreSQL installation directory. If ./configure didn't find pg_config, try using the --with-pgconfig=/path/to/pg_config switch to specify a particular PostgreSQL installation.

3 - Using PostGIS Raster

3.1 - The PostGIS Raster Type

Like the PostGIS "geometry" type, the PostGIS "raster" type is a new PostgreSQL type. This means each raster or raster tile is stored as a row of data in a PostgreSQL database table. It is a complex type, embedding information about the raster itself (width, height, number of bands, pixeltype for each band and nodata value for each band) along with its geolocalisation (pixelsize, upper left pixel center, rotation and SRID).

3.1.1 - PostGIS Raster Rationale

PostGIS Raster was designed with many objectives, to accommodate a myriad of dataset structures and a diversity of applications:

Objective 1: Simplicity, Complementarity and Functionality - PostGIS Raster provides a "raster" type that complements the existing PostGIS vectorial "geometry" type. This new extension offers operators and functions that are similar to those available with the existing geometry type. They work in the same user-friendly manner, but are associated with a matricial geospatial data structure to support the use of rasters. PostGIS Raster also includes a simple loader to import many raster formats into the database.

Objective 2: Seamless Integration with the PostGIS geometry type - PostGIS Raster enables PostGIS operators and functions to work seamlessly on both raster and geometry types, so that users:

may use their prior knowledge of PostGIS in general, and of its geometry type operators and functions, when building SQL queries;

expect similar behaviors when using these operators and functions with the raster type without having to consider whether their data are in vectorial or matricial form;

expect existing applications to work with new data loaded as rasters without (many) changes (to a certain conceptual limit).

This makes PostGIS Raster an abstraction level, not only over the raster format (like GDAL) or over the vector format (like OGR), but more generally over the two most used data structures in the geospatial industry (raster AND vector). Even if many operations performed on a vector object (e.g. all the functions working only on LINESTRING geometries) are not really possible to apply to raster objects, and if there are also some raster specific functions (e.g. ST_Resample() or ST_SetBandNoDataValue()) not working on vector objects, we think, and we will demonstrate as we extend PostGIS Raster in the future, that most operations have their equivalent in both raster and vector worlds (e.g. ST_Intersections(), ST_Accum(), ST_Area(), ST_MapAlgebra()), even if this does not appear obvious at first sight.

We think that providing a single paradigm (SQL) for dealing with both rasters and vectors should allow developers to write better GIS applications, by simplifying graphical user interfaces and softening the learning curves. Developers should be able to build a unique graphical user interface to deal with raster and vector data, making users having to learn only one set of operators to work with vector data and raster data, instead of two. We think this should generally enhance users’ experiences with geospatial applications and allow them to focus on real geographic problems rather than struggling with data formats and structures.

We also think that this approach is compliant (at least in philosophy) with the ISO 19123 "Abstract Specification Schema for Coverage Geometry and Functions" specification (not to be confused with the OCG Web Coverage Service) in which a coverage can be represented as a point layer, a polygon layer, a TIN layer or a raster layer, and in which any positional value of the coverage is accessible without having to know the data structure upon which the coverage is based.

do not have to retrieve images from the database but might instead access them directly from the filesystem (e.g. as JPEG files);

can nevertheless use most PostGIS Raster operators and functions on those rasters, transparently.

This is "out-db" storage as opposed to the more natural "in-db" storage.

Objective 4: Interoperability – PostGIS Raster uses GDAL (​http://www.gdal.org/) as its main connector to filesystem raster files. GDAL is involved when loading rasters into the database (using the loader) and when working with out-db rasters. It enables PostGIS Raster to work with nearly a hundred different file formats. (For a complete list of supported file formats see ​http://www.gdal.org/formats_list.html). (GDAL is also used to implement some raster processing functions, but this is another matter.)

Objectives 2 and 3 respond to the most discussed idea that "storing rasters in the database is slower and useless if it does not benefit from specific databases’ features". Objective 2 makes PostGIS Raster an analytical toolkit, going beyond the simple data format. Objective 3 ensures storage flexibility, retrieval efficiency and transparency to users. Both objectives benefit greatly from the indexing of raster data in the database.

We believe that these additional functionalities (over other raster-in-the-database implementations) make PostGIS Raster much more than just a new raster format, but also a necessary complement to PostGIS.

3.1.2 - PostGIS Raster Implementation

To fulfill these objectives, PostGIS Raster implements a minimalistic yet complete raster data structure, and adopts a simple single-type and single-table relational schema. This data structure is very similar to the PostGIS geometry vector type structure, and very different, for example, from the Oracle Spatial SDO_GEORASTER and SDO_RASTER raster types.

A single raster type - In PostGIS Raster there is no such thing as a raster and its raster tiles or an image and its image tiles. Even though single raster attributes may be considered tiles of a coverage or tiles of an image, any attribute of type raster is a complete and self sufficient georeferenced raster. It is not necessarily related to other rasters in the same table, and not necessarily located in a specific grid. This choice makes everything very simple and flexible, letting users and applications decide how they structure their data and how they name their raster elements (rasters, images, bitmap, blocks, tiles, etc…).

A single table relational schema – Similarly, in PostGIS Raster there is not one table for storing the raster data (like Oracle Spatial SDO_RASTER) and another table for storing the georeference and the metadata (like Oracle Spatial SDO_GEORASTER). Everything is stored in a single raster attribute, and raster attributes composing a table are not necessarily related to each other to form a significant coverage.

These choices make the PostGIS raster structure very similar to the existing PostGIS vector structure in which a layer can be geometrically topological or not, depending on the user data and their choices of application.

This also means, contrary to most raster formats, that raster attributes from the same table:

may be of different size. E.g. one raster can be 256 pixels wide and 256 pixels high while another can be 64 pixels wide and 64 pixels high.

may "snap" to different grid. E.g. the upper left corner of a raster with a pixel size of 10m can be at 0,0 in a coordinate system with a map unit in meters, and the upper left corner of another raster with the same pixel size in the same coordinate system can be at 0,1.25. This means that pixels of different raster attributes are not necessarily grid aligned.

may overlap, like polygons in a vector layer may overlap. This is fundamental to implementing meaningful vector to raster conversions in which all attributes accompanying a vector feature are conserved in the resulting raster table. E.g. 10 lake vector features with attributes like "name, type, area, perimeter, etc…" are converted to 10 lake raster features with the same attributes. In this case, even if the vector features and the “withdata values” part of the resulting raster features do not necessarily overlap, the "nodata values" of the raster features will overlap.

These characteristics enable easy integration with vectors, but the flexibility also allows the loading of raster assemblages that are not necessarily suitable for geoanalysis. As with vector layers, you are responsible to ensure that your layer meets the minimum topological requirements for analysis. The ideal case is when all the raster tiles of a continuous layer are the same size, snap to the same grid and do not overlap...

Hence, a single attribute of type raster can store:

a "complete image" (e.g. Landsat),

an image "tile",

a "raster object"; a new kind of raster object introduced by PostGIS Raster resulting from the rasterization of a vector feature.

All these objects are different terminologies and concepts related to the general concept of "raster". We will use the latter term throughout the rest of this documentation.

Similarly, a table with a column of type raster may be seen, depending on the structure and the relationships between the raster objects it contains, as storing (Table 1 provides a summary of the characteristics inherent to each arrangement, and figure 1 provides a corresponding graphical representation of each one.):

a) an image warehouse of untiled and (possibly) unrelated images. These images may or may not overlap since every image has its own georeference. They may also have no georeference at all.

b) an irregularly tiled raster coverage. It might not necessarily be rectangular, there might be some missing tiles, and they might be different sizes. Tiles should not overlap.

c) a regularly tiled raster coverage. It might not necessarily be rectangular, there might be some missing tiles, and the tiles should be the same size. Tiles should not overlap.

d) a rectangular regularly tiled raster coverage. It is necessarily rectangular, there are no missing tiles, they are all the same size and they do not overlap.

e) a tiled image. It is necessarily rectangular, there are no missing tiles, they are all the same size, and they do not overlap. This type is different from type d) in that it does not represent a complete coverage; other images forming the rest of the coverage are stored as other tables of tiled images. This structure is not very practical from a GIS analytical point of view since any operations applied to the coverage must also be applied to every table.

f) a raster object coverage resulting from the rasterization of a vector coverage. Each vector feature becomes a small raster with the same extent as the original vector feature. This type of coverage is not necessarily complete, nor rectangular; tiles should be of different sizes and might overlap. It all depends on the characteristics of the vector layer being rasterized. An exhaustive (or continuous) layer of mutually exclusive geometries (without gaps or holes like a forest cover) would result in a raster object coverage in which significant pixels (withdata values) would not overlap, but non-significant pixels (nodata values) would. On the other end of the spectrum, a discontinuous layer of mutually exclusive geometries (like a lake or building layer) would result in a coverage of mostly disjoint raster objects.

All these arrangements are possible, and as for a geometry layer which can be implicitly topological or not, PostGIS Raster does not impose one over the other (even though types c) and d) and f) are the most practical for a GIS analyst). The fact that users of a database might (contrary to a raster file format) add or delete rows (or “tiles”) in a table, along with the fact that we must support variable-sized tiles (for vector to raster conversion), makes it very difficult to enforce a certain type of configuration.

Regularly Blocked Table

You can inform PostGIS that the raster layer you are loading meets certain useful criteria by adding the –k option to the gdal2wktraster.py Python loader. This will set the "regular_blocking" attribute of the raster_columns table to true. It implies that:

All loaded tiles have the same width and height,

All tiles do not overlap and their upper left corners follow a regular block grid,

The global extent of the layer is rectangular and not rotated.

Some blocks (or tiles) can be missing in a regularly blocked table. Missing tiles are assumed to be filled with the proper nodata value for each band as determined in the raster_columns table.

This regular tiling (or regular blocking) is expected from arrangements c) d) and e) in figure 1. PostGIS Raster provides this mechanism because raster applications are often optimized to deal with these arrangements. There is, however, no mechanism to enforce (constrain) these criteria and, as mentioned above, adding, modifying or deleting a row from the table might break this regular blocking, and the regular_blocking attribute will not be automatically updated.

Structure of the "raster" Type

Like the geometry type, the raster type is a complex PostgreSQL type composed of many attributes accessible using various functions. A raster attribute may be composed of many bands sharing a common size, pixel size and georeference. Each band has a pixel type and may have a nodata value.

Table 2 summarizes the composition of a raster type.

Table 2 - Structure of a raster type

Attribute

Description

Storage

Accessible with function...

version

WKB format version (now 0).

unsigned 16 bit integer

Not accessible.

number of bands

Number of raster bands stored in the raster.

unsigned 16 bit integer

ST_NumBands()

width

Width of the raster.

unsigned 16 bit integer

ST_Width()

height

Height of the raster.

unsigned 16 bit integer

ST_Height()

Georeference information

pixelsizex

Pixel size in the x-direction in the same map units as the coordinate system. Same as parameter A in a world file.

64 bit double

ST_PixelSizeX()

pixelsizey

Pixel size in the y-direction in the same map units as the coordinate system. Same as parameter E in a world file.

64 bit double

ST_PixelSizeY()

upperleftx

X-coordinate of the center of the upper left pixel. Same as parameter C in a world file.

64 bit double

ST_UpperLeftX()

upperlefty

Y-coordinate of the center of the upper left pixel. Same as parameter F in a world file.

64 bit double

ST_UpperLeftY()

rotationx

Rotation about x-axis. Same as parameter B in a world file.

64 bit double

ST_RotationX()

rotationy

Rotation about y-axis. Same as parameter D in a world file.

64 bit double

ST_RotationY()

srid

Spatial reference id.

32 bit integer

ST_SRID()

Band information (one set per band)

isoffline

Flag specifying if the band data is stored in the database or as a file in the file system.

1 bit

If ST_BandPath() return an empty string, the data is stored in the database.

hasnodatavalue

Flag specifying if the stored nodatavalue is significant or not.

1 bit

ST_BandHasNoDataValue()

pixeltype

Pixel type for this band.

4 bits

ST_BandPixelType()

nodatavalue

Nodata value for the band.

Depend on band pixel type.

ST_BandNodataValue()

Band data (one set per band) for in-db raster

values[]

Value of each pixel.

Depends on band pixel type.

ST_ Value()

Band data (one set per band) for out-db raster

bandnumber

Number of the out-db band.

unsigned 8 bit integer

Not accessible yet.

path

Path to the out-db raster file.

string

ST_BandPath()

Min and max values for:

"unsigned 8 bits integer" are 0 and 255.

"unsigned 16 bits integer" are 0 and 65535.

"32 bits integer" are -2 147 483 648 and 2 147 483 647.

"64 bits double" are -1.7*10-308 and 1.7*10308.

PostGIS Raster is expressed in different forms, depending on the level at which it is referred:

WKT - "Well Known Text" refers to the human readable text format used when inserting a raster with ST_RasterFromText() (not yet implemented) and retrieving a raster with ST_AsText() (not yet implemented). This format can result in the loss of precision when used with floating point values. This is why the HEXWKB form is preferred when importing/exporting in textual form.

WKB - "Well Known Binary" refers to the binary equivalent of the WKT. It is used when inserting a raster using ST_RasterFromWKB() and retrieving a raster using ST_AsBinary().

HEXWKB - "Hexadecimal WKB" is an exact hexadecimal representation of the WKB form. It is also called the "canonical" form. It is what you get from the loader (gdal2wktraster.py), what is accepted by the raster type input function (ST_Raster_In), and what you get when outputting the value of a raster field without conversion (e.g. SELECT rast FROM table).

Serialized - The "serialized" format is what is written to the filesystem by the database. It differs from WKB in that it does not have to store the endianness of the data and that it must be aligned. Serializing is the action of writing data to the database file, deserializing is the action of reading this data.

The raster_columns Table

Like the PostGIS "geometry_column" table, the PostGIS Raster "raster_columns" table is a way for applications to get a quick overview of which tables have a raster column, and to get the main characteristics (metadata) of the rasters stored in these columns. Applications should maintain this table using the AddRasterColumn() and DropRasterColumn() functions, as there is no automatic mechanism in the database to keep this table up to date with every raster column created or deleted.

Table 3 summarizes the attributes of the raster_columns table.

Attribute

PostgreSQL Type

Description

r_table_catalog

character varying(256)

Name of the catalog containing the table. This attribute exists only to follow the geometry_column table schema. It is generally left empty.

r_table_schema

character varying(256)

Name of the schema containing the table. Generally equal to “public”.

r_table_name

character varying(256)

Name of the table containing a column of type raster.

r_column

character varying(256)

Name of the raster column in the table.

srid

integer

ID of the spatial reference system used for the raster column in this table. It is a foreign key reference to the PostGIS spatial_ref_sys table.

Array of flags for in-db or out-db storage; one per band. True when raster data is not stored in the database but resides as files in the filesystem. Paths to these files (one per row) are determined by ST_BandPath(). Index for first band is 1.

regular_blocking

boolean

Flag specifying that the regular blocking constraint is enforced for this table.

nodata_values

array[double]

Array of nodata values; one for each band. Each nodata value is stored as a double (even when the pixel type is integer). Index for first band is 1.

pixelsize_x

double

Pixel size of the raster in the x-direction. In the units of the coordinate system determined by srid.

pixelsize_y

double

Pixel size of the raster in the y-direction. In the units of the coordinate system determined by srid.

blocksize_x

integer

Width of a block of raster data when regular_blocking is set to true. NULL otherwise.

blocksize_y

integer

Height of a block of raster data when "regular_blocking" is set to true. NULL otherwise.

extent

geometry

Polygon geometry encompassing all raster rows of this table. NULL if predefined bounds are not known. When "regular_blocking" is set to true this polygon is a simple rectangle. In other cases it might be an irregular polygon.

3.1.3 - Storing in-db raster or registering out-db raster?

One of the major PostGIS Raster features is the ability to store raster data in the database or merely register them as external files residing in the filesystem. When registering a raster, only the raster metadata are stored in the database (width, height, number of band, georeference and path to the actual raster file), not the values associated with the pixels. Registering is done via the loader’s –R option.

>raster2pgsql.py –r c:/imagesets/landsat/image01.tif –t landsat -R

Out-db raster is targeted at read-only applications. It has the following advantages:

Access Speed - Speeds up the reading (and transmission) of web-ready rasters (JPEG, GIF, PNG) files for web applications as there is no need to convert them from the PostGIS raster type to a web friendly format.

Seemless SQL Operators and Functions – All the operators and functions (except those involving write operations to the raster like ST_SetValue()) work seamlessly with out-db rasters.

Simplified Backup – Rasters residing in the filesystem can be backed up once forever since they are not expected to be edited.

Faster Database Loading – Initial table creation is faster since data do not have to be loaded/copied in the database.

In-db raster is targeted at editing and analysis applications. It has the following advantages:

Fast Analysis – Analysis operations involving pixel per pixel interpretation (ST_AsPolygon(), ST_Intersections(), etc…) are faster on in-db rasters since there is no need to read and extract the actual data from a JPEG or a TIFF format.

Single storage solution – Provide a single vector/raster datasets storage solution. There is no need to backup the raster datasets separately from the vector datasets.

To get full information about the loader syntax. It creates a SQL file, ready to be loaded in your database. The database must have been enabled with PostGIS and PostGIS Raster. Check section 2.3.4 to know how to create and enable a new PostgreSQL database.

Some basic examples:

Generate the SQL output (as mytable.sql file) to load a TIF file into database. The file is loaded in 1 row's table. The table is previously created.

python raster2pgsql.py -r myfile.tif -t mytable -o mytable.sql

Generate the SQL output (as mytable.sql file) to load a TIF file into database and georeference it using SRID 4326. The file is loaded in 1 row's table. The table is previously created.

Generate the SQL output (as mytable.sql file) to load a TIF file into database, georeference it using SRID 4326, split the file in tiles of 40x20 px (one per row) and create a GiST index over the raster column. The number of rows is dependent on the raster size, in pixels. The table is previously created.