Wednesday, August 21, 2013

PostGIS 2.x (latest release, 2.1) enables users to do fairly sophisticated raster processing directly in a database. For many applications, these data can stay in the database; it's the insight into spatial phenomena that comes out. Sometimes, however, you need to get file data (e.g. a GeoTIFF) out of PostGIS. It isn't immediately obvious how to do this efficiently, despite the number of helpful functions that serialize a raster field to Well-Known Binary (WKB) or other "flat" formats.

Background

In particular, I recently needed to create a web service that delivers PostGIS raster outputs as file data. The queries that we needed to support were well suited for PostGIS and sometimes one query would consume another (one or more) as subquer(ies). These and other considerations led me to decide to implement the service layer in Python using either GeoDjango or GeoAlchemy. More on that later. Suffice to say, a robust and stable solution for exporting and attaching file data from PostGIS to an HTTP response was needed. I found at least six (6) different ways of doing this; there may be more:

Export an ASCII Grid

This works great! Because an ASCII grid file (or "ESRI Grid" file, with the *.asc or *.grd extension, typically) is just plain text, you can directly export it from the database. The GDAL driver name is "AAIGrid" which should be the second argument to ST_AsGDALRaster(). Be sure to remove the column header from your export (see image below). However, what you get is a file that has no projection information that you may need to convert to another format. This can present problems for your workflow, especially if you're trying to automate the production of raster files, say, through a web API.

Connect Using the QGIS Desktop Client

There is a plug-in for QGIS that promises to allow you to load raster data from PostGIS directly into a QGIS workspace. I used the Plugins Manager ("Plugins" > "Fetch Python Plugins...") in QGIS to get this plug-in package. The first time I selected the "Load PostGIS Raster to QGIS" plug-in and tried to install it, I found that I couldn't write to the plug-ins directory (this with a relatively fresh installation of QGIS). After creating and setting myself as the owner of the python/plugins directory, I was able to install the plug-in without any further trouble. Connecting to the database and viewing the available relations was also no trouble at all. One minor irritation is that you need to enter your password every time the plug-in interfaces with the database, which can be very often, at every time the list of available relations needs to be updated.

You'll be doing this a lot.

There are a few options available to you in displaying raster data from the database: "Read table's vector representation," "Read one table as a raster," "Read one row as a raster," or "Read the dataset as a raster." It's not clear what the second and last choices are, but "Reading the table as a raster" did not work for me where my table has one raster field and a couple of non-raster, non-geometry/geography fields; QGIS hung for a few seconds then said it "Could not load PG..." Reading one row worked, however, you have to select the row by its primary key (or row number in a random selection, not sure which it is returning). This may not be the easiest way for you to find a particular raster image in a table.

Using the COPY Declaration in SQL

My colleague suggested this method, demonstrated in Python, which requires the pygresql module to be installed; easy enough with pip:

pip install psycopg2 pygresql

The basic idea is to use the COPY declaration in SQL to export the raster to a hexadecimal file, then to convert that file to a binary file using xxd:

This needs to be done on the file system of the database server, which is where PostgreSQL will write.

Using an Output Function and Serializing from a Byte Array

Despite the seeming complexity of this option (then again, compare it to the above), I think it is the most flexible approach. I'll provide two examples here, with code: using GeoDjango to execute a raw query and using GeoAlchemy2's object-relational model to execute the query. Finally, I'll show an example of writing the output to a file or to a Django HttpResponse() instance.

Using GeoDjango

First, some setup. We'll define a RasterQuery class to help with handling the details. While a new class isn't exactly an idiomatic example, I'm hoping it will succinctly illustrate the considerations involved in using performing raw SQL queries with Django.

Seem simple enough? To write to a file instead, see the write_all() method of the RasterQuery class. The query.fetch_all()[0] at the end is contrived. I'll show a better way of getting to a nested buffer in the next example.

Here we see a better way of getting at a nested buffer. If we wanted all of the rasters that were returned (all of the buffers), we could call ST_Union on our final raster selection before passing it to ST_AsGDALRaster.

After considering all my (apparent) options, I found this last technique, using the PostGIS raster output function(s) and writing the byte array to a file-attachment in an HTTP response, to be best suited for my application. I'd be interested in hearing about other techniques not described here.

Follow by Email

Blog Feeds

About the Authors

These articles are authored by members of AmericaView, a nationwide program that focuses on satellite remote sensing data and technologies in support of applied research, K-16 education, workforce development, and technology transfer. If you have comments or questions, please contact us.

If you are affiliated with AmericaView and would like to contribute to the blog, read this post and follow the instructions.