lundi 5 décembre 2011

The USGS has recently released GMTED2010, a new dataset of elevation data, available in resolutions of 30, 15 and 7.5 arc-seconds. A look at the download page shows that the dataset is delivered as tiles whose dimensions are 30° of longitude x 20° of latitude.

There are also global grids available, but due to the fact they are contained in a .zip file, seeking to arbitrary offsets will undoubtedly be very slow. The content of the be75_grd.zip file can be obtained with the new gdal_ls.py sample script and a combination of /vsizip/ and /vsicurl/ virtual filesystem prefixes :

With the build_gmted.py Python script leveraging on GDAL Python and utilities, we create a GDAL virtual dataset that merges all the tiles into global datasets, so that we can access them seamlessly. Of course, we also take advantage of the multi-resolution nature of the dataset to use the tiles of resolutions 30 and 15 arc-seconds as overviews that can speed-up sub-sampling processing. The script currently uses the breakline emphasis layer of the GMTED dataset, but can easily be customized to use other layers (minimum, maximum, mean, medium).

For best performance, the script and the following operations must be used with the latest GDAL development version (1.9.0dev, >= r23468) to benefit from new optimizations.

$ python build_gmted.py

After a few minutes, the script has created a gmted subdirectory that contains several VRT files : one for each tile at each resolution, and 3 global ones : gmted/all075.vrt, gmted/all150.vrt and gmted/all300.vrt.

Let's have a look at the content of gmted/all075.vrt with the gdalinfo utility :

The global dataset covers 172 200 x 86 400 elevation points, i.e. 29.8 GB of data. all075.vrt also references all150.vrt and all300.vrt as overviews. And we can verify that it is consistent with the global grid provided by USGS (except that our VRT extends more towards the poles).

(The white strip between Patagonia and Antarctica is filled with no-data values)

2 options (new in GDAL 1.9.0dev) have been used to speed up access :

The CPL_VSIL_CURL_ALLOWED_EXTENSIONS option is set to ".tif" so that only TIF files are queried from the server, and no other auxiliary files (.aux.xml, .ovr, .msk, etc..)

The GTIFF_DIRECT_IO option is set to YES so that the minimum number of bytes are fetched from the remote files. Partial TIFF strips can be read. It also takes advantage of the ability of the HTTP server to return multi-range of data at once, minimizing client-server round-trips. On the downside, it makes no use of the GDAL block cache mechanism, hence it is not an appropriate default choice for general purpose I/O. It is also restricted to simple configurations of TIFF files : single-band, untiled, uncompressed.

We can now edit gmted/all075.vrt with our favorite editor to take advantage of the local overview that we have just generated. We just have to insert the following snippet before the </VRTRasterBand> element at the end of the file :

Depending on the zoom level, either the local 2048x1024 dataset, or the remote 300, 15 and 7.5 arc-seconds datasets will be used.

For example, let's zoom in on Reunion island. We can easily distinguish its 3 magnificent cirques (the 3 dark areas near the center/west) and the Piton de la Fournaise volcano (white circle at the east) :