If you have a very large archive of data and need to find which scenes intersect a point often, having a spatial database with the scene outlines would be a better approach. However, if it isn’t something you do often this quick approach using only the GDAL utilities and a bit of Python is worth knowing.

The latest version of GDAL (2.1) has a driver to read Sentinel 2 data (see http://gdal.org/frmt_sentinel2.html). Like HDF files they are read as subdatasets. Running ‘gdalinfo’ on the zipped folder or the .xml file contained within the .SAFE directory will display all the subdatasets, as well as all the metadata (so quite a lot of information!).

As with HDF5 you can pass the subdataset names to gdalinfo to get more information or gdal_translate to extract them as a separate dataset.

To get the extension for all supported drivers, and some creation options the ‘get_gdal_drivers’ module from arsf_dem_scripts is optionally used. You can just download this file and copy into the same directory ‘extract_s2_data.py’ has been saved to. For Linux or OS X you can run:

Recently I took a lot of photos on my phone during fieldwork (survey for the NERC-ARF LiDAR and hyperspectral calibration flights) and wanted to extract the GPS coordinates from each photo so I could load them into QGIS using the add delimited text dialogue. The aim wasn’t to have precise locations for each photo (GPS positions for each of the points surveyed were recorded separately) but to give a quick idea of the locations we’d visited before the main GPS data were processed.

It code isn’t particularly tidy (despite my recent updates) but it basically pulls out the GPS location from the EXIF tags and writes this to a CSV file. To create a KMZ a thumbnail image is created and a corresponding KML file. Both the thumbnail and KML are then zipped together to generate the KMZ which can be opened in GoogleEarth.

For working with ENVI files I normally use GDAL as code can then be applied to different formats. However, there are a couple of limitations with GDAL when working with hyperspectral data in ENVI format:

GDAL doesn’t copy every item from the header file to a new header file if they don’t fit in with the GDAL data model. Examples are FWHM and comments. Sometimes extra attributes are copied to the aux.xml file GDAL creates, these files aren’t read by ENVI or other programs based on IDL (e.g., ATCOR).

For data stored Band Interleaved by Line (BIL) rather than Band Sequential (BSQ) reading and writing a band at a time is inefficient as it is necessary to keep jumping around the file

To overcome these issues NERC-ARF-DAN use their own Python functions for reading / writing header files and loading BIL files a line at a time. These functions have been tidied up and released through the NERC-ARF Tools repository on GitHub (https://github.com/pmlrsg/arsf_tools). The functions depend on NumPy.

Unzip and within a ‘Command Prompt’ or ‘Terminal’ window navigate to the the folder using (for example):

cd Downloads\arsf_tools-master

Install the tools and library by typing:

python setup.py install

Note, if you are using Linux you can install the arsf_binary reader from https://github.com/arsf/arsf_binaryreader which is written in C++. The ‘arsf_envi_reader’ module will import this if available as it is faster than the standard NumPy BIL reader.

If you are a UK based researcher with access to the JASMIN system the library is already installed and can be loaded using:

module load contrib/arsf/arsf_tools

An simple example of reading each line of a file, adding 1 to every band and writing back out again is:

For developing algorithms using spatial data, in particular multiple files it is recommended to convert the files to another band-sequential format using the ‘gdal_translate’ so they can be read using programs which use GDAL, for example RIOS or RSGISLib.

To convert files from ENVI BIL to ENVI BSQ and copy all header attributes the ‘envi_header’ module can be used after converting the interleave with GDAL. For example:

I have a couple of general scripts repositories for myself and shared with colleagues. These are for scripts which are useful but don’t fit into existing projects or justify having their own repository. An example is rsgis_scripts on bitbucket. The scripts are split into different directories which aren’t available on the main path. To make them available when needed I use EnvMaster (described in a previous post). I have an envmaster module which will search the repository for folders containing executables and add them to $PATH. It will also add folders containing ‘__init__.py’ to $PYTHONPATH.

The newer ‘ratapplier’ module is modelled after the ‘applier’ module for images and allows a function to be applied to chunks of rows, making it particularly useful for a RAT which is too large to load to memory.