Description

Exposing structure

The current grid coverage API is based on the idea of a coverage reader containing a single coverage, which can be, according to the read parameters, subsetted, rescaled, eventually reprojected too. Some readers, like image mosaic, are actually based on a collection of internal raster data (referred to as "granules") which:

is spatially distributed

can have a time and elevation associated

can have other custom dimensions associated

is associated to a number of attributes that allow for general feature selection

All of the above is unfortunately mostly opaque to the caller, besides some level of access to dimensions provided via the metadata strings, which is suboptimal in several ways, first and foremost because all the available data has to be coerced into a in memory string representation, which has to be parsed on the other side. Also, the choice of dimensions made by the reader is somehow arbitrary, and the user of the reader might want to use other time/elevation attributes without having to re-configure the reader.

The current image mosaic is not the only example of "structured" readers, NetCDF data sources, PostGIS rasters and other in database rasters are another example where variables and dimensions could be exposed with higher control and better details. This can be leveraged by callers to have higher flexibilty in exposing dimensions, as well as allowing callers to inspect the inner structure of a complex coverage and better tailor their requests (WCS-EO is an example that allows to expose the inner structure of a mosaic and allows the caller to work against the single granules composing it).

Allowing granule addition/removal

Very often these structured readers need to be modified in terms of the granules they contain by:

adding new granules

removing existing granules

A simple and typical case is keeping a moving time window of data, e.g., the last month of satellite observations for a certain atmospheric gas. In order to support these cases we propose interfaces mimicking the vector data source, GranuleSource and GranuleStore, that allow access and modification of the granules internals. A structured reader might be read only, in which case it can only return a GranuleSource, or be writable, in which case a GranuleStore is returned instead. FeatureSource/FeatureStore have not been used directly to keep the work needed to implement a structured reader to a minimum (their interface is significantly larger).

In order to allow each reader to acquire extra data the way it prefers a "harvest" operation has been added that works against the file system, and allows each reader to add data into its backend storage the preferred way: a file oriented tool like ImageMosaic will just add references to the files being harvested internally, a database oriented tool like PostGIS raster can instead read the files and copy them into its internal storage.

Implementors of the harvest operation will have to consider the case of harvesting from another structured data source, for example, a image mosaic could with NetCDF files, which have in turn their own internal structure, taking that into account and eventually building not only new granules in the respective existing coverages: for example, a mosaic could be made of NetCDF files having each three variables, NO2, O3 and BrO, each file contains granules for the three gases at different batches of times and elevation, and the mosaic exposes the same three coverages, but hiding the fact the bits and pieces are split among various source NetCDF files.

Implementation

Implementation wise the proposal will come with two implementations:

ImageMosaic being improved to expose its internal structure. This one will be writable, allowing for harvesting and granule removal

NetCDF (as a new unsupported module), as a read only structured grid coverage that plays well with the ImageMosaic harvesting code, allowing multidimensional mosaics of NetCDF files to be constructed.