Surfaces and fields

Surfaces and fields

The data described in Section 6.1.1, Test datasets, are representations of physical surfaces. For every location in the set {x,y} there is a single scalar value z representing the height of the surface at that point. There is no reason however to restrict the set of z-values to be heights — they could equally well be some other variable, such as the cost of land or the level of a soil trace element. The range of values that {z} may take would typically be the set of (positive) real numbers, although only integer values may be recorded depending on the dataset in question.

In some instances a number of z-values are associated with each (x,y) location. This is the case, for example, with multi-spectral remote sensing data, where separate spectral bands are coded for each pixel of an image. In this case each band can be considered as a distinct surface, and is typically stored as separate layers within an image folder. Frequently such data are integer-coded color values, so have a well-defined and limited range, e.g. [0,255]. Another example is the case of spatial datasets (surfaces) that are recorded over a number of time periods — for example atmospheric pressure or temperature at a specified altitude, recorded at hourly intervals.

In all of these examples the recorded information is a single interval or ratio-scaled value. Spatial datasets of this type are considered as surfaces or (scalar) fields within GIS. Datasets in which multiple values for the same variable are recorded for a single location and time (e.g. geological borehole data) and/or have a directional component (e.g. wind speed and direction), are typically excluded from the set of objects described as surfaces or fields, or are re-cast where possible to fit such a model (e.g. as multiple layers). The term vector field is used to refer to fields that include both a magnitude and a directional value at every point.

Having defined the kind of datasets that qualify, the question of how such information is obtained and stored becomes central to the process of analyzing and interpreting such data. A very large proportion of data that describes the physical world (terrain, vegetation, land-use etc.) is obtained from one of two sources (possibly both) ― national mapping agencies and sample surveys.

National agencies provide details of topography; land-use; geology etc.; derived from a mixture of terrestrial, aerial and satellite surveys. Traditionally such data were analyzed and then recorded on paper maps, although this has now been largely superseded by digital encoding and storage, with paper maps as one of a range of possible output forms. For terrain data, output consisted largely of contour maps with spot heights, and the great majority of current digital terrain datasets, including those available in grid or raster format, have been derived from such contour maps. One result of this process has been to make available grid datasets or digital elevation models (DEMs) for many areas of the world. However, since these have been programmatically generated there is a tendency for artifacts to appear, such as small ridges, troughs and hollows, and these can be seen even in the most carefully produced national datasets, including the GB Ordnance Survey and United States Geological Survey (USGS) datasets. For example, the test dataset for the OS tile TQ91SW includes an area of sea, which the file records as having a range of values from 0 to ‑3m! The same issue applies to conversion utilities built-in to many GIS packages — converting input contour datasets to grid format, or grid data to contours or TIN format.

Another principal source of field-like data is from sample surveys. In this case one or more sample values are known for a set of distinct (point-like) locations. These might be soil samples, radioactivity data, meteorological station data or some other variable that is known to be spatially continuous. Values at unsampled locations, typically defined as a fine grid, are estimated from the measured data using a variety of interpolation and prediction methods. Hence again grid data may be subject to artifacts and additional estimated or unknown errors of modeling.

In each of these examples the surface or grid can be thought of as a single valued function z=f(x,y). Mathematically derived functions that apply to an entire region or to sub-sections of regions may be used to create grid datasets. For example the surface shown in Figure 6‑3 is a representation obtained by fitting a linear regression surface of the form z=ax+by+c to the set of spot heights provided by the GB Ordnance Survey for tile NT04.

Figure 6‑3 Linear regression surface fit to NT04 spot heights

More complex mathematical models are frequently used to generate and analyze surface datasets. Many of those available in modern GIS packages and related software (notably geostatistical packages) are described in the subsections that follow.