Context-Based classification of objects in topographic data

Abstract

Large-scale topographic databases model real world features as vector data objects. These can be point, line or area features. Each of these map objects is assigned to a
descriptive class; for example, an area feature might be classed as a building, a garden or a road. Topographic data is subject to continual updates from cartographic surveys
and ongoing quality improvement. One of the most important aspects of this is assignment and verification of class descriptions to each area feature. These attributes
can be added manually, but, due to the vast volume of data involved, automated techniques are desirable to classify these polygons.
Analogy is a key thought process that underpins learning and has been the subject of much research in the field of artificial intelligence (AI). An analogy identifies
structural similarity between a well-known source domain and a less familiar target domain. In many cases, information present in the source can then be mapped to the
target, yielding a better understanding of the latter. The solution of geometric analogy problems has been a fruitful area of AI research. We observe that there is a correlation
between objects in geometric analogy problem domains and map features in topographic data. We describe two topographic area feature classification tools that use
descriptions of neighbouring features to identify analogies between polygons: content vector matching (CVM) and context structure matching (CSM). CVM and CSM classify an area feature by matching its neighbourhood context against those of analogous polygons whose class is known.
Both classifiers were implemented and then tested on high quality topographic polygon data supplied by Ordnance Survey (Great Britain). Area features were found to exhibit a high degree of variation in their neighbourhoods. CVM correctly classified 85.38% of the 79.03% of features it attempted to classify. The accuracy for CSM was 85.96% of the 62.96% of features it tried to identify. Thus, CVM can classify 25.53% more features than CSM, but is slightly less accurate. Both techniques excelled at identifying the feature classes that predominate in suburban data. Our structure-based classification approach may also benefit other types of spatial data, such as topographic line data, small-scale topographic data, raster data, architectural plans and circuit diagrams.