Unit 188 - Artificial Neural Networks
for Spatial Data Analysis

Written by Sucharita GopalDepartment of Geography and Centre for Remote Sensing
Boston University, Boston MA 02215

DRAFT - comments invited

This unit is part of the NCGIA
Core Curriculum in Geographic Information Science. These materials
may be used for study, research, and education, but please credit the author,
Sucharita Gopal, and the project, NCGIA Core Curriculum in GIScience.
All commercial rights reserved. Copyright 1998 by Sucharita Gopal.

Your comments on these materials are welcome. A link to an evaluation
form is provided at the end of this document.

Advanced Organizer

Topics covered in this unit

Intended Learning Outcomes

After learning the material in this unit, students should be able to:

Define ANN and describe different types and some applications of ANN

Explain the applications of ANN in geography and spatial analysis

Explain the differences between ANN and AI, and between ANN and statistics

Unit 188 - Artificial Neural Networks
for Spatial Data Analysis

1. Introduction

1.1. What are Artificial Neural Networks (ANN)?

provide the potential of an alternative information processing paradigm
that involves

large interconnected networks of processing units (PE)

units relatively simple and typically non-linear

units connected to each other by communication channels or "connections"

connections carry numeric (as opposed to symbolic) data; encoded by any
of various means

units operate only on their local data and on the inputs they receive via
the connections

1.2. Some Definitions of ANN

According to the DARPA Neural Network Study (1988, AFCEA International
Press, p. 60):

a neural network is a system composed of many simple processing
elements operating in parallel whose function is determined by network
structure, connection strengths, and the processing performed at computing
elements or nodes.

A neural network is a massively parallel distributed processor that
has a natural propensity for storing experiential knowledge and making
it available for use. It resembles the brain processor in two respects:

Knowledge is acquired by the network through a learning process.

Interneuron connection strengths known as synaptic weights are used to
store the knowledge.

1.3. Brief History of ANN

ANN were inspired by models of biological neural networks since much of
the motivation came from the desire to produce artificial systems capable
of sophisticated, perhaps "intelligent", computations similar to
those that the human brain routinely performs, and thereby possibly to
enhance our understanding of the human brain.

1.4. Applications of ANN

ANN is a multi-disciplinary field and as such its applications are numerous
including

finance

industry

agriculture

business

physics

statistics

cognitive science

neuroscience

weather forecasting

computer science and engineering

spatial analysis and geography

1.5. Differences between ANN and AI approaches:

Several features distinguish this paradigm from conventional computing
and traditional artificial intelligence approaches. In ANN

information processing is inherently parallel.

knowledge distributed throughout the system

ANNs are extremely fault tolerant

adaptive model free function estimation, non-algorithmic strategy

1.6. ANN in Apatial Analysis and Geography

Fischer (1992) outlines the role of ANN in both exploratory and explanatory
modeling.

Key candidate application areas in exploratory geographic information
processing are considered to include:

Kohonen nets for adaptive vector quantization are very similar to k-means
cluster analysis.

Hebbian learning is closely related to principal component analysis.

Some neural network areas that appear to have no close relatives in the
existing statistical literature are:

Kohonen's self-organizing maps.

Reinforcement learning (although this is treated in the operations research
literature on Markov decision processes

2. Types of ANN

There are many types of ANNs.

Many new ones are being developed (or at least variations of existing ones).

2.1. Networks based on supervised andunsupervised
learning

2.1.1. Supervised Learning

the network is supplied with a sequence of both input data and desired
(target) output data network is thus told precisely by a "teacher" what
should be emitted as output.

The teacher can during the learning phase "tell" the network how well it
performs ("reinforcement learning") or what the correct behavior would
have been ("fully supervised learning").

2.1.2. Self-Organization or Unsupervised
Learning

a training scheme in which the network is given only input data,
network finds out about some of the properties of the data set ,
learns to reflect these properties in its output. e.g. the network learns
some compressed representation of the data. This type of learning presents
a biologically more plausible model of learning.

what exactly these properties are, that the network can learn to recognise,
depends on the particular network model and learning method.

2.2. Networks based on Feedback and Feedforward connections

The following shows some types in each category

Unsupervised Learning

Feedback Networks:

Binary Adaptive Resonance Theory (ART1)

Analog Adaptive Resonance Theory (ART2, ART2a)

Discrete Hopfield (DH)

Continuous Hopfield (CH)

Discrete Bidirectional Associative Memory (BAM)

Kohonen Self-organizing Map/Topology-preserving map (SOM/TPM)

Feedforward-only Networks:

Learning Matrix (LM)

Sparse Distributed Associative Memory (SDM)

Fuzzy Associative Memory (FAM)

Counterprogation (CPN)

Supervised Learning

Feedback Networks:

Brain-State-in-a-Box (BSB)

Fuzzy Congitive Map (FCM)

Boltzmann Machine (BM)

Backpropagation through time (BPTT)

Feedforward-only Networks:

Perceptron

Adaline, Madaline

Backpropagation (BP)

Artmap

Learning Vector Quantization (LVQ)

Probabilistic Neural Network (PNN)

General Regression Neural Network (GRNN)

3. Methodology: Training, Testing and Validation
Datasets

In the ANN methodology, the sample data is often subdivided into training,
validation,
and test sets.

The distinctions among these subsets are crucial.

Ripley (1996) defines the following (p.354):

Training set: A set of examples used for learning, that
is to fit the parameters [weights] of the classifier.

Validation set: A set of examples used to tune the parameters
of a classifier, for example to choose the number of hidden units in a
neural network.

Test set: A set of examples used only to assess the performance
[generalization] of a fully-specified classifier.

4. Application of a Supervised ANN for a Classification
Problem

In this section, we describe how two neural networks to classify data and
estimate unknown functions. Multi-Layer Perceptron (MLP) and fuzzy
ARTMAP networks.

4.1. Multi-Layer Perceptron (MLP) Using Backpropagation

A popular ANN classifier is the Multi-Layer Perceptron (MLP) architecture
trained using the backpropagation algorithm.

In overview, a MLP is composed of layers of processing units that are interconnected
through weighted connections.

The first layer consists of the input vector

The last layer consists of the output vector representing the output class.

Intermediate layers called `hidden` layers receive the entire input pattern
that is modified by the passage through the weighted connections. The hidden
layer provides the internal representation of neural pathways.

The network is trained using backpropagation with three major phases.

First phase:
an
input vector is presented to the network which leads via the forward pass
to the activation of the network as a whole. This generates a difference
(error) between the output of the network and the desired output.

Second phase: computethe error
factor (signal) for the output unit and propagates this factor successively
back through the network (error backward pass).

Third phase: compute the changes
for the connection weights by feeding the summed squared errors from the
output layer back through the hidden layers to the input layer.

Continue this process until the connection weights in the network have
been adjusted so that the network output has converged, to an acceptable
level, with the desired output.

Assign "unseen" or new data

The trained network is then given the new data and processing and flow
of information through the activated network should lead to the assignment
of the input data to the output class.

For the basic equations relevant to the backpropagation model based
on generalized delta rule, the training algorithm that was popularized
by Rumelhart, Hinton, and Williams, see chapter 8 of Rumelhart and McClelland
(1986).

4.1.1. Things to note while using the backpropagation
algorithm

Learning rate:

Standard backprop can be used for incremental (on-line) training (in which
the weights are updated after processing each case) but it does not converge
to a stationary point of the error surface. To obtain convergence, the
learning rate must be slowly reduced. This methodology is called "stochastic
approximation."

In standard backprop, too low a learning rate makes the network learn very
slowly. Too high a learning rate makes the weights and error function diverge,
so there is no learning at all.

Trying to train a NN using a constant learning rate is usually a tedious
process requiring much trial and error. There are many variations proposed
to improve the standard backpropagation as well as other learning algorithms
that do not suffer from these limitations. For example, stabilized Newton
and Gauss-Newton algorithms, including various Levenberg-Marquardt and
trust-region algorithms).

Output Representation:

use 1-of-C coding or dummy variables.

For example, if the categories are Water, Forest and Urban, then the output
data would look like this:

We are very interested in your comments and suggestions for improving this
material. Please follow the link above to the evaluation form if
you would like to contribute in this manner to this evolving project.

Citation

To reference this material use the appropriate variation of the following
format:
Sucharita Gopal. (1998) Artificial Neural Networks for Spatial Data
Analysis, NCGIA Core Curriculum in GIScience, http://www.ncgia.ucsb.edu/giscc/units/u188/u188.html,
posted December 22, 1998.The correct URL for this page is: http://www.ncgia.ucsb.edu/giscc/units/u188/u188.htmlCreated: November 23, 1998.Last
revised: December 22, 1998.To
the Core Curriculum Outline