Let's start by taking a look at a biological neuron.
Figure 1 shows such a neuron.

Figure 1. A Biological Neuron

A neuron operates by receiving signals from other neurons through
connections, called synapses. The combination of these signals, in
excess of a certain threshold or activation level, will result
in the neuron firing, that is sending a signal on to other neurons
connected to it. Some signals act as excitations and others as
inhibitions to a neuron firing. What we call thinking is believed
to be the collective effect of the presence or absence of firings in the
pattern of synaptic connections between neurons.

This sounds very simplistic until we recognize that there are approximately
one hundred billion (100,000,000,000) neurons each connected to as many as one
thousand (1,000) others in the human brain. The massive number of neurons
and the complexity of their interconnections results in a "thinking
machine", your brain.

Each neuron has a body, called the soma. The soma is much like the body
of any other cell. It contains the cell nucleus, various bio-chemical
factories and other components that support ongoing activity.

Surrounding the soma are dendrites. The dendrites are receptors for
signals generated by other neurons. These signals may be excitatory or
inhibitory. All signals present at the dendrites of a neuron are combined and
the result will determine whether or not that neuron will fire.

If a neuron fires, an electrical impulse is generated. This impulse starts at
the base, called the hillock, of a long cellular extension, called the
axon, and proceeds down the axon to its ends.

The end of the axon is actually split into multiple ends, called the
boutons. The boutons are connected to the dendrites of other neurons
and the resulting interconnections are the previously discussed synapses.
(Actually, the boutons do not touch the dendrites; there is a small gap
between them.) If a neuron has fired, the electrical impulse that has been
generated stimulates the boutons and results in electrochemical activity which
transmits the signal across the synapses to the receiving dendrites.

At rest, the neuron maintains an electrical potential of about 40-60
millivolts. When a neuron fires, an electrical impulse is created which is the
result of a change in potential to about 90-100 millivolts. This impulse
travels between 0.5 to 100 meters per second and lasts for about 1
millisecond. Once a neuron fires, it must rest for several milliseconds
before it can fire again. In some circumstances, the repetition rate may be as
fast as 100 times per second, equivalent to 10 milliseconds per firing.

Compare this to a very fast electronic computer whose signals travel at
about 200,000,000 meters per second (speed of light in a wire is 2/3 of that
in free air), whose impulses last for 10 nanoseconds and may repeat such an
impulse immediately in each succeeding 10 nanoseconds continuously.
Electronic computers have at least a 2,000,000 times advantage in signal
transmission speed and 1,000,000 times advantage in signal repetition rate.

It is clear that if signal speed or rate were the sole criteria for processing
performance, electronic computers would win hands down. What the human brain
lacks in these, it makes up in numbers of elements and interconnection
complexity between those elements. This difference in structure manifests
itself in at least one important way; the human brain is not as quick as an
electronic computer at arithmetic, but it is many times faster and hugely more
capable at recognition of patterns and perception of relationships.

The human brain differs in another, extremely important, respect beyond speed;
it is capable of "self-programming" or adaptation in response to
changing external stimuli. In other words, it can learn. The brain has
developed ways for neurons to change their response to new stimulus patterns
so that similar events may affect future responses. In particular, the
sensitivity to new patterns seems more extensive in proportion to their
importance to survival or if they are reinforced by repetition.

Neural networks are models of biological neural structures. The starting point
for most neural networks is a model neuron, as in Figure 2. This neuron
consists of multiple inputs and a single output. Each input is modified by a
weight, which multiplies with the input value. The neuron will combine
these weighted inputs and, with reference to a threshold value and activation
function, use these to determine its output. This behavior follows closely
our understanding of how real neurons work.

FIgure 2. A Model Neuron

While there is a fair understanding of how an individual neuron works, there
is still a great deal of research and mostly conjecture regarding the way
neurons organize themselves and the mechanisms used by arrays of neurons to
adapt their behavior to external stimuli. There are a large number of
experimental neural network structures currently in use reflecting this state
of continuing research.

In our case, we will only describe the structure, mathematics and behavior of
that structure known as the backpropagation network. This is the most
prevalent and generalized neural network currently in use. If the reader is
interested in finding out more about neural networks or other networks, please
refer to the material listed in the bibliography.

To build a backpropagation network, proceed in the following fashion. First,
take a number of neurons and array them to form a layer. A layer has
all its inputs connected to either a preceding layer or the inputs from the
external world, but not both within the same layer. A layer has all its
outputs connected to either a succeeding layer or the outputs to the external
world, but not both within the same layer.

Next, multiple layers are then arrayed one succeeding the other so that there
is an input layer, multiple intermediate layers and finally an output layer,
as in Figure 3. Intermediate layers, that is those that have no inputs or
outputs to the external world, are called >hidden layers.
Backpropagation neural networks are usually fully connected. This means
that each neuron is connected to every output from the preceding layer or one
input from the external world if the neuron is in the first layer and,
correspondingly, each neuron has its output connected to every neuron in the
succeeding layer.

Figure 3. Backpropagation Network

Generally, the input layer is considered a distributor of the signals from the
external world. Hidden layers are considered to be categorizers or feature
detectors of such signals. The output layer is considered a collector of the
features detected and producer of the response. While this view of the neural
network may be helpful in conceptualizing the functions of the layers, you
should not take this model too literally as the functions described may not be
so specific or localized.

With this picture of how a neural network is constructed, we can now proceed
to describe the operation of the network in a meaningful fashion.

The output of each neuron is a function of its inputs. In particular, the
output of the jth neuron in any layer is described by two sets of
equations:

[Eqn 1]

and

[Eqn 2]

For every neuron, j, in a layer, each of the i
inputs, Xi, to that layer is multiplied by a
previously established weight, wij. These are
all summed together, resulting in the internal value of this operation,
Uj. This value is then biased by a previously
established threshold value, tj, and sent
through an activation function, Fth. This activation
function is usually the sigmoid function, which has an input to output mapping
as shown in Figure 4. The resulting output,
Yj, is an input to the next layer or it is a
response of the neural network if it is the last layer. Neuralyst allows other
threshold functions to be used in place of the sigmoid described here.

Figure 4. Sigmoid Function

In essence, Equation 1 implements the combination operation of the neuron
and Equation 2 implements the firing of the neuron.

From these equations, a predetermined set of weights, a predetermined set of
threshold values and a description of the network structure (that is the
number of layers and the number of neurons in each layer), it is possible to
compute the response of the neural network to any set of inputs. And this is
just how Neuralyst goes about producing the response. But how does it learn?

Learning in a neural network is called training. Like training in
athletics, training in a neural network requires a coach, someone that
describes to the neural network what it should have produced as a response.
From the difference between the desired response and the actual response, the
error is determined and a portion of it is propagated
backward through the network. At each neuron in the network the
error is used to adjust
the weights and
threshold values of the neuron, so that the next
time, the error in the network response will be less for the
same inputs.

Figure 5. Neuron Weight Adjustment

This corrective procedure is called backpropagation
(hence the name of the neural network) and it is applied continuously and
repetitively for each set of inputs and corresponding set of outputs produced
in response to the inputs. This procedure continues so long as the individual
or total errors in the responses exceed a specified level or
until there are no measurable errors. At this point, the
neural network has learned the training material and you can stop the training
process and use the neural network to produce responses to new input
data.

[There is some heavier going in the next few paragraphs. Skip
ahead if you
don't need to understand all the details of neural network learning.]

Backpropagation starts at the output layer with the
following equations:

[Eqn 3]

and

[Eqn 4]

For the ith input of the jth neuron in the output
layer, the weight wij is adjusted by adding to
the previous weight value, w'ij, a term
determined by the product of a learning rate, LR, an
error term, ej, and the value of the
ith input, Xi. The error term,
ej, for the jth neuron is determined by the
product of the actual output, Yj, its
complement, 1 - Yj, and the difference between
the desired output, dj, and the actual output.

Once the error terms are computed and weights are adjusted for the output
layer, the values are recorded and the next layer back is adjusted. The same
weight adjustment process, determined by Equation 3, is followed, but the
error term is generated by a slightly modified version of Equation 4. This
modification is:

[Eqn 5]

In this version, the difference between the desired output and the actual
output is replaced by the sum of the error terms for each neuron,
k, in the layer immediately succeeding the layer being processed
(remember, we are going backwards through the layers so these terms have
already been computed) times the respective pre-adjustment weights.

The learning rate, LR, applies a greater or lesser portion of
the respective adjustment to the old weight. If the factor is set to a large
value, then the neural network may learn more quickly, but if there is a large
variability in the input set then the network may not learn very well or at
all. In real terms, setting the learning rate to a large value is analogous to
giving a child a spanking, but that is inappropriate and counter-productive to
learning if the offense is so simple as forgetting to tie their shoelaces.
Usually, it is better to set the factor to a small value and edge it upward if
the learning rate seems slow.

In many cases, it is useful to use a revised weight adjustment process. This
is described by the equation:

[Eqn 6]

This is similar to Equation 3, with a momentum factor, M,
the previous weight,
w'ij,
and the next to previous weight,
w''ij,
included in the last term. This extra term allows for momentum in weight
adjustment. Momentum basically allows a change to the weights to persist for
a number of adjustment cycles. The magnitude of the persistence is controlled
by the momentum factor. If the momentum factor is set to 0, then the equation
reduces to that of Equation 3. If the momentum factor is increased from 0,
then increasingly greater persistence of previous adjustments is allowed in
modifying the current adjustment. This can improve the learning rate in some
situations, by helping to smooth out unusual conditions in the training set.

As you train the network, the total error, that is the sum of the errors over
all the training sets, will become smaller and smaller. Once the network
reduces the total error to the limit set, training may stop. You may then
apply the network, using the weights and thresholds as trained.

It is a good idea to set aside some subset of all the inputs available and
reserve them for testing the trained network. By comparing the output of
a trained network on these test sets to the outputs you know to be correct,
you can gain greater confidence in the validity of the training. If you are
satisfied at this point, then the neural network is ready for running.

Usually, no backpropagation takes place in this running mode as was done in
the training mode. This is because there is often no way to be immediately
certain of the desired response. If there were, there would be no need for the
processing capabilities of the neural network! Instead, as the validity of the
neural network outputs or predictions are verified or contradicted over time,
you will either be satisfied with the existing performance or determine a need
for new training. In this case, the additional input sets collected since the
last training session may be used to extend and improve the training data.