Imagine if your robot could learn to characterize its sensations. Could it evolve its

own language to describe its “feelings?” They might be literal sensations derived

from sensors rather than self-reflection, but it is still a provocative idea ...

Artificial Intelligence literature calls this characterization
ability “clustering” and it is often used for data-mining
and analyzing very complex data sets where a human
cannot easily determine a pattern in the data by simply
looking at a graph. This is a fascinating topic of its own,
but for this month’s article we are going to look at how a
particular clustering technique called a Self-Organizing
Map can enable a simple microcontroller circuit to learn to
dynamically categorize the data it senses over time.

A Self-Organizing Map is a form of unsupervised
learning which means that there is no “teacher” present
telling the program whether its answer is right or wrong.
Instead, this neural network model is concerned with finding
patterns in sensory data and classifying them. There is no
correct or incorrect answer in this case, however there is an
inductive bias or embedded prejudice in the network towards
analyzing the data in the specific manner you set up. But
more on inductive bias in a bit.

Self-Organizing Maps are a particular form of
competitive learning referred to as winner take all.
Competitive learning is a class of neural network learning in
which neurons in a given network compete for activation.
“Winner take all” means that neurons are arranged in a
single layer and only one can fire at a time; hence they
“compete” for activation.

A two-dimensional Self-Organizing Map can be visualized easily as a graph. Data points show the relationship
between data collected from two sensors, in this case
temperature and light. They are plotted together and, as you
can see, certain patterns or clusters are visible in the data.

A Self-Organizing Map is initialized by creating random
prototype vectors, or points on the graph. Over time as the

learning rule is applied, these prototype vectors move
towards different clusters in the data and come to represent
the general properties of the region. This eventually creates
an emergent data topography which extracts features
from the data and — more interesting for our purposes —
allows a program to classify new input data in light of
past experience.

The learning method for a Self-Organizing Map is
actually quite simple and intuitive. Once a data set is
acquired, the programmer needs to decide how many
prototype vectors should represent the data space. This is
where the inductive bias I mentioned earlier comes in,
because here you are telling the program how many
categories to look for in the data. If you want the program to
think like you do, this isn’t a problem, but if you want to learn
something new from seeing how the program categorizes
messy data, I suggest using more prototype vectors than you
think you might need; extra prototypes will just be redundant
but missing prototypes are harder to spot.

FIGURE 1. A winner take all neural network layout with two
inputs and one prototype vector. Only the winner is activated.