Image classification consists in extracting added-value information from images. Such processing methods
classify pixels within images into geographical connected zones with similar properties, and identified by a
common class label. The classification can be either unsupervised or supervised.

Unsupervised classification does not require any additional information about the properties of the input
image to classify it. On the contrary, supervised methods need a preliminary learning to be computed over
training datasets having similar properties than the image to classify, in order to build a classification
model.

The OTB classification is implemented as a generic Machine Learning framework, supporting several
possible machine learning libraries as backends. The base class otb::MachineLearningModel defines
this framework. As of now libSVM (the machine learning library historically integrated in
OTB), machine learning methods of OpenCV library ([14]) and also Shark machine learning
library ([67]) are available. Both supervised and unsupervised classifiers are supported in the
framework.

The current list of classifiers available through the same generic interface within the OTB is:

LibSVM: Support Vector Machines classifier based on libSVM.

SVM: Support Vector Machines classifier based on OpenCV, itself based on libSVM.

Bayes: Normal Bayes classifier based on OpenCV.

Boost: Boost classifier based on OpenCV.

DT: Decision Tree classifier based on OpenCV.

RF: Random Forests classifier based on the Random Trees in OpenCV.

GBT: Gradient Boosted Tree classifier based on OpenCV (removed in version 3).

KNN: K-Nearest Neighbors classifier based on OpenCV.

ANN: Artificial Neural Network classifier based on OpenCV.

SharkRF : Random Forests classifier based on Shark.

SharkKM : KMeans unsupervised classifier based on Shark.

These models have a common interface, with the following major functions:

SetInputListSample(InputListSampleType ⋆in) : set the list of input samples

SetTargetListSample(TargetListSampleType ⋆in) : set the list of target samples

Train() : train the model based on input samples

Save(...) : saves the model to file

Load(...) : load a model from file

Predict(...) : predict a target value for an input sample

PredictBatch(...) : prediction on a list of input samples

The PredictBatch(...) function can be multi-threaded when called either from a multi-threaded filter, or
from a single location. In the later case, it creates several threads using OpenMP. There is a factory
mechanism on top of the model class (see otb::MachineLearningModelFactory ). Given an input file,
the static function CreateMachineLearningModel(...) is able to instantiate a model of the right
type.

For unsupervised models, the target samples still have to be set. They won’t be used so you can fill a
ListSample with zeros.

The models are trained from a list of input samples, stored in a itk::Statistics::ListSample
. For supervised classifiers, they also need a list of targets associated to each input sample.
Whatever the source of samples, it has to be converted into a ListSample before being fed into the
model.

Then, model-specific parameters can be set. And finally, the Train() method starts the learning step. Once
the model is trained it can be saved to file using the function Save(). The following examples show how to
do that.

The source code for this example can be found in the fileExamples/Learning/TrainMachineLearningModelFromSamplesExample.cxx.

This example illustrates the use of the otb::SVMMachineLearningModel class, which inherits from the
otb::MachineLearningModel class. This class allows the estimation of a classification model (supervised
learning) from samples. In this example, we will train an SVM model with 4 output classes, from 1000
randomly generated training samples, each of them having 7 components. We start by including the
appropriate header files.

The input parameters of the sample generator and of the SVM classifier are initialized.

int nbSamples=1000;int nbSampleComponents=7;int nbClasses=4;

Two lists are generated into a itk::Statistics::ListSample which is the structure used
to handle both lists of samples and of labels for the machine learning classes derived from
otb::MachineLearningModel . The first list is composed of feature vectors representing multi-component
samples, and the second one is filled with their corresponding class labels. The list of labels is composed of
scalar values.

In this example, the list of multi-component training samples is randomly filled with a random number
generator based on the itk::Statistics::MersenneTwisterRandomVariateGenerator class. Each
component’s value is generated from a normal law centered around the corresponding class label of each
sample multiplied by 100, with a standard deviation of 10.

Once both sample and label lists are generated, the second step consists in declaring the machine learning
classifier. In our case we use an SVM model with the help of the otb::SVMMachineLearningModel class
which is derived from the otb::MachineLearningModel class. This pure virtual class is based on the
machine learning framework of the OpenCV library ([14]) which handles other classifiers than the SVM.

Once the classifier is parametrized with both input lists and default parameters, except for the kernel type in
our example of SVM model estimation, the model training is computed with the Train method. Finally, the
Save method exports the model to a text file. All the available classifiers based on OpenCV are
implemented with these interfaces. Like for the SVM model training, the other classifiers can be
parametrized with specific settings.

SVMClassifier->Train(); SVMClassifier->Save(outputModelFileName);

The source code for this example can be found in the fileExamples/Learning/TrainMachineLearningModelFromImagesExample.cxx.

This example illustrates the use of the otb::MachineLearningModel class. This class allows the
estimation of a classification model (supervised learning) from images. In this example, we will train an
SVM with 4 classes. We start by including the appropriate header files.

Now, we need to declare the machine learning model which will be used by the classifier. In this example,
we train an SVM model. The otb::SVMMachineLearningModel class inherits from the pure virtual class
otb::MachineLearningModel which is templated over the type of values used for the measures and the
type of pixels used for the labels. Most of the classification and regression algorithms available through this
interface in OTB is based on the OpenCV library [14]. Specific methods can be used to set classifier
parameters. In the case of SVM, we set here the type of the kernel. Other parameters are let with their
default values.

The source code for this example can be found in the fileExamples/Classification/SupervisedImageClassificationExample.cxx.

In OTB, a generic streamed filter called otb::ImageClassificationFilter is available to classify any
input multi-channel image according to an input classification model file. This filter is generic because it
works with any classification model type (SVM, KNN, Artificial Neural Network,...) generated within the
OTB generic Machine Learning framework based on OpenCV ([14]). The input model file is smartly parsed
according to its content in order to identify which learning method was used to generate it. Once the
classification method and model are known, the input image can be classified. More details are
given in subsections ?? and ?? to generate a classification model either from samples or from
images. In this example we will illustrate its use. We start by including the appropriate header
files.

Moreover, it is necessary to define a otb::MachineLearningModelFactory which is templated over its
input and output pixel types. This factory is used to parse the input model file and to define which
classification method to use.

The classifiers are integrated in several OTB Applications. There is a base class that provides an easy access
to all the classifiers: otb::Wrapper::LearningApplicationBase . As each machine learning model has
a specific set of parameters, the base class LearningApplicationBase knows how to expose each type of
classifier with its dedicated parameters (a task that is a bit tedious so we want to implement it only once).
The DoInit() method creates a choice parameter named classifier which contains the different
supported classifiers along with their parameters.

The function Train(...) provide an easy way to train the selected classifier, with the corresponding
parameters, and save the model to file.

On the other hand, the function Classify(...) allows to load a model from file and apply it on a list of
samples.

Kernel based learning methods in general and the Support Vector Machines (SVM) in particular, have been
introduced in the last years in learning theory for classification and regression tasks, [133]. SVM have
been successfully applied to text categorization, [74], and face recognition, [103]. Recently,
they have been successfully used for the classification of hyperspectral remote-sensing images,
[15].

Simply stated, the approach consists in searching for the separating surface between 2 classes by the
determination of the subset of training samples which best describes the boundary between the 2 classes.
These samples are called support vectors and completely define the classification system. In the case where
the two classes are nonlinearly separable, the method uses a kernel expansion in order to make projections
of the feature space onto higher dimensionality spaces where the separation of the classes becomes
linear.

This subsection reminds the basic principles of SVM learning and classification. A good tutorial on SVM
can be found in, [17].

We have N samples represented by the couple (yi,xi),i = 1…N where yi∈{-1,+1} is the class label and
xi∈ℝn is the feature vector of dimension n. A classifier is a function

where
α are the classifier parameters. The SVM finds the optimal separating hyperplane which fulfills the
following constraints :

The samples with labels +1 and -1 are on different sides of the hyperplane.

The distance of the closest vectors to the hyperplane is maximised. These are the support
vectors (SV) and this distance is called the margin.

The separating hyperplane has the equation

with
w being its normal vector and x being any point of the hyperplane. The orthogonal distance to
the origin is given by . Vectors located outside the hyperplane have either w ⋅x +b > 0 or
w ⋅x +b < 0.

Therefore, the classifier function can be written as

The SVs are placed on two hyperplanes which are parallel to the optimal separating one. In order to find the
optimal hyperplane, one sets w and b :

Since there must not be any vector inside the margin, the following constraint can be used:

which can be rewritten as

The orthogonal distances of the 2 parallel hyperplanes to the origin are and . Therefore the
modulus of the margin is equal to and it has to be maximised.

Thus, the problem to be solved is:

Find w and b which minimise

under the constraint : yi(w ⋅xi+b) ≥ 1 i = 1…N.

This problem can be solved by using the Lagrange multipliers with one multiplier per sample. It can be
shown that only the support vectors will have a positive Lagrange multiplier.

In the case where the two classes are not exactly linearly separable, one can modify the constraints above by
using

If ξi> 1, one considers that the sample is wrong. The function which has then to be minimised is
∥w∥2+C;, where C is a tolerance parameter. The optimisation problem is the same than in the
linear case, but one multiplier has to be added for each new constraint ξi≥ 0.

If the decision surface needs to be non-linear, this solution cannot be applied and the kernel approach has to
be adopted.

One drawback of the SVM is that, in their basic version, they can only solve two-class problems. Some
works exist in the field of multi-class SVM (see [4, 137], and the comparison made by [59]), but they are
not used in our system.

You have to be aware that to achieve better convergence of the algorithm it is strongly advised to normalize
feature vector components in the [-1;1] interval.

For problems with N > 2 classes, one can choose either to train N SVM (one class against all the others), or
to train N ×(N -1) SVM (one class against each of the others). In the second approach, which is the one
that we use, the final decision is taken by choosing the class which is most often selected by the whole set of
SVM.

The Random Forests algorithm is also available in OTB machine learning framework. This model builds a
set of decision trees. Each tree may not give a reliable prediction, but taking them together, they form
a robust classifier. The prediction of this model is the mode of the predictions of individual
trees.

There are two implementations: one in OpenCV and the other on in Shark. The Shark implementation has a
noteworthy advantage: the training step is parallel. It uses the following parameters:

The number of trees to train

The number of random attributes to investigate at each node

The maximum node size to decide a split

The ratio of the original training dataset to use as the out of bag sample

Except these specific parameter, its usage is exactly the same as the other machine learning models (such as
the SVM model).

OTB has developed a specific interface for user-defined kernels. However, the following functions use a
deprecated OTB interface. The code source for these Generic Kernels has been removed from the official
repository. It is now available as a remote module: GKSVM.

A function k(⋅,⋅) is considered to be a kernel when:

∀g(⋅) ∈2(Rn)

so that ∫g(x)2dx be finite,

(19.1)

then ∫k(x,y)g(x)g(y)dxdy≥ 0,

which is known as the Mercer condition.

When defined through the OTB, a kernel is a class that inherits from GenericKernelFunctorBase. Several
virtual functions have to be overloaded:

The Evaluate function, which implements the behavior of the kernel itself. For instance, the
classical linear kernel could be re-implemented with:

otb::SpectralAngleKernelFunctor, a kernel that integrates the Spectral Angle, instead of
the Euclidean distance, into an inverse multiquadric kernel. This kernel may be appropriated
when using multispectral data.

otb::ChangeProfileKernelFunctor, a kernel which is dedicated to the supervized
classification of the multiscale change profile presented in section 21.5.1.

The source code for this example can be found in the fileExamples/Learning/SVMGenericKernelImageClassificationExample.cxx.

This example illustrates the modifications to be added to use the otb::SVMClassifier class for
performing SVM classification on images with a user-defined kernel. In this example, we will use an SVM
model estimated in the previous section to separate between water and non-water pixels by using the RGB
values only. The first thing to do is include the header file for the class as well as the header of the
appropriated kernel to be used.

The KMeans algorithm has been implemented in Shark library, and has been wrapped in the OTB machine
learning framework. It is the first unsupervised algorithm in this framework. It can be used in the same way
as other machine learning models. Remember that even if unsupervised model don’t use a label information
on the samples, the target ListSample still has to be set in MachineLearningModel. A ListSample filled
with zeros can be used.

This model uses a hard clustering model with the following parameters:

The source code for this example can be found in the fileExamples/Classification/ScalarImageKmeansClassifier.cxx.

This example shows how to use the KMeans model for classifying the pixel of a scalar image.

The itk::Statistics::ScalarImageKmeansImageFilter is used for taking a scalar image and
applying the K-Means algorithm in order to define classes that represents statistical distributions of
intensity values in the pixels. The classes are then used in this filter for generating a labeled image where
every pixel is assigned to one of the classes.

First we define the pixel type and dimension of the image that we intend to classify. With this image type
we can also declare the otb::ImageFileReader needed for reading the input image, create one and set
its input filename.

In general the classification will produce as output an image whose pixel values are integers associated to
the labels of the classes. Since typically these integers will be generated in order (0, 1, 2, ...N), the output
image will tend to look very dark when displayed with naive viewers. It is therefore convenient
to have the option of spreading the label values over the dynamic range of the output image
pixel type. When this is done, the dynamic range of the pixels is divided by the number of
classes in order to define the increment between labels. For example, an output image of 8
bits will have a dynamic range of [0:255], and when it is used for holding four classes, the
non-contiguous labels will be (0, 64, 128, 192). The selection of the mode to use is done with the method
SetUseContiguousLabels().

For each one of the classes we must provide a tentative initial value for the mean of the class. Given that this
is a scalar image, each one of the means is simply a scalar value. Note however that in a general case of
K-Means, the input image would be a vector image and therefore the means will be vectors of the same
dimension as the image pixels.

The itk::ScalarImageKmeansImageFilter is predefined for producing an 8 bits scalar image as
output. This output image contains labels associated to each one of the classes in the K-Means
algorithm. In the following lines we use the OutputImageType in order to instantiate the type of a
otb::ImageFileWriter . Then create one, and connect it to the output of the classification
filter.

We are now ready for triggering the execution of the pipeline. This is done by simply invoking the
Update() method in the writer. This call will propagate the update request to the reader and then to the
classifier.

At this point the classification is done, the labeled image is saved in a file, and we can take a look at
the means that were found as a result of the model estimation performed inside the classifier
filter.

Figure 19.1 illustrates the effect of this filter with three classes. The means can be estimated by
ScalarImageKmeansModelEstimator.cxx.

The source code for this example can be found in the fileExamples/Classification/ScalarImageKmeansModelEstimator.cxx.

This example shows how to compute the KMeans model of an Scalar Image.

The itk::Statistics::KdTreeBasedKmeansEstimator is used for taking a scalar image and applying
the K-Means algorithm in order to define classes that represents statistical distributions of intensity values
in the pixels. One of the drawbacks of this technique is that the spatial distribution of the pixels is not
considered at all. It is common therefore to combine the classification resulting from K-Means with other
segmentation techniques that will use the classification as a prior and add spatial information to it in order
to produce a better segmentation.

The source code for this example can be found in the fileExamples/Classification/KMeansImageClassificationExample.cxx.

The K-Means classification proposed by ITK for images is limited to scalar images and is not streamed. In
this example, we show how the use of the otb::KMeansImageClassificationFilter allows for a
simple implementation of a K-Means classification application. We will start by including the appropirate
header file.

#include"otbKMeansImageClassificationFilter.h"

We will assume double precision input images and will also define the type for the labeled
pixels.

We instantiate the classifier and the reader objects and we set their parameters. Please note the call
of the GenerateOutputInformation() method on the reader in order to have available the
information about the input image (size, number of bands, etc.) without needing to actually read the
image.

The most common termination criteria is that if there is no measurement vector that changes its cluster
membership from the previous iteration, then the algorithm stops.

The itk::Statistics::KdTreeBasedKmeansEstimator is a variation of this logic. The k-means
clustering algorithm is computationally very expensive because it has to recalculate the mean at each
iteration. To update the mean values, we have to calculate the distance between k means and each and every
measurement vector. To reduce the computational burden, the KdTreeBasedKmeansEstimator uses a special
data structure: the k-d tree ( itk::Statistics::KdTree ) with additional information. The additional
information includes the number and the vector sum of measurement vectors under each node under the tree
architecture.

With such additional information and the k-d tree data structure, we can reduce the computational cost of
the distance calculation and means. Instead of calculating each measurement vectors and k means, we can
simply compare each node of the k-d tree and the k means. This idea of utilizing a k-d tree can be found in
multiple articles [5][105][78]. Our implementation of this scheme follows the article by the Kanungo et al
[78].

Since the NormalVariateGenerator class only supports 1-D, we define our measurement
vector type as one component vector. We then, create a ListSample object for data inputs. Each
measurement vector is of length 1. We set this using the SetMeasurementVectorSize() method.

The following code snippet creates a NormalVariateGenerator object. Since the random variable generator
returns values according to the standard normal distribution (The mean is zero, and the standard deviation is
one), before pushing random values into the sample, we change the mean and standard deviation.
We want two normal (Gaussian) distribution data. We have two for loops. Each for loop uses
different mean and standard deviation. Before we fill the sample with the second distribution data,
we call Initialize(random seed) method, to recreate the pool of random variables in the
normalGenerator.

To see the probability density plots from the two distribution, refer to the Figure 19.2.

Figure 19.2: Two normal distributions’ probability density plot (The means are 100 and 200, and the standarddeviation is 30 )

Once we have the k-d tree, it is a simple procedure to produce k mean estimates.

We create the KdTreeBasedKmeansEstimator. Then, we provide the initial mean values using the
SetParameters(). Since we are dealing with two normal distribution in a 1-D space, the size of the mean
value array is two. The first element is the first mean value, and the second is the second mean value. If we
used two normal distributions in a 2-D space, the size of array would be four, and the first two elements
would be the two components of the first normal distribution’s mean vector. We plug-in the k-d tree using
the SetKdTree().

The remaining two methods specify the termination condition. The estimation process stops when the
number of iterations reaches the maximum iteration value set by the SetMaximumIteration(), or the
distances between the newly calculated mean (centroid) values and previous ones are within the
threshold set by the SetCentroidPositionChangesThreshold(). The final step is to call the
StartOptimization() method.

The for loop will print out the mean estimates from the estimation process.

If we are only interested in finding the mean estimates, we might stop. However, to illustrate how a
classifier can be formed using the statistical classification framework. We go a little bit further in this
example.

Since the k-means algorithm is an minimum distance classifier using the estimated k means and the
measurement vectors. We use the EuclideanDistanceMetric class as membership functions.
Our choice for the decision rule is the itk::Statistics::MinimumDecisionRule that
returns the index of the membership functions that have the smallest value for a measurement
vector.

After creating a SampleClassifierFilter object and a MinimumDecisionRule object, we plug-in the
decisionRule and the sample to the classifier. Then, we must specify the number of classes that will be
considered using the SetNumberOfClasses() method.

The remainder of the following code snippet shows how to use user-specified class labels. The classification
result will be stored in a MembershipSample object, and for each measurement vector, its class label will be
one of the two class labels, 100 and 200 (unsigned int).

The classifier is almost ready to do the classification process except that it needs two membership
functions that represents two clusters respectively.

In this example, the two clusters are modeled by two Euclidean distance functions. The distance function
(model) has only one parameter, its mean (centroid) set by the SetOrigin() method. To plug-in two
distance functions, we call the AddMembershipFunction() method. Then invocation of the Update()
method will perform the classification.

The Self Organizing Map, SOM, introduced by Kohonen is a non-supervised neural learning algorithm. The
map is composed of neighboring cells which are in competition by means of mutual interactions and they
adapt in order to match characteristic patterns of the examples given during the learning. The SOM is
usually on a plane (2D).

The algorithm implements a nonlinear projection from a high dimensional feature space to a lower
dimension space, usually 2D. It is able to find the correspondence between a set of structured data and a
network of much lower dimension while keeping the topological relationships existing in the
feature space. Thanks to this topological organization, the final map presents clusters and their
relationships.

Kohonen’s SOM is usually represented as an array of cells where each cell is, i, associated to a feature (or
weight) vector mi= T∈ℝn (figure 19.3).

Figure 19.3: Kohonen’s Self Organizing Map

A cell (or neuron) in the map is a good detector for a given input vector x= T∈ℝn
if the latter is close to the former. This distance between vectors can be represented by the
scalar product xT⋅mi, but for most of the cases other distances can be used, as for instance
the Euclidean one. The cell having the weight vector closest to the input vector is called the
winner.

The goal of the learning step is to get a map which is representative of an input example set. It is an iterative
procedure which consists in passing each input example to the map, testing the response of each neuron and
modifying the map to get it closer to the examples.

Algorithm 1SOM learning:

t = 0.

Initialize the weight vectors of the map (randomly, for instance).

While t < number of iterations, do:

k = 0.

While k < number of examples, do:

Find the vector mi(t) which minimizes the distance d(xk,mi(t))

For a neighborhood Nc(t) around the winner cell, apply the transformation:

(19.2)

k = k+1

t = t +1.

In 19.2, β(t) is a decreasing function with the geometrical distance to the winner cell. For instance:

(19.3)

with β0(t) and σ(t) decreasing functions with time and r the cell coordinates in the output – map –
space.

Therefore the algorithm consists in getting the map closer to the learning set. The use of a neighborhood
around the winner cell allows the organization of the map into areas which specialize in the recognition of
different patterns. This neighborhood also ensures that cells which are topologically close are also close in
terms of the distance defined in the feature space.

The source code for this example can be found in the fileExamples/Learning/SOMExample.cxx.

This example illustrates the use of the otb::SOM class for building Kohonen’s Self Organizing
Maps.

We will use the SOM in order to build a color table from an input image. Our input image is coded with
3×8 bits and we would like to code it with only 16 levels. We will use the SOM in order to learn which are
the 16 most representative RGB values of the input image and we will assume that this is the optimal color
table for the image.

The first thing to do is include the header file for the class. We will also need the header files for
the map itself and the activation map builder whose utility will be explained at the end of the
example.

Since the otb::SOM class uses a distance, we will need to include the header file for the one we want to
use

The Self Organizing Map itself is actually an N-dimensional image where each pixel contains a neuron. In
our case, we decide to build a 2-dimensional SOM, where the neurons store RGB values with floating point
precision.

We can now define the type for the map. The otb::SOMMap class is templated over the neuron type –
PixelType here –, the distance type and the number of dimensions. Note that the number of dimensions of
the map could be different from the one of the images to be processed.

typedef otb::SOMMap<VectorType, DistanceType, Dimension> MapType;

We are going to perform the learning directly on the pixels of the input image. Therefore, the image type is
defined using the same pixel type as we used for the map. We also define the type for the imge file
reader.

typedef otb::ImageFileReader<ImageType> ReaderType;

Since the otb::SOM class works on lists of samples, it will need to access the input image through an
adaptor. Its type is defined as follows:

typedef itk::Statistics::ListSample<VectorType> SampleListType;

We can now define the type for the SOM, which is templated over the input sample list and the type of the
map to be produced and the two functors that hold the training behavior.

As an alternative to standard SOMType, one can decide to use an otb::PeriodicSOM , which behaves like
otb::SOM but is to be considered to as a torus instead of a simple map. Hence, the neighborhood behavior
of the winning neuron does not depend on its location on the map... otb::PeriodicSOM is defined in
otbPeriodicSOM.h.

We can now start building the pipeline. The first step is to instantiate the reader and pass its output to the
adaptor.

Finally, we set up the las part of the pipeline where the plug the output of the SOM into the writer. The
learning procedure is triggered by calling the Update() method on the writer. Since the map is itself an
image, we can write it to disk with an otb::ImageFileWriter .

Figure 19.4 shows the result of the SOM learning. Since we have performed a learning on RGB pixel
values, the produced SOM can be interpreted as an optimal color table for the input image. It can be
observed that the obtained colors are topologically organised, so similar colors are also close in the map.
This topological organisation can be exploited to further reduce the number of coding levels of the pixels
without performing a new learning: we can subsample the map to get a new color table. Also, a
bilinear interpolation between the neurons can be used to increase the number of coding levels.

We can now compute the activation map for the input image. The activation map tells us how many times a
given neuron is activated for the set of examples given to the map. The activation map is stored as a scalar
image and an integer pixel type is usually enough.

The source code for this example can be found in the fileExamples/Learning/SOMClassifierExample.cxx.

This example illustrates the use of the otb::SOMClassifier class for performing a classification using
an existing Kohonen’s Self Organizing. Actually, the SOM classification consists only in the attribution of
the winner neuron index to a given feature vector.

We will use the SOM created in section 19.4.2.1 and we will assume that each neuron represents a class in
the image.

The first thing to do is include the header file for the class.

#include"otbSOMClassifier.h"

As for the SOM learning step, we must define the types for the otb::SOMMap, and therefore, also for
the distance to be used. We will also define the type for the SOM reader, which is actually an
otb::ImageFileReader which the appropriate image type.

The classification will be performed by the otb::SOMClassifier , which, as most of the classifiers,
works on itk::Statistics::ListSample s. In order to be able to perform an image classification, we
will need to use the itk::Statistics::ImageToListAdaptor which is templated over the type of
image to be adapted. The SOMClassifier is templated over the sample type, the SOMMap type and the
pixel type for the labels.

The classifier can now be instantiated. The input data is set by using the SetSample() method and the
SOM si set using the SetMap() method. The classification is triggered by using the Update()
method.

The source code for this example can be found in the fileExamples/Classification/SOMImageClassificationExample.cxx.

In previous examples, we have used the otb::SOMClassifier , which uses the ITK classification
framework. This good for compatibility with the ITK framework, but introduces the limitations
of not being able to use streaming and being able to know at compilation time the number of
bands of the image to be classified. In OTB we have avoided this limitation by developing the
otb::SOMImageClassificationFilter . In this example we will illustrate its use. We start by including
the appropriate header file.

#include"otbSOMImageClassificationFilter.h"

We will assume double precision input images and will also define the type for the labeled
pixels.

And finally, we define the readers (for the input image and theSOM) and the writer. Since the images, to
classify can be very big, we will use a streamed writer which will trigger the streaming ability of the
classifier.

The source code for this example can be found in the fileExamples/Classification/BayesianPluginClassifier.cxx.

In this example, we present a system that places measurement vectors into two Gaussian classes. The
Figure 19.6 shows all the components of the classifier system and the data flow. This system differs with
the previous k-means clustering algorithms in several ways. The biggest difference is that this classifier uses
the itk::Statistics::GaussianMembershipFunction as membership functions instead of
the itk::Statistics::EuclideanDistanceMetric . Since the membership function is
different, the membership function requires a different set of parameters, mean vectors and
covariance matrices. We choose the itk::Statistics::MeanSampleFilter (sample mean) and
the itk::Statistics::CovarianceSampleFilter (sample covariance) for the estimation
algorithms of the two parameters. If we want more robust estimation algorithm, we can replace these
estimation algorithms with more alternatives without changing other components in the classifier
system.

It is a bad idea to use the same sample for test and training (parameter estimation) of the parameters.
However, for simplicity, in this example, we use a sample for test and training.

Since the NormalVariateGenerator class only supports 1-D, we define our measurement vector type as a one
component vector. We then, create a ListSample object for data inputs.

We also create two Subsample objects that will store the measurement vectors in sample into two separate
sample containers. Each Subsample object stores only the measurement vectors belonging to a single class.
This class sample will be used by the parameter estimation algorithms.

The following code snippet creates a NormalVariateGenerator object. Since the random variable generator
returns values according to the standard normal distribution (the mean is zero, and the standard deviation is
one) before pushing random values into the sample, we change the mean and standard deviation. We want
two normal (Gaussian) distribution data. We have two for loops. Each for loop uses different mean
and standard deviation. Before we fill the sample with the second distribution data, we call
Initialize(random seed) method, to recreate the pool of random variables in the normalGenerator. In
the second for loop, we fill the two class samples with measurement vectors using the AddInstance()
method.

To see the probability density plots from the two distributions, refer to Figure 19.2.

In the following code snippet, notice that the template argument for the MeanSampleFilter and
CovarianceFilter is ClassSampleType (i.e., type of Subsample) instead of SampleType (i.e., type
of ListSample). This is because the parameter estimation algorithms are applied to the class
sample.

After creating a SampleClassifierFilter object and a MaximumRatioDecisionRule object, we plug in the
decisionRule and the sample to the classifier. Then, we specify the number of classes that will be
considered using the SetNumberOfClasses() method.

The MaximumRatioDecisionRule requires a vector of a priori probability values. Such a priori probability
will be the P(ωi) of the following variation of the Bayes decision rule:

(19.4)

The remainder of the code snippet shows how to use user-specified class labels. The classification result
will be stored in a MembershipSample object, and for each measurement vector, its class label will be one
of the two class labels, 100 and 200 (unsigned int).

The classifier is almost ready to perform the classification except that it needs two membership
functions that represent the two clusters.

In this example, we can imagine that the two clusters are modeled by two Euclidean distance functions. The
distance function (model) has only one parameter, the mean (centroid) set by the SetOrigin() method. To
plug-in two distance functions, we call the AddMembershipFunction() method. Then invocation of the
Update() method will perform the classification.

The source code for this example can be found in the fileExamples/Classification/ExpectationMaximizationMixtureModelEstimator.cxx.

In this example, we present ITK’s implementation of the expectation maximization (EM) process to
generate parameter estimates for a two Gaussian component mixture model.

The Bayesian plug-in classifier example (see Section 19.4.3) used two Gaussian probability density
functions (PDF) to model two Gaussian distribution classes (two models for two class). However, in
some cases, we want to model a distribution as a mixture of several different distributions.
Therefore, the probability density function (p(x)) of a mixture model can be stated as follows
:

(19.5)

where i is the index of the component, c is the number of components, αi is the proportion of the
component, and fi is the probability density function of the component.

Now the task is to find the parameters(the component PDF’s parameters and the proportion values) to
maximize the likelihood of the parameters. If we know which component a measurement vector belongs to,
the solutions to this problem is easy to solve. However, we don’t know the membership of each
measurement vector. Therefore, we use the expectation of membership instead of the exact membership.
The EM process splits into two steps:

E step: calculate the expected membership values for each measurement vector to each
classes.

M step: find the next parameter sets that maximize the likelihood with the expected
membership values and the current set of parameters.

The E step is basically a step that calculates the a posteriori probability for each measurement
vector.

The M step is dependent on the type of each PDF. Most of distributions belonging to exponential family
such as Poisson, Binomial, Exponential, and Normal distributions have analytical solutions for updating the
parameter set. The itk::Statistics::ExpectationMaximizationMixtureModelEstimator class
assumes that such type of components.

Since the NormalVariateGenerator class only supports 1-D, we define our measurement vector type as a one
component vector. We then, create a ListSample object for data inputs.

We also create two Subsample objects that will store the measurement vectors in the sample into two
separate sample containers. Each Subsample object stores only the measurement vectors belonging to a
single class. This class sample will be used by the parameter estimation algorithms.

The following code snippet creates a NormalVariateGenerator object. Since the random variable generator
returns values according to the standard normal distribution (the mean is zero, and the standard deviation is
one) before pushing random values into the sample, we change the mean and standard deviation. We want
two normal (Gaussian) distribution data. We have two for loops. Each for loop uses different mean
and standard deviation. Before we fill the sample with the second distribution data, we call
Initialize() method to recreate the pool of random variables in the normalGenerator. In the second
for loop, we fill the two class samples with measurement vectors using the AddInstance()
method.

To see the probability density plots from the two distribution, refer to Figure 19.2.

In the following code snippet notice that the template argument for the MeanSampleFilter and
CovarianceSampleFilter is ClassSampleType (i.e., type of Subsample) instead of SampleType (i.e., type
of ListSample). This is because the parameter estimation algorithms are applied to the class
sample.

The Stochastic Expectation Maximization (SEM) approach is a stochastic version of the EM mixture
estimation seen on section 19.4.4. It has been introduced by [22] to prevent convergence of the EM
approach from local minima. It avoids the analytical maximization issued by integrating a stochastic
sampling procedure in the estimation process. It induces an almost sure (a.s.) convergence to the
algorithm.

From the initial two step formulation of the EM mixture estimation, the SEM may be decomposed into 3
steps:

E-step, calculates the expected membership values for each measurement vector to each
classes.

S-step, performs a stochastic sampling of the membership vector to each classes, according
to the membership values computed in the E-step.

The implementation of the SEM has been turned to a contextual SEM in the sense where the evaluation of
the membership parameters is conditioned to membership values of the spatial neighborhood of each
pixels.

The source code for this example can be found in the fileExamples/Learning/SEMModelEstimatorExample.cxx.

Input/Output images type are define in a classical way. In fact, a itk::VariableLengthVector is to be
considered for the templated MeasurementVectorType, which will be used in the ListSample
interface.

By default, otb::SEMClassifier performs initialization of ModelComponentBase by as many
instantiation of otb::Statistics::GaussianModelComponent as the number of classes to estimate in
the mixture. Nevertheless, the user may add specific distribution into the mixture estimation. It is
permitted by the use of AddComponent for the given class number and the specific distribution.

Figure 19.7 shows the result of the SEM segmentation with 4 different classes and a contextual
neighborhood of 3 pixels.

Figure 19.7: SEM Classification results.

As soon as the segmentation is performed by an iterative stochastic process, it is worth verifying the output
status: does the segmentation ends when it has converged or just at the limit of the iteration
numbers.

The itk::Statistics::MRFImageFilter uses the maximum a posteriori (MAP) estimates
for modeling the MRF. The object traverses the data set and uses the model generated by the
Mahalanobis distance classifier to get the the distance between each pixel in the data set to
a set of known classes, updates the distances by evaluating the influence of its neighboring
pixels (based on a MRF model) and finally, classifies each pixel to the class which has the
minimum distance to that pixel (taking the neighborhood influence under consideration). The
energy function minimization is done using the iterated conditional modes (ICM) algorithm
[12].

The source code for this example can be found in the fileExamples/Classification/ScalarImageMarkovRandomField1.cxx.

This example shows how to use the Markov Random Field approach for classifying the pixel of a scalar
image.

The itk::Statistics::MRFImageFilter is used for refining an initial classification by introducing the
spatial coherence of the labels. The user should provide two images as input. The first image is
the one to be classified while the second image is an image of labels representing an initial
classification.

The following headers are related to reading input images, writing the output image, and making the
necessary conversions between scalar and vector images.

First we define the pixel type and dimension of the image that we intend to classify. With this image type
we can also declare the otb::ImageFileReader needed for reading the input image, create one and set
its input filename.

As a second step we define the pixel type and dimension of the image of labels that provides the initial
classification of the pixels from the first image. This initial labeled image can be the output of a K-Means
method like the one illustrated in section 19.4.1.

Since the Markov Random Field algorithm is defined in general for images whose pixels have multiple
components, that is, images of vector type, we must adapt our scalar image in order to satisfy the interface
expected by the MRFImageFilter. We do this by using the itk::ScalarToArrayCastImageFilter .
With this filter we will present our scalar image as a vector image whose vector pixels contain a single
component.

With the input image type ImageType and labeled image type LabelImageType we instantiate the type of
the itk::MRFImageFilter that will apply the Markov Random Field algorithm in order to refine the
pixel classification.

We set now some of the parameters for the MRF filter. In particular, the number of classes to be used during
the classification, the maximum number of iterations to be run in this filter and the error tolerance that will
be used as a criterion for convergence.

The smoothing factor represents the tradeoff between fidelity to the observed image and the smoothness of
the segmented image. Typical smoothing factors have values between 1 5. This factor will
multiply the weights that define the influence of neighbors on the classification of a given pixel.
The higher the value, the more uniform will be the regions resulting from the classification
refinement.

mrfFilter->SetSmoothingFactor(smoothingFactor);

Given that the MRF filter needs to continually relabel the pixels, it needs access to a set of membership
functions that will measure to what degree every pixel belongs to a particular class. The classification is
performed by the itk::ImageClassifierBase class, that is instantiated using the type of the input
vector image and the type of the labeled image.

The classifier needs a decision rule to be set by the user. Note that we must use GetPointer() in the call of
the SetDecisionRule() method because we are passing a SmartPointer, and smart pointer cannot perform
polymorphism, we must then extract the raw pointer that is associated to the smart pointer. This extraction
is done with the GetPointer() method.

and we set the neighborhood radius that will define the size of the clique to be used in the computation of
the neighbors’ influence in the classification of any given pixel. Note that despite the fact that we call this a
radius, it is actually the half size of an hypercube. That is, the actual region of influence will not be circular
but rather an N-Dimensional box. For example, a neighborhood radius of 2 in a 3D image will
result in a clique of size 5x5x5 pixels, and a radius of 1 will result in a clique of size 3x3x3
pixels.

mrfFilter->SetNeighborhoodRadius(1);

We should now set the weights used for the neighbors. This is done by passing an array of values that
contains the linear sequence of weights for the neighbors. For example, in a neighborhood of
size 3x3x3, we should provide a linear array of 9 weight values. The values are packaged in a
std::vector and are supposed to be double. The following lines illustrate a typical set of values for a
3x3x3 neighborhood. The array is arranged and then passed to the filter by using the method
SetMRFNeighborhoodWeight().

We now scale weights so that the smoothing function and the image fidelity functions have
comparable value. This is necessary since the label image and the input image can have different
dynamic ranges. The fidelity function is usually computed using a distance function, such as the
itk::DistanceToCentroidMembershipFunction or one of the other membership functions. They tend
to have values in the order of the means specified.

Finally, the classifier class is connected to the Markov Random Fields filter.

mrfFilter->SetClassifier(classifier);

The output image produced by the itk::MRFImageFilter has the same pixel type as the
labeled input image. In the following lines we use the OutputImageType in order to instantiate
the type of a otb::ImageFileWriter . Then create one, and connect it to the output of the
classification filter after passing it through an intensity rescaler to rescale it to an 8 bit dynamic
range

We are now ready for triggering the execution of the pipeline. This is done by simply invoking the
Update() method in the writer. This call will propagate the update request to the reader and then to the
MRF filter.

Figure 19.8 illustrates the effect of this filter with four classes. In this example the filter was run with a
smoothing factor of 3. The labeled image was produced by ScalarImageKmeansClassifier.cxx and the
means were estimated by ScalarImageKmeansModelEstimator.cxx described in section 19.4.1. The
obtained result can be compared with the one of figure 19.1 to see the interest of using the MRF approach
in order to ensure the regularization of the classified image.

The ITK approach was considered not to be flexible enough for some remote sensing applications.
Therefore, we decided to implement our own framework.

Figure 19.9: OTB Markov Framework.

The source code for this example can be found in the fileExamples/Markov/MarkovClassification1Example.cxx.

This example illustrates the details of the otb::MarkovRandomFieldFilter . This filter is an application
of the Markov Random Fields for classification, segmentation or restoration.

This example applies the otb::MarkovRandomFieldFilter to classify an image into four classes
defined by their mean and variance. The optimization is done using an Metropolis algorithm with a random
sampler. The regularization energy is defined by a Potts model and the fidelity by a Gaussian
model.

The first step toward the use of this filter is the inclusion of the proper header files.

Then we must decide what pixel type to use for the image. We choose to make all computations with
double precision. The labelled image is of type unsigned char which allows up to 256 different
classes.

Finally, we define the different classes necessary for the Markov classification. A
otb::MarkovRandomFieldFilter is instantiated, this is the main class which connect the other to do the
Markov classification.

Two energy, deriving from the otb::MRFEnergy class need to be instantiated. One energy is required for
the regularization, taking into account the relashionship between neighborhing pixels in the
classified image. Here it is done with the otb::MRFEnergyPotts which implement a Potts
model.

Figure 19.10 shows the output of the Markov Random Field classification after 20 iterations with a random
sampler and a Metropolis optimizer.

Figure 19.10: Result of applying the otb::MarkovRandomFieldFilterto an extract from a PAN Quickbirdimage for classification. The result is obtained after 20 iterations with a random sampler and a Metropolisoptimizer. From left to right : original image, classification.

The source code for this example can be found in the fileExamples/Markov/MarkovClassification2Example.cxx.

Using a similar structure as the previous program and the same energy function, we are now going to
slightly alter the program to use a different sampler and optimizer. The proposed sample is proposed
randomly according to the MAP probability and the optimizer is the ICM which accept the proposed
sample if it enable a reduction of the energy.

As the otb::MRFOptimizerICM does not have any parameters, the call to optimizer->SetParameters()
must be removed

Apart from these, no further modification is required.

Figure 19.11 shows the output of the Markov Random Field classification after 5 iterations with a MAP
random sampler and an ICM optimizer.

Figure 19.11: Result of applying the otb::MarkovRandomFieldFilterto an extract from a PAN Quickbirdimage for classification. The result is obtained after 5 iterations with a MAP random sampler and an ICMoptimizer. From left to right : original image, classification.

The source code for this example can be found in the fileExamples/Markov/MarkovClassification3Example.cxx.

This example illustrates the details of the MarkovRandomFieldFilter by using the Fisher distribution to
model the likelihood energy. This filter is an application of the Markov Random Fields for
classification.

This example applies the MarkovRandomFieldFilter to classify an image into four classes defined by their
Fisher distribution parameters L, M and mu. The optimization is done using a Metropolis algorithm with a
random sampler. The regularization energy is defined by a Potts model and the fidelity or likelihood energy
is modelled by a Fisher distribution. The parameter of the Fisher distribution was determined for each class
in a supervised step. ( See the File OtbParameterEstimatioOfFisherDistribution ) This example is a
contribution from Jan Wegner.

Then we must decide what pixel type to use for the image. We choose to make all computations with
double precision. The labeled image is of type unsigned char which allows up to 256 different
classes.

Finally, we define the different classes necessary for the Markov classification. A MarkovRandomFieldFilter is
instantiated, this is the main class which connect the other to do the Markov classification.

An MRFSamplerRandomMAP, which derives from the MRFSampler, is instantiated. The sampler is in
charge of proposing a modification for a given site. The MRFSamplerRandomMAP, randomly pick one
possible value according to the MAP probability.

An MRFOptimizerMetropolis, which derives from the MRFOptimizer, is instantiated. The optimizer is in
charge of accepting or rejecting the value proposed by the sampler. The MRFSamplerRandomMAP, accept
the proposal according to the variation of energy it causes and a temperature parameter.

typedef otb::MRFOptimizerMetropolis OptimizerType;

Two energy, deriving from the MRFEnergy class need to be instantiated. One energy is required for the
regularization, taking into account the relationship between neighboring pixels in the classified image. Here
it is done with the MRFEnergyPotts, which implements a Potts model.

The second energy is used for the fidelity to the original data. Here it is done with a
MRFEnergyFisherClassification class, which defines a Fisher distribution to model the data.

Figure 19.12 shows the output of the Markov Random Field classification into four classes using the
Fisher-distribution as likelihood term.

Figure 19.12: Result of applying the otb::MarkovRandomFieldFilterto an extract from a PAN Quickbirdimage for classification into four classes using the Fisher-distribution as likehood term. From left to right : originalimage, classification.

The source code for this example can be found in the fileExamples/Markov/MarkovRegularizationExample.cxx.

This example illustrates the use of the otb::MarkovRandomFieldFilter . to regularize a classification
obtained previously by another classifier. Here we will apply the regularization to the output of an SVM
classifier presented in ??.

The reference image and the starting image are both going to be the original classification. Both
regularization and fidelity energy are defined by Potts model.

The convergence of the Markov Random Field is done with a random sampler and a Metropolis model as in
example 1. As you should get use to the general program structure to use the MRF framework, we are not
going to repeat the entire example. However, remember you can find the full source code for this example in
your OTB source directory.

To find the number of classes available in the original image we use the itk::LabelStatisticsImageFilter
and more particularly the method GetNumberOfLabels().

In order to obtain a relevant image classification it is sometimes necessary to fuse several classification
maps coming from different classification methods (SVM, KNN, Random Forest, Artificial
Neural Networks,...). The fusion of classification maps combines them in a more robust and
precise one. Two methods are available in the OTB: the majority voting and the Demspter Shafer
framework.

For each input pixel, the Majority Voting method consists in choosing the more frequent class label among
all classification maps to fuse. In case of not unique more frequent class labels, the undecided value is set
for such pixels in the fused output image.

The source code for this example can be found in the fileExamples/Classification/MajorityVotingFusionOfClassificationMapsExample.cxx.

The Majority Voting fusion filter itk::LabelVotingImageFilter used is based on ITK. For each pixel,
it chooses the more frequent class label among the input classification maps. In case of not unique more
frequent class labels, the output pixel is set to the undecidedLabel value. We start by including the
appropriate header file.

#include"itkLabelVotingImageFilter.h"

We will assume unsigned short type input labeled images.

constunsignedint Dimension=2;typedefunsignedshort LabelPixelType;

The input labeled images to be fused are expected to be scalar images.

A more adaptive fusion method using the Dempster Shafer theory (http://en.wikipedia.org/wiki/Dempster-Shafer_theory)
is available within the OTB. This method is adaptive as it is based on the so-called belief function of each
class label for each classification map. Thus, each classified pixel is associated to a degree of confidence
according to the classifier used. In the Dempster Shafer framework, the expert’s point of view (i.e. with a
high belief function) is considered as the truth. In order to estimate the belief function of each class label,
we use the Dempster Shafer combination of masses of belief for each class label and for each classification
map. In this framework, the output fused label of each pixel is the one with the maximal belief
function.

Like for the majority voting method, the Dempster Shafer fusion handles not unique class labels
with the maximal belief function. In this case, the output fused pixels are set to the undecided
value.

The confidence levels of all the class labels are estimated from a comparison of the classification maps to
fuse with a ground truth, which results in a confusion matrix. For each classification maps, these confusion
matrices are then used to estimate the mass of belief of each class label.

The source code for this example can be found in the fileExamples/Classification/DempsterShaferFusionOfClassificationMapsExample.cxx.

The fusion filter otb::DSFusionOfClassifiersImageFilter is based on the Dempster Shafer (DS)
fusion framework. For each pixel, it chooses the class label Ai for which the belief function bel(Ai) is
maximal after the DS combination of all the available masses of belief of all the class labels. The masses of
belief (MOBs) of all the labels present in each classification map are read from input *.CSV confusion
matrix files. Moreover, the pixels into the input classification maps to be fused which are equal to the
nodataLabel value are ignored by the fusion process. In case of not unique class labels with the maximal
belief function, the output pixels are set to the undecidedLabel value. We start by including the appropriate
header files.

We will assume unsigned short type input labeled images. We define a type for confusion matrices as
itk::VariableSizeMatrix which will be used to estimate the masses of belief of all the class labels for
each input classification map. For this purpose, the otb::ConfusionMatrixToMassOfBelief will be
used to convert each input confusion matrix into masses of belief for each class label.