Saturday, December 28, 2013

In
the last post we acknowledged that competent GAs are those that automatically
identify building blocks (BBs) and exchange these BBs without disrupting them.
As a simple example, the compact genetic algorithm (cGA) was presented for competently
solving the trap-n functions. In spite of the good results obtained by cGA in these
environments, this algorithm is way too simple for truly tough problems: the m-concatenated trap-n problems, as depicted
in Figure 1. In these kinds of
problems, deceptive trap-n functions are concatenated into a single, bigger
problem.cGA’s probability vector
representation cannot detect the complicated combinations of BBs so, again, a
new strategy to tackle this challenging environments is required: we need an
order-n probabilistic optimization algorithm (in contrast cGA is of order-1).

Figure 1: the 4-concatenated trap-3: It is composed of four trap-3 functions and the objective is to find 111111111111, but the problem is deceptive and guides the learner towards 000000000000 (a local optima!). Click to enlarge.

Following
the initial steps of primeval estimation-distribution algorithms (EDAs), a
complex probabilistic model is used to detect BBs (instead of the simple
probability vector). One of the most successful EDAs that use such idea is the
extended compact genetic algorithm (ECGA) [1]. This algorithm works by detecting
the trap sub-functions and then computing the proportions of all the m-combinations
of n-bits in the corresponding string positions. ECGA does this by learning
marginal product models (MPMs). The idea
behind it is that the MPM contains the probabilities of each combination of BB
and these are used to model and sample forthcoming populations. In this regard,
to obtain the right combination of BB the algorithm uses a combined complexity
criterion (CC): it greedily tries to minimize the amount of bits required to describe
the model while being accurate.

ECGA
works as in the following (rather simplified) pseudocode:

It first generates a random population of size
N.

Evaluate the fitness of each individual.

Generate the first MPM model assuming that all
the variables are independent and then compute CC.

Compact the current MPM greedily (newMPM) trying
the combination of BB with better score (newCC).

Sunday, December 15, 2013

We
endorse the term Competent Genetic
Algorithms to those GAs that solve hard problems quickly, accurately and
reliably [1]. We already know that GAs process building blocks (BB): low
order---few specific bits---and low length---small distance between specific
bits---schema with above average fitness. However, crossover may disturb these
BB. Ideally, crossover should identify the fundamental BB of the problem at
hand and mix them well, but in the real world this phenomenon scarcely happens.
In order to tackle this issue a radical approach is required: remove the
classic selecto-recombinative operators out of the GA loop and develop
strategies that automatically identify BBs ensuring that these are not
disrupted. Researchers call this strategy as Linkage Learning [2].

Estimation
of Distribution Algorithms (EDAs) use probabilistic models that perform the
task. They learn a probabilistic model and then build new solutions by sampling
candidates from the model.

One
of the simplest forms of EDA is the so-called compact genetic algorithm (cGA,
[2]). CGA uses a probability vector to represent populations of strings.
Furthermore, population is completely replaced by this probability vector---i.e.,
no explicit population is stored in memory, hence its name. At each iteration
cGA generates two solutions out of the probability vector. Then it evaluates
these two solutions and finally it updates the probability vector according to
the fitness computation.

I
coded a simple cGA in R solving the trap-n function: a simple boolean function
that is deceptive---i.e., it is misleading toward local optima [1].

The
situation is the following: for n = 5, the learner has to reach the chromosome
11111, but the fitness computation misleads the search towards 00000 (a local
optima!). Figure 1 depicts this situation in the trap-5 function. This problem
is hard for a traditional GA (specially for the simple GA), but cGA solves it quickly
and accurately.

Edit: notice that Figure 1 depicted a distinct trap-5 function---in the picture version of the function I forgot to count 0 as a valid solution. Also notice that the fitness function leads the system toward 00000 and not 00001.