Maintainer's Corner

Readme for hmep-0.1.1

Multi Expression Programming

You say, not enough Haskell machine learning libraries?

Here is yet another one!

History

There exist many other Genetic Algorithm (GA) Haskell packages.
Personally I have used
simple genetic algorithm,
GA,
and moo for quite a long time.
The last package was the most preferred, but the other two are
also great.

However, when I came up with this
MEP paper,
to my surprise there was no MEP implementation in Haskell.
Soon I realized that existing GA packages are limited,
and it would be more efficient to implement MEP from scratch.

That is how this package was started. I also wish to say thank you
to the authors of the moo
GA library, which inspired the present
hmep package.

About MEP

Multi Expression Programming is a genetic programming variant encoding multiple
solutions in the same chromosome. A chromosome is a computer program.
Each gene is featuring code reuse.

How MEP is different from other genetic programming (GP) methods?
Consider a classical example of tree-based GP.
The number of nodes to encode x^N
using a binary tree is 2N-1.
With MEP encoding, however, redundancies can be dramatically
diminished so that the
shortest chromosome
that encodes the same expression has only N/2 nodes!
That often results in significantly reduced computational costs
when evaluating MEP chromosomes. Moreover, all the intermediate
solutions such as x^(N/2), x^(N/4), etc. are provided by the
chromosome as well.

Versatility. hmep can be applied to solve regression problems with
one or multiple outputs. It means, you can approximate unknown functions
or solve classification tasks. The only requirement is a custom
loss function.

Effectively, the solution cos^2(x) = 1 - sin^2(x) was found.
Of course, MEP is a stochastic method, meaning that there is
no guarantee to find the globally optimal solution.

The unknown function approximation problem can be illustrated
by the following suboptimal solution for a given set of random
data points (blue crosses). This example was produced by another run of
the same demo, after 100 generations of 100 chromosomes.
The following expression was obtained
y(x) = 3*0.31248786462471034 - sin(sin^2(x)).
Interestingly, the approximating function lies symmetrically
in-between the extrema of the unknown function, approximately
described by the blue crosses.

Example 2

A similar example is to approximate sin(x) using only
addition and multiplication operators, i.e. with polynomials.

$ stack exec hmep-sin-approximation

The algorithm is able to automatically figure out the
powers of x. That is where MEP really shines. We calculate
30 expressions represented by each chromosome with practically no
additional computational penalty. Then, we
choose the best expression. In this run, we have automatically obtained a
seventh degree polynomial
coded by 14 genes. Pretty cool, yeah?