Program Evolution using Nonparametric Probabilistic Model

@PhdThesis{500000550739,
author = "Makoto Tanji",
title = "Program Evolution using Nonparametric Probabilistic
Model",
school = "The University of Tokyo",
year = "2011",
address = "Japan",
keywords = "genetic algorithms, genetic programming",
URL = "http://ci.nii.ac.jp/naid/500000550739",
URL = "http://iss.ndl.go.jp/books/R100000002-I023389085-00",
abstract = "This paper describes research results of Genetic
Programming (GP) which uses nonparametric probabilistic
model. Genetic Programming, a method of Evolutionary
Computation which uses mechanism of biological
evolution for optimization, can generate programs
itself and functions by using tree structure data. It
can be applied to complex problems which are difficult
to solve by human such as the robot routing problem and
the regression problem, etc.
Conventional Evolutionary Computation methods, GP and
GA (Genetic Algorithm), use crossover and mutation
which have been inspired by sexual multiplication. In
recent years, new approaches called EDA (Estimation of
Distribution Algorithm) and PMBGA (Probabilistic Model
Building Genetic Algorithm) which uses a probabilistic
model from promising have shown promise. EDA and PMBGA
can deal with the dependencies of variables and can
solve complex problems in GA. Especially, it is widely
accepted that EDA and PMBGA are well suited to problems
which have dependencies between the variables.
In the past decade, EDA-GP and PMBGP (Probabilistic
Model Building GP) have been studied. EDA-GP uses PPT
(Probabilistic Prototype Tree) or stochastic grammar as
probabilistic model. However it is difficult to
represent dependencies of solutions because of a
variable length tree structure. Furthermore, the number
of parameters and the number of promising solutions
increase rapidly. As a result, the computational cost,
CPU time and memory, will be high. In this research, a
comparative experiment of EDA-GP using PPT model is
reported.
I propose two methods of GP to overcome the problems
above: PORTS (Program Optimization by Random Tree
Sampling) and PERCE (Program Evolution using Related
Clique Extraction). These methods do not need to
calculate and store parameters of a probabilistic
model. Instead of storing them, PORTS and PERCE use
current solutions directly to create new
solutions.
PORTS creates a new solution by sampling and
concatenation of fragment of promising trees. The size
of the fragments follows a geometric distribution to
maintain the diversity of a population. Experimental
result shows that PORTS solves GP-easy problems very
fast than the conventional GP.
PERCE uses a set of sub graphs, called ''related
cliques'', that contains a number of PPT nodes relating
each other. It does not require explicit calculation
and storing of the parameters. Experimental result
shows that PERCE solves GP-hard problems such as
deceptive problem and multimodal problem.
I summarize this research and discuss how to use GP for
practical problems. Finally, further topics to be
researched are presented and discussed.",
}