Even more quick summaries of research papers

Uses the MNIST hand-drawn digits dataset, trains both a normal deep network and a convolutional network

Results:

HyperNEAT on it’s own performs very badly.

Using HyperNEAT to generate a number of layers and then backprop on the final layer is achieves 58.4% for normal ANNs

Using HyperNEAT to generate a number of layers and then backprop on the final layer for a Convolution Neural net achieved performance of 92.1%

This paper seems to miss that one of the advantages of HyperNEAT is the ability to scale it to different sizes of input so it would be nice to see how it performans when given images with different dimensions vs a traditional approach which has to do a standard Photoshop resize and then work off that image.

Also would love to see some details of exactly how the algorithms were implemented

A big problem with HyperNeat is that though it can express a potentially infinite number of nodes and connection, the numbers of hidden nodes must be decided in advance

ES-HyperNeat is an attempt to extend HyperNeat to be able to determine how many hidden nodes it should have.

ES-HyperNeat uses the connection weights values themselves to determine node placement, areas with more complex(higher variance) weight patterns should be given more nodes.

There is an addition called Link Expression Output (LEO) for HyperNEAT where weather or not there is a connection between 2 nodes node is not calculated from the magnitude of weights but from another evolved parameter. This also works with ES-HyperNEAT

Algorithm is:

When creating a connection between 2 nodes (e.g an input and an output) rather than just calculate a single weight value we create 4 hidden nodes(quad tree style) and calculate the weights for each

if the variance of the weight is above some threshold we create all for nodes

otherwise just have the single connection

when doing the connection for the 4 sub nodes we may do more subdivisions(up to some max)

Normally Genetic algorithms are trained with an objective(fitness) function and those agents that perform best against it are the selected for

This leads to problems of local minima and will often result in lots of similar behaviors with small tweaks being selected for not allowing for the more complex sets of interactions required.

Novelty search instead adds a measure of the distinctness or novelty or difference in the result of a behavior to the fitness function.

Gives the example of teaching a robot to walk, a fitness function would start off with just selecting the robot that fell the furthest to the goal. Which would likely never lead to complex interesting behavior.

But if you reward different kinds of falls some may arise that balance for a bit in nay direction over time these can adapt to move to the actual objective

Recommends average distance to k nearest neighbors as a good measure of novelty.