A Genetic Algorithm to Improve Network Coherence

August 19th 2016

Biomimicry is always a pretty entertaining concept. It lends us the likes of fractals, robotics, and in it's purest form is used to model parts of the world to better understand the minutia of natural systems. Probably my favourite aspect of nature, from a somewhat bias standpoint that I share with most other living beings, is natural selection. The very beast which delivered me into existence. Engineered with sublime fitness to be master of the environment I inhabit.

The "Problem"

At work, we had an application that was driven by a state machine, which in turn was controlled by JSON. we maintained two versions of this configuration, one in the application that did the heavy lifting, and the other on some graphing software to allow us to visualise and discuss the system.

Someone Suggested:

Wouldn't it be great if we could just generate the graph from our config?

HELL YES. That would be great (I mean, it's kind of easier all round to have the graph generate our JSON, but being smart now isn't going to allow us to dick around with some AI).

The Network

Our state machine effectively consisted of a bunch of states that were connected to other states, they would have varying routes between them so the primary goal of our solution would be to 'untangle' the network, so that it could be seen in it's simplest form for all to admire. Here's a sample of the input:

Mutating the way to success

I wanted to solve the "problem" by placing all nodes of the network down randomly, connect them all up, then quantify the simplicity of the network. I'd do this a whole bunch of times. At a certain point, taking and modifying the best scoring (fittest) from amongst them to form the next generation of networks.

Scoring Fitness

Intending to create nice, small, and simple networks, the factors I used to score my network's fitness were as follows:

The Euclidean distance of all connecting lines

Area of overlapping nodes

Area of nodes outside of the canvas

Line intersections

Lines intersect with other nodes

Each of these would increase the score, so the best network would have the lowest total fitness.

Mutations

I couldn't think of a way to effectively breed between different networks so, like many computer enthusiasts before me, I resorted to asexual reproduction. Opting to derive mutated forms of the parents in subsequent generations. These mutations fell into three categories:

Results

Given enough time I hoped this would produce nice looking, coherent, networks. And, for smaller networks, it did!

However when trying to render larger, more complex networks, there came a point where the fitness would hit a wall. Tangles that made it past a certain point became less and less likely to be removed:

After sitting and watching my networks fail to unfold for some time, I slowly realised that I'd designed my networks to be greedy. They were being hindered by cheap hits of fitness. By shrinking the size of the network (scored by the Euclidean distance) and drawing all the nodes closer together, there was less and less room available for nodes to mutate and untangle themselves.

In addition, the networks needed to get off to a good start. Some of them would have quite fundamental flaws in their structure which could require, for example, that 4 nodes would need to mutate in unison to the other side of the network to untangle completely. Either I needed to mutate my nodes in groups or I needed to iterate through more candidate networks.

Barbaric Island Species

To this end, I modified the simulation to spawn several Island species which mutate in isolation. The fittest networks from each island would be used to grow a single mainland species.

To ensure these island species had untangled as well as possible, I removed the size constraint and increased the mutation rates. I figured that if I could mutate without being limited by size, I could introduce the size constraint once the topology is simplified to boil down the dimensions of the network.

Nice!

This improved the success rate dramatically. Here are some examples of the networks the GA designed.

In The End

While I'm claiming success, these wouldn't produce super readable diagrams. There are some readability traits which most of us somewhat instinctively bestow on diagrams we produce. We align the nodes horizontally and vertically with one another as best we can. The lines we use to connect them are usually either bendy or are restricted to right angles. The GA isn't totally reliable either; about 15% of the time the resulting network still contain knots and tangles.

But failings aside, I'm actually kind of excited that my networks developed quirks. The Euclidean distance was intended to shrink the network as much as possible, as it turns out however, that favours a somewhat circular arrangement of nodes over a denser collection with some potentially longer connections between them. The impact this has on the networks gives them a vaguely organic quality, and they end up resembling weird geological continents or Amoebas:

In short: I had fun solving the wrong problem with a powerful technology and created a wonky shape creator that eats CPU's like Weetos.