I finally got around to tinkering with Wallace, the tool for using genetic algorithms to play Nintendo games, again. The latest version of the software has several new features, a couple of which I’ll mention here.

The biggest is that your solutions can now take the form of a button input sequence, instead of being a program that runs in a VM and generates button input. This has the advantage of producing interesting results a lot faster, at the cost of your solutions being tied to your initial conditions. So, while you might be able to generate something that plays a particular Dr. Mario screen very well, it will crash and burn on any different starting condition — it’s just a dumb sequence of moves, after all.

There are many other, more subtle, improvements to the evolver as well. The selection phase now uses tournament selection to choose who lives on to the next generation, instead of just taking the top n solutions so far. This lets you vary selection pressure from brutal (top n) to nonexistent (creationists’ “evolution is entirely random” strawman) and anywhere in between. You can also now set the rate of mutation to a much wider range, in particular setting it to rates far lower than the (ridiculously high) minimum available in the previous version.

Also, the evolver prompts you to save your progress before it closes. I can’t tell you how many times the lack of that feature has caused me to lose a lot of data.

So, how do things shape up now? Here’s the statistics from a relatively quick sample run, on the same Dr. Mario screen as last time. Now, however, I’m using button input sequences with Select and Start filtered out (you can do that now too), with a population ranging from 200 to 2000 solutions, with fairly mild selection pressure (15%ish?), with a maximum mutation rate of 20 mutations per organism. The fitness function now measures final score, with penalties for the time intervals between each time you earn points. So, if you score 300 points at time 5 and 900 points at time 15, your fitness is 300 – 5 + 900 – (15 – 5) = 1185. (The ability to express that kind of fitness function is another new feature.)

So, here’s the stats (did I mention collecting these statistics is another new feature?):

A 100% randomly generated solution still managed to score 2400 points in the first generation.

By the sixth generation, at least half the solutions are better than the best solution in the first generation.

Each generation, the mean and median fitness increases.

It should be impossible for the fitness to be -1 with the fitness function I defined. There’s a bug somewhere.

Even so, the minimum shows no improvement from generation to generation; mutations can still render a good solution bad.

Of course, you want to see that top solution in the sixth generation, don’t you? Well, unlike the VM-based ones from last time, which ended up doing the same thing over and over, these solutions look like they’re being played by a monkey on crack who occasionally makes a really good move. Of course, since my lease doesn’t cover monkeys and none of my crack dealers are returning my calls, the monkey-on-crack theory is only a hypothesis.

Anyway, here’s the video of the top solution of generation six:

If you’re feeling adventurous, you can try your hand at Wallace yourself: