I was reading this thread earlier and noticed a link to this website showing an example of a genetic algorithm. I found this very interesting so I decided to create my own version of the app the author describes.

The algorithm quickly makes an appoximation, but finds details much harder

For the most part it works quite well, but the small population(only two) meant that it quickly gets into an evolutionary dead end and is unable to evolve out of it. If I wanted to experiment properly then I needed a much larger population, and the image generating algorithm isn't suited to that. As the genetic code is it's own entity it can be used to control any style of app that takes numeric input, and feeds back a quantifiable success rate. As most games take some kind of user input and give a defined score it made perfect sense to have the gene codes control a simple game and see how well they do. And so was born The Mario Genome.

Given enough time the genetic code can equal any human player

The premise is simple, a platformer with two controls, right and jump, and a population of 1,000 genetically controlled Marios. Each run the genetic codes attempt the level, and are given feedback on their success. The 500 least successful codes then die and the remaining 500 reproduce to bring the population back to 1,000. This lifecycle is repeated indefinitely, and every generation, as a whole, makes improvements over the last. In theory the Marios should keep improving until they can complete the level in the perfect time.

So how did they do? Well the first task was for them to complete the level, which they did in only 1935 generations. It actually surprised me how quickly they learnt how to get to the end. Next I played the level myself and decided that a good speedrun time was 452 ticks(1/60 second). The evolving Marios made the time at generation 3010. Finally I calculated the best time possible was 431 ticks and set them about the task of equalling it. This was much harder and took them a very long(several hours) time to do, but they did make it at generation 7705.

Ok thats enough talk here are the apps(win&mac) for anyone who wants to try them out. Evo is the image app and you'll need an image for it to test against, 256x256 pixels works best. Evoplat is The Mario Genome and the controls are on screen for that one. Some things to note. The generations may not go up every run, this is because the last batch of children were all worse performers than their parents, and the same parents have a new batch of children. My cpu is quite beefy so the apps run really fast on my setup, but I can't vouch for the speed on slower machines. The Alpha Mario is the Mario who was most successful on the previous run, he may not actually be the best of the current crop. As the genetic codes adapt to their current environment it is possible for the same codes to learn more than one game. As long as the codes can interface with the game and get quantifiable feedback then these simple gene codes can play it.

What next? Well I'm done with this as I've used up too much time already. If I was to expand upon it I'd try something like Mario 1-1, and give the gene codes full game controls.

This is awesome, this deserves to get posted somewhere where it will be seen by more people than it would just in the technical forum here. Reddit and such would love this, esp. if you could make a YouTube video demonstrating your process.

I'd be curious how much differently things would turn out if you add recombination (sexual reproduction effectively). This can make genetic algorithms more effective by allowing "good ideas" to spread through the population more quickly. Although I'm not sure how much it would help in this particular case. The "genome" you're mutating is just the string of run-or-jump keypress commands to execute, right? So it seems like one problem here is that once you've got the full run complete, mutations toward the end become extremely desirable and mutations toward the beginning become extremely undesirable, because mutations toward the beginning could possibly invalidate the input after (ie you jump at time t to get over a block, but after a mutation earlier in the string you are no longer at that block at time t, making he jump pointless).

Thanks. I didn't really know where to post this as it was just a personal experiment and I wasn't sure if anyone would share my interest. Like you I thought that changes in behaviour early in the run would be less desirable than changes in later behaviour, but after testing for a while earlier changes seem to make the best progress. The behaviour at the end of the run always seems to be more efficient than at the start so the biggest room for improvement is always near the start. I'm not exactly sure why. One thing that should be noted is that the way the test is set up it's a little unfair on the control(random) Mario. To make this a completely fair test the control should have a population of 1,000 too.

@mcc: In the image app the gene codes were variable length, but in the mario app the are fixed to 180 bytes. The fitness test was the distance to the end of the level with the time taken as a tie breaker.

@agj: This is an evolving AI so to speak, behaviour patterns that are successful are passed on to the next generation. You could also think of it as a kind of 'trial and error' AI I suppose. This technique can be used for far more than just keypresses. If gene codes got more feedback on their current enviroment they could develop behaviour patterns for set situations, but it would be a far more complicated experiment than I'm prepared to try.

* Sample ten points near the character as relative coordinates.* A genome contains a transformation of the weightings for each possible action on finding each possible block type at each of the ten points.* Mutations can involve moving sample points and/or changing behaviours but the probability of N changes decreases as some function of N.* Crossover is done in the obvious way.

The reason this probably isn't worth doing is because the incremental fitness function you'd need would lead to nasty evolutionary dead ends. For example, if running along the ground gets you 20 blocks in but then you're doomed whereas running along an elevated platform can finish the level but will kill you after 10 blocks unless you're careful then early selection pressure will favour the bad route.

Yeah, writing AI to deal with the unexpected dead-ends in the level would be tough. It'd have to learn to back off. Or you know, you could handle it like real evolution. Let the ones that go into the dead-end die and encourage diversity at such branching decisions.

I'd have to think about it, but there's something to it. Pretty amazing stuff; I do remember there was a Mario AI competition a year ago or so. It had some of the same ideas.

I love things like this. I hope I have enough time at some point to make a fully genetic ecosystem sim, possibly with meiosis simulations with linkage and whatnot too. It's interesting to see that it's possible to code learning behaviours.

I actually designed a hugely complex "learning" AI for stress at some point but never implemented it. I think I'll have to try something like this soon, thanks for getting me back onto it.

Or you know, you could handle it like real evolution. Let the ones that go into the dead-end die and encourage diversity at such branching decisions.

That actually won't work with the current setup.

First it would be necessary to change the per-generation death criterion from the 50% least fit to a probability of death based on fitness. That way evolutionary dead ends still leave a small number of individuals without the fatal flaw from whom new approaches can develop.

They handle dead ends quite well actually. It only takes one mutated gene code to go down the non dead-end route and he will do much better than the others. He will therefore always survive and go on to have many children who in turn have many more children, and so on. Eventually the group that go down the dead-end will die out. In my example something similar to this happens with the first split path. Initially all the Marios go down the drop as it's easier at first, but further on it is in fact a much harder path and so the few mutations that take one of the higher routes survive better, until eventually they all go that way. I'm sure there would be situations they'd find hard to overcome, but that is more due to the lack of feedback I give them about the level than any flaw in their design. If the game used a path finding algo. to determine their distance from the end of the level they wouldn't really go down the dead-end in the first place.

@Core Xii: How fast does it run? Without the vSync it should run as fast as your computer will allow it, as there isn't any timing code at all. As for the jumping I know plenty of real humans that randomly jump about when playing Mario.