Learning to AI: The Little Stickthing That Could!

Learning more about the practical development of AIs has been a project pretty high on my list of priorities for way too long now. My interest, in particular, is in how one can go about implementing personalities and learning for video game AIs in order to provide a more immersive and nuanced gaming experience. The aim is not necessarily to make the AIs more difficult. Rather, it's making the AIs more personal and to let it take use of certain game mechanics more effectively.

Anyway - I've been meaning to do a whole blog post of its own about both my process of learning more about AIs and about my thoughts related to video game AIs. Talk about decision trees, pathfinding, agent systems, sensors, optimal difficulty curves.. But this post is not about any of that.

This post is about one of my first forays into simple feedforward neural networks and learning behaviour. Bringing to you.. The Little Stickthing That Could!

At first, I control the creature and after that, as the video speeds up, it's up to the AI.

This is, of course, a ridiculously simple implementation, but I wanted to try something a little different after copying my first Python digit recognizer from some random tutorial.

For its "brain", the Stickthing has mere 3 input nodes, 3 hidden nodes in a single hidden layer, and 2 output nodes:

The data being fed to the input nodes is fairly simple: Input 1 gets the angle of the back leg, input 2 gets the angle of the front leg, and input 3 gets the height difference between the two legs. This is all the data the "brain" has - two angles and a height difference.

The two output nodes' values range from -1.0 to 1.0 and represent force impulses that should be put to the back leg and the front leg. Output 1 is force for back leg, output 2 is force to the front leg.

At start, all the weights between the nodes are randomized. A few times per second, a controller system feeds new inputs to the neural network and grabs the output values, using the output values directly to apply force impulses to the appropriate legs of the Stickthing.

This is allowed to run for some time, then we look at how far the Stickthing made it. After that, we take a single weight, randomize it, and put the Stickthing back to the start to try again. Each such cycle is called a generation. If a generation makes it further than the last generation, then the new random weight is kept. If the generation doesn't make it further, then the change is discarded and the old value is restored for the particular weight.

I didn't really expect very much from the system, but interestingly, it does repeatedly manage to make it about as far as I could by manually controlling the Stickthing. The control system is pretty bad - there's only 2 joints after all - and I can't make it more than a little bit up the 2nd hill. The neural network achieves about the same result. Also, while it does occasionally take largish steps back, it does seem to quickly restore itself to its previous record. I find it quite interesting - and amusing - that such a very, very simple system is capable of at least minor improvement. Capped improvement, slow improvement, but damn it - it's improvement.

Now there are lots of small things that could be improved . Different activation functions could be tried out. More joints could be added to the Stickthing. The whole thing could be changed to a recurrent neural network, to try to model movement arcs a little bit better and to be able to take past states into account. I don't know how much I'll really bother to improve this anymore, though getting into recurrent networks is my next goal in what goes to neural networks.