Computer Program Learns To Play Classic Nintendo Games

Tom Murphy presented a research paper called "The First Level of Super Mario Bros. is Easy with Lexicographic Orderings and Time Travel . . . after that it gets a little tricky," SIGBOVIK 2013, in which he sets out a computational method for solving classic NES games. In this video, Murphy tells about the steps he took getting to a this Nintendo game-playing artificial intelligence.

Then through recursive playing, the artificial intelligence system Murphy called 'Playfun' learn how to reproduce them again and again. The artificial intelligence program, called a “technique for automating NES games,” can take on nearly every NES game, but it doesn’t always win.

The paper "The First Level of Super Mario Bros. is Easy with Lexicographic Orderings and Time Travel . . . after that it gets a little tricky" which is published online was presented at SIGBOVIK 2013. As you can see, the game does most of the things normal humans would do but consistently uses very difficult tricks to, say, attack two Goombas in rapid succession.

Mario bounces of one Goomba up into the
feet of another. Not only doesn't he have enough ve-
locity to reach the platform, but he's about to hit
that Goomba from below. Believe it or not, this ends
well: Playfun is happy to exploit bugs in the game;
in this case, that whenever Mario is moving downward
(his jump crests just before the Goomba hits him) this
counts as "stomping" on an enemy, even if that enemy
is above him. The additional bounce from this Goomba
also allows him to make it up to the platform, which he
wouldn't have otherwise! Source: Tom Murphy VII

Murphy writes:

Bytes in memory (and sometimes 16- and 32-bit words) can contain interesting game facts like the player’s position in the level or score. The central idea of this paper is to use (only) the value of memory locations to deduce when the player is “winning”. The things that a human player perceives, like the video screen and sound effects, are completely ignored. As an additional simplification, we assume that winning always consists of a value going up—either the position in the level getting larger, the score getting larger, the number of lives, the world or level number getting bigger, and so on.

By giving the program a little bit of training – by watching a human competitor's turns – the program becomes somewhat of a whiz, stomping turtles and taking no mushroom prisoners. Playfun also seems to excel at making use of programming bugs in the game, such as when it squashes a Goomba in mid-air.

Murphy also ran a few other games through it, including Tetris but found that the program would eventually just pause itself rather than continue playing and lose, a tactic shared by annoying, over-competitive cousins around the world since the 80's.

Murphy concludes his paper by saying:

The approach uses an amusingly simple and mathematically elegant model to learn what constitutes "winning" according to a human player's inputs. It then uses hundreds of CPU hours to search di erent input sequencesthat seem to \win", inspecting only the RAM of the simulated game, ignoring any of the human outputs like video and sound. The technique works great on some games, getting farther into the game than the human's inputs did, and produces novel gameplay (for example, bug exploitation and super-human daredevil timing).