Abstract

Abstract Game playing has been a core domain of artificial intelligence research since the beginnings of the field. Game playing provides clearly defined arenas within which computational approaches can be readily compared to human expertise through head‐to‐head competition and other benchmarks. Game playing research has identified several simple core algorithms that provide successful foundations, with development focused on the challenges of defeating human experts in specific games. Key developments include minimax search in chess, machine learning from self‐play in backgammon, and Monte Carlo tree search in Go. These approaches have generalized successfully to additional games. While computers have surpassed human expertise in a wide variety of games, open challenges remain and research focuses on identifying and developing new successful algorithmic foundations. WIREs Cogn Sci 2014, 5:193–205. doi: 10.1002/wcs.1278 This article is categorized under: Computer Science > Artificial Intelligence

Images

Estimated rankings of best available chess‐playing programs during the development of the field, compared to human champion ranking. Rankings are given using the standard chess Elo scale. Estimated rankings from before 2000 include systems using high‐end special‐purpose hardware, while estimated rankings from after 2000 are from the SSDF Rating List which tests programs using commodity computing hardware. The maximum rating achieved by human chess champions is approximately 2900; current computer chess performance is well beyond this level.

Ranking of Monte Carlo programs for Go; ratings held for at least 20 rated games on the KGS Go server where they compete against a wide variety of human opponents. Rankings are relative to the best 6 kyu KGS ranking achieved by the classical pre‐Monte Carlo programs GNU Go and Many Faces of Go from 2006; 6 kyu has a relative ranking of 0 on the bottom of the vertical axis. The value 14 at the top of the vertical axis represents a maximum ranking on KGS, which corresponds approximately to low professional Go player rankings.

Monte Carlo Tree Search. Each legal move is evaluated by playing out from it many complete games to the end, taking the average win rate of the final positions in these games. The move with maximum win rate is then selected for play.

A small neural network. In TD‐Gammon, the input features represent the state of the backgammon board, and the output is an evaluation of the position. Each hidden layer and output node computes a weighted sum of its inputs, and applies a sigmoid function to this sum to compute its output. The weights are set by a machine learning procedure using gradient descent. TD‐Gammon uses a much larger neural network than is pictured here.