Monte Carlo Tree Search

Methods

A playout is the basis of all Monte Carlo methods. It is a fast game played with random moves from a starting position to the end of the game, generating either a score or simply a result (win/loss). A playout can be light (completely random moves) or heavy (moves are biased based on heuristics, such as pattern libraries, shape, group status, move history, killer moves, etc.). (See also: playout analysis)

Win rate is a statistic kept by nodes in the game tree, tracking how many of the playouts from this position resulted in a win for the color to move. Research so far finds that this is a more effective statistic to optimize than average score. The use of winrate instead of score results in the characteristic Monte Carlo behavior of winning games by small margins.

CrazyStone started the new wave of MCTS programs by winning the 9x9 gold medal at the ICGA Computer Olympiad, in 2006. The principles of MCTS algorithms were published in the proceedings of the Computer and Games conference, which was held together with the Olympiad.

UCT is a simple and effective form of MCTS, used successfully by MoGo in 2006. It is a best-first search which tries to balance deep searches of high-winrate moves with exploration of untried moves.

RAVE (rapid action value estimate) is a heuristic which takes into account the frequent transpositions found in the game of Go. Normally, only the first move of a playout determines which node's statistics are updated. RAVE updates all children where the move was played at any point in the playout. AMAF (all moves as first) is a general name for this type of heuristic.