Writing a tic-tac-toe solver using minimax

Neel Somani - September 6, 2017

In this post, we’ll build a tic-tac-toe solver using the minimax algorithm. There are a few steps. First, we’ll need to generate a game tree of all possible moves and outcomes. Then, we’ll write the minimax code to calculate the optimal move.

Here’s a link to the GitHub repo. Note: The code in the GitHub repo is slightly refactored, but the logic is all the same.

1. Starter code

Because the focus of this post isn’t really about creating the tic-tac-toe game itself, I’ve included some starter code below. We’re going to create a class called GameTree, which will represent a game state: the current state of the board, and whose move it is. The GameTree will also store all child GameTrees, which are defined as any game state that is reachable within the opponent’s next move.

We’ll internally represent a board as a string of 9 characters, consisting of X, O, and spaces. The string will represent the board flattened. (e.g., The string "XXO XO X" would represent the board with the top row as "X X O", the next row as "_ X O", and the bottom row as "_ _ X".)

Before representing the game tree with a class, I was initially concerned about performance. I remember that I previously ran into issues when using a binary search tree class in Python with a few hundred million elements. But a loose upper bound on the number of tic-tac-toe boards is 3^9 (3 possible values for each square: X, O, and space) = 19683, and it’s loose since not all of those formations are valid (e.g., you can’t have three X’s in a row and three O’s in a row on the same board, the number of X's and O's have to be within one, etc.). Python should be able to handle 20000 elements easily.

I’ve left a TODO for the self.generate_children() function, which we’ll implement in the next step.

2. Generating the game tree

The self.generate_children() function should reassign the self.children variable to a list of child GameTrees. Before we write that function, though, we need to know whether a game state is terminal, that is, whether either player has won, or whether the board is full.

def is_win(self, player_number):
""" Determine if the player has won the game. """
player = GameTree.players[player_number]
t = self.value
return any([(t[3 * i] == player and t[3 * i + 1] == player and t[3 * i + 2] == player) for i in range(3)]) or \
any([(t[i] == player and t[i + 3] == player and t[i + 6] == player) for i in range(3)]) or \
(t[0] == player and t[4] == player and t[8] == player) or (t[2] == player and t[4] == player and t[6] == player)

To see whether the board is full, we’ll check for the presence of " " in the board string.

To generate all child trees, we’ll get all of the indices of the current tree.value string that are spaces (which represent available moves), and generate the trees corresponding to the player placing their symbol in each location.

3. The minimax code

Let’s write a depth-limited minimax algorithm. We’ll let the user specify how deep into the minimax tree we’ll traverse. If that’s the case, then we’ll need some sort of heuristic to evaluate a game state if it’s not terminal. (If a game state is terminal, we can just value it as 1, 1/2, or 0 in the cases of winning, tying, and losing respectively.)

Let’s use the "probability of winning" as the heuristic, where the probability of winning is defined as the weighted proportion of descendent leaves that are winning states (valued at 1), tied states (valued at 1/2) and losing states (valued at 0). For example, if there are three possible child moves in a position, one of which is a win, one is a tie, and one is a loss, then we’ll value the game state as (1/3) * (1 + 1/2 + 0) = 1/2.

4. The probability heuristic

Is the probability heuristic alone really that much worse than minimax? In short, yes. Try the following sample game tree.

sample = GameTree('OXO X X', 1)
sample.print_tree()

In this position, it’s O to move, and the optimal move is clearly to place in the middle column of the bottom row, to prevent X from winning. Let’s write a helper function to get the "optimal" move that the probability heuristic identifies.

If you run sample.move_by_probability.print_tree(), you’ll find that O plays an entirely suboptimal move — a move that does maximize the proportion of situations in which O wins (if the moves were random), but doesn’t prevent X from winning if X plays optimally.

About the Author

Neel Somani, a student at the University of California, Berkeley, is the founder of Apptic LLC. In addition to computer science, he's interested in philosophy and entrepreneurship. You can follow him on LinkedIn and Twitter.