Reasoning About Strategy

Deductive and Inductive Reasoning

Modern games tend to have a high degree of complexity, in that they would be very difficult to “solve” in the game theoretical sense. For games with a relatively low degree of complexity, we can use game theory, math and logic to derive truly optimal strategies. When we do this, we are using purely deductive reasoning. A deductively derived optimal strategy for a game can be proven to be optimal, and there is no uncertainty about its optimality.

The game of Nim is a simple game which is familiar to many and which is often used as an example in game theory textbooks of a non-partisan complete-information game with an elegantly mathematical optimal solution. The strategy can also be proven optimal in a satisfyingly mathematical manner (seen here). Tic-Tac-Toe is a game which has a game tree of manageable size, so we can find optimal tic-tac-toe strategies by following every branch of the game tree to the end and determining which moves lead to the best outcomes assuming optimal performance by one’s opponent.

Unsurprisingly, there weren’t very many Nim or Tic-Tac-Toe events to be found at Gen Con this year; we tend to play games whose truly optimal solutions are far out of the reach of humans (and usually computers, too). We may not be able to find optimal strategies for games like Power Grid and Dominion, but some strategies are certainly better than others. In order to develop strategies for modern games, we have to use a combination of deductive and inductive reasoning.

Here I use “induction” not in the sense of mathematical induction, which is a method for constructing mathematical proofs. Here, I use inductive reasoning to mean empirical inference. That is, inductive reasoning about strategy is the type of reasoning we do by examining the outcomes of games and making inferences about the quality of different strategies.

Therefore playing games is like conducting scientific research. In science, we develop theories to explain complex natural phenomena. A theory is a simplified model representing the more complex natural object. We use deductive reasoning to derive the consequences of the logical principles constituting that theory, and we use inference to tune the parameters of our model and compare the predictive consequences of our model to reality. Similarly, when playing modern games, we develop theories which model a simplified version of the game tree. Using reason, we can deduce which moves we should make to optimize our performance according to our theory, and then we receive feedback by observing the realized progression of the game.

Inference and Luck

In my last blog post, I proposed a definition for luck in games, and here I will use luck in the same way. Put briefly, luck is the component of the uncertainty in an outcome of a game after accounting for a player’s strategy. Therefore, the more luck there is in a game, the less information each game outcome provides about the strength of a players’ strategic choices. In a game with very little luck, a single win provides a lot of evidence that your strategy is superior to your opponent’s strategy, whereas in a game with a lot of luck, a single win provides relatively little evidence that your strategy is superior to your opponent’s strategy. Furthermore, here I am using strategy in the game theoretical sense, meaning “the choices a player will make in any given situation,” (allowing for the possibility of these choices being probabilistic) despite the fact that in reality, strategies in this sense don’t exist fully formed in a player’s mind.

Agricola

It is convenient to consider the simplified scenario in which we are interested in comparing the relative quality of just two strategies. This is not necessarily unrealistic, as there are often two friends who have a disagreement about the proper way to play a game. Player A can argue their case theoretically, as in “stone is worth more than wood in Agricola, so it is better to take three stone than three wood,” but in a game so complex the applicability of such heuristics is likely to be limited. Player B may argue that Player A’s model, which abstracts the value of stone and wood, does not fully account for the unusually high utility of wood in this particular situation. Sometimes it is not possible to prove theoretically that one decision is better than another. Therefore, players may need to resort to empiricism to compare the value of the strategies by actually playing the game.

A Simple ExperimentHow many games do players need to play in order to determine whose strategy is better? It depends on three things: the amount of luck in the game, the magnitude of the difference between the quality of the players’ strategies, and the degree of certainty desired about which strategy is better. For instance, consider two players playing a game such that unbeknownst to the players, Player A has a 60% chance of winning. This could be the case if the players are relatively evenly matched, or that they are quite unevenly matched but the game has a fair amount of luck. The players conduct a test to determine the better player by playing 10 games in a row.

Keep in mind that even if two players were evenly matched, it is unlikely that the results would be exactly 5-5. Results such as 6-4 and 7-3 provide relatively little evidence that the players are not evenly matched. In order to conduct this experiment, the players have to choose a threshold past which they will conclude that one player is better than the other. In statistics, this is called a “confidence level.” In this case, in order to ensure that if players are evenly matched, we only wrongly designate one as the better player 10% of the time or less, they will have to institute the rule that one player will have to win 9 or 10 times out of the 10 matches.
This table shows the possible outcomes of the experiment and the probabilities of each outcome, assuming Player A inherently wins 60% of the time.

Player A Victories

Probability

Better Player?

0

.01%

Player B!

1

.16%

Player B!

2

1.06%

No Conclusion

3

4.25%

No Conclusion

4

11.15%

No Conclusion

5

20.07%

No Conclusion

6

25.08%

No Conclusion

7

21.50%

No Conclusion

8

12.09%

No Conclusion

9

4.03%

Player A!

10

.6%

Player A!

As we can see from this table, there is only a 4.8% chance that from this experiment we will determine one player superior, and a .17% chance it will be misleading. Of course, we could relax our confidence level to allow us to conclude that a player is the better player even if he only wins 7 or more of the games, so that there would be a 56.3% chance we would designate one player better than the other, but then we would be reducing our confidence level and accepting a 34.3% chance of designating one player better than the other even if the players were actually evenly matched.

This experiment illustrates that to compare the quality of two strategies in a game with a lot of luck, or when the differences between the strategies are subtle, a large number of repetitions are required to make inferences with any sort of certainty.

Informal Inference

In reality, most of us are not conducting formal experiments when we play games. Nonetheless, observing the outcomes of our strategic decisions is important for developing strategies. How we actually process data about game outcomes is a matter of psychology, and almost certainly immensely complex. I won’t pretend to know much about psychology, but I do know that people aren’t always very good at making accurate inferences on the fly. I first discovered these ideas in Thinking Fast and Slow by Daniel Kahnemann, a psychologist who is largely responsible for the emergence of the field of “Behavioral Finance.” Considering the close relationship between games and economics, I find many of the psychological principles found within immediately applicable here. For now, I’ll just mention a couple systematic flaws that people tend exhibit when they are conducting inference in their head.

Confirmation Bias occurs when people contextualize data through the lens of beliefs that they hold. I suspect this is extremely common in gaming. Players have beliefs about which moves are good and bad, and when they end up winning the game, they interpret the win as strong evidence that their decisions were good, while downplaying the contribution of luck to their success. Meanwhile, when players lose the game, they are much more likely to blame bad luck. Even if players are resistant to this kind of self-aggrandizing thinking, they may still tend to remember their wins more clearly and give more weight to the times the use of a strategic decision resulted in a positive outcome than a negative outcome.

The Clustering Bias occurs because people tend to overestimate the significance of successive positive or negative outcomes. Often times if a player wins a game two or three times in a row, he will interpret this as strong evidence for the superiority of his strategy, even though these outcomes may lack statistical significance. In any sequence of random numbers, apparently significant streaks and patterns tend to occur more often than people expect.

Overall, I suspect that people underestimate the amount of variance in the outcomes of games. One reason is because humans are always searching for patterns and are naturally inclined to analyses which link factors together in a deterministic way. Another related reason is that many people overlook Type II Randomness and Type III Randomness (discussed in my last post) as contributors to the variance in the outcome. Underestimating the amount of indeterminacy in the outcomes in games can result in players putting too much weight on individual outcomes, which ultimately makes it more difficult to make accurate inferences about the strength of strategic decisions.

Inference Online

Luckily, most online gaming sites have implemented automatic tracking of win-loss data, which allows us to study game outcomes in an accurate and formal way. Players generally have access to a statistic summarizing their performance in the form of a “player rating.” Usually this statistic is calculated using some variant of Elo’s formula (the rating system originally developed for chess).

In theory, rating systems allow us to conduct more sophisticated and practical experiments than simply playing a series of games between two players. Player rating statistics pool information from matches against multiple opponents and give us some information about how our performance compares to that of the overall population.

I personally have used my player rating on sites like boardgamearena.com and boiteajeux.net as a method for inferring the strength of certain strategic choices in games that have highly uncertain outcomes such as Race for the Galaxy and Seasons. Player ratings are possibly the best source of information we have to defend strategies in an inductive way – alongside arguing for the strength of a strategy theoretically, one could point to their player rating as evidence that their strategy performs well (or poorly).

Unfortunately, Elo’s formula is not statistically optimal, and the uncertainty associated with Elo ratings is not generally calculated. I believe there is interesting research to be done on rating systems and their interpretations, but I will write about that elsewhere.

Optimizing Strategies

As I have discussed, we have two ways of developing better and better strategies for games. One is by exploring the consequences of our decisions theoretically, and one is by inferring the quality of our decisions by observing the outcomes of games. In general, it is impossible to derive fully optimal strategies in a theoretical way, because modern games tend to be too complex to do so. But is it possible to derive fully optimal strategies using inference? In general, the answer to this question should also be no.

The first reason is practical; it would take forever. This is for two interacting reasons: the strategies are immensely complex and have countless parameters that need to be tuned, and second of all, the outcomes are partially random, therefore each outcome only provides limited information about each parameter of a strategy. (The study of optimizing the average output of a random function is called “stochastic optimization,” and in general it is more computationally intensive than deterministic optimization). In the example of the two-player 10-game experiment above, we are using inference to attempt to compare just two strategies and even then the amount of data we need to collect to come to any conclusion is quite large for the average player. Now consider that an experiment immensely more precise would need to be conducted for an immensely large number of parameters. Naively optimizing a strategy to say, Settlers of Catan, in this way would be a project that would outlast the sun.

The second reason is mathematical; the problem of optimizing a strategy is almost certainly non-convex. What this means is that altering a parameter of a strategy in one direction may initially cause your chances of winning to decrease, but if you increase that parameter even more your chances of winning will ultimately increase. As a concrete example, it may be that the best way to win Catan is to collect as many sheep as possible – so many sheep that you can trade sheep for any good you want and ultimately win the game. However, if you are currently examining a strategy which involves collecting a balance of goods, increasing your propensity towards sheep may actually decrease your chances of winning. If we always sought to adjust the parameters of our strategy in a way that immediately increased our chances of winning, we would never discover the optimal strategy. The mathematical consequence of the non-convexity of a function is that we can never be certain if we’ve reached a truly optimal strategy, or if we’ve just reached a “local maximum” where adjusting any of the parameters of our strategy would decrease our chances of winning, despite the fact that a better strategy exists.

Conclusion

Part of the fun of playing a game is developing more sophisticated strategies over time and showing them off to your opponents. For a group of players who only play amongst themselves, these strategies will likely be developed through a combination of reasoning within players’ models of the game tree as well as inferential wisdom gained through experimentation. Often a faster way of improving at a game is to play against players more experienced than yourself and studying those players’ decisions so that they might help inform your strategy.

I think it’s important for players to recognize, however, the limits of both deductive and inductive reasoning in the case of games. I have certainly been in situations where I have been convinced of the supremacy of my strategy for a particular game, only to be humiliated by a more skilled opponent. My overconfidence in these cases can be explained by the two methods of reasoning about strategy. In some cases, I have wrongly believed that my theoretical analysis has accounted for the entirety of the strategic space of the game. In other cases, I have overestimated the quality of the inferences I have about the strength of my strategy.