Rock Paper Scissors – Genetic Optimization Visualization

Week 1: “Genetic Optimization Visualization”

For our first session of Games & Art, we looked at Rock Paper Scissors (RPS) as well as at strategies for winning this time-honored game.

In the standard game, one point is awarded for rock beating scissors, paper beating rock or scissors beating paper. With this point assignment, there is no numerical strategy that will create a distinct advantage. Basically, the best way to win is to try to psyche the other player out or, to go a bit further, to make your rock, paper or scissors choices be as random as possible. In other words, R, P and S should each have a 33.3% chance of being chosen in each round. This makes sense given that RPS is generally thought of being more along the lines of a coin toss than a game.

If we change the point distributions awarded to a winning R, P or S, however, the possibility of numerical strategies for winning begins to emerge.

One variation we considered in class was the following set of point assignments: R:+10, P:+3, S:+1

In this variation, the reward for winning with an S is only +1, which is much less than R or P. To make matters worse, if the other player happens to choose R, then the effective cost of choosing S can be thought of as -11 since the other player has gained 10 while you have also lost the opportunity to gain 1. Therefore, we can see that an optimal strategy for winning at this variation of the game would not be an even 33%/33%/33% distribution between R, P and S. Instead, we should favor our choices towards R or P. But to what extent?

To find this out, I looked towards genetic algorithms. With the help of Dan Shiffman’s excellent genetic algorithm text, I devised the following model for an algorithm that could be used to determine the optimal choice-probability distribution for any particular point distribution:

First, I defined a Player class of objects, which contains four data points: a R, P, and S choice-probability value as well as the number of points the player has scored. Next, an initial population of players is generated. The choice-probabilities for each player are initially random, but hover somewhere close to an even 33%/33%/33% distribution. Once the population has been generated, the players are put into matches against one another. In each match, each player chooses an R, P or S in accordance with its choice-probability model. The winner of each match is awarded with the appropriate number of points. After the players have played a sufficient number of games against one another, their fitness for mating is determined by the number of points they have scored. So if one player has scored 10 points, the probability of its being selected for mating is ten times that of a player who has scored only 1 point.

The mating process consists of pairing two players together and using some of each of their “genetic” traits to form a child player. I had to experiment with the best way to go about this mating process. Since the R, P and S choice probabilities for each player must add up to a total of 100%, we cannot disassociate these values from one another. My first thought was to simply average the two parents’ choice-probability values to form the child’s values, but this turned out to produce somewhat muddy results and does not always come out to 100% probability total. The approach I eventually implemented was to take a number line between 0 and 1 and insert two barriers into it: one barrier separating R from P and one separating P from S. These two barriers can be moved anywhere in the number line and so long as they don’t cross one another or go above 1 or below 0, we will end up with positive R, P and S probability values that sum to 1.0 (or 100%). Using this way of the dividing the probability space works well for the mating process. Once two parents have been chosen for mating, the algorithm randomly decides whether to take one barrier value from each parent or both from one. Finally, to ensure that the gene pool does not get too incestuous, each child is given a 10% chance of mutating. If a child is chosen for mutation, it’s barrier values are moved randomly in either direction by a small amount.

With this genetic algorithm, the program is able to automatically find an optimal choice-probability model for a given RPS point distribution by using a survival of the fittest approach where fitness is determined by the player’s accumulated winnings over many games. The size of the population, the number opponents each player plays against, the number of matches that each pair of plays against one another and the number of generations over which the algorithm is run all have an effect on the accuracy of the algorithm’s determination of what constitutes an “optimal” choice-probability model.

Next, I set about the task of creating a visualization that would present the optimal choice-probability distribution for a group of RPS point distribution variations. For visualization purposes, I had the good fortune of RPS being a three-dimensional system, which maps well to both 3D coordinates and RGB values.

I assigned Rock to the X axis, Paper to the Y axis and Scissors to the Z axis. As we move along the X from 0 to 9, for instance, we add one to the value of winning with a rock and so forth for each of the axises and its respective R, P or S point value.

At each coordinate in the grid, I represent the choice-probability distribution with a color. Rock is associated with the Red color dimension, Paper with Green and Scissors with Blue. So, if for instance, Rock was worth 9 and Paper and Scissors were both worth 0, the color of that grid coordinate should in theory be bright Red as an optimal strategy for this point distribution would presumably bias towards the choice of Rock.

To show how the number of players, opponents, matches and generations affect the successfulness of the optimization process, I ran my visualization using varied parameters for these variables and recorded the results. Here are a few samples:

Notice that with a small number of players, opponents and matches, the arrangement of color values is relatively unordered. With a larger number of players, opponents and matches, the color arrangements take on a linear gradient quality. This shows that greater optimization (or accuracy if you prefer) has been achieved with the larger population. If the smaller population were allowed to run for a much larger number of generations, it’s accuracy would begin to approach that of the larger population.

I wrote the genetic algorithm in C++ and the visualization uses openFrameworks.

WEEK 2: “Genetic RPS Newsfeed”

For last week’s assignment, I was interested in exploring the aesthetics of RPS as a type of zero-player game. In the book, Evolution and the Theory of Games, John Maynard Smith considers the ways in which game theory or the investigation of strategic advantage can be applied to our understanding of individual fitness and evolution within a population. For instance, with the Hawk-Dove game, he demonstrates how the aggressiveness of an individual’s strategy in competing for food affects population balance and, in turn, how evolutionarily stable populations can emerge from the proper mix of strategies utilized by individuals as part of an ecosystem. To call RPS a zero-player game is in some respects counter intuitive. There are, of course, two players. In a statistical sense, though, these two individuals are merely playing out the probabilities. Though each run of the aforementioned genetic algorithm is likely to produce slightly different optimal probability distributions for a given point distribution, the results vary only slightly and therefore indicate that to a large extent, there can be little individual differentiation in the numerical aspect of choosing a winning strategy.

What remains, then, is a sort of mating dance. If no individual is truly more fit than any other, competition must be about the extent to which an individual is able to purport him or herself as more fit than another. That is to say, RPS is about psyching the other player out. The best way to win seems to be to focus on executing the steps of your own dance while trying remain undistracted by that of your opponent. It is hard to conjure random or statistically-weighted choices while in the midst of a mating dance. So for this week, I was interested in testing whether strategic advantage could be achieved by enhancing a player’s ability to do so.

Using my genetic algorithm, I precompiled a list of R,P or S choices for a particular point distribution (R:10 P:3 S:1) and converted this list to an audio feed. Before entering a game of RPS with a human opponent, I donned a TV news anchor-style earpiece, which would be used to feed me a stream of R, P or S’s during game play. I then played a number of games against opponents who were not given any technological aid. After playing several opponents over a number of games, it was clear that this aid was indeed giving me an advantage over my opponents. Of course, RPS remains a game of probability and so I found that this device created an advantage but did not lead to a complete domination of all opponents in all cases.

Is RPS a zero-player game? Understanding, of course, that I am using this term metaphorically and not by the technical definition (which is intended to describe dynamics more like Conway’s Game of Life, etc), I submit that RPS is a game of narrow margins. But so too is the mating game. If RPS falls somewhat short in meeting the criteria required by Smith’s comprehension of evolutionary advantage, perhaps it forms a better metaphor for the evolutionary state of humankind. That is to say, the availability of technology and the complexity of our society is such that the fitness criteria for an individual human cannot be clearly named. Rather, we live and die by something far more ephemeral, with all the same technologies, medicines, etc at our disposal, we have only the slight bias of our mating dance, our swagger, to put us ahead. In the end, this seems to make all the difference or possibly none at all.