Artificial Intelligence in Video Games

In FSM, each situation would be assigned a specific action by the developers creating the game.

The FSM algorithm is not feasible to use in every game.

Imagine using FSM in a strategy game for example.

If a bot were pre-programmed to respond the same way every time, the player would quickly learn how to outsmart the computer.

This creates a repetitive gaming experience which, as you might expect, would not be enjoyable for the player.

The Monte Carlo Search Tree (MCST) algorithm was created to prevent the repeatability aspect of FSM.

The way MCST works is by first visualizing all of the possible moves a bot has available to it currently.

Then, for each of those possible moves, it analyzes all of the moves a player could respond with, then it considers all of it’s possible responding moves it could make in response, etc.

(Lou, 2017).

You can visualize how quickly this tree would become massive in size.

Here is a good chart to visualize how MCST works:Figure 2 (Lou, 2017)Figure 2 highlights the process that a computer using MCST goes through before making a move against a human component.

It first looks at all of the options it has, in the above example these options are to either defend, build technology, or attack.

Then it builds out a tree that predicts the likelihood of success for each potential move thereafter.

Above we can see that the option with the highest likelihood of success is ‘attack’ (because dark red equals a higher probability of reward), therefore the computer chooses to attack.

When the player makes their next move the computer repeats the tree building process all over again.

Imagine a game like Civilization where there are a huge amount of choices a computer could make.

The computer would take an extremely long time if it were to build out a detailed tree for every possible choice and every scenario possible for the entire game.

It would never make a move.

So, to avoid this huge calculation, the MCST algorithm will randomly select a handful of possible options and build out the trees for only the ones selected.

That way, the calculation is much quicker and the computer can analyze which selected option would have the highest likelihood of reward.

A.

I.

in Alien: IsolationOne of the more popular forms of advanced A.

I.

in video games recently is the Alien from Creative Assembly’s Alien: Isolation.

There are some misunderstandings about how the A.

I.

works behind the scenes.

However, it is a remarkable display of the ways in which A.

I.

can be used to create an engaging and unpredictable environment for the player.

The alien in Alien: Isolation has two driving A.

I.

forces controlling its movements and behavior: Director A.

I.

and Alien A.

I.

The Director A.

I.

is a passive controller that is in charge of creating an enjoyable player experience.

In order to achieve this, the Director A.

I.

knows where both the player and alien are at all times.

However, it does not share this knowledge with the alien.

The Director A.

I.

keeps an eye on what is referred to as the Menace Gauge, which is essentially just a measure of expected player stress levels determined by a multitude of factors such as the alien’s proximity to the player, the amount of time the alien spends near the player, the amount of time spent in sight of the player, the amount of time spent visible on the motion tracker device, etc.

This Menace Gauge informs the alien’s Job Systems, which is essentially just a task tracker for the alien.

If the Menace Gauge reaches a certain level, the priority of the task “search new location zone” will grow until the alien moves away from the player into a separate area.

Behavioral Decision Trees:Before diving into how the Alien A.

I.

works in action, it is important to first highlight the structure that informs the decision-making process.

If, at any point, one of the nodes returns a (fail), the entire sequence would fail.

For example, if it turned out that “Do I Have Food?” failed, it would not check to see if there were any enemies around and it would not eat the food.

Instead, the sequence would fail and that would be the end of that sequence.

Sequences can obviously get much more complex, and become multilayered in depth.

Here is a deeper example:Figure 4 (Simpson, 2014)Remember, when a sequence either succeeds or fails it returns the result to it’s Parent node.

In the example above, let’s assume that we have succeeded in approaching the door, but failed to open the door as it was locked and we had no key.

The sequence node was marked as a fail.

As a result, the behavior tree path reverted to the parent node of that sequence.

Here is what this parent node might look like:Figure 5 (Simpson, 2014)So, we have failed at opening the door, but we haven’t given up yet.

Our parent node has another sequence for us to try.

This time it involves entering through a window instead.

The Alien A.

I.

has 30 different selector nodes and 100 total nodes, so it is exponentially more complicated than this example, but I hope this gives you an idea about how the Alien A.

I.

works under the hood.

Back to Alien A.

I.

As we know, the Alien A.

I.

is the system that controls the alien’s actions.

It is never provided information about the player’s location.

The only information it receives from the Director A.

I.

is which general area it should search.

Beyond that, it must find the player on its own.

It does have some tools to use that help it in hunting down the player.

The first is the Sensor System which allows the alien to pick up on audio and visual cues in the environment.

Noises such as footsteps, gunshots, the opening of doors, even the beeping of the motion tracker, all of these help the alien to track down the player.

The audio range depends on the type of noise that was created.

In addition to audio sensors, the alien can also pick up on visual sensors such as glimpsing Ripley running past, or seeing a door open in view, etc.

Another tool the alien has to hunt down the player is the Searching System.

There are specific areas that the developers have determined to be good hiding spots that the alien is pre-programmed to search.

However, it does not search them in any particular order, and will even double check areas that have already been visited.

Of course, if the alien hears a noise or sees a visual cue, it will search an area that the developers have not specifically outlined.

The most commonly discussed topic about Alien: Isolation is how the alien seems to learn more about the player as the game progresses.

The actions it makes seem to become more complex as it learns certain traits about a player’s play style.

Surprisingly to some, the way the developers achieved this was not by building in a complex neural network into the alien’s A.

I.

system.

To show how the game accomplishes this sense of alien learning we need to refer back to the Alien A.

I.

’s behavioral decision tree.

Figure 6 (Simpson, 2014)At the start of the game, there are sections of this behavioral tree that are blocked out to the alien.

The areas that are blocked out are inaccessible to the alien, meaning it cannot access certain behaviors and actions.

For example, at the start of the game the section of the tree that responds to the sound of a door opening in the distance may not be active.

If a player opens a door in the alien’s view it could unlock that section of the behavioral tree so that, in the future, the sound of a door opening will trigger a response.

As the player progresses through the game, more and more of the alien’s behavioral tree is unlocked.

This gives the illusion that the alien is learning and adapting to the player’s gaming style.

Genetic Neural Networks in Video GamesThis article would not be complete without at least some mention of neural networks being applied to video games.

There are some very famous recent examples, one being the A.

I.

that beat a professional Dota 2 team.

However, the best way to cover this topic is to start small and build a basic understanding of how a neural network learns the goals and strategies of a video game.

Figure 7 (Comi, 2018)The game we will use for this basic understanding purpose is Snake.

For those of you unfamiliar, Snake is a 2D game where you control a line of squares (referred to as a snake).

You have three choices for movement: left, right, or straight ahead.

If you hit a wall or run into your tail you die instantly and restart.

There is a dot for you to collect (referred to as food) that will grow your tail by one square.

So the more you eat, the longer you become.

Let’s imagine we want to teach our snake how to get as high of a score as possible.

For our snake to survive in this world it needs to learn a few things.

For our snake to learn it needs to be provided information about the environment.

We will refer to this information we provide it as inputs.

These inputs can be anything that we have information about.

For example, our inputs could be the following 6 Yes/No questions: is it clear straight, is it clear left, is it clear right, is food straight, is food left, is food right (Designing AI, 2017).

This would provide 6 input nodes with 1 or 0 depending on the answer to each question.

However, these inputs could also be measures of the distance between the head of the snake and the wall, or it’s tail, or the food.

For simplicity, let’s stay at our 6 input nodes example.

The next thing we need to tell our snake is what we want it to achieve.

To communicate our desired goal, we implement a reward system.

For example, we might give our snake 1 point every time it moves 1 step towards the food, and maybe 10 points every time it eats the food and grows in length.

However, when Binggeser (Designing AI, 2017) implemented these rewards for his snake, he realized his snake would only move in a very small circle.

This way his snake was able to rack up points while avoiding the hazards posed by walls and a long tail.

Obviously, this wasn’t the intended result.

There needed to be some type of punishment built into the initial model that removed points whenever the snake moved away from the food.

This encouraged the snake to move primarily in the direction of the food.

So now we have a snake that has information from the environment and a reward system that defines what its goal is.

Where do we go from here?.How does our snake actually learn how to play the game?.At this point, it will be helpful to give a quick run through of how a neural network actually works.

Generational Neural NetworkA generational neural network is structured the same way as a standard neural network.

It starts off with a certain number of input nodes, which then feed into one or more hidden layers, eventually providing an output.

Here is a good example visual:Figure 8 (Comi, 2018)For our snake example, we would have 6 input nodes which are the 6 Yes/No questions we defined earlier: is it clear straight, is it clear left, is it clear right, is food straight, is food left, is food right.

Each input node connects to each of the first hidden layer nodes through what we refer to as weights.

In figure 8 we see all of the lines (weights) connecting to each of the nodes.

These weights are what our model will be adjusting as it learns which inputs to strengthen or weaken to provide the most accurate outputs.

In our case, “the most accurate output” is defined as “the snake that collects the highest amount of points”.

Remember, our snake receives points for moving towards the food, it receives even more points for eating the food, and it receives negative points for moving away from the food.

The way a generational neural network ‘learns’ is by first deciding the size of each generation (let’s say we want each generation to contain 200 snakes).

Next, it creates micro-variations in the weights for each of the 200 snakes in the first generation, then it runs each of the 200 snakes in the first generation and selects the most successful snakes (the snakes that received the most points).

Let’s say we pick the top 10 snakes (top 5%) that received the most points in our first generation.

These 10 snakes then become the ‘parents’ of the second generation.

The weights of these 10 snakes are used to define the second generation's starting point.

The second generation of 200 snakes will again create micro-variations in these weights and the top performers will be selected as ‘parents’ of the third generation, and so on.

Back to SnakeSo, as we saw above, we can run our first generation snake model over and over (above we ran it 200 times) to see a wide range of variations the snake comes up with through changing each of the weights slightly.

We then select the top performers that will go on to influence the neural network weightings in the second generation.

We repeat this process for every subsequent generation until the snake’s learning rate begins to level out (in other words, until the generational improvement slows or stops).

Maybe in the first, second, and third generations none of the snakes ever ate a piece of food, and therefore never learned that food rewards 10 points.

However, maybe on the fourth generation one snake eats a piece of food.

This snake will likely have the highest amount of points from its generation and will, therefore, be selected to influence future generations.

The weights of future generations will be altered based on the most successful snake ancestors.

After 10, 100, or even 1,000 generations, you can imagine how much learning will take place.

Uses of Video Game A.

I.

in the Real WorldThe same type of reinforcement learning that is being used in the video game industry is also being successfully applied to other industries.

For example, the Grand Theft Auto games, which have pre-programmed “traffic rules, roads, and car physics” (Luzgin, 2018) have been used to provide a safe and realistic environment for testing self-driving car algorithms.

Not only is it safe and realistic, but it is also up to 1,000 times faster to collect data in a virtual environment compared to the real world (Luzgin, 2018).

“Video games are a great way of training AI algorithms because they are designed to give human minds gradual progression into harder and harder challenges.

” (Luzgin, 2018)One of the latest advancements of A.

I.

in video games was made by researchers at Open AI.

Open AI created a game based on an algorithm whose sole purpose was simply to explore with a sense of natural curiosity.

The reward system focused on rewarding exploration over progressing further into the game.

The researchers placed this curiosity-driven model into a game of Super Mario Bros.

and it successfully passed 11 levels out of pure curiosity.

Obviously, there are downsides to this, as it takes immense computing power and the machine can get easily distracted.

However, this would also be the same for a human player playing the game for the first time.

As Luzgin quoted in his article, “babies appear to employ goal-less exploration to learn skills that will be useful later on in life.

” This sense of goal-less exploration is continued throughout life, but the most obvious example is in the exploration of virtual environments through video games.

SummaryThere are many forms of A.

I.

in use across the video game industry today.

Whether it’s a simplistic FSM model or an advanced neural network learning from feedback in it’s the environment, the possibilities that these virtual environments provide to the advancement of A.