Gym Tutorial: The Frozen Lake

In this article, we are going to learn how to create and explore the Frozen Lake environment using the Gym library, an open source project created by OpenAI used for reinforcement learning experiments. The Gym library defines a uniform interface for environments what makes the integration between algorithms and environment easier for developers. Among many ready-to-use environments, the default installation includes a text-mode version of the Frozen Lake game, used as example in our last post.

The Frozen Lake Environment

The first step to create the game is to import the Gym library and create the environment. The code below shows how to do it:

The first instruction imports Gym objects to our current namespace. The next line calls the method gym.make() to create the Frozen Lake environment and then we call the method env.reset() to put it on its initial state. Finally, we call the method env.render() to print its state:

Output of the the method env.render()

So, the same grid we saw in the previous post now is represented by a matrix of characters. Their meaning is as follows:

S: initial state

F: frozen lake

H: hole

G: the goal

Red square: indicates the current position of the player

Also, we can inspect the possible actions to perform in the environment, as well as the possible states of the game:

In the code above, we print on the console the field action_space and the field observation_space. The returned objects are of the type Discrete, which describes a discrete space of size n. For example, the action_space for the Frozen Lake environment is a discrete space of 4 values, which means that the possible values for this space are 0 (zero), 1, 2 and 3. Yet, the observation_space is a discrete space of 16 values, which goes from 0 to 15. Besides, these objects offer some utility methods, like the sample() method which returns a random value from the space. With this method, we can easily create a dummy agent that plays the game randomly:

The code above executes the game for a maximum of 10 iterations using the method sample() from the action_space object to select a random action. Then the env.step() method takes the action as input, executes the action on the environment and returns a tuple of four values:

new_state: the new state of the environment

reward: the reward

done: a boolean flag indicating if the returned state is a terminal state

info: an object with additional information for debugging purposes

Finally, we use the method env.render() to print the grid on the console and use the returned “done” flag to break the loop. Notice that the selected action is printed together with the grid:

Output of successive calls to env.render() method, after selecting an action to execute

Stochastic vs Deterministic

Note in the previous output the cases in which the player moves in a different direction than the one chosen by the agent. This behavior is completely normal in the Frozen Lake environment because it simulates a slippery surface. Also, this behavior represents an important characteristic of real-world environments: the transitions from one state to another, for a given action, are probabilistic. For example, if we shoot a bow and arrow there’s a chance to hit the target as well as to miss it. The distribution between these two possibilities will depend on our skill and other factors, like the direction of the wind, for example. Due to this probabilistic nature, the final result of a state transition does not depend entirely on the taken action.

By default, the Frozen Lake environment provided in Gym has probabilistic transitions between states. In other words, even when our agent chooses to move in one direction, the environment can execute a movement in another direction:

Executing the code above, we can observe different results and paths at each execution. Also, using the info object returned by the step method we can inspect the probability used by the environment to choose the executed movement:

The character moved in directions other than the selected one, with probability of 0.3333…

However, the Frozen Lake environment can also be used in deterministic mode. By setting the property is_slippery=False when creating the environment, the slippery surface is turned off and then the environment always executes the action chosen by the agent:

Observe that the probabilities returned in the info object is always equals to 1.0.

In deterministic mode, the agent always move in the selected direction

Map sizes and custom maps

The default 4×4 map is not the only option to play the Frozen Lake game. Also, there’s an 8×8 version that we can create in two different ways. The first one is to use the specific environment id for the 8×8 map:

Conclusion

In this post, we learned how to use the Gym library to create an environment to train a reinforcement learning agent. We focused on the Frozen Lake environment, a text mode game with simple rules but that allows us to explore the fundamental concepts of reinforcement learning.

By clicking submit, you agree to share your email address with the site owner and Mailchimp to receive marketing, updates, and other emails from the site owner. Use the unsubscribe link in those emails to opt out at any time.

Processing…

Success! You're on the list.

Whoops! There was an error and we couldn't process your subscription. Please reload the page and try again.