Dopamine - 谷歌开源基于 TensorFlow 的强化学习框架

Dopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. It aims to fill the need for a small, easily grokked codebase in which users can freely experiment with wild ideas (speculative research).

Our design principles are:

Easy experimentation: Make it easy for new users to run benchmark experiments.

Flexible development: Make it easy for new users to try out research ideas.

Compact and reliable: Provide implementations for a few, battle-tested algorithms.

In the spirit of these principles, this first version focuses on supporting the state-of-the-art, single-GPU Rainbow agent (Hessel et al., 2018) applied to Atari 2600 game-playing (Bellemare et al., 2013). Specifically, our Rainbow agent implements the three components identified as most important by Hessel et al.:

What's new

01/11/2018: Download links for each individual checkpoint, to avoid having to download all of the checkpoints.

29/10/2018: Graph definitions now show up in Tensorboard.

16/10/2018: Fixed a subtle bug in the IQN implementation and upated the colab tools, the JSON files, and all the downloadable data.

18/09/2018: Added support for double-DQN style updates for the ImplicitQuantileAgent.

Can be enabled via the double_dqn constructor parameter.

18/09/2018: Added support for reporting in-iteration losses directly from the agent to Tensorboard.

Set the run_experiment.create_agent.debug_mode = True via the configuration file or using the gin_bindings flag to enable it.

Control frequency of writes with the summary_writing_frequency agent constructor parameter (defaults to 500).

27/08/2018: Dopamine launched!

Instructions

Install via source

Installing from source allows you to modify the agents and experiments as you please, and is likely to be the pathway of choice for long-term use. These instructions assume that you've already set up your favourite package manager (e.g. apt on Ubuntu, homebrew on Mac OS X), and that a C++ compiler is available from the command-line (almost certainly the case if your favourite package manager works).

The instructions below assume that you will be running Dopamine in a virtual environment. A virtual environment lets you control which dependencies are installed for which program; however, this step is optional and you may choose to ignore it.

Dopamine is a Tensorflow-based framework, and we recommend you also consult the Tensorflow documentation for additional details.

Finally, these instructions are for Python 2.7. While Dopamine is Python 3 compatible, there may be some additional steps needed during installation.

First install Anaconda, which we will use as the environment manager, then proceed below.

To get finer-grained information about the process, you can adjust the experiment parameters in dopamine/agents/dqn/configs/dqn.gin, in particular by reducing Runner.training_steps and Runner.evaluation_steps, which together determine the total number of steps needed to complete an iteration. This is useful if you want to inspect log files or checkpoints, which are generated at the end of each iteration.