Reinforcement Learning Repository at UMass, Amherst

Demos and Implementation (Domains)

This section contains programs which demonstrate
reinforcement learning in action, as an illustration of the concepts and
common algorithms. These programs might provide a useful
starting place for the implementation of reinforcement learning to solve
real
problems and advance research in this area. Wherever possible, source
code is included.

Please note that use of this software is restricted; you must
read this license agreement and agree to its terms
before downloading any
software from this site. Downloading the software is considered consent
to the terms.

Elevator
Fortran simulation of an elevator, written by James Lewis, and provided by
Christos Cassandras at UMASS ECE Dept. The reinforcement learning
addition to the elevator simulation was implemented by Bob Crites, CS
Dept. UMass. and John McNulty and is described in the paper
Improving
Elevator Performance Using Reinforcement Learning.
Source code: elevator.tar.gz
(284 K) or
elevator.tar
(814
K). Both require a C compiler and the f2c library
to convert Fortran to c, as it incorporates c random number handling
routines).

Grid World
This program is a simulation of learning the goal of moving to a
user-defined square of a grid. It uses Q-learning, and was written by
Sridhar Mahadevan.
Source code:grid.tar (72 K;
requires C compiler and X11 libraries)

Interactive Demo of Q-learning
A Java swing
applet that allows the user to construct a grid by specifying danger and
target cells, and then modify various learning paramaeters. Upon completion
of learning, the learned policy is represented as arrows overlaying the grid.
Documentation is available
here.
A french version is available
here
with documentation.
Written by Thierry Masson.
Requirements: JDK 1.3 or higher.
Source code: TM_QLearnerDemo_Src_only.jar (38.8 K).
Classes: TM_QLearnerDemo.jar (22.6 K).

Proposed Standard for Reinforcement Learning Software
This standard,
developed by Rich Sutton and Juan Carlos Santamaria, is intended to
facilitate RL research and development, and is available for C++ and
Common Lisp.

Proposed Standard Interface
A proposed standard interface
for RL systems written in C++. It provides standard interface
classes for an agent, an environment, a function approximator,
and states and actions. Written by Bohdana Ratitch.
Source code: si-classes.tar
(72 K, requires C++ compiler, documentation).

Programs written by Bohdana Ratitch using this proposed
standard interface are the following. All require a C++ compiler.
Further compiler information can be found
here.

Reinforcement Learning Toolbox
The
Reinforcement Learning Toolbox is a set of classes implementing
a variety of reinforcement learning algorithms,
including TD-lambda, actor critic, prioritzed sweeping, and hierarchical learning. The toolbox also permits logging and error recognition.
Written by Gerhard Neumann and Stephan Neumann.
Download (written in C++, available for both Windows and Linux).

Rumpus Gridworld Simulator
A
language independent simulator
that uses TCP/IP ports for interaction with client
applications (agents).
Requires the scripting language Ruby.
Simulator
allows a gridworld to be specified using a bitmap format and
supports both local and
unique state descriptions as well as deterministic and
non-deterministic actions.
Written by Torbjorn Dahl.

CLSquareCLSquare
from the Neuroinformatics Group at the University of
Osnabrueck
simulates a control loop for
closed loop control. Although originally designed
for training and testing Reinforcement Learning controllers, it also applies to other learning
and non-learning controller concepts.
Currently availabe plants:
Acrobot, bicycle, cart pole, cart double pole, pole, mountain car and maze.
Currently availabe controllers:
linear controller, Reinforcement learning Q table, neural network based Q controller.
It comes with many useful features, e.g. graphical display and statistics output,
a documentation, and many demos for quick starting.

PIQLEPIQLE
is an open source set of Java classes
for quickly experimenting with single- and multi-agent reinforcement
learning schemes (new problems or new algorithms)
by Francesco De Comité.
Version 2, with with a major refactoring of classes, English
renamings and synthetic documentation released in November 2006.

The
Pinball domain is
a fairly challenging 4-dimensional continuous and dynamic reinforcement
learning
domain. The goal is to maneuver the blue ball into the red hole, while
avoiding (or using, since the ball is dynamic and collisions are
elastic) the obstacles. The dynamics of the ball and the presence of
obstacles result in a domain with sharp discontinuities, and the
location and shape of the obstacles can be specified so you can make
the domain as hard or as easy as you want.

The source code is in Java, and includes full documentation,
an
RL-Glue interface, and GUI
programs for editing obstacle
configurations, viewing saved trajectories, etc.