Download

Abstract

Temporal difference (TD) learning methods have become popular reinforcement
learning techniques in recent years. TD methods have had some experimental successes and have
been shown to exhibit some desirable properties in theory, but have often been found very slow in
practice. A key feature of TD methods is that they represent policies in terms of value functions.
In this paper we introduce behavior transfer, a novel approach to speeding up TD learning
by transferring the learned value function from one task to a second related task. We present
experimental results showing that autonomous learners are able to learn one multiagent task
and then use behavior transfer to markedly reduce the total training time for a more complex
task.

BibTeX Entry

@InProceedings{AAMAS05-transfer,
author="Matthew E.\ Taylor and Peter Stone",
title="Behavior Transfer for Value-Function-Based Reinforcement Learning",
booktitle="The Fourth International Joint Conference on Autonomous Agents and Multiagent Systems",
month="July",year="2005",
editor="Frank Dignum and Virginia Dignum and Sven Koenig and Sarit Kraus and Munindar P.~Singh and Michael Wooldridge",
publisher="{ACM Press}",
address="New York, NY",
pages="53--59",
abstract={
Temporal difference (TD) learning
methods have become popular
reinforcement learning techniques in recent years. TD
methods have had some experimental successes and have
been shown to exhibit some desirable properties in
theory, but have often been found very slow in
practice. A key feature of TD methods is that they
represent policies in terms of value functions. In
this paper we introduce \emph{behavior transfer}, a
novel approach to speeding up TD learning by
transferring the learned value function from one task
to a second related task. We present experimental
results showing that autonomous learners are able to
learn one multiagent task and then use behavior
transfer to markedly reduce the total training time
for a more complex task.
},
wwwnote={<a href="http://www.aamas2005.nl/">AAMAS-2005</a>},
}