Download

Abstract

As machine learning is applied to increasingly complex tasks, it is likely that the diverse challenges encountered
can only be addressed by combining the strengths of different learning algorithms. We examine this aspect of learning
through a case study grounded in the robot soccer context. The task we consider is Keepaway, a popular benchmark for
multiagent reinforcement learning from the simulation soccer domain. Whereas previous successful results in Keepaway have
limited learning to an isolated, infrequent decision that amounts to a turn-taking behavior (passing), we expand the
agents' learning capability to include a much more ubiquitous action (moving without the ball, or getting open), such
that at any given time, multiple agents are executing learned behaviors simultaneously. We introduce a policy search
method for learning ``GetOpen'' to complement the temporal difference learning approach employed for learning ``Pass''.
Empirical results indicate that the learned GetOpen policy matches the best hand-coded policy for this task, and outperforms
the best policy found when Pass is learned. We demonstrate that Pass and GetOpen can be learned simultaneously to realize
tightly-coupled soccer team behavior.

BibTeX Entry

@incollection{LNAI09-kalyanakrishnan-1, author = "Shivaram Kalyanakrishnan and Peter Stone", title = "Learning Complementary Multiagent Behaviors: A Case Study", booktitle = "{R}obo{C}up 2009: Robot Soccer World Cup {XIII}", editor = "Jacky Baltes and Michail G. Lagoudakis and Tadashi Naruse and Saeed Shiry Ghidary", Publisher="Springer Verlag", year = "2010", pages="153--165", abstract = { As machine learning is applied to increasingly complex tasks, it is likely that the diverse challenges encountered can only be addressed by combining the strengths of different learning algorithms. We examine this aspect of learning through a case study grounded in the robot soccer context. The task we consider is Keepaway, a popular benchmark for multiagent reinforcement learning from the simulation soccer domain. Whereas previous successful results in Keepaway have limited learning to an isolated, infrequent decision that amounts to a turn-taking behavior (passing), we expand the agents' learning capability to include a much more ubiquitous action (moving without the ball, or getting open), such that at any given time, multiple agents are executing learned behaviors simultaneously. We introduce a policy search method for learning ``GetOpen'' to complement the temporal difference learning approach employed for learning ``Pass''. Empirical results indicate that the learned GetOpen policy matches the best hand-coded policy for this task, and outperforms the best policy found when Pass is learned. We demonstrate that Pass and GetOpen can be learned simultaneously to realize tightly-coupled soccer team behavior. },}