You are here

Automated generation of environments to test the general learning capabilities of AI agents

Algorithms for evolving agents that learn during their lifetime have typically been evaluated on only a handful of environments. Designing such environments is labour intensive, potentially biased, and provides only a small sample size that may prevent accurate general conclusions from being drawn. In this paper we introduce a method for automatically generating MDP environments which allows the diculty to be scaled in several ways. We present a case study in which environments are generated that vary along three key dimensions of diculty: the number of environment congurations, the number of available actions, and the length of each trial. The study reveals interesting differences between three neural network models {Fixed-Weight, Plastic-Weight, and Modulated Plasticity) that would not have been obvious without sweeping across these different dimensions. Our paper thus introduces a new way of conducting reinforcement learning science: instead of manually designing a few environments, researchers will be able to automatically generate a range of environments across key
dimensions of variation. This will allow scientists to more rigorously assess the general learning capabilities of an algorithm, and may ultimately improve the rate at which we discover how to create AI with general purpose learning.

Automated Generation of Environments to Test the General Learning Capabilities of AI Agents

Automated Generation of Environments to Test the General Learning Capabilities of AI Agents

Pub. Info:

Proceedings of the Genetic and Evolutionary Computation Conference. 161-168