Are People Successful at Learning Sequential Decisions on a Perceptual Matching Task?

Reiko Yakushijin, Aoyama Gakuin University

Robert Jacobs, University of Rochester

Abstract

Sequential decision-making tasks are commonplace in our everyday
lives. We report the results of an experiment in which human subjects were
trained to perform a perceptual matching task, an instance of a sequential
decision-making task. We use two benchmarks to evaluate the quality of subjects'
learning. One benchmark is based on optimal performance as defined by a dynamic
programming procedure. The other is based on an adaptive computational agent that
uses a reinforcement learning method known as Q-learning to learn to perform the
task. Our analyses suggest that subjects learned to perform the perceptual
matching task in a near-optimal manner at the end of training. Subjects were able
to achieve near-optimal performance because they learned, at least partially, the
causal structure underlying the task. Subjects' learning curves were broadly
consistent with those of model-based reinforcement-learning agents that built and
used internal models of how their actions influenced the external environment. We
hypothesize that, in general, people will achieve near-optimal performances on
sequential decision-making tasks when they can detect the effects of their
actions on the environment, and when they can represent and reason about these
effects using an internal mental model.