Abstract

A paradigm shift is underway in reinforcement learning (RL) research.
The dominant approach to RL over the past 50 years has been based on
dynamic programming, whereby RL agents learn an abstract value
function. An alternative approach, Direct Reinforcement (DR), has
recently been revisited, wherein DR agents learn strategies to solve
problems directly. DR can enable a simpler problem representation,
avoid Bellman's curse of dimensionality, and offer compelling advantages
in efficiency.

We review and contrast the major approaches to RL, present Direct
Reinforcement, and describe its application to asset management with
transaction costs. In this very challenging and uncertain domain, DR
agents seek to discover strategies that maximize profit, economic
utility or risk adjusted returns. The potential powers of DR are
illustrated through an asset allocator and an intradaily foreign
exchange trader.

Other promising applications of DR may be found in robotics, autonomous
vehicles, industrial control, telecommunications, data mining, adaptive
software agents, and decision support.