Hierarchical Reinforcement Learning with Context Detection (HRL-CD)

Yiğit E. Yücesoy and M. Borahan Tümer

Abstract—A reinforcement learning (RL) agent mostly assumes environments are stationary which is not feasible on most real world problems. Most RL approaches adapt slow changes by forgetting the previous dynamics of the environment. Reinforcement learning-context detection (RL-CD) is a technique that helps determine changes of the environment’s nature which the agent with the capability to learn different dynamics of the non-stationary environment. In this study we propose an autonomous agent that learns a dynamic environment by taking advantage of hierarchical reinforcement learning (HRL) and present how the hierarchical structure can be integrated into RL-CD to speed up the convergence of a policy.