Automatic Discovery and Transfer of Task Hierarchies in Reinforcement Learning

Neville Mehta, Soumya Ray, Prasad Tadepalli, Thomas Dietterich

Abstract

Sequential decision tasks present many opportunities for the study of transfer learning. A principal one among them is the existence of multiple domains that share the same underlying causal structure for actions. We describe an approach that exploits this shared causal structure to discover a hierarchical task structure in a source domain, which in turn speeds up learning of task execution knowledge in a new target domain. Our approach is theoretically justiﬁed and compares favorably to manually designed task hierarchies in learning efﬁciency in the target domain. We demonstrate that causally motivated task hierarchies transfer more robustly than other kinds of detailed knowledge that depend on the idiosyncrasies of the source domain and are hence less transferable.