chical reinforcement learning that does not rely on a pri ori hierarchical structures Thus the approach deals with a more di cult problem compared with existing work It in volves learning to segment sequences to create hierarchical structures based on reinforcement received during task ex ecution with di erent levels of control communicating with each other through sharing reinforcement estimates obtained by each others The algorithm segments sequences to re duce non Markovian temporal dependencies to facilitate the learning of the overall task Initial experiments demon strated the basic promise of the approach..