Title: Hybrid Supervised/Reinforcement Learning Problems
Umar Syed, Princeton University
In supervised learning, a learner passively receives training examples
labeled by an expert. In reinforcement learning, the learner interacts
with an environment, receiving a training signal in the form of
rewards. In this talk, I will discuss algorithms for learning problems
that have features of both supervised and reinforcement
learning. Specifically, I will describe how one can leverage advice from
an expert to learn a good policy in an environment where the true reward
function is only partially known. An interesting consequence of our
analysis is a novel performance guarantee for multiplicative weights
algorithms. I will also describe a method for learning high-reward
behavior in an environment where the rewards can abruptly, but
predictably, change. I will present theoretical performance guarantees
for each of our algorithms, which in some cases represent substantial
improvements over guarantees for earlier methods. I will also present
some experimental results. Our algorithms are motivated by problems in
vehicle navigation and web search.
Includes joint work with Rob Schapire (Princeton), Nina Mishra
(Microsoft), and Alex Slivkins (Microsoft).