Imitation learning has been commonly applied to solve different tasks in
isolation. This usually requires either careful feature engineering, or a
significant number of samples. This is far from what we desire: ideally, robots
should be able to learn from very few demonstrations of any given task, and
instantly generalize to new situations of the same task, without requiring
task-specific engineering. In this paper, we propose a meta-learning framework
for achieving such capability, which we call one-shot imitation learning.
Specifically, we consider the setting where there is a very large set of
tasks, and each task has many instantiations. For example, a task could be to
stack all blocks on a table into a single tower, another task could be to place
all blocks on a table into two-block towers, etc. In each case, different
instances of the task would consist of different sets of blocks with different
initial states. At training time, our algorithm is presented with pairs of
demonstrations for a subset of all tasks. A neural net is trained that takes as
input one demonstration and the current state (which initially is the initial
state of the other demonstration of the pair), and outputs an action with the
goal that the resulting sequence of states and actions matches as closely as
possible with the second demonstration. At test time, a demonstration of a
single instance of a new task is presented, and the neural net is expected to
perform well on new instances of this new task. The use of soft attention
allows the model to generalize to conditions and tasks unseen in the training
data. We anticipate that by training this model on a much greater variety of
tasks and settings, we will obtain a general system that can turn any
demonstrations into robust policies that can accomplish an overwhelming variety
of tasks.
Videos available at https://bit.ly/one-shot-imitation.

Captured tweets and retweets: 5

Made with a human heart + one part enriched uranium + four parts unicorn blood