DeepMPC

Learning Deep Latent Features for Model Predictive Control

What is DeepMPC?

For robots to be able to perform the large variety of tasks required of them
in the real world, they need to learn layers of abstracted knowledge.
Imagine a robot getting your dinner ready for when you come home from work.
Sure beats takeout!
However, even making a simple recipe like a salad is challenging. There is a huge amount of variety in food items, and each needs to be manipulated differently.

Following traditional control theory, the solution to this problem would be to create a new
controller for each food item we want the robot to chop - one for cucumbers, one
for lemons, one for potatoes, and so on. Obviously, this is very time-consuming and doesn't let us handle new materials. So much for that bok choy! It also has problems dealing with variations inside the same type of material, like
thicker or thinner carrots and potatoes, varying temperature, or layers
like the skin and flesh of a lemon.

A skilled chef understands from experience how each material responds,
and adapts to what he feels while cutting. That's what
DeepMPC does, too. It lets the robot learn a model of how the
world responds to its actions, even under all the variety we see when
cutting food. Then, when acting, it uses this model
to pick the actions which it thinks will give the best future results,
contiunually updating (100 times a second) based on new information.

Of course, cutting food is just one application of DeepMPC - it's designed
to be a flexible, general approach to solving complex control problems where
the robot interacts with a changing environment. Other such problems
include scrubbing dishes, assembling furniture, manuevering aerial or underwater
drones, and even robotic surgery.

How does DeepMPC work?

More formally, DeepMPC is an approach to model learning for
predictive control
designed to handle both variations in the robot's enviornment
and variations that might occur while the robot acts. The two
main components of this algorithm are a Model Predictive
Controller (MPC) and Deep Learning (DL).

Model Predictive Control:
The key idea behind MPC is essentially allowing a robot to
ask "what if?" If we have an accurate model of
how the world responds to the robot's actions, it can use this
to try new ones out and pick the one it predicts will do best.
The problem is
that we need an accurate model so these predictions
match the real world. Hand-coding models for complex, varying
tasks like cutting food is almost impossible, so instead
we want the robot to be able to learn from
experience.

Deep Learning:
To learn these models, we apply deep learning algorithms. Deep
learning is an exciting new class of machine learning
algorithms which learn simulated neural networks to solve
problems. These
algorithms are especially powerful because they can learn
good represntations (abstractions) directly from data, allowing
us to deal with real-world variety without having to model
it ourselves.

In DeepMPC, we develop a new deep architecture and learning
algorithm for modeling
complex physical dynamics like we see when cutting food. This
architecture allows us to model the effects of variations
between and within materials without having to manually define
a set of properties to do so. Instead, our algorithm
automatically learns a set of features which let the model
recognize these properties and predict their effects.

Video and Results

More concretely, we performed over 450 experiments on 10
different materials, comparing our approach to conventional
non-adaptive stiffness controllers:

Even compared to stiffness controllers tuned specifically
for a particular material, DeepMPC still gave better
performance. This is because even per-class tuning ignores
all the variation that happens inside a certain class of
material - for example, the thin end of a carrot is much easier
to cut than the thick end, and cutting a potato gets
much harder as the knife moves from the surface to the inside.
DeepMPC can detect these kinds of variations and adapt to them
while cutting, making for a much more robust controller equipped
to deal with real-world variety.