Publications

We present an approach for one-shot learning from a video of a human by using human and robot demonstration data from a variety of previous tasks to build up prior knowledge through meta-learning. Then, combining this prior knowledge and only a single video demonstration from a human, the robot can perform the task that the human demonstrated. We show experiments on both a PR2 arm and a Sawyer arm, demonstrating that after meta-learning, the robot can learn to place, push, and pick-and-place new objects using just one video of a human performing the manipulation.
arXiv

*, Zoe McCarthy*, Owen Jow, Dennis Lee, Ken Goldberg, Pieter Abbeel

In the
IEEE International Conference on Robotics and Automation (ICRA), 2018.

We describe how consumer-grade virtual reality headsets and hand tracking hardware can be used to naturally teleoperate robots to perform complex tasks. We also describe how imitation learning can learn deep neural network policies (mapping from pixels to actions) that can acquire the demonstrated skills. Our experiments showcase the effectiveness of our approach for learning visuomotor skills.
arXiv
Video
Website

One-Shot Visual Imitation Learning via Meta-Learning

Chelsea Finn*, Tianhe Yu*, , Pieter Abbeel, Sergey Levine

In the 1st
Annual Conference on Robot Learning (CoRL), 2017.

We present a meta-imitation learning method that enables a robot to learn how to learn more efficiently, allowing it to acquire new skills from just a single demonstration. Unlike prior methods for one-shot imitation, our method can scale to raw pixel inputs and requires data from significantly fewer prior tasks for effective learning of new skills. Our experiments on both simulated and real robot platforms demonstrate the ability to learn new tasks, end-to-end, from a single visual demonstration.
arXiv
Video
Website

Learning from the Hindsight Plan -- Episodic MPC Improvement

Aviv Tamar, Garrett Thomas, , Sergey Levine, Pieter Abbeel

In the
IEEE International Conference on Robotics and Automation (ICRA), 2017.

Model predictive control (MPC) is an effective control method but is limited by planning with short horizon due to practical constraints. We propose a general policy improvement scheme for MPC, hindsight iterative MPC (HIMPC), which incorporates long-term reasoning into MPC short-horizon planning and demonstrates superior empirical performance in simulated and real contact-rich manipulation tasks.
arXiv
Video
Website

PLATO: Policy Learning using Adaptive Trajectory Optimization

Gregory Kahn, , Sergey Levine, Pieter Abbeel

In the
IEEE International Conference on Robotics and Automation (ICRA), 2017.

We propose PLATO, an algorithm that trains complex control policies with supervised learning, using model-predictive control (MPC) to generate the supervision. PLATO uses an adaptive training method to modify the behavior of MPC to gradually match the learned policy, in order to generate training samples at states that are likely to be visited by the policy while avoiding highly undesirable on-policy actions. We prove that this type of adaptive MPC expert produces supervision that leads to good long-horizon performance of the resulting policy, and empirically demonstrate that MPC can still avoid dangerous on-policy actions in unexpected situations during training.
arXiv
Video
Website

, Gregory Kahn, Sergey Levine, Pieter Abbeel

Model predictive control (MPC) is crucial for underactuacted systems such as autonomous aerial vehicles,
but its application can be computationally demanding.
We propose to combine MPC with reinforcement learning in the framework of guided policy search (GPS).
The resulting neural network policy can successfully control the robot at a fraction of the computational cost of MPC.
arXiv
Video
Slides

Projects

Towards Stochastic Neural Network Control Policies

Deterministic neural networks (DNNs) are shown effective policies with good generalization in robot control. However, their determinism restricts DNNs to only modelling uni-modal controls. Training DNNs on multi-modal controls can be inefficient or lead to garbage results. In contrast, stochastic neural networks (SNNs) are able to learn one-to-many mappings. In this paper, we introduce SNNs as control policies and extend existing learning algorithm for feed-forward SNNs to recurrent ones.
PDF
Poster

Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN

Craig Hiller, David Zhang, , Zihao Zhang

UC Berkeley CS280 (Computer Vision) Final Project (Spring 2015).

We propose an algorithm to automatically identify window regions on exterior-facing building facades in a colored 3D point cloud generated using data captured from an ambulatory backpack sensor system outfitted with multiple LiDAR sensors and cameras. Our work is based on a R-CNN-inspired algorithm with novel filtering and preprocessing techniques. We use multiscale combinatorial grouping (MCG) for region proposal generation, pass the proposals to a convolution neural network (CNN), and train a random forest with the CNN output vectors.
PDF