Solutions, Experience, Learnings and Outlook of the Amazon Robotics Challenge

29 May 2017, Monday

08:30-17:00

Co-organized with

ARC

Level 4, Rm. 4511/4512,

Marina Bay Sands Convention Centre, Singapore

The launch of the Amazon Robotics Challenge or ARC (formerly the Amazon Picking Challenge) in 2015 has shone light on the challenging problem of item picking in an e-commerce fullfilment warehouse. We have seen a lot of developments on all aspects of the picking problem since then. The challenges of automated item picking is a quintessential robot problem that encompass various aspects of grasping, vision and other forms of sensing, gripper and robot design, motion planning, optimization, machine learning, software engineering, and system integration, among others.

The main aim of this workshop is to gather past and future participants of ARC and the robotics and automation community to discuss their robotic solutions, experience of the previous competitions and their vision on automating item picking for warehouse logistics. Attendees of the workshop will have the opportunity to see the state-of-the-art in item picking research and development and interact with the people who are passionate about solving this complex problem during the presentations, poster discussions, and open forums. This workshop is co-organized by Amazon Robotics, the sponsor of the Amazon Robotics Challenge.

We describe our entry into the 2016 Amazon Picking Challenge (APC) and the lessons learned from deploying a complex, robotic system outside of the lab. To help future developments decided to create a new physical benchmark challenge for robotic picking to drive scientific progress and make research into (end-to-end) picking comparable. It consists of a set of 42 common objects, a widely available shelf, and exact guidelines for object arrangement using stencils.

The goal of the Amazon Robotics Challenge is to automate pick and place operations in unstructured environments by applying the state of the art in robotics. The two areas of complexity that were expanded relative to previous competitions are in storage density and the presence of unknown objects prior to the competition. To deal with these two factors, we use two contact-exploitative modes of manipulation along with the use of a soft GelSight [1]-inspired elastomer used to prevent possible toppling of objects. We present an overview of our approach to the picking task of the Amazon Robotics Challenge as follows: We begin with our choice of hardware, and then discuss our deep learning based vision approach. We then go into our modes of operation along with the situations they are used in and conclude with the limitations of our chosen approach.

We identify two main challenges present in ware- house grasping automation: how to search for an object in clutter and how to manipulate an object for feasible grasping. In this work, we address the second challenge by learning a direct mapping from an image to a push action. The experimental results demonstrate the optimality of our approach over the baseline and the generalizability of our approach to handle novel cuboids.

The Amazon Robotics Challenge (ARC) is a robotics competition aimed to advance warehouse automation. One of the engineering challenges is making the system robust to and being able to handle a wide variety of objects, as would be the case in a real warehouse. In this paper, we shortly describe our system used in ARC featuring a method to obtain object grasp poses containing the location of the object as well as orientation for the grasp by using a convolutional neural network with an RGB-D image as input. Through our entry in ARC 2016, we show the effectiveness of our method and the robustness of our network model to a large variety of object types in dense and unstructured environments wherein occlusions are possible.

Today a big research and innovation effort is devoted to automatize the picking of different objects with a unique flexible robot. Picking tasks are characterized by high vari- ability of object size and shape, a priori unknown object pose and not perfectly known environment. Our key idea is to introduce elasticity in the robot design to confer it adaptivity to deal with different objects and robustness to be effective in unstructured environments. We achieved this by employing Variable Stiffness Actuators and the Pisa/IIT SoftHand to realize a robot that took part in the Amazon Picking Challenge.

This presentation describes an integrated system for warehouse picking. The focus is on comparing different modalities of end-effectors in terms of their capabilities in grasping a variety of objects. In this context, this report also describes how different components of the overall system, such as pose estimation and motion are integrated.

This work proposes a fully autonomous process to train Convolutional Neural Networks (CNNs) for object detec- tion and pose estimation in setups for robotic manipulation. The application involves detection of objects placed in a clutter and in tight environments, such as a shelf. In particular, given access to 3D object models, several aspects of the environment are simulated and the models are placed in physically realistic poses with respect to their environment to generate a labeled synthetic dataset. To further improve object detection, the network self- trains over real images that are labeled using a multi-view pose estimation process. Results show that the proposed process outperforms popular training processes relying on synthetic data generation and manual annotation.

Deep learning methods often require large anno- tated data sets to estimate their high numbers of parameters, which is not practical for many robotic domains. One way to migitate this issue is to transfer features learned on large datasets to related tasks. In this work, we describe the percep- tion system developed for the entry of team NimbRo Picking into the Amazon Picking Challenge 2016. Object detection and semantic Segmentation methods are adapted to the domain, including incorporation of depth measurements. To avoid the need for large training datasets, we make use of pretrained models whenever possible, e.g. CNNs pretrained on ImageNet, and the whole DenseCap captioning pipeline pretrained on the Visual Genome Dataset. Our system performed well at the APC 2016 and reached second and third places for the stow and pick tasks, respectively.

We describe the application of a recently devel- oped deliberative perception framework to the task of multi- object instance recognition in warehouse environments. Tradi- tional object recognition pipelines based exclusively on discrim- inative feature-matching and/or statistical learners are often sensitive to inter-object occlusions and the training data used. Deliberative approaches such as PERCH treat multi-object pose estimation as a generative global optimization over possible configurations of objects, thereby predicting and accounting for occlusions. Further, D2P—an extension of PERCH, leverages guidance from modern learning-based techniques to combine the efficiency of discriminative approaches with the robustness provided by global reasoning. We conclude with a discussion of how these approaches were used by Carnegie Mellon Univer- sity’s team in the 2016 Amazon Picking Challenge, and their role in the upcoming 2017 Amazon Robotics Challenge.

In this talk, we provide details of several improvements that have been made to our Pick & Place robot which will be used in the upcoming ARC 2017 competition. An UR5 / UR10 system with fixed base has been selected for our system. The major technological improvements which are being included are as follows. (1) A set of new deep learning methods combined with a hierarchical two step strategy is employed to improve the detection performance and deal with new objects which will be made available only during the competition. (2) A new gripper is being designed that will combine both suction and gripping to improve the ability to pick and place all kinds of objects including non-rigid deformable objects. (3) A new grasping algorithm that combines image features with depth information for grasp pose detection and computation of suitable graspable affordance for a given object. (4) An automated system is developed for automating the generation of annotated templates necessary for training deep networks which require large amount for training. (5) The motion planning of the robot is improved using RRT algorithm along with flexible collision library (FCL) and Octomap to avoid collision with obstacles. The current system is implemented using standard ROS services and we are able to achieve a pick rate of 2-3 objects per minute.

Developing a successful warehouse picking robot requires an integrative approach across multiple robotics disci- plines, including hardware design, distributed systems integra- tion, machine learning, grasp planning, and motion planning. Like other robot systems projects, it also requires frequent test- ing and debugging. Our university team is currently addressing this problem for the Amazon Robotics Challenge 2017 by emphasizing the minimization of design-debug-test cycle times. This is achieved via the use of rapid hardware prototyping, robot simulation tools, and a software infrastructure built around centralized persistent state and pervasive visualization.

In this work we summarize the solution developed by Team KTH for the Amazon Picking Challenge 2016 in Leipzig, Germany. The competition simulated a warehouse automation scenario and it was divided into two tasks: a picking task where a robot picks items from a shelf and places them in a tote and a stowing task which is the inverse task where the robot picks items from a tote and places them in a shelf. We describe our approach to the problem starting from a high level overview of our system and later delving into details of our perception pipeline and our strategy for manipulation and grasping. The solution was implemented using a Baxter robot equipped with additional sensors.

We will present our framework used in the Amazon Picking Challenge in 2015 and some of the lessons learned that may prove useful to researchers and future teams participating in the competition. The competition proved to be a very useful occasion to integrate the work of various researchers at the Robotics, Perception and Learning laboratory of KTH, measure the performance of our robotics system and define the future direction of our research.

This paper describes the vision based robotic picking system that was developed by our team, Team Applied Robotics, for the Amazon Picking Challenge 2016. This competition challenged teams to develop a robotic system that is able to pick a large variety of products from a shelve or a tote. We discuss the design considerations and our strategy, the high resolution 3D vision system, the use of a combination of texture and shape-based object detection algorithms, the robot path planning and object manipulators that were developed.

We present the architecture of the Team Nanyang's robotic picker for the 2015 Amazon Picking Challenge. Using the competition rules and scoring system, we discuss how the team came up with the design architecture, from the hardware choices, to the vision strategy, the grasping strategy, and the gripper design. We also talk about the trade-offs and the calculated risks that we took with the design that we chose. We will conclude the talk with some lessons we learned from participating in the challenge.

The Amazon Robotics Challenge has become one of the biggest robotic challenges in the field of warehouse automation and manipulation. In this paper, we present an overview of materials available for newcomers to the challenge, what we learned from the previous editions and discuss the new challenges within the Amazon Robotics Challenge 2017. We also outline how we developed our solution, the results of an investigation on suction cup size and some notable difficulties we encountered along the way. Our aim is to speed up development for those who come after and, as first-time contenders like us, have to develop a solution from zero.

Our team participated in the 2015 picking chal- lenge with a 7DOF Barrett WAM arm and hand. Our robot was placed on a fixed base and had difficulty reaching and grasping objects deep inside the shelve bins. Given limited resources we come up with a simple design, a custom-made mechanism to bring objects to the edge of the shelve bins. During trials we explored both image-based visual servoing and Kinect RGBD vision. The former, while precise, relied on fragile video tracking. The final system used open-loop RGBD vision and readily available open-source tools. In this article we log our strategy and the lessons we have learned.

In this paper, we describe the problem definition and system construction for automatic loading / unloading of consumer products by dual-arm robot. We define the Loading / unloading task as complex of 3 elements; installation localization, object recognition, and robot planning. We also propose a hardware configuration and software components to manipulate a wide variety of items. The proposed system was evaluated in the warehouse environment with variety items and the store environment with multiple items, and the result shows ability of handling of a wide variety of items in the warehouse and reliability of a long-term continuous working in the store environment.

In this talk, we'll summarize our verification-based recognition and manipulation system with failure recovery architecture for APC 2015,2016 as well as on-going efforts for next ARC 2017. Since these challenges consist from variety aspects of robotics research topics, changed trend in manipulation research from classic table-top object manipulation to cluttered multiple objects within narrow space, this inspired us to bring up new research challenges. We'll discuss our post challenge topics including learning based failure prediction, integration of suction and dexterous hand, re-reinforcement learning of perception, and so on.

Team Delft’s robotic system won both the Picking and Stowing Competitions at the Amazon Picking Challenge 2016. The goal of the challenge is to automate pick and place operations in unstructured environments, specifically the (representative) shelves of an Amazon warehouse. Team Delft’s robot is based on an industrial robot arm, 3D cameras and a customized gripper. The robot’s software uses ROS to integrate off-the-shelf components and modules developed specifically for the competition such as implementing Deep Learning and other AI techniques for object recognition and pose estimation, grasp planning and motion planning. This paper provides an overview of the main functional components of the system, and discusses their performance and results at the Amazon Picking Challenge 2016 finals.

Every day, Amazon is able to quickly pick, pack and ship millions of items to customers from a network of fulfillment centers all over the globe. This wouldn't be possible without leveraging cutting-edge advances in technology. Amazon Robotics sponsors the Amazon Picking Challenge to spur the advancement of automated picking in unstructured environments. This talk will describe lessons learned from the 2016 challenge, the changes made for 2017, and some key elements for the future.

The main goal of this talk is to motivate the need to move from a an open-loop approach to grasp planning to a view of grasping as a dynamic reactive process. For the most part, all teams in the 2015 and 2016 Amazon Picking Challenge have developed open loop solutions that
grasp or suction accessible objects. While proven sufficient to accomplish the tasks in the first two iterations of the Challenge, grasping of cluttered/packed/occluded/constrained objects still remains an unsolved problem. The future of the Amazon Robotics Challenge can play a decisive role in fostering developments towards a more reactive and dynamic approach to grasping.