With the increased use of high degree-of-freedom robots that must perform tasks in real-time, there is a need for fast algorithms for motion planning. In this work, we view motion planning from a probabilistic perspective. We consider smooth continuous-time trajectories as samples from a Gaussian process (GP) and formulate the planning problem as probabilistic inference. We use factor graphs and numerical optimization to perform inference quickly, and we show how GP interpolation can further increase the speed of the algorithm. Our framework also allows us to incrementally update the solution of the planning problem to contend with changing conditions. We benchmark our algorithm against several recent trajectory optimization algorithms on planning problems in multiple environments. Our evaluation reveals that our approach is several times faster than previous algorithms while retaining robustness. Finally, we demonstrate the incremental version of our algorithm on replanning problems, and show that it often can find successful solutions in a fraction of the time required to replan from scratch.

We present a new algorithm for task and motion planning (TMP) and discuss the requirements and abstrac- tions necessary to obtain robust solutions for TMP in general. Our Iteratively Deepened Task and Motion Planning (IDTMP) method is probabilistically-complete and offers improved per- formance and generality compared to a similar, state-of-the- art, probabilistically-complete planner. The key idea of IDTMP is to leverage incremental constraint solving to efficiently add and remove constraints on motion feasibility at the task level. We validate IDTMP on a physical manipulator and evaluate scalability on scenarios with many objects and long plans, showing order-of-magnitude gains compared to the benchmark planner and a four-times self-comparison speedup from our extensions. Finally, in addition to describing a new method for TMP and its implementation on a physical robot, we also put forward requirements and abstractions for the development of similar planners in the future.

Roadmaps constructed by many sampling-based motion planners coincide, in the absence of obstacles, with standard models of random geometric graphs (RGGs). Those models have been studied for several decades and by now a rich body of literature exists analyzing various properties and types of RGGs. In their seminal work on optimal motion planning Karaman and Frazzoli [31] conjectured that a sampling-based planner has a certain property if the underlying RGG has this property as well. In this paper we settle this conjecture and leverage it for the development of a general framework for the analysis of sampling-based planners. Our framework, which we call localization-tessellation, allows for easy transfer of arguments on RGGs from the free unit-hypercube to spaces punctured by obstacles, which are geometrically and topologically much more complex. We demonstrate its power by providing alternative and (arguably) simple proofs for probabilistic completeness and asymptotic (near-)optimality of probabilistic roadmaps (PRMs). Furthermore, we introduce two variants of PRMs, analyze them using our framework, and discuss the implications of the analysis.

We describe a process that constructs robot-specific circuitry for motion planning, capable of generating motion plans approximately three orders of magnitude faster than existing methods. Our method is based on building collision detection circuits for a probabilistic roadmap. Collision detection for the roadmap edges is completely parallelized, so that the time to determine which edges are in collision is independent of the number of edges. We demonstrate planning using a 6-degree- of-freedom robot arm in less than 1 millisecond.

In this paper we present an inverse optimal control based transfer of motions from human experiments to humanoid robots and apply it to walking in constrained environments. To this end we introduce a 3D template model, which describes motion on the basis of center of mass trajectory, foot trajectories, upper body orientation and phase duration. Despite of its abstract architecture with prismatic joints combined with damped series elastic actuators instead of knees, the model (including dynamics and constraints) is suitable to describe both, human and hu- manoid locomotion with appropriate parameters. We present and apply an inverse optimal control approach to identify optimality criteria based on human motion capture experiments. The iden- tified optimal strategy is then transferred to the humanoid robot for gait generation by solving an optimal control problem, which takes into account the properties of the robot and differences in the environment. The results of this approach are the center of mass trajectory, the foot trajectories, the torso orientation, and the single and double support phase durations for a sequence of multiple steps allowing the humanoid robot to walk within a new environment. We present one computational cycle (from motion capture data to an optimized robot motion) for the example of walking over irregular step stones with the aim to transfer the motion to two very different humanoid robots (iCub Heidelberg01 and HRP-2 14). The transfer of these optimized robot motions to the real robots by means of inverse kinematics is work in progress and not part of this paper.

Gaits are crucial to the performance of locomotors. However, it is often difficult to design effective gaits for complex locomotors. Geometric mechanics offers powerful gait design tools, but the utilities of these tools have been limited to systems with two joints. Using shape basis functions, it is possible to approximate the kinematics of complex locomotors using only two shape variables. As a result, the tools of geometric mechanics can be used to study complex locomotion in an intuitive way. The choice of shape basis functions plays an important role in determining gait kinematics, and therefore the performance of a locomotor. To find appropriate basis functions, we introduce the shape basis optimization algorithm, an algorithm that iteratively improve basis functions to find effective kinematic programs. Applying this algorithm to a snake robot resulted a novel gait, which improves its speed of swimming in granular materials.

Physiological measures, such as pain, anxiety, effort, or energy consumption, play a crucial role in the evaluation and development of assistive robotic devices. Physiological data are collected and analyzed by researchers and clinicians, and are often used to inform an iterative tuning process of a device and its controller. Currently, these data are collected then analyzed offline such that they are only evaluated after the experiment has ended. This makes any iterative design process tedious and time consuming since tuning must be done on a subject-by-subject basis and for a variety of tasks that the device is intended to be used for. To overcome these drawbacks, we are proposing a new type of human-machine interaction that is based on measuring and using physiological measurements in real-time. By continuously monitoring a physiological objective through a set of suitable sensors, we propose conducting an optimization of a set of controller parameters that shape the assistance provided by the device. In other words, we pose an optimization that includes the human body in the loop. This Body-in-the-Loop optimization allows for optimal subject specific control and has the potential to be used for controller adaptation to changing environments. We validated this concept in an extensive human subject study where we autonomously optimized the actuation onset of a pair of bilateral ankle exoskeletons to minimize user's metabolic effort.

In this work, we use first principles of kinematics to provide a fundamental insight into mechanical power distribution within multi-actuator machines. Individual actuator powers—not their net sum—determine the efficiency and actuator size of a multi-joint machine. Net power delivered to the environment naturally discards important information about how that power is generated. For example, simultaneous positive and negative powers will cancel within a mechanism, wasting energy and raising peak power requirements. The same effect can bias power draw toward a single actuator while the other actuators do zero work. In general, it is best for all actuators to contribute equally to the net power demand because balance minimizes the mechanical power requirements of individual actuators. In this paper, we present the actuation power space, within which we measure the antagonism in a machine (joints working against each other due to kinematic constraints). We show the difference between the net power consumed by the task and the total power supplied by the actuators. We derive the power quality measure as a smooth objective function which encodes both antagonism and the balance of power between actuators. As a demonstration of our general framework, we apply our technique to a legged-robot design to find improved kinematics for performing a running gait. This technique finds mechanisms with optimal power distribution, regardless of actuator choice or loss models, so it can be applied early in the design process using mechanism kinematics alone. After choosing appropriate kinematics for an application, designers can independently optimize each actuator in a design to minimize local losses.

The sophisticated and intricate connection between bat morphology and flight capabilities makes it challenging to employ conventional flying robots to replicate the aerial locomotion of these creatures. In recent work, a bat inspired soft Micro Aerial Vehicle (MAV) called Bat Bot (B2) with five Degrees of Actuation (DoA) has been constructed to mimic the flight behavior of a biological bat. Major differences in structural topology resulted from this simpler kinematic complexity, and thus it is necessary to find the dimensions of B2's structure and the behavior of its actuators such that the wingbeat cycle of B2 closely mimics that of a biological bat. The current work assumes the previously designed structure of B2 and presents a synergistic design approach to imitate the kinematic synergies of a biological bat. Recent findings have unveiled that the most dominant synergies in a biological bat could be combined to accurately represent the original kinematic movement, therefore simplifying its dimensional complexity. In this work, Principal Component Analysis (PCA) has been employed in order to extract dominant principal components of biological bat flight kinematics. Thereafter, first and second principal components are chosen to shape the parametric kinematics and actuator trajectories of B2 through finite state nonlinear constrained optimization. The method yields a robot mechanism that despite having a few DoAs, it possesses several biologically meaningful morphing specializations.

In this study, we present a framework for phase- space planning and control of agile bipedal locomotion while robustly tracking a set of non-periodic keyframes. By using a reduced-order model, we formulate a hybrid planning framework where the center-of-mass motion is constrained to a general sur- face manifold. This framework also proposes phase-space bundles to characterize robustness and a robust hybrid automaton to effectively design planning algorithms. A newly defined phase- space locomotion manifold is used as a Riemannian metric to measure the distance between the disturbed state and the planned manifold. Based on this metric, a dynamic programming based hybrid controller is introduced to produce robust locomotions. The robustness of the proposed framework is validated by using simulations of rough terrain locomotion recovery from external disturbances. Additionally, the agility of this framework is demonstrated by using simulations of the dynamic locomotion over random rough terrains.

Although methods about building fish-like robots have attracted much research in recent years, new techniques are still required for developing superior robot fish. This paper presents a novel robot fish propelled by an active compliant propulsion mechanism. The key innovation is the combination of an active wire-driven mechanism with a soft compliant tail to accomplish subcarangiform swimming (i.e., swimming with an "S" motion). Additionally, an effective design methodology based on a 3R pseudo-rigid-body model is proposed to design the compliant tail. First, the mathematical model based on the design methodology was derived and computer simulations were performed. Second, a prototype was fabricated and numerous experimental studies were conducted. The experiments demonstrated that the predictions of the mathematical model matched the testing results. Compared with existing robot fishes which use multi-link structure with each joint being actuated by one motor, the new robot fish has several advantages: it is simple in structure, easy to control, and capable of high speed swimming and maneuverable swimming, the maximum swimming speed reached 2.15 body length per second and the instantaneous maximum turning speed is 269°/s. Furthermore, the design methodology presented in the paper can be also used in other applications such as flexible probe of medical devices and soft manipulators.

Snake robots can contact their environments along their whole bodies. This distributed contact makes them versatile and robust locomotors, but also makes controlling them a chal- lenging problem involving high-dimensional configuration spaces, with no direct way to break their motion down into "driving" and "steering" actions. In this paper, we use concepts from geometric mechanics—e.g., expanded Lie bracket analysis—to simplify the problem of controlling a snake robot moving across a granular surface. Without needing force laws that model the interaction of the snake robot with the granular surface, the relationship between shape and body velocities can be experimentally derived by perturbing the robot's shape from a sampling of initial configurations, which allows us to: 1. Generate an intuitive and visualizable relationship between gait cycles and the motion they induce; 2. Make accurate predictions about the most efficient gaits available to the robot; and 3. Identify an effective turning gait for the robot that to the best of our knowledge has not previously appeared in the snake robot literature. This geometric analysis of snake robot locomotion serves as a demonstration of how differential-geometric tools can provide insight into the motion of systems that do not have the analytic models often associated with such approaches.

Many applications in robotics such as registration, object tracking, sensor calibration, etc. use Kalman filters to estimate a time invariant SE(3) element by locally linearizing a non-linear measurement model. Linearization-based filters tend to suffer from inaccurate estimates, and in some cases divergence, in the presence of large initialization errors. In this work, we use a dual quaternion to represent the S E (3) element and use multiple measurements simultaneously to rewrite the measurement model in a truly linear form with state dependent measurement noise. Use of the linear measurement model bypasses the need for any linearization in prescribing the Kalman filter, resulting in accurate estimates while being less sensitive to initial estimation error. To show the broad applicability of this approach, we derive linear measurement models for applications that use either position measurements or pose measurements. A procedure to estimate the state dependent measurement uncertainty is also discussed. The efficacy of the formulation is illustrated using simulations and hardware experiments for two applications in robotics: rigid registration and sensor calibration.

In recent years, the topic of multi-robot systems has become very popular. These systems have been demonstrated in various applications, including exploration, construction, and warehouse operations. In order for the whole system to function properly, sensor calibrations such as determining the camera frame relative to the IMU frame are important. Compared to the traditional hand-eye & robot-world calibration, a relatively new problem called the AX B = Y C Z calibration problem arises in the multi-robot scenario, where A, B, C are rigid body transformations measured from sensors and X,Y,Z are unknown transformations to be calibrated. Several solvers have been proposed previously in different application areas that can solve for X, Y and Z simultaneously. However, all of the solvers assume a priori knowledge of the exact correspondence among the data streams {Ai}, {Bi} and {Ci}. While that assumption may be justified in some scenarios, in the application domain of multi-robot systems, which may use ad hoc and asynchronous communication protocols, knowledge of this correspondence gen- erally cannot be assumed. Moreover, the existing methods in the literature require good initial estimates that are not always easy or possible to obtain. In this paper, we propose two probabilistic approaches that can solve the AXB = Y CZ problem without a priori knowledge of the correspondence of the data. In addition, no initial estimates are required for recovering X, Y and Z. These methods are particularly well suited for multi-robot systems, and also apply to other areas of robotics in which AXB = YCZ arises.

For a given robot and a given task, this paper addresses questions about which modifications may be made to the robot's suite of sensors without impacting the robot's behavior in completing its task. Though this is an important design-time question, few principled methods exist for providing a definitive answer in general. Utilizing and extending the language of combinatorial filters, this paper aims to fill that lacuna by introducing theoretical tools for reasoning about sensors and representations of sensors. It introduces new representations for sensors and filters, exploring the relationship between those elements and the specific information needed to perform a task. It then shows how these tools can be used to algorithmically answer questions about changes to a robot's sensor suite. The paper substantially expands the expressiveness of combinatorial filters so that, where they were previously limited to quite simple sensors, our richer filters are able to reasonably model a much broader variety of real devices. We have implemented the proposed algorithms, and describe their application to an example instance involving a series of simplifications to the sensors of a specific, widely deployed mobile robot.

This paper provides a new fully-decentralized al- gorithm for Collaborative Localization based on the extended Kalman filter. The major challenge in decentralized collaborative localization is to track inter-robot dependencies – which is particularly difficult in situations where sustained synchronous communication between robots cannot be guaranteed. Current approaches suffer from the need for particular communica- tion schemes, extensive bookkeeping of measurements, overly- conservative assumptions, or the restriction to specific mea- surement models. To the best of our knowledge, the algorithm we propose in this paper is the first one that tracks inter- robot correlations while fulfilling all of the following relevant conditions: communication is limited to two robots that obtain a relative measurement, the algorithm is recursive in the sense that it does not require storage of measurements and each robot maintains only the latest estimate of its own pose, and it supports generic measurement models. These particularly hard conditions make the approach applicable to a wide range of multi-robot applications. Extensive experiments carried out using real world datasets demonstrate the improved performance of our method compared to several existing approaches.

Safe control of dynamical systems that satisfy tem- poral invariants expressing various safety properties is a challeng- ing problem that has drawn the attention of many researchers. However, making the assumption that such temporal properties are deterministic is far from the reality. For example, a robotic system might employ a camera sensor and a machine learned system to identify obstacles. Consequently, the safety properties the controller has to satisfy, will be a function of the sensor data and the associated classifier. We propose a framework for achieving safe control. At the heart of our approach is the new Probabilistic Signal Temporal Logic (PrSTL), an expressive language to define stochastic properties, and enforce probabilistic guarantees on them. We also present an efficient algorithm to reason about safe controllers given the constraints derived from the PrSTL specification. One of the key distinguishing features of PrSTL is that the encoded logic is adaptive and changes as the system encounters additional data and updates its beliefs about the latent random variables that define the safety properties. We demonstrate our approach by deriving safe control of quadrotors and autonomous vehicles in dynamic environments.

Magnetic field gradients have repeatedly been shown to be the most feasible mechanism for gastrointestinal capsule endoscope actuation. An inverse quartic magnetic force varia- tion with distance results in large force gradients induced by small movements of a driving magnet; this necessitates robotic actuation of magnets to implement stable control of the device. A typical system consists of a serial robot with a permanent magnet at its end effector that actuates a capsule with an embedded permanent magnet. We present a tethered capsule system where a capsule with an embedded magnet is closed loop controlled in 2 degree-of-freedom in position and 2 degree-of- freedom in orientation. Capitalizing on the magnetic field of the external driving permanent magnet, the capsule is localized in 6- D allowing for both position and orientation feedback to be used in a control scheme. We developed a relationship between the serial robot's joint parameters and the magnetic force and torque that is exerted onto the capsule. Our methodology was validated both in a dynamic simulation environment where a custom plug-in for magnetic interaction was written, as well as on an experimental platform. The tethered capsule was demonstrated to follow desired trajectories in both position and orientation with accuracy that is acceptable for colonoscopy.

Coordinated control strategies for multi-robot sys- tems are necessary for tasks that cannot be executed by a single robot. This encompasses tasks where the workspace of the robot is too small or where the load is too heavy for one robot to handle. Using multiple robots makes the task feasible by extending the workspace and/or increase the payload of the overall robotic system. In this paper, we consider two instances of such tasks: a co-worker scenario in which a human hands over a large object to a robot; intercepting a large flying object. The problem is made difficult as the pick-up/intercept motions must take place while the object is in motion and because the object's motion is not deterministic. The challenge is then to adapt the motion of the robotic arms in coordination with one another and with the object. Determining the pick-up/intercept point is done by taking into account the workspace of the multi-arm system. The point is continuously recomputed to adapt to change in the object's trajectory. We propose a virtual object based dynamical systems (DS) control law to generate autonomous and synchronized motions for a multi-arm robot system. We show theoretically that the multi-arm + virtual object system converges asymptotically to the moving object. We validate our approach on a dual-arm robotic system and demonstrate that it can re- synchronize and adapt the motion of each arm in a fraction of a second, even when the motion of the object is fast and not accurately predictable.

Having many degrees of freedom is both a blessing and a curse. A mechanism with a large number of degrees of freedom can better comply to and therefore better move in complex environments. Yet, possessing many degrees of freedom is only an advantage if the system is capable of coordinating them to achieve desired goals in realtime. This work supports the belief that a middle layer of abstraction between conventional planning and control is needed to enable robust locomotion of articulated systems in complex terrains. The basis for this abstraction is the notion that a system's shape can be used to capture joint- to-joint coupling and provide an intuitive set of controllable parameters that adapt the system to the environment in real time. This paper presents a generalizable framework that specifies desired shapes in terms of shape functions. We show how shape functions can be used to link low-level controllers to high-level planners in a compliant control framework that directly controls shape parameters. The resultant shape-based controllers produce behaviors that enable robots to robustly feel their way through unknown environments. This framework is applied to the control of two separate mechanisms, a snake-like and a hexapod robot.

Perception of transparent objects has been an open challenge in robotics despite advances in sensors and data- driven learning approaches. In this paper, we introduce a new approach that combines recent advances in learnt object detectors with perceptual grouping in 2D, and projective geometry of apparent contours in 3D. We train a state of the art structured edge detector on an annotated set of foreground glassware. We assume that we deal with surfaces of revolution (SOR) and apply perceptual symmetry grouping in a 2D spherical transformation of the image to obtain a 2D detection of the glassware object and a hypothesis about its 2D axis. Rather than stopping at a single view detection, we ultimately want to reconstruct the 3D shape of the object and its 3D pose to allow for a robot to grasp it. Using two views allows us to decouple the 3D axis localization from the shape estimation. We develop a parametrization that uniquely relates the shape reconstruction of SOR to given a set of contour points and tangents. Finally, we provide the first annotated dataset for 2D detection, 3D pose and 3D shape of glassware and we show results comparable to category-based detection and localization of opaque objects without any training on the object shape.

In robotic systems with moving cameras control of gaze allows for image stabilization, tracking and attention switching. Proper integration of these capabilities lets the robot exploit the kinematic redundancy of the oculomotor system to improve tracking performance and extend the field of view, while at the same time stabilize vision to reduce image blur induced by the robot's own movements. Gaze may be driven not only by vision but also by other sensors (e.g. inertial sensors or motor encoders) that carry information about the robot's own movement. Humanoid robots have sophisticated oculomotor systems, usually mounting inertial devices and are therefore an ideal platform to study this problem. We present a complete architecture for gaze control of a humanoid robot. Our system is able to control the neck and the eyes in order to track a 3D cartesian fixation point in space. The redundancy of the kinematic problem is exploited to implement additional behaviors, namely passive gaze stabilization, saccadic movements, and vestibulo- ocular reflex. We implement this framework on the iCub's head, which is equipped with a 3-DoFs neck and a 3-DoFs eyes system and includes an inertial unit that provides feedback on the acceleration and angular speed of the head. The framework presented in this work can be applied to any robot equipped with an anthropomorphic head. In addition we provide an open- source, modular implementation, which has been already ported to other robotic platforms.

We introduce a novel paradigm for model-based multi-object recognition and 3 DoF pose estimation from 3D sensor data that integrates exhaustive global reasoning with discriminatively-trained algorithms in a principled fashion. Typ- ical approaches for this task are based on scene-to-model feature matching or regression by statistical learners trained on a large database of annotated scenes. These approaches are fast but sensitive to occlusions, features, and/or training data. Generative approaches, on the other hand, e.g., methods based on rendering and verification, are robust to occlusions and require no training, but are slow at test time. We conjecture that robust and efficient perception can be achieved through a combination of generative methods and discriminatively-trained approaches. To this end, we introduce the Discriminatively-guided Deliberative Perception (D2P) paradigm that has the following desirable properties: a) D2P is a single search algorithm that looks for the ‘best' rendering of the scene that matches the input, b) can be guided by any and multiple discriminative algorithms, and c) generates a solution that is provably bounded suboptimal with respect to the chosen cost function. In addition, we introduce the notions of completeness and resolution completeness for multi-object pose estimation problems, and show that D2P is resolution complete. We conduct extensive evaluations on a benchmark dataset to study various aspects of D2P in relation to existing approaches.

In order to track dynamic objects in a robot's environment, one must first segment the scene into a collection of separate objects. Most real-time robotic vision systems today rely on simple spatial relations to segment the scene into separate objects. However, such methods fail under a variety of real- world situations such as occlusions or crowds of closely-packed objects. We propose a probabilistic 3D segmentation method that combines spatial, temporal, and semantic information to make better-informed decisions about how to segment a scene. We begin with a coarse initial segmentation. We then compute the probability that a given segment should be split into multiple segments or that multiple segments should be merged into a single segment, using spatial, semantic, and temporal cues. Our probabilistic segmentation framework enables us to significantly reduce both undersegmentations and oversegmentations on the KITTI dataset [3, 4, 5] while still running in real-time. By combining spatial, temporal, and semantic information, we are able to create a more robust 3D segmentation system that leads to better overall perception in crowded dynamic environments.

The advantage of modular robot systems lies in their flexibility, but this advantage can only be realized if there exists some reliable, effective way of generating configurations (shapes) and behaviors (controlling programs) appropriate for a given task. In this paper, we present an end-to-end system for addressing tasks with modular robots, and demonstrate that it is capable of accomplishing challenging multi-part tasks in hardware experiments. The system consists of four tightly integrated components: (1) A high-level mission planner, (2) A large design library spanning a wide set of functionality, (3) A design and simulation tool for populating the library with new configurations and behaviors, and (4) modular robot hardware. The broader goal of this project is enabling users to address real-world tasks using modular robots. We believe this work represents an important step toward this larger goal.

We conducted a study to investigate trust in and dependence upon robotic decision support among nurses and doctors on a labor and delivery floor. There is evidence that suggestions provided by embodied agents engender inappropriate degrees of trust and reliance among humans. This concern is a critical barrier that must be addressed before fielding intelligent hospital service robots that take initiative to coordinate patient care. Our experiment was conducted with nurses and physicians, and evaluated the subjects' levels of trust in and dependence on high- and low-quality recommendations issued by robotic versus computer-based decision support. The support, generated through action-driven learning from expert demonstration, was shown to produce high-quality recommendations that were ac- cepted by nurses and physicians at a compliance rate of 90%. Rates of Type I and Type II errors were comparable between robotic and computer-based decision support. Furthermore, em- bodiment appeared to benefit performance, as indicated by a higher degree of appropriate dependence after the quality of recommendations changed over the course of the experiment. These results support the notion that a robotic assistant may be able to safely and effectively assist in patient care. Finally, we conducted a pilot demonstration in which a robot assisted resource nurses on a labor and delivery floor at a tertiary care center.

In this paper, we propose an approach to designing online feedback controllers for input-saturated robotic systems evolving on Lie groups by extending the recently developed Sequential Action Control (SAC). In contrast to existing feedback controllers, our approach poses the nonconvex constrained non- linear optimization problem as the tracking of a desired negative mode insertion gradient on the configuration space of a Lie group. This results in a closed-form feedback control law even with input saturation and thus is well suited for online application. In extending SAC to Lie groups, the associated mode insertion gradient is derived and the switching time optimization on Lie groups is studied. We demonstrate the efficacy and scalability of our approach in the 2D kinematic car on SE(2) and the 3D quadrotor on SE(3). We also implement iLQG on a quadrator model and compare to SAC, demonstrating that SAC is both faster to compute and has a larger basin of attraction.

A set of practical mathematical tools for creating self correcting, self compensating optical angular encoders is presented. Included is [a] a discussion and proof of the so-called Angular Encoder Theorem, which is the fundamental element of real time self correction based on read head symmetry aver- aging; [b] a discussion and derivation of the self-compensation equations, which can be used in tandem with the encoder theorem to increase accuracy further; [c] a kinematic model which can be used to simulate realistic operational scenarios; and [d] rules of thumb and implications for the designer. Sample simulation results and sample test results are also presented. These tools enable the designer to work out cost-effective encoder designs to meet different accuracy requirements for different applications. Such encoders automatically retain their accuracy over indefinite periods of time without human intervention of any kind and are thus well suited for challenging robotic applications. These tools can and have been used to construct cost-effective encoders with demonstrated sub-arcsecond accuracy. Finally, sample source code is available online.

Traditionally, autonomous cars make predic- tions about other drivers' future trajectories, and plan to stay out of their way. This tends to result in defensive and opaque behaviors. Our key insight is that an autonomous car's actions will actually affect what other cars will do in response, whether the car is aware of it or not. Our thesis is that we can leverage these responses to plan more efficient and communicative behaviors. We model the interaction between an autonomous car and a human driver as a dy- namical system, in which the robot's actions have immediate consequences on the state of the car, but also on human actions. We model these consequences by approximating the human as an optimal planner, with a reward function that we acquire through Inverse Reinforcement Learning. When the robot plans with this reward function in this dynamical system, it comes up with actions that purposefully change human state: it merges in front of a human to get them to slow down or to reach its own goal faster; it blocks two lanes to get them to switch to a third lane; or it backs up slightly at an intersection to get them to proceed first. Such behaviors arise from the optimization, without relying on hand-coded signaling strategies and without ever explicitly modeling communication. Our user study results suggest that the robot is indeed capable of eliciting desired changes in human state by planning using this dynamical system.

This paper examines the relationship between sys- tem dynamics and problem complexity of collision avoidance in multi-agent systems. Motivated particularly by results in the field of automated driving, a variant of the reciprocal n-body collision avoidance problem is considered. In this problem, agents must avoid collision while moving according to individual reward functions in a crowded environment. The main contribution of this work is the novel result that there is a quantifiable relationship between system dynamics and the requirement for agent coordination, and that this requirement can change the complexity class of the problem dramatically: from P to NEXP or even NEXPNP. In addition, a constructive proof is provided that demonstrates the relationship and potential real- world applications of the result are discussed.

This paper introduces a new kinematic model to describe the planar motion of an Autonomous Underwater Vehicle (AUV) moving in constant current flows. The AUV is modeled as a rigid body moving at maximum attainable forward velocity with symmetric bounds on the control input for the turning rate. The model incorporates the effect a flow will induce on the turning rate of the AUV due to the non-symmetric geometry of the vehicle. The model is then used to characterize and construct the minimum time paths that take the AUV from a given initial configuration to a final configuration in the plane. Two algorithms for the time-optimal path synthesis problem are also introduced along with several simulations to validate the proposed method.

This paper considers the problem of routing and rebalancing a shared fleet of autonomous (i.e., self-driving) vehicles providing on-demand mobility within a capacitated trans- portation network, where congestion might disrupt throughput. We model the problem within a network flow framework and show that under relatively mild assumptions the rebalancing vehicles, if properly coordinated, do not lead to an increase in congestion (in stark contrast to common belief). From an algorithmic standpoint, such theoretical insight suggests that the problem of routing customers and rebalancing vehicles can be decoupled, which leads to a computationally-efficient routing and rebalancing algorithm for the autonomous vehicles. Numerical experiments and case studies corroborate our theoretical insights and show that the proposed algorithm outperforms state-of-the- art point-to-point methods by avoiding excess congestion on the road. Collectively, this paper provides a rigorous approach to the problem of congestion-aware, system-wide coordination of autonomously driving vehicles, and to the characterization of the sustainability of such robotic systems.

This work addresses the problem of how a robot can improve a manipulation skill in a sample-efficient and secure manner. As an alternative to the standard reinforcement learning formulation where all objectives are defined in a single reward function, we propose a generalized formulation that consists of three components: 1) A known analytic control cost function; 2) A black-box return function; and 3) A black-box binary success constraint. While the overall policy optimization problem is high- dimensional, in typical robot manipulation problems we can assume that the black-box return and constraint only depend on a lower-dimensional projection of the solution. With our formulation we can exploit this structure for a sample-efficient learning framework that iteratively improves the policy with respect to the objective functions under the success constraint. We employ efficient 2nd-order optimization methods to optimize the high-dimensional policy w.r.t. the analytic cost function while keeping the lower dimensional projection fixed. This is alternated with safe Bayesian optimization over the lower-dimensional projection to address the black-box return and success constraint. During both improvement steps the success constraint is used to keep the optimization in a secure region and to clearly distinguish between motions that lead to success or failure. The learning algorithm is evaluated on a simulated benchmark problem and a door opening task with a PR2.

Our goal is to automate the understanding of natural hand-object manipulation by developing computer vision- based techniques. Our hypothesis is that it is necessary to model the grasp types of hands and the attributes of manipulated objects in order to accurately recognize manipulation actions. Specifically, we focus on recognizing hand grasp types, object attributes and actions from a single image within an unified model. First, we explore the contextual relationship between grasp types and object attributes, and show how that context can be used to boost the recognition of both grasp types and object attributes. Second, we propose to model actions with grasp types and object attributes based on the hypothesis that grasp types and object attributes contain complementary information for characterizing different actions. Our proposed action model outperforms traditional appearance-based models which are not designed to take into account semantic constraints such as grasp types or object attributes. Experiment results on public egocentric activities datasets strongly support our hypothesis.

We describe a combined force and distance sensor using a commodity infrared distance sensor embedded in a transparent elastomer with applications in robotic manipulation. Prior to contact, the sensor works as a distance sensor (0–10 cm), whereas after contact the material doubles as a spring, with force proportional to the compression of the elastomer (0– 5 N). We describe its principle of operation and de- sign parameters, including polymer thickness, mixing ratio, and emitter current, and show that the sensor response has an inflection point at contact that is independent of an object's surface properties. We then demonstrate how two arrays of eight sensors, each mounted on a standard Baxter gripper, can be used to (1) improve gripper alignment during grasping, (2) determine contact points with objects, and (3) obtain crude 3D models that can serve to determine possible grasp locations.

We describe the winning entry to the Amazon Picking Challenge. From the experience of building this system and competing in the Amazon Picking Challenge, we derive several conclusions: 1) We suggest to characterize robotic systems building along four key aspects, each of them spanning a spectrum of solutions—modularity vs. integration, generality vs. assumptions, computation vs. embodiment, and planning vs. feedback. 2) To understand which region of each spectrum most adequately addresses which robotic problem, we must explore the full spectrum of possible approaches. To achieve this, our community should agree on key aspects that characterize the solution space of robotic systems. 3) For manipulation problems in unstructured environments, certain regions of each spectrum match the problem most adequately, and should be exploited further. This is supported by the fact that our solution deviated from the majority of the other challenge entries along each of the spectra.

Our goal is to develop models that allow a robot to understand natural language instructions in the context of its world representation. Contemporary models learn possible correspondences between parsed instructions and candidate groundings that include objects, regions and motion constraints. However, these models cannot reason about abstract concepts expressed in an instruction like, "pick up the middle block in the row of five blocks". In this work, we introduce a probabilistic model that incorporates an expressive space of abstract spatial concepts as well as notions of cardinality and ordinality. The graph is structured according to the parse structure of language and introduces a factorisation over abstract concepts correlated with concrete constituents. Inference in the model is posed as an approximate search procedure that leverages partitioning of the joint in terms of concrete and abstract factors. The algorithm first estimates a set of probable concrete constituents that constrains the search procedure to a reduced space of abstract concepts, pruning away improbable portions of the exponentially- large search space. Empirical evaluation demonstrates accurate grounding of abstract concepts embedded in complex natural language instructions commanding a robot manipulator. The proposed inference method leads to significant efficiency gains compared to the baseline, with minimal trade-off in accuracy

In this paper, we consider robotic surveillance tasks that involve visual perception. The robot has a limited access to a remote operator to ask for help. However, humans may not be able to accomplish the visual task in many scenarios, depending on the sensory input. In this paper, we propose a machine learning-based approach that allows the robot to probabilistically predict human visual performance for any visual input. Based on this prediction, we then present a methodology that allows the robot to properly optimize its field decisions in terms of when to ask for help, when to sense more, and when to rely on itself. The proposed approach enables the robot to ask the right questions, only querying the operator with the sensory inputs for which humans have a high chance of success. Furthermore, it allows it to autonomously locate the areas that need more sensing. We test the proposed predictor on a large validation set and show Normalized Mean Square Error of 0.0199, as well as a reduction of about an order of magnitude in error as compared to the state-of-the-art. We then run a number of robotic surveillance experiments on our campus as well as a larger-scale evaluation with real data/human feedback in a simulation environment. The results showcase the efficacy of our approach, indicating a considerable increase in the success rate of human queries (a few folds in several cases) and the overall performance (30%-41% increase in success rate).

We present a framework for representing scenarios with complex object interactions, in which a robot cannot directly interact with the object it wishes to control, but must instead do so via intermediate objects. For example, a robot learning to drive a car can only indirectly change its pose, by rotating the steering wheel. We formalize such complex interactions as chains of Markov decision processes and show how they can be learned and used for control. We describe two systems in which a robot uses learning from demonstration to achieve indirect control: playing a computer game, and using a hot water dispenser to heat a cup of water.

Communication with robots is challenging, partly due to their differences from humans and the consequent discrepancy in people's mental model of what robots can see, hear, or understand. Transparency mechanisms aim to mitigate this challenge by providing users with information about the robot's internal processes. While most research in human-robot interaction aim towards natural transparency using human-like verbal and non-verbal behaviors, our work advocates for the use of visualization-based transparency. In this paper, we first present an end-to-end system that infers task commands that refer to objects or surfaces in everyday human environments, using Bayesian inference to combine scene understanding, pointing detection, and speech recognition. We characterize capabilities of this system through systematic tests with a corpus collected from people (N=5). Then we design human-like and visualization- based transparency mechanisms and evaluate them in a user study (N=20). The study demonstrates the effects of visualizations on the accuracy of people's mental models, as well as their effectiveness and efficiency in communicating task commands.

We explore the capabilities of Auto-Encoders to fuse the information available from cameras and depth sensors, and to reconstruct missing data, for scene understanding tasks. In particular we consider three input modalities: RGB images; depth images; and semantic label information. We seek to generate complete scene segmentations and depth maps, given images and partial and/or noisy depth and semantic data. We formulate this objective of reconstructing one or more types of scene data using a Multi-modal stacked Auto-Encoder. We show that suitably designed Multi-modal Auto-Encoders can solve the depth estimation and the semantic segmentation problems simultaneously, in the partial or even complete absence of some of the input modalities. We demonstrate our method using the outdoor dataset KITTI that includes LIDAR and stereo cameras. Our results show that as a means to estimate depth from a single image, our method is comparable to the state-of-the-art, and can run in real time (i.e., less than 40ms per frame). But we also show that our method has a significant advantage over other methods in that it can seamlessly use additional data that may be available, such as a sparse point-cloud and/or incomplete coarse semantic labels.

Convolutional network techniques have recently achieved great success in vision based detection tasks. This paper introduces the recent development of our research on transplanting the fully convolutional network technique to the detection tasks on 3D range scan data. Specifically, the scenario is set as the vehicle detection task from the range data of Velodyne 64E lidar. We proposes to present the data in a 2D point map and use a single 2D end-to-end fully convolutional network to predict the objectness confidence and the bounding boxes simultaneously. By carefully design the bounding box encoding, it is able to predict full 3D bounding boxes even using a 2D convolutional network. Experiments on the KITTI dataset shows the state-of- the-art performance of the proposed method.

Loop closure detection is an essential component for simultaneously localization and mapping in a variety of robotics applications. One of the most challenging problems is to perform long-term place recognition with strong perceptual aliasing and appearance variations due to changes of illumination, vegetation, weather, etc. To address this challenge, we propose a novel Robust Multimodal Sequence-based (ROMS) method for long-term loop closure detection, by formulating image sequence matching as an optimization problem regularized by structured sparsity-inducing norms. Our method is able to model the sparsity nature of place recognition, i.e., the current location should match only a small subset of previously visited places, as well as to model underlying structures of image sequences and incorporate multiple feature modalities to construct a discriminative scene representation. In addition, a new optimization algorithm is developed to efficiently solve the formulated problem, which has a theoretical guarantee to converge to the global optimal solution. To evaluate the ROMS algorithm, extensive experiments are performed using large-scale benchmark datasets, including St Lucia, CMU-VL, and Nordland datasets. Experimental results have validated that our algorithm outperforms previous loop closure detection methods, and obtains the state-of-the-art performance on long-term place recognition.

We propose a system for performing structural change detection in street-view videos captured by a vehicle- mounted monocular camera over time. Our approach is moti- vated by the need for more frequent and efficient updates in the large-scale maps used in autonomous vehicle navigation. Our method chains a multi-sensor fusion SLAM and fast dense 3D reconstruction pipeline, which provide coarsely registered image pairs to a deep deconvolutional network for pixel-wise change detection. To train and evaluate our network we introduce a new urban change detection dataset which is an order of magnitude larger than existing datasets and contains challenging changes due to seasonal and lighting variations. Our method outperforms existing literature on this dataset, which we make available to the community, and an existing panoramic change detection dataset, demonstrating its wide applicability.

We consider the problem of assigning software pro- cesses (or tasks) to hardware processors in distributed robotics environments. We introduce the notion of a task variant, which supports the adaptation of software to specific hardware con- figurations. Task variants facilitate the trade-off of functional quality versus the requisite capacity and type of target execution processors. We formalise the problem of assigning task variants to processors as a mathematical model that incorporates typical con- straints found in robotics applications; the model is a constrained form of a multi-objective, multi-dimensional, multiple-choice knapsack problem. We propose and evaluate three different solution methods to the problem: constraint programming, a constructive greedy heuristic and a local search metaheuristic. Furthermore, we demonstrate the use of task variants in a real in- stance of a distributed interactive multi-agent navigation system, showing that our best solution method (constraint programming) improves the system's quality of service, as compared to the local search metaheuristic, the greedy heuristic and a randomised solution, by an average of 16%, 41% and 56% respectively.

We introduce a functional gradient descent tra- jectory optimization algorithm for robot motion planning in Reproducing Kernel Hilbert Spaces (RKHSs). Functional gra- dient algorithms are a popular choice for motion planning in complex many-degree-of-freedom robots, since they (in theory) work by directly optimizing within a space of continuous trajec- tories to avoid obstacles while maintaining geometric properties such as smoothness. However, in practice, implementations such as CHOMP and TrajOpt typically commit to a fixed, finite parametrization of trajectories, often as a sequence of waypoints. Such a parameterization can lose much of the benefit of reasoning in a continuous trajectory space: e.g., it can require taking an inconveniently small step size and large number of iterations to maintain smoothness. Our work generalizes functional gradient trajectory optimization by formulating it as minimization of a cost functional in an RKHS. This generalization lets us represent trajectories as linear combinations of kernel functions. As a re- sult, we are able to take larger steps and achieve a locally optimal trajectory in just a few iterations. Depending on the selection of kernel, we can directly optimize in spaces of trajectories that are inherently smooth in velocity, jerk, curvature, etc., and that have a low-dimensional, adaptively chosen parameterization. Our experiments illustrate the effectiveness of the planner for different kernels, including Gaussian RBFs with independent and coupled interactions among robot joints, Laplacian RBFs, and B-splines, as compared to the standard discretized waypoint representation.

Autonomous surface and underwater vehicles (ASVs and AUVs) are increasingly being used for persistent monitoring of ocean phenomena. Typically, these vehicles are deployed for long periods of time and must operate with limited energy budgets. As a result, there is increased interest in recent years on developing energy efficient motion plans for these vehicles that leverage the dynamics of the surrounding flow field. In this paper, we present a graph search based method to plan time and energy optimal paths in a flow field where the kinematic actuation constraints on the vehicles are captured in our cost functions. We also use tools from topological path planning to generate optimal paths in different homotopy classes, which facilitates simultaneous exploration of the environment. The proposed strategy is validated using analytical flow models for large scale ocean circulation and in experiments using an indoor laboratory testbed capable of creating flows with ocean-like features. We also present a Riemannian metric based approximation for these cost functions which provides an alternative method for computing time and energy optimal paths. The Riemannian approximation results in smoother trajectories in contrast to the graph based approach while requiring less computational time.

16:00 - 16:10

Break to walk to poster session

16:10 - 18:00

Poster Session: Posters 25-47Location: Michigan League Ballroom

Posters are numbered by talk order (i.e. Talk 1 corresponds to Poster 1). Light refreshments and beverages will be served.

18:15 - 20:00

Transportation to BanquetBanquet buses depart Michigan League at 18:15 and 18:45

19:00 - 21:30

Conference BanquetLocation: The Henry Ford Museum

20:45 - 22:30

Return Transportation from BanquetBuses depart for both Michigan League and southern Ann Arbor hotels at 20:45, 21:15, and 22:45