In this paper, a case-supported principle-based behavior paradigm is proposed to help ensure ethical behavior of autonomous machines. We argue that ethically significant behavior of autonomous systems should be guided by explicit ethical principles determined through a consensus of ethicists. Such a consensus is likely to emerge in many areas in which autonomous systems are apt to be deployed and for the actions they are liable to undertake. We believe that this is the case since we are more likely to agree on how machines ought to treat us than on how human beings ought to treat one another. Given such a consensus, particular cases of ethical dilemmas where ethicists agree on the ethically relevant features and the right course of action can be used to help discover principles that balance these features when they are in conflict. Such principles not only help ensure ethical behavior of complex and dynamic systems but also can serve as a basis for justification of this behavior. The requirements, methods, implementation, and evaluation components of the paradigm are detailed as well as its instantiation in both a simulated and real robot functioning in the domain of eldercare.

For many service robots, reactivity to changes in their surroundings is a must. However, developing software suitable for dynamic environments is difficult. Existing robotic middleware allows engineers to design behavior graphs by organizing communication between components. But because these graphs are structurally inflexible, they hardly support the development of complex reactive behavior. To address this limitation, we propose Playful, a software platform that applies reactive programming to the specification of robotic behavior.

We propose a method for instance-level segmentation that uses RGB-D data as input and provides detailed information about the location, geometry and number of {\em individual\/} objects in the scene. This level of understanding is fundamental for autonomous robots. It enables safe and robust decision-making under the large uncertainty of the real-world. In our model, we propose to use the first and second order moments of the object occupancy function to represent an object instance. We train an hourglass Deep Neural Network (DNN) where each pixel in the output votes for the 3D position of the corresponding object center and for the object's size and pose. The final instance segmentation is achieved through clustering in the space of moments. The object-centric training loss is defined on the output of the clustering. Our method outperforms the state-of-the-art instance segmentation method on our synthesized dataset. We show that our method generalizes well on real-world data achieving visually better segmentation results.

We address the challenging problem of robotic grasping and manipulation in the presence of uncertainty. This uncertainty is due to noisy sensing, inaccurate models and hard-to-predict environment dynamics. Our approach emphasizes the importance of continuous, real-time perception and its tight integration with reactive motion generation methods. We present a fully integrated system where real-time object and robot tracking as well as ambient world modeling provides the necessary input to feedback controllers and continuous motion optimizers. Specifically, they provide attractive and repulsive potentials based on which the controllers and motion optimizer can online compute movement policies at different time intervals. We extensively evaluate the proposed system on a real robotic platform in four scenarios that exhibit either challenging workspace geometry or a dynamic environment. We compare the proposed integrated system with a more traditional sense-plan-act approach that is still widely used. In 333 experiments, we show the robustness and accuracy of the proposed system.

One of the most basic skills a robot should possess is predicting the effect of physical interactions with objects in the environment. This enables optimal action selection to reach a certain goal state. Traditionally, dynamics are approximated by physics-based analytical models. These models rely on specific state representations that may be hard to obtain from raw sensory data, especially if no knowledge of the object shape is assumed. More recently, we have seen learning approaches that can predict the effect of complex physical interactions directly from sensory input. It is however an open question how far these models generalize beyond their training data. In this work, we investigate the advantages and limitations of neural network based learning approaches for predicting the effects of actions based on sensory input and show how analytical and learned models can be combined to leverage the best of both worlds. As physical interaction task, we use planar pushing, for which there exists a well-known analytical model and a large
real-world dataset. We propose to use a convolutional neural network to convert raw depth images or organized point clouds into a suitable representation for the analytical model and compare this approach to using neural networks for both, perception and prediction.
A systematic evaluation of the proposed approach on a very large real-world dataset shows two
main advantages of the hybrid architecture. Compared to a pure neural network, it significantly (i) reduces required training data and (ii) improves generalization to novel physical interaction.

For complex robots such as humanoids, model-based control is highly beneficial for accurate tracking while keeping negative feedback gains low for compliance. However, in such multi degree-of-freedom lightweight systems, conventional identification of rigid body dynamics models using CAD data and actuator models is inaccurate due to unknown nonlinear robot dynamic effects. An alternative method is data-driven parameter estimation, but significant noise in measured and inferred variables affects it adversely. Moreover, standard estimation procedures may give physically inconsistent results due to unmodeled nonlinearities or insufficiently rich data. This paper addresses these problems, proposing a Bayesian system identification technique for linear or piecewise linear systems. Inspired by Factor Analysis regression, we develop a computationally efficient variational Bayesian regression algorithm that is robust to ill-conditioned data, automatically detects relevant features, and identifies input and output noise. We evaluate our approach on rigid body parameter estimation for various robotic systems, achieving an error of up to three times lower than other state-of-the-art machine learning methods

One of the hallmarks of the performance, versatility, and robustness
of biological motor control is the ability to adapt the impedance of
the overall biomechanical system to different task requirements and
stochastic disturbances. A transfer of this principle to robotics is
desirable, for instance to enable robots to work robustly and safely
in everyday human environments. It is, however, not trivial to derive
variable impedance controllers for practical high degree-of-freedom
(DOF) robotic tasks.
In this contribution, we accomplish such variable impedance control
with the reinforcement learning (RL) algorithm PISq ({f P}olicy
{f I}mprovement with {f P}ath {f I}ntegrals). PISq is a
model-free, sampling based learning method derived from first
principles of stochastic optimal control. The PISq algorithm requires no tuning
of algorithmic parameters besides the exploration noise. The designer
can thus fully focus on cost function design to specify the task. From
the viewpoint of robotics, a particular useful property of PISq is
that it can scale to problems of many DOFs, so that reinforcement learning on real robotic
systems becomes feasible.
We sketch the PISq algorithm and its theoretical properties, and how
it is applied to gain scheduling for variable impedance control.
We evaluate our approach by presenting results on several simulated and real robots.
We consider tasks involving accurate tracking through via-points, and manipulation tasks requiring physical contact with the environment.
In these tasks, the optimal strategy requires both tuning of a reference trajectory emph{and} the impedance of the end-effector.
The results show that we can use path integral based reinforcement learning not only for
planning but also to derive variable gain feedback controllers in
realistic scenarios. Thus, the power of variable impedance control
is made available to a wide variety of robotic systems and practical
applications.

2008

One of the most general frameworks for phrasing control problems for
complex, redundant robots is operational space control. However, while
this framework is of essential importance for robotics and well-understood
from an analytical point of view, it can be prohibitively hard to achieve
accurate control in face of modeling errors, which are inevitable in com-
plex robots, e.g., humanoid robots. In this paper, we suggest a learning
approach for opertional space control as a direct inverse model learning
problem. A ï¬rst important insight for this paper is that a physically cor-
rect solution to the inverse problem with redundant degrees-of-freedom
does exist when learning of the inverse map is performed in a suitable
piecewise linear way. The second crucial component for our work is based
on the insight that many operational space controllers can be understood
in terms of a constrained optimal control problem. The cost function as-
sociated with this optimal control problem allows us to formulate a learn-
ing algorithm that automatically synthesizes a globally consistent desired
resolution of redundancy while learning the operational space controller.
From the machine learning point of view, this learning problem corre-
sponds to a reinforcement learning problem that maximizes an immediate
reward. We employ an expectation-maximization policy search algorithm
in order to solve this problem. Evaluations on a three degrees of freedom
robot arm are used to illustrate the suggested approach. The applica-
tion to a physically realistic simulator of the anthropomorphic SARCOS
Master arm demonstrates feasibility for complex high degree-of-freedom
robots. We also show that the proposed method works in the setting of
learning resolved motion rate control on real, physical Mitsubishi PA-10
medical robotics arm.

Dexterous manipulation with a highly redundant movement system is one of the hallmarks of hu-
man motor skills. From numerous behavioral studies, there is strong evidence that humans employ
compliant task space control, i.e., they focus control only on task variables while keeping redundant
degrees-of-freedom as compliant as possible. This strategy is robust towards unknown disturbances
and simultaneously safe for the operator and the environment. The theory of operational space con-
trol in robotics aims to achieve similar performance properties. However, despite various compelling
theoretical lines of research, advanced operational space control is hardly found in actual robotics imple-
mentations, in particular new kinds of robots like humanoids and service robots, which would strongly
profit from compliant dexterous manipulation. To analyze the pros and cons of different approaches
to operational space control, this paper focuses on a theoretical and empirical evaluation of different
methods that have been suggested in the literature, but also some new variants of operational space
controllers. We address formulations at the velocity, acceleration and force levels. First, we formulate
all controllers in a common notational framework, including quaternion-based orientation control, and
discuss some of their theoretical properties. Second, we present experimental comparisons of these
approaches on a seven-degree-of-freedom anthropomorphic robot arm with several benchmark tasks.
As an aside, we also introduce a novel parameter estimation algorithm for rigid body dynamics, which
ensures physical consistency, as this issue was crucial for our successful robot implementations. Our
extensive empirical results demonstrate that one of the simplified acceleration-based approaches can
be advantageous in terms of task performance, ease of parameter tuning, and general robustness and
compliance in face of inevitable modeling errors.

In this paper we introduce an improved implementation of locally weighted projection regression
(LWPR), a supervised learning algorithm that is capable of handling high-dimensional input data.
As the key features, our code supports multi-threading, is available for multiple platforms, and
provides wrappers for several programming languages.

1998

We introduce a constructive, incremental learning system for regression problems that models data by means of spatially localized linear models. In contrast to other approaches, the size and shape of the receptive field of each locally linear model as well as the parameters of the locally linear model itself are learned independently, i.e., without the need for competition or any other kind of communication. Independent learning is accomplished by incrementally minimizing a weighted local cross validation error. As a result, we obtain a learning system that can allocate resources as needed while dealing with the bias-variance dilemma in a principled way. The spatial localization of the linear models increases robustness towards negative interference. Our learning system can be interpreted as a nonparametric adaptive bandwidth smoother, as a mixture of experts where the experts are trained in isolation, and as a learning system which profits from combining independent expert knowledge on the same problem. This paper illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields.Â

Incremental learning of sensorimotor transformations in high dimensional spaces is one of the basic prerequisites for the success of autonomous robot devices as well as biological movement systems. So far, due to sparsity of data in high dimensional spaces, learning in such settings requires a significant amount of prior knowledge about the learning task, usually provided by a human expert. In this paper we suggest a partial revision of the view. Based on empirical studies, we observed that, despite being globally high dimensional and sparse, data distributions from physical movement systems are locally low dimensional and dense. Under this assumption, we derive a learning algorithm, Locally Adaptive Subspace Regression, that exploits this property by combining a dynamically growing local dimensionality reduction techniqueÂ as a preprocessing step with a nonparametric learning technique, locally weighted regression, that also learns the region of validity of the regression. The usefulness of the algorithm and the validity of its assumptions are illustrated for a synthetic data set, and for data of the inverse dynamics of human arm movements and an actual 7 degree-of-freedom anthropomorphic robot arm.Â

1994

This paper explores issues involved in implementing robot learning for a challenging dynamic task, using a case study from robot juggling. We use a memory-based local modeling approach (locally weighted regression) to represent a learned model of the task to be performed. Statistical tests are given to examine the uncertainty of a model, to optimize its prediction quality, and to deal with noisy and corrupted data. We develop an exploration algorithm that explicitly deals with prediction accuracy requirements during exploration. Using all these ingredients in combination with methods from optimal control, our robot achieves fast real-time learning of the task within 40 to 100 trials.

1994

Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems