Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC), pages: 5193-5200, IEEE, IEEE Conference on Decision and Control, December 2017 (conference)

Abstract

Finding optimal feedback controllers for nonlinear dynamic systems from data
is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful
framework for direct controller tuning from experimental trials. For selecting
the next query point and finding the global optimum, BO relies on a
probabilistic description of the latent objective function, typically a
Gaussian process (GP). As is shown herein, GPs with a common kernel choice can,
however, lead to poor learning outcomes on standard quadratic control problems.
For a first-order system, we construct two kernels that specifically leverage
the structure of the well-known Linear Quadratic Regulator (LQR), yet retain
the flexibility of Bayesian nonparametric learning. Simulations of uncertain
linear and nonlinear systems demonstrate that the LQR kernels yield superior
learning performance.

Recent approaches in robotics follow the insight that perception is facilitated by interactivity with the environment. These approaches are subsumed under the term of Interactive Perception (IP). We argue that IP provides the following benefits: (i) any type of forceful interaction with the environment creates a new type of informative sensory signal that would otherwise not be present and (ii) any prior knowledge about the nature of the interaction supports the interpretation of the signal. This is facilitated by knowledge of the regularity in the combined space of sensory information and action parameters. The goal of this survey is to postulate this as a principle and collect evidence in support by analyzing and categorizing existing work in this area. We also provide an overview of the most important applications of Interactive Perception. We close this survey by discussing the remaining open questions. Thereby, we hope to define a field and inspire future work.

We propose a novel long-term optimization criterion to improve the robustness of model-based reinforcement learning in real-world scenarios. Learning a dynamics model to derive a solution promises much greater data-efficiency and reusability compared to model-free alternatives. In practice, however, modelbased RL suffers from various imperfections such as noisy input and output data, delays and unmeasured (latent) states. To achieve higher resilience against such effects, we propose to optimize a generative long-term prediction model directly with respect to the likelihood of observed trajectories as opposed to the common approach of optimizing a dynamics model for one-step-ahead predictions. We evaluate the proposed method on several artificial and real-world benchmark problems and compare it to PILCO, a model-based RL framework, in experiments on a manipulation robot. The results show that the proposed method is competitive compared to state-of-the-art model learning methods. In contrast to these more involved models, our model can directly be employed for policy search and outperforms a baseline method in the robot experiment.

Understanding physical phenomena is a key component of human intelligence and enables physical interaction with previously unseen environments. In this paper, we study how an artificial agent can autonomously acquire this intuition through interaction with the environment. We created a synthetic block stacking environment with physics simulation in which the agent can learn a policy end-to-end through trial and error. Thereby, we bypass to explicitly model physical knowledge within the policy. We are specifically interested in tasks that require the agent to reach a given goal state that may be different for every new trial. To this end, we propose a deep reinforcement learning framework that learns policies which are parametrized by a goal. We validated the model on a toy example navigating in a grid world with different target positions and in a block stacking task with different target structures of the final tower. In contrast to prior work, our policies show better generalization across different goals.

In Proceedings of the IEEE/RSJ International Conference of Intelligent Robots and Systems, September 2017 (inproceedings)Accepted

Abstract

We aim to reliably predict whether a grasp on a known object is successful before it is executed in the real world. There is an entire suite of grasp metrics that has already been developed which rely on precisely known contact points between object and hand. However, it remains unclear whether and how they may be combined into a general purpose grasp stability predictor. In this paper, we analyze these questions by leveraging a large scale database of simulated grasps on a wide variety of objects. For each grasp, we compute the value of seven metrics. Each grasp is annotated by human subjects with ground truth stability labels. Given this data set, we train several classification methods to find out whether there is some underlying, non-trivial structure in the data that is difficult to model manually but can be learned. Quantitative and qualitative results show the complexity of the prediction problem. We found that a good prediction performance critically depends on using a combination of metrics as input features. Furthermore, non-parametric and non-linear classifiers best capture the structure in the data.

An event-based state estimation approach for reducing communication in a networked control system is proposed. Multiple distributed sensor agents observe a dynamic process and sporadically transmit their measurements to estimator agents over a shared bus network. Local event-triggering protocols ensure that data is transmitted only when necessary to meet a desired estimation accuracy. The event-based design is shown to emulate the performance of a centralised state observer design up to guaranteed bounds, but with reduced communication. The stability results for state estimation are extended to the distributed control system that results when the local estimates are used for feedback control. Results from numerical simulations and hardware experiments illustrate the effectiveness of the proposed approach in reducing network communication.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 5295-5301, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pages: 1557-1563, IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2017 (inproceedings)

We propose a probabilistic filtering method which fuses joint measurements with depth images to yield a precise, real-time estimate of the end-effector pose in the camera frame. This avoids the need for frame transformations when using it in combination with visual object tracking methods.
Precision is achieved by modeling and correcting biases in the joint measurements as well as inaccuracies in the robot model, such as poor extrinsic camera calibration. We make our method computationally efficient through a principled combination of Kalman filtering of the joint measurements and asynchronous depth-image updates based on the Coordinate Particle Filter.
We quantitatively evaluate our approach on a dataset recorded from a real robotic platform, annotated with ground truth from a motion capture system. We show that our approach is robust and accurate even under challenging conditions such as fast motion, significant and long-term occlusions, and time-varying biases. We release the dataset along with open-source code of our approach to allow for quantitative comparison with alternative approaches.

Abstract Anticipation can enhance the capability of a robot in its interaction with humans, where the robot predicts the humans' intention for selecting its own action. We present a novel framework of anticipatory action selection for human-robot interaction, which is capable to handle nonlinear and stochastic human behaviors such as table tennis strokes and allows the robot to choose the optimal action based on prediction of the human partner's intention with uncertainty. The presented framework is generic and can be used in many human-robot interaction scenarios, for example, in navigation and human-robot co-manipulation. In this article, we conduct a case study on human-robot table tennis. Due to the limited amount of time for executing hitting movements, a robot usually needs to initiate its hitting movement before the opponent hits the ball, which requires the robot to be anticipatory based on visual observation of the opponent's movement. Previous work on Intention-Driven Dynamics Models (IDDM) allowed the robot to predict the intended target of the opponent. In this article, we address the problem of action selection and optimal timing for initiating a chosen action by formulating the anticipatory action selection as a Partially Observable Markov Decision Process (POMDP), where the transition and observation are modeled by the \{IDDM\} framework. We present two approaches to anticipatory action selection based on the \{POMDP\} formulation, i.e., a model-free policy learning method based on Least-Squares Policy Iteration (LSPI) that employs the \{IDDM\} for belief updates, and a model-based Monte-Carlo Planning (MCP) method, which benefits from the transition and observation model by the IDDM. Experimental results using real data in a simulated environment show the importance of anticipatory action selection, and that \{POMDPs\} are suitable to formulate the anticipatory action selection problem by taking into account the uncertainties in prediction. We also show that existing algorithms for POMDPs, such as \{LSPI\} and MCP, can be applied to substantially improve the robot's performance in its interaction with humans.

2012

In IEEE International Workshop on Haptic Audio Visual Environments and Games (HAVE), pages: 25-31, October 2012 (inproceedings)

Abstract

In this paper, we address some of the challenges that arise as model-mediated teleoperation is applied to systems with multiple degrees of freedom and multiple sensors. Specifically we use a system with position, force, and vision sensors to explore an environment geometry in two degrees of freedom. The inclusion of vision is proposed to alleviate the difficulties of estimating an increasing number of environment properties. Vision can furthermore increase the predictive nature of model-mediated teleoperation, by effectively predicting touch feedback before the slave is even in contact with the environment. We focus on the case of estimating the location and orientation of a local surface patch at the contact point between the slave and the environment. We describe the various information sources with their respective limitations and create a combined model estimator as part of a multi-d.o.f. model-mediated controller. An experiment demonstrates the feasibility and benefits of utilizing vision sensors in teleoperation.

In International Conference on Intelligent Robots and Systems, October 2012 (inproceedings)

Abstract

Building robots capable of long term autonomy has been a long standing goal of robotics research. Such systems must be capable of performing certain tasks with a high degree of robustness and repeatability. In the context of personal robotics, these tasks could range anywhere from retrieving
items from a refrigerator, loading a dishwasher, to setting up a dinner table. Given the complexity of tasks there are a multitude of failure scenarios that the robot can encounter, irrespective
of whether the environment is static or dynamic. For a robot to be successful in such situations, it would need to know how to recover from failures or when to ask a human for help.
This paper, presents a novel shared autonomy behavioral executive to addresses these issues. We demonstrate how this executive combines generalized logic based recovery and human
intervention to achieve continuous failure free operation. We tested the systems over 250 trials of two different use case experiments. Our current algorithm drastically reduced human intervention from 26% to 4% on the first experiment and 46% to 9% on the second experiment. This system provides a new dimension to robot autonomy, where robots can exhibit long term failure free operation with minimal human supervision. We also discuss how the system can be generalized.

In this paper, we present an approach towards autonomous grasping of objects according to their category and a given task. Recent advances in the field of object segmentation and categorization as well as task-based grasp inference have been leveraged by integrating them into one pipeline. This allows us to transfer task-specific grasp experience between objects of the same category. The effectiveness of the approach is demonstrated on the humanoid robot ARMAR-IIIa.

In Seventeenth International Conference on Artificial Intelligence and Statistics, La Palma, Canary Islands, Fifteenth International Conference on Artificial Intelligence and Statistics , April 2012 (inproceedings)

In this contribution we propose an inverse dynamics controller for a humanoid robot that exploits torque redundancy to minimize any combination of linear and quadratic costs in the contact forces and the commands. In addition the controller satisfies linear equality and inequality constraints in the contact forces and the commands such as torque limits, unilateral contacts or friction cones limits. The originality of our approach resides in the formulation of the problem as a quadratic program where we only need to solve for the control commands and where the contact forces are optimized implicitly. Furthermore, we do not need a structured representation of the dynamics of the robot (i.e. an explicit computation of the inertia matrix). It is in contrast with existing methods based on quadratic programs. The controller is then robust to uncertainty in the estimation of the dynamics model and the optimization is fast enough to be implemented in high bandwidth torque control loops that are increasingly available on humanoid platforms. We demonstrate properties of our controller with simulations of a human size humanoid robot.

Movement primitives as basis of movement planning and control have become a popular topic in recent years. The key idea of movement primitives is that a rather small set of stereotypical movements should suffice to create a large set of complex manipulation skills. An interesting side effect of stereotypical movement is that it also creates stereotypical sensory events, e.g., in terms of kinesthetic variables, haptic variables, or, if processed appropriately, visual variables. Thus, a movement primitive executed towards a particular object in the environment will associate a large number of sensory variables that are typical for this manipulation skill. These association can be used to increase robustness towards perturbations, and they also allow failure detection and switching towards other behaviors. We call such movement primitives augmented with sensory associations Associative Skill Memories (ASM). This paper addresses how ASMs can be acquired by imitation learning and how they can create robust manipulation skill by determining subsequent ASMs online to achieve a particular manipulation goal. Evaluation for grasping and manipulation with a Barrett WAM/Hand illustrate our approach.

The ability to grasp unknown objects is an important skill for personal robots, which has been addressed by many present and past research projects, but still remains an open problem. A crucial aspect of grasping is choosing an appropriate grasp configuration, i.e. the 6d pose of the hand relative to the object and its finger configuration. Finding feasible grasp configurations for novel objects, however, is challenging because of the huge variety in shape and size of these objects. Moreover, possible configurations also depend on the specific kinematics of the robotic arm and hand in use. In this paper, we introduce a new grasp selection algorithm able to find object grasp poses based on previously demonstrated grasps. Assuming that objects with similar shapes can be grasped in a similar way, we associate to each demonstrated grasp a grasp template. The template is a local shape descriptor for a possible grasp pose and is constructed using 3d information from depth sensors. For each new object to grasp, the algorithm then finds the best grasp candidate in the library of templates. The grasp selection is also able to improve over time using the information of previous grasp attempts to adapt the ranking of the templates. We tested the algorithm on two different platforms, the Willow Garage PR2 and the Barrett WAM arm which have very different hands. Our results show that the algorithm is able to find good grasp configurations for a large set of objects from a relatively small set of demonstrations, and does indeed improve its performance over time.

In this paper, we derive a probabilistic registration algorithm for object modeling and tracking. In many robotics applications, such as manipulation tasks, nonvisual information about the movement of the object is available, which we will combine with the visual information. Furthermore we do not only consider observations of the object, but we also take space into account which has been observed to not be part of the object. Furthermore we are computing a posterior distribution over the relative alignment and not a point estimate as typically done in for example Iterative Closest Point (ICP). To our knowledge no existing algorithm meets these three conditions and we thus derive a novel registration algorithm in a Bayesian framework. Experimental results suggest that the proposed methods perform favorably in comparison to PCL [1] implementations of feature mapping and ICP, especially if nonvisual information is available.

2008

One of the most general frameworks for phrasing control problems for
complex, redundant robots is operational space control. However, while
this framework is of essential importance for robotics and well-understood
from an analytical point of view, it can be prohibitively hard to achieve
accurate control in face of modeling errors, which are inevitable in com-
plex robots, e.g., humanoid robots. In this paper, we suggest a learning
approach for opertional space control as a direct inverse model learning
problem. A ï¬rst important insight for this paper is that a physically cor-
rect solution to the inverse problem with redundant degrees-of-freedom
does exist when learning of the inverse map is performed in a suitable
piecewise linear way. The second crucial component for our work is based
on the insight that many operational space controllers can be understood
in terms of a constrained optimal control problem. The cost function as-
sociated with this optimal control problem allows us to formulate a learn-
ing algorithm that automatically synthesizes a globally consistent desired
resolution of redundancy while learning the operational space controller.
From the machine learning point of view, this learning problem corre-
sponds to a reinforcement learning problem that maximizes an immediate
reward. We employ an expectation-maximization policy search algorithm
in order to solve this problem. Evaluations on a three degrees of freedom
robot arm are used to illustrate the suggested approach. The applica-
tion to a physically realistic simulator of the anthropomorphic SARCOS
Master arm demonstrates feasibility for complex high degree-of-freedom
robots. We also show that the proposed method works in the setting of
learning resolved motion rate control on real, physical Mitsubishi PA-10
medical robotics arm.

Dexterous manipulation with a highly redundant movement system is one of the hallmarks of hu-
man motor skills. From numerous behavioral studies, there is strong evidence that humans employ
compliant task space control, i.e., they focus control only on task variables while keeping redundant
degrees-of-freedom as compliant as possible. This strategy is robust towards unknown disturbances
and simultaneously safe for the operator and the environment. The theory of operational space con-
trol in robotics aims to achieve similar performance properties. However, despite various compelling
theoretical lines of research, advanced operational space control is hardly found in actual robotics imple-
mentations, in particular new kinds of robots like humanoids and service robots, which would strongly
profit from compliant dexterous manipulation. To analyze the pros and cons of different approaches
to operational space control, this paper focuses on a theoretical and empirical evaluation of different
methods that have been suggested in the literature, but also some new variants of operational space
controllers. We address formulations at the velocity, acceleration and force levels. First, we formulate
all controllers in a common notational framework, including quaternion-based orientation control, and
discuss some of their theoretical properties. Second, we present experimental comparisons of these
approaches on a seven-degree-of-freedom anthropomorphic robot arm with several benchmark tasks.
As an aside, we also introduce a novel parameter estimation algorithm for rigid body dynamics, which
ensures physical consistency, as this issue was crucial for our successful robot implementations. Our
extensive empirical results demonstrate that one of the simplified acceleration-based approaches can
be advantageous in terms of task performance, ease of parameter tuning, and general robustness and
compliance in face of inevitable modeling errors.

In Abstracts of the Eighteenth Annual Meeting of Neural Control of Movement (NCM), Naples, Florida, April 29-May 4, 2008, clmc (inproceedings)

Abstract

Force field experiments have been a successful paradigm for studying the principles of planning, execution, and learning in human arm movements. Subjects have been shown to cope with the disturbances generated by force fields by learning internal models of the underlying dynamics to predict disturbance effects or by increasing arm impedance (via co-contraction) if a predictive approach becomes infeasible.
Several studies have addressed the issue uncertainty in force field learning. Scheidt et al. demonstrated that subjects exposed to a viscous force field of fixed structure but varying strength (randomly changing from trial to trial), learn to adapt to the mean disturbance, regardless of the statistical distribution. Takahashi et al. additionally show a decrease in strength of after-effects after learning in the randomly varying environment. Thus they suggest that the nervous system adopts a dual strategy: learning an internal model of the mean of the random environment, while simultaneously increasing arm impedance to minimize the consequence of errors.
In this study, we examine what role variance plays in the learning of uncertain force fields. We use a 7 degree-of-freedom exoskeleton robot as a manipulandum (Sarcos Master Arm, Sarcos, Inc.), and apply a 3D viscous force field of fixed structure and strength randomly selected from trial to trial. Additionally, in separate blocks of trials, we alter the variance of the randomly selected strength multiplier (while keeping a constant mean). In each block, after sufficient learning has occurred, we apply catch trials with no force field and measure the strength of after-effects.
As expected in higher variance cases, results show increasingly smaller levels of after-effects as the variance is increased, thus implying subjects choose the robust strategy of increasing arm impedance to cope with higher levels of uncertainty. Interestingly, however, subjects show an increase in after-effect strength with a small amount of variance as compared to the deterministic (zero variance) case. This result implies that a small amount of variability aides in internal model formation, presumably a consequence of the additional amount of exploration conducted in the workspace of the task.

Real-time control of the endeffector of a humanoid robot in external coordinates requires
computationally efficient solutions of the inverse kinematics problem. In this context, this
paper investigates methods of resolved motion rate control (RMRC) that employ optimization
criteria to resolve kinematic redundancies. In particular we focus on two established techniques,
the pseudo inverse with explicit optimization and the extended Jacobian method. We prove that
the extended Jacobian method includes pseudo-inverse methods as a special solution. In terms of
computational complexity, however, pseudo-inverse and extended Jacobian differ significantly in
favor of pseudo-inverse methods. Employing numerical estimation techniques, we introduce a
computationally efficient version of the extended Jacobian with performance comparable to the
original version. Our results are illustrated in simulation studies with a multiple degree-offreedom
robot, and were evaluated on an actual 30 degree-of-freedom full-body humanoid robot.

In Abstracts of the Eighteenth Annual Meeting of Neural Control of Movement (NCM), Naples, Florida, April 29-May 4, 2008, clmc (inproceedings)

Abstract

Reinforcement learning (RL) - learning solely based on reward or cost
feedback - is widespread in robotics control and has been also suggested
as computational model for human motor control. In human motor control,
however, hardly any experiment studied reinforcement learning. Here, we
study learning based on visual cost feedback in a reaching task and did
three experiments: (1) to establish a simple enough experiment for RL,
(2) to study spatial localization of RL, and (3) to study the dependence
of RL on the cost function.
In experiment (1), subjects sit in front of a drawing tablet and look at
a screen onto which the drawing pen's position is projected. Beginning
from a start point, their task is to move with the pen through a target
point presented on screen. Visual feedback about the pen's position is
given only before movement onset. At the end of a movement, subjects get
visual feedback only about the cost of this trial. We choose as cost the
squared distance between target and virtual pen position at the target
line. Above a threshold value, the cost was fixed at this value. In the
mapping of the pen's position onto the screen, we added a bias (unknown
to subject) and Gaussian noise. As result, subjects could learn the
bias, and thus, showed reinforcement learning.
In experiment (2), we randomly altered the target position between three
different locations (three different directions from start point: -45,
0, 45). For each direction, we chose a different bias. As result,
subjects learned all three bias values simultaneously. Thus, RL can be
spatially localized.
In experiment (3), we varied the sensitivity of the cost function by
multiplying the squared distance with a constant value C, while keeping
the same cut-off threshold. As in experiment (2), we had three target
locations. We assigned to each location a different C value (this
assignment was randomized between subjects). Since subjects learned the
three locations simultaneously, we could directly compare the effect of
the different cost functions. As result, we found an optimal C value; if
C was too small (insensitive cost), learning was slow; if C was too
large (narrow cost valley), the exploration time was longer and learning
delayed. Thus, reinforcement learning in human motor control appears to
be sen

In this paper we introduce an improved implementation of locally weighted projection regression
(LWPR), a supervised learning algorithm that is capable of handling high-dimensional input data.
As the key features, our code supports multi-threading, is available for multiple platforms, and
provides wrappers for several programming languages.

Stochastic optimal control is a framework for computing control commands that lead to an optimal behavior under a given cost. Despite the long history of optimal control in engineering, it has been only recently applied to describe human motion. So far, stochastic optimal control has been mainly used in tasks that are already learned, such as reaching to a target. For learning, however, there are only few cases where optimal control has been applied. The main assumptions of stochastic optimal control that restrict its application to tasks after learning are the a priori knowledge of (1) a quadratic cost function (2) a state space model that captures the kinematics and/or dynamics of musculoskeletal system and (3) a measurement equation that models the proprioceptive and/or exteroceptive feedback. Under these assumptions, a sequence of control gains is computed that is optimal with respect to the prespecified cost function.
In our work, we relax the assumption of the a priori known cost function and provide a computational framework for modeling tasks that involve learning. Typically, a cost function consists of two parts: one part that models the task constraints, like squared distance to goal at movement endpoint, and one part that integrates over the squared control commands. In learning a task, the first part of this cost function will be adapted. We use an expectation-maximization scheme for learning: the expectation step optimizes the task constraints through gradient descent of a reward function and the maximizing step optimizes the control commands.
Our computational model is tested and compared with data given from a behavioral experiment. In this experiment, subjects sit in front of a drawing tablet and look at a screen onto which the drawing-pen's position is projected. Beginning from a start point, their task is to move with the pen through a target point presented on screen. Visual feedback about the pen's position is given only before movement onset. At the end of a movement, subjects get visual feedback only about the cost of this trial. In the mapping of the pen's position onto the screen, we added a bias (unknown to subject) and Gaussian noise. Therefore the cost is a function of this bias. The subjects were asked to reach to the target and minimize this cost over trials.
In this behavioral experiment, subjects could learn the bias and thus showed reinforcement learning. With our computational model, we could model the learning process over trials. Particularly, the dependence on parameters of the reward function (Gaussian width) and the modulation of movement variance over time were similar in experiment and model.

Local linearizations are ubiquitous in the control of robotic systems. Analytical methods, if available, can be used to obtain the linearization, but in complex robotics systems where the the dynamics and kinematics are often not faithfully obtainable, empirical linearization may be preferable. In this case, it is important to only use data for the local linearization that lies within a ``reasonable'' linear regime of the system, which can be defined from the Hessian at the point of the linearization -- a quantity that is not available without an analytical model. We introduce a Bayesian approach to solve statistically what constitutes a ``reasonable'' local regime. We approach this problem in the context local linear regression. In contrast to previous locally linear methods, we avoid cross-validation or complex statistical hypothesis testing techniques to find the appropriate local regime. Instead, we treat the parameters of the local regime probabilistically and use approximate Bayesian inference for their estimation. This approach results in an analytical set of iterative update equations that are easily implemented on real robotics systems for real-time applications. As in other locally weighted regressions, our algorithm also lends itself to complete nonlinear function approximation for learning empirical internal models. We sketch the derivation of our Bayesian method and provide evaluations on synthetic data and actual robot data where the analytical linearization was known.

The planning and execution of human arm movements is still unresolved. An ongoing controversy is whether we plan a movement in kinematic coordinates and convert these coordinates with an inverse internal model into motor commands (like muscle activation) or whether we combine a few muscle synergies or equilibrium points to move a hand, e.g., between two targets. The first hypothesis implies that a planner produces a desired end-effector position for all time points; the second relies on the dynamics of the muscular-skeletal system for a given control command to produce a continuous end-effector trajectory. To distinguish between these two possibilities, we use a visuomotor adaptation experiment.
Subjects moved a pen on a graphics tablet and observed the pen's mapped position onto a screen (subjects quickly adapted to this mapping). The task was to move a cursor between two points in a given time window. In the adaptation test, we manipulated the velocity profile of the cursor feedback such that the shape of the trajectories remained unchanged (for straight paths). If humans would use a kinematic plan and map at each time the desired end-effector position onto control commands, subjects should adapt to the above manipulation. In a similar experiment, Wolpert et al (1995) showed adaptation to changes in the curvature of trajectories. This result, however, cannot rule out a shift of an equilibrium point or an additional synergy activation between start and end point of a movement.
In our experiment, subjects did two sessions, one control without and one with velocity-profile manipulation. To skew the velocity profile of the cursor trajectory, we added to the current velocity, v, the function 0.8*v*cos(pi + pi*x), where x is the projection of the cursor position onto the start-goal line divided by the distance start to goal (x=0 at the start point).
As result, subjects did not adapt to this manipulation: for all subjects, the true hand motion was not significantly modified in a direction consistent with adaptation, despite that the visually presented motion differed significantly from the control motion.
One may still argue that this difference in motion was insufficient to be processed visually. Thus, as a control experiment, we replayed control and modified motions to the subjects and asked which of the two motions appeared 'more natural'. Subjects chose the unperturbed motion as more natural significantly better than chance.
In summary, for a visuomotor transformation task, the hypothesis of a planned continuous end-effector trajectory predicts adaptation to a modified velocity profile. The current experiment found no adaptation under such transformation.

1999

This review will focus on two recent developments in artificial intelligence and neural computation: learning from imitation and the development of humanoid robots. It will be postulated that the study of imitation learning offers a promising route to gain new insights into mechanisms of perceptual motor control that could ultimately lead to the creation of autonomous humanoid robots. This hope is justified because imitation learning channels research efforts towards three important issues: efficient motor learning, the connection between action and perception, and modular motor control in form of movement primitives. In order to make these points, first, a brief review of imitation learning will be given from the view of psychology and neuroscience. In these fields, representations and functional connections between action and perception have been explored that contribute to the understanding of motor acts of other beings. The recent discovery that some areas in the primate brain are active during both movement perception and execution provided a first idea of the possible neural basis of imitation. Secondly, computational approaches to imitation learning will be described, initially from the perspective of traditional AI and robotics, and then with a focus on neural network models and statistical learning research. Parallels and differences between biological and computational approaches to imitation will be highlighted. The review will end with an overview of current projects that actually employ imitation learning for humanoid robots.

1999

Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems