This paper presents work on vision based robotic grasping. The proposed method adopts a learning framework where prototypical grasping points are learnt from several examples and then used on novel objects. For representation purposes, we apply the concept of shape context and for learning we use a supervised learning approach in which the classifier is trained with labelled synthetic images. We evaluate and compare the performance of linear and non-linear classifiers. Our results show that a combination of a descriptor based on shape context with a non-linear classification algorithm leads to a stable detection of grasping points for a variety of objects.

For complex robots such as humanoids, model-based control is highly beneficial for accurate tracking
while keeping negative feedback gains low for compliance. However, in such multi degree-of-freedom
lightweight systems, conventional identification of rigid body dynamics models using CAD data and
actuator models is inaccurate due to unknown nonlinear robot dynamic effects. An alternative method
is data-driven parameter estimation, but significant noise in measured and inferred variables affects it
adversely. Moreover, standard estimation procedures may give physically inconsistent results due to
unmodeled nonlinearities or insufficiently rich data. This paper addresses these problems, proposing
a Bayesian system identification technique for linear or piecewise linear systems. Inspired by Factor
Analysis regression, we develop a computationally efficient variational Bayesian regression algorithm
that is robust to ill-conditioned data, automatically detects relevant features, and identifies input and
output noise. We evaluate our approach on rigid body parameter estimation for various robotic systems,
achieving an error of up to three times lower than other state-of-the-art machine learning methods.

In this work we present the ﬁrst constrained stochastic op-
timal feedback controller applied to a fully nonlinear, tendon
driven index ﬁnger model. Our model also takes into account an
extensor mechanism, and muscle force-length and force-velocity
properties. We show this feedback controller is robust to noise
and perturbations to the dynamics, while successfully handling
the nonlinearities and high dimensionality of the system. By ex-
tending prior methods, we are able to approximate physiological
realism by ensuring positivity of neural commands and tendon
tensions at all timesthus can, for the ﬁrst time, use the optimal control framework
to predict biologically plausible tendon tensions for a nonlinear
neuromuscular ﬁnger model.
METHODS
1 Muscle Model
The rigid-body triple pendulum ﬁnger model with slightly
viscous joints is actuated by Hill-type muscle models. Joint
torques are generated by the seven muscles of the index ﬁn-

Robot learning methods which allow au- tonomous robots to adapt to novel situations have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to ful- fill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics. If possible, scaling was usually only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general ap- proach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human- like performance. For doing so, we study two major components for such an approach, i. e., firstly, we study policy learning algo- rithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structu- res for task representation and execution.

We present a novel algorithm for efficient learning and feature selection in high-
dimensional regression problems. We arrive at this model through a modification of
the standard regression model, enabling us to derive a probabilistic version of the
well-known statistical regression technique of backfitting. Using the Expectation-
Maximization algorithm, along with variational approximation methods to overcome
intractability, we extend our algorithm to include automatic relevance detection
of the input features. This Variational Bayesian Least Squares (VBLS) approach
retains its simplicity as a linear model, but offers a novel statistically robust â??black-
boxâ? approach to generalized linear regression with high-dimensional inputs. It can
be easily extended to nonlinear regression and classification problems. In particular,
we derive the framework of sparse Bayesian learning, e.g., the Relevance Vector
Machine, with VBLS at its core, offering significant computational and robustness
advantages for this class of methods. We evaluate our algorithm on synthetic and
neurophysiological data sets, as well as on standard regression and classification
benchmark data sets, comparing it with other competitive statistical approaches
and demonstrating its suitability as a drop-in replacement for other generalized
linear regression techniques.

In the proceedings of American Control Conference (ACC 2010) , 2010, clmc (article)

Abstract

We present a generalization of the classic Differential Dynamic Programming algorithm. We assume the existence of state- and control-dependent process noise, and proceed to derive the second-order expansion of the cost-to-go. Despite having quartic and cubic terms in the initial expression, we show that these vanish, leaving us with the same quadratic structure as standard DDP.

In a not too distant future, robots will be a natural part of
daily life in human society, providing assistance in many
areas ranging from clinical applications, education and care
giving, to normal household environments [1]. It is hard to
imagine that all possible tasks can be preprogrammed in such
robots. Robots need to be able to learn, either by themselves
or with the help of human supervision. Additionally, wear and
tear on robots in daily use needs to be automatically compensated
for, which requires a form of continuous self-calibration,
another form of learning. Finally, robots need to react to stochastic
and dynamic environments, i.e., they need to learn
how to optimally adapt to uncertainty and unforeseen
changes. Robot learning is going to be a key ingredient for the
future of autonomous robots.
While robot learning covers a rather large field, from learning
to perceive, to plan, to make decisions, etc., we will focus
this review on topics of learning control, in particular, as it is
concerned with learning control in simulated or actual physical
robots. In general, learning control refers to the process of
acquiring a control strategy for a particular control system and
a particular task by trial and error. Learning control is usually
distinguished from adaptive control [2] in that the learning system
can have rather general optimization objectivesâ??not just,
e.g., minimal tracking errorâ??and is permitted to fail during
the process of learning, while adaptive control emphasizes fast
convergence without failure. Thus, learning control resembles
the way that humans and animals acquire new movement
strategies, while adaptive control is a special case of learning
control that fulfills stringent performance constraints, e.g., as
needed in life-critical systems like airplanes.
Learning control has been an active topic of research for at
least three decades. However, given the lack of working robots
that actually use learning components, more work needs to be
done before robot learning will make it beyond the laboratory
environment. This article will survey some ongoing and past
activities in robot learning to assess where the field stands and
where it is going. We will largely focus on nonwheeled robots
and less on topics of state estimation, as typically explored in
wheeled robots [3]â??6], and we emphasize learning in continuous
state-action spaces rather than discrete state-action spaces [7], [8].
We will illustrate the different topics of robot learning with
examples from our own research with anthropomorphic and
humanoid robots.

We present a control architecture for fast quadruped locomotion over rough terrain. We approach the problem by decomposing
it into many sub-systems, in which we apply state-of-the-art learning, planning, optimization, and control techniques
to achieve robust, fast locomotion. Unique features of our control strategy include: (1) a system that learns optimal
foothold choices from expert demonstration using terrain templates, (2) a body trajectory optimizer based on the Zero-
Moment Point (ZMP) stability criterion, and (3) a floating-base inverse dynamics controller that, in conjunction with force
control, allows for robust, compliant locomotion over unperceived obstacles. We evaluate the performance of our controller
by testing it on the LittleDog quadruped robot, over a wide variety of rough terrains of varying difficulty levels. The
terrain that the robot was tested on includes rocks, logs, steps, barriers, and gaps, with obstacle sizes up to the leg length
of the robot. We demonstrate the generalization ability of this controller by presenting results from testing performed by
an independent external test team on terrain that has never been shown to us.

A distinct property of robot vision systems is that they are embodied. Visual information is extracted for the purpose of moving in and interacting with the environment. Thus, different types of perception-action cycles need to be implemented and evaluated.
In this paper, we study the problem of designing a vision system for the purpose of object grasping in everyday environments. This vision system is firstly targeted at the interaction with the world through recognition and grasping of objects and secondly at being an interface for the reasoning and planning module to the real world. The latter provides the vision system with a certain task that drives it and defines a specific context, i.e. search for or identify a certain object and analyze it for potential later manipulation. We deal with cases of: (i) known objects, (ii) objects similar to already known objects, and (iii) unknown objects. The perception-action cycle is connected to the reasoning system based on the idea of affordances. All three cases are also related to the state of the art and the terminology in the neuroscientific area.

Computational models of the neuromuscular system hold the potential to allow us to reach a deeper understanding of neuromuscular function and clinical rehabilitation by complementing experimentation. By serving as a means to distill and explore specific hypotheses, computational models emerge from prior experimental data and motivate future experimental work. Here we review computational tools used to understand neuromuscular function including musculoskeletal modeling, machine learning, control theory, and statistical model analysis. We conclude that these tools, when used in combination, have the potential to further our understanding of neuromuscular function by serving as a rigorous means to test scientific hypotheses in ways that complement and leverage experimental data.

Abstract The paper presents a two-layered system for (1) learning and encoding a periodic signal without any knowledge on its frequency and waveform, and (2) modulating the learned periodic trajectory in response to external events. The system is used to learn periodic tasks on a humanoid HOAP-2 robot. The first layer of the system is a dynamical system responsible for extracting the fundamental frequency of the input signal, based on adaptive frequency oscillators. The second layer is a dynamical system responsible for learning of the waveform based on a built-in learning algorithm. By combining the two dynamical systems into one system we can rapidly teach new trajectories to robots without any knowledge of the frequency of the demonstration signal. The system extracts and learns only one period of the demonstration signal. Furthermore, the trajectories are robust to perturbations and can be modulated to cope with a dynamic environment. The system is computationally inexpensive, works on-line for any periodic signal, requires no additional signal processing to determine the frequency of the input signal and can be applied in parallel to multiple dimensions. Additionally, it can adapt to changes in frequency and shape, e.g. to non-stationary signals, such as hand-generated signals and human demonstrations.

Locally-weighted regression is a computationally-efficient technique for non-linear regression.
However, for high-dimensional data, this technique becomes numerically brittle and computationally
too expensive if many local models need to be maintained simultaneously. Thus, local linear
dimensionality reduction combined with locally-weighted regression seems to be a promising solution.
In this context, we review linear dimensionality-reduction methods, compare their performance on nonparametric
locally-linear regression, and discuss their ability to extend to incremental learning. The
considered methods belong to the following three groups: (1) reducing dimensionality only on the input
data, (2) modeling the joint input-output data distribution, and (3) optimizing the correlation between
projection directions and output data. Group 1 contains principal component regression (PCR); group
2 contains principal component analysis (PCA) in joint input and output space, factor analysis, and
probabilistic PCA; and group 3 contains reduced rank regression (RRR) and partial least squares
(PLS) regression. Among the tested methods, only group 3 managed to achieve robust performance
even for a non-optimal number of components (factors or projection directions). In contrast, group 1
and 2 failed for fewer components since these methods rely on the correct estimate of the true intrinsic
dimensionality. In group 3, PLS is the only method for which a computationally-efficient incremental
implementation exists. Thus, PLS appears to be ideally suited as a building block for a locally-weighted
regressor in which projection directions are incrementally added on the fly.

Recent experimental and theoretical work [1] investigated the neural control of contact transition between motion and force during tapping with the index finger as a nonlinear optimization problem. Such transitions from motion to well-directed contact force are a fundamental part of dexterous manipulation. There are 3 alternative hypotheses of how this transition could be accomplished by the nervous system as a function of changes in direction and magnitude of the torque vector controlling the finger. These hypotheses are 1) an initial change in direction with a subsequent change in magnitude of the torque vector; 2) an initial change in magnitude with a subsequent directional change of the torque vector; and 3) a simultaneous and proportionally equal change of both direction and magnitude of the torque vector. Experimental work in [2] shows that the nervous system selects the first strategy, and in [1] we suggest that this may in fact be the optimal strategy. In [4] the framework of Iterative Linear Quadratic Optimal Regulator (ILQR) was extended to incorporate motion and force control. However, our prior simulation work assumed direct and instantaneous control of joint torques, which ignores the known delays and filtering properties of skeletal muscle.
In this study, we implement an ILQR controller for a more biologically plausible biomechanical model of the index finger than [4], and add activation-contraction dynamics to the system to simulate muscle function. The planar biomechanical model includes the kinematics of the 3 joints while the applied torques are driven by activation?contraction dynamics with biologically plausible time constants [3]. In agreement with our experimental work [2], the task is to, within 500 ms, move the finger from a given resting configuration to target configuration with a desired terminal velocity. ILQR does not only stabilize the finger dynamics according to the objective function, but it also generates smooth joint space trajectories with minimal tuning and without an a-priori initial control policy (which is difficult to find for highly dimensional biomechanical systems).
Furthemore, the use of this optimal control framework and the addition of activation-contraction dynamics considers the full nonlinear dynamics of the index finger and produces a sequence of postures which are compatible with experimental motion data [2]. These simulations combined with prior experimental results suggest that optimal control is a strong candidate for the generation of finger movements prior to abrupt motion-to-force transitions.
This work is funded in part by grants NIH R01 0505520 and NSF EFRI-0836042 to Dr. Francisco J. Valero- Cuevas
1 Venkadesan M, Valero-Cuevas FJ. Effects of neuromuscular lags on controlling contact transitions. Philosophical Transactions of the Royal Society A: 2008.
2 Venkadesan M, Valero-Cuevas FJ. Neural Control of Motion-to-Force Transitions with the Fingertip. J. Neurosci., Feb 2008; 28: 1366 - 1373;
3 Zajac. Muscle and tendon: properties, models, scaling, and application to biomechanics and motor control. Crit Rev Biomed Eng, 17
4. Weiwei Li., Francisco Valero Cuevas: ?Linear Quadratic Optimal Control of Contact Transition with Fingertip ? ACC 2009

Abstract The paper presents a two-layered system for (1) learning and encoding a periodic signal without any knowledge on its frequency and waveform, and (2) modulating the learned periodic trajectory in response to external events. The system is used to learn periodic tasks on a humanoid HOAP-2 robot. The first layer of the system is a dynamical system responsible for extracting the fundamental frequency of the input signal, based on adaptive frequency oscillators. The second layer is a dynamical system responsible for learning of the waveform based on a built-in learning algorithm. By combining the two dynamical systems into one system we can rapidly teach new trajectories to robots without any knowledge of the frequency of the demonstration signal. The system extracts and learns only one period of the demonstration signal. Furthermore, the trajectories are robust to perturbations and can be modulated to cope with a dynamic environment. The system is computationally inexpensive, works on-line for any periodic signal, requires no additional signal processing to determine the frequency of the input signal and can be applied in parallel to multiple dimensions. Additionally, it can adapt to changes in frequency and shape, e.g. to non-stationary signals, such as hand-generated signals and human demonstrations.

Dexterous manipulation with a highly redundant movement system is one of the hallmarks of hu-
man motor skills. From numerous behavioral studies, there is strong evidence that humans employ
compliant task space control, i.e., they focus control only on task variables while keeping redundant
degrees-of-freedom as compliant as possible. This strategy is robust towards unknown disturbances
and simultaneously safe for the operator and the environment. The theory of operational space con-
trol in robotics aims to achieve similar performance properties. However, despite various compelling
theoretical lines of research, advanced operational space control is hardly found in actual robotics imple-
mentations, in particular new kinds of robots like humanoids and service robots, which would strongly
profit from compliant dexterous manipulation. To analyze the pros and cons of different approaches
to operational space control, this paper focuses on a theoretical and empirical evaluation of different
methods that have been suggested in the literature, but also some new variants of operational space
controllers. We address formulations at the velocity, acceleration and force levels. First, we formulate
all controllers in a common notational framework, including quaternion-based orientation control, and
discuss some of their theoretical properties. Second, we present experimental comparisons of these
approaches on a seven-degree-of-freedom anthropomorphic robot arm with several benchmark tasks.
As an aside, we also introduce a novel parameter estimation algorithm for rigid body dynamics, which
ensures physical consistency, as this issue was crucial for our successful robot implementations. Our
extensive empirical results demonstrate that one of the simplified acceleration-based approaches can
be advantageous in terms of task performance, ease of parameter tuning, and general robustness and
compliance in face of inevitable modeling errors.

One of the most general frameworks for phrasing control problems for
complex, redundant robots is operational space control. However, while
this framework is of essential importance for robotics and well-understood
from an analytical point of view, it can be prohibitively hard to achieve
accurate control in face of modeling errors, which are inevitable in com-
plex robots, e.g., humanoid robots. In this paper, we suggest a learning
approach for opertional space control as a direct inverse model learning
problem. A ï¬rst important insight for this paper is that a physically cor-
rect solution to the inverse problem with redundant degrees-of-freedom
does exist when learning of the inverse map is performed in a suitable
piecewise linear way. The second crucial component for our work is based
on the insight that many operational space controllers can be understood
in terms of a constrained optimal control problem. The cost function as-
sociated with this optimal control problem allows us to formulate a learn-
ing algorithm that automatically synthesizes a globally consistent desired
resolution of redundancy while learning the operational space controller.
From the machine learning point of view, this learning problem corre-
sponds to a reinforcement learning problem that maximizes an immediate
reward. We employ an expectation-maximization policy search algorithm
in order to solve this problem. Evaluations on a three degrees of freedom
robot arm are used to illustrate the suggested approach. The applica-
tion to a physically realistic simulator of the anthropomorphic SARCOS
Master arm demonstrates feasibility for complex high degree-of-freedom
robots. We also show that the proposed method works in the setting of
learning resolved motion rate control on real, physical Mitsubishi PA-10
medical robotics arm.

In this paper we introduce an improved implementation of locally weighted projection regression
(LWPR), a supervised learning algorithm that is capable of handling high-dimensional input data.
As the key features, our code supports multi-threading, is available for multiple platforms, and
provides wrappers for several programming languages.

HFSP Journal Frontiers of Interdisciplinary Research in the Life Sciences, 1(2):115-126, 2007, clmc (article)

Abstract

Research in robotics has moved away from its primary focus on industrial
applications. The New Robotics is a vision that has been developed in past years
by our own university and many other national and international research
instiutions and addresses how increasingly more human-like robots can live
among us and take over tasks where our current society has shortcomings. Elder
care, physical therapy, child education, search and rescue, and general
assistance in daily life situations are some of the examples that will benefit from
the New Robotics in the near future. With these goals in mind, research for the
New Robotics has to embrace a broad interdisciplinary approach, ranging from
traditional mathematical issues of robotics to novel issues in psychology,
neuroscience, and ethics. This paper outlines some of the important research
problems that will need to be resolved to make the New Robotics a reality.

While the predictive nature of the primate smooth pursuit system has been evident through several behavioural and neurophysiological experiments, few models have attempted to explain these results comprehensively. The model we propose in this paper in line with previous models employing optimal control theory; however, we hypothesize two new issues: (1) the medical superior temporal (MST) area in the cerebral cortex implements a recurrent neural network (RNN) in order to predict the current or future target velocity, and (2) a forward model of the target motion is acquired by on-line learning. We use stimulation studies to demonstrate how our new model supports these hypotheses.

This paper introduces a provably stable learning adaptive control framework with statistical learning. The proposed algorithm employs nonlinear function approximation with automatic growth of the learning network according to the nonlinearities and the working domain of the control system. The unknown function in the dynamical system is approximated by piecewise linear models using a nonparametric regression technique. Local models are allocated as necessary and their parameters are optimized on-line. Inspired by composite adaptive control methods, the proposed learning adaptive control algorithm uses both the tracking error and the estimation error to update the parameters. We first discuss statistical learning of nonlinear functions, and motivate our choice of the locally weighted learning framework. Second, we begin with a class of first order SISO systems for theoretical development of our learning adaptive control framework, and present a stability proof including a parameter projection method that is needed to avoid potential singularities during adaptation. Then, we generalize our adaptive controller to higher order SISO systems, and discuss further extension to MIMO problems. Finally, we evaluate our theoretical control framework in numerical simulations to illustrate the effectiveness of the proposed learning adaptive controller for rapid convergence and high accuracy of control.

2004

This paper develops a general policy for learning relevant features of an imitation task. We restrict our study to imitation of manipulative tasks or of gestures. The imitation process is modeled as a hierarchical optimization system, which minimizes the discrepancy between two multi-dimensional datasets. To classify across manipulation strategies, we apply a probabilistic analysis to data in Cartesian and joint spaces. We determine a general metric that optimizes the policy of task reproduction, following strategy determination. The model successfully discovers strategies in six different imitative tasks and controls task reproduction by a full body humanoid robot.

Rhythmic movements, like walking, chewing, or scratching, are phylogenetically old mo-tor behaviors found in many organisms, ranging from insects to primates. In contrast, discrete movements, like reaching, grasping, or kicking, are behaviors that have reached sophistication primarily in younger species, particularly in primates. Neurophysiological and computational research on arm motor control has focused almost exclusively on dis-crete movements, essentially assuming similar neural circuitry for rhythmic tasks. In con-trast, many behavioral studies focused on rhythmic models, subsuming discrete move-ment as a special case. Here, using a human functional neuroimaging experiment, we show that in addition to areas activated in rhythmic movement, discrete movement in-volves several higher cortical planning areas, despite both movement conditions were confined to the same single wrist joint. These results provide the first neuroscientific evi-dence that rhythmic arm movement cannot be part of a more general discrete movement system, and may require separate neurophysiological and theoretical treatment.

In this paper, we introduce a framework for learning biped locomotion using dynamical movement primitives based on non-linear oscillators. Our ultimate goal is to establish a design principle of a controller in order to achieve natural human-like locomotion. We suggest dynamical movement primitives as a central pattern generator (CPG) of a biped robot, an approach we have previously proposed for learning and encoding complex human movements. Demonstrated trajectories are learned through movement primitives by locally weighted regression, and the frequency of the learned trajectories is adjusted automatically by a novel frequency adaptation algorithmbased on phase resetting and entrainment of coupled oscillators. Numerical simulations and experimental implementation on a physical robot demonstrate the effectiveness of the proposed locomotioncontroller.

In this paper, we present our theoretical investigations of the technique of feedback error learning (FEL) from the viewpoint of adaptive control. We first discuss the relationship between FEL and nonlinear adaptive control with adaptive feedback linearization, and show that FEL can be interpreted as a form of nonlinear adaptive control. Second, we present a Lyapunov analysis suggesting that the condition of strictly positive realness (SPR) associated with the tracking error dynamics is a sufficient condition for asymptotic stability of the closed-loop dynamics. Specifically, for a class of second order SISO systems, we show that this condition reduces to KD^2 > KP; where KP and KD are positive position and velocity feedback gains, respectively. Moreover, we provide a ÔpassivityÕ-based stability analysis which suggests that SPR of the tracking error dynamics is a necessary and sufficient condition for asymptotic hyperstability. Thus, the condition KD^2>KP mentioned above is not only a sufficient but also necessary condition to guarantee asymptotic hyperstability of FEL, i.e. the tracking error is bounded and asymptotically converges to zero. As a further point, we explore the adaptive control and FEL framework for feedforward control formulations, and derive an additional sufficient condition for asymptotic stability in the sense of Lyapunov. Finally, we present numerical simulations to illustrate the stability properties of FEL obtained from our mathematical analysis.

2003

Philosophical Transaction of the Royal Society of London: Series B, Biological Sciences, 358(1431):537-547, 2003, clmc (article)

Abstract

Movement imitation requires a complex set of mechanisms that map an observed movement of a teacher onto one's own movement apparatus. Relevant problems include movement recognition, pose estimation, pose tracking, body correspondence, coordinate transformation from external to egocentric space, matching of observed against previously learned movement, resolution of redundant degrees-of-freedom that are unconstrained by the observation, suitable movement representations for imitation, modularization of motor control, etc. All of these topics by themselves are active research problems in computational and neurobiological sciences, such that their combination into a complete imitation system remains a daunting undertaking - indeed, one could argue that we need to understand the complete perception-action loop. As a strategy to untangle the complexity of imitation, this paper will examine imitation purely from a computational point of view, i.e. we will review statistical and mathematical approaches that have been suggested for tackling parts of the imitation problem, and discuss their merits, disadvantages and underlying principles. Given the focus on action recognition of other contributions in this special issue, this paper will primarily emphasize the motor side of imitation, assuming that a perceptual system has already identified important features of a demonstrated movement and created their corresponding spatial information. Based on the formalization of motor control in terms of control policies and their associated performance criteria, useful taxonomies of imitation learning can be generated that clarify different approaches and future research directions.

2002

Locally weighted learning (LWL) is a class of techniques from nonparametric statistics that provides useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of robotic systems. This paper introduces several LWL algorithms that have been tested successfully in real-time learning of complex robot tasks. We discuss two major classes of LWL, memory-based LWL and purely incremental LWL that does not need to remember any data explicitly. In contrast to the traditional belief that LWL methods cannot work well in high-dimensional spaces, we provide new algorithms that have been tested on up to 90 dimensional learning problems. The applicability of our LWL algorithms is demonstrated in various robot learning examples, including the learning of devil-sticking, pole-balancing by a humanoid robot arm, and inverse-dynamics learning for a seven and a 30 degree-of-freedom robot. In all these examples, the application of our statistical neural networks techniques allowed either faster or more accurate acquisition of motor control than classical control engineering.

In recent years, an increasing number of research projects investigated whether the central nervous system employs internal models in motor control. While inverse models in the control loop can be identified more readily in both motor behavior and the firing of single neurons, providing direct evidence for the existence of forward models is more complicated. In this paper, we will discuss such an identification of forward models in the context of the visuomotor control of an unstable dynamic system, the balancing of a pole on a finger. Pole balancing imposes stringent constraints on the biological controller, as it needs to cope with the large delays of visual information processing while keeping the pole at an unstable equilibrium. We hypothesize various model-based and non-model-based control schemes of how visuomotor control can be accomplished in this task, including Smith Predictors, predictors with Kalman filters, tapped-delay line control, and delay-uncompensated control. Behavioral experiments with human participants allow exclusion of most of the hypothesized control schemes. In the end, our data support the existence of a forward model in the sensory preprocessing loop of control. As an important part of our research, we will provide a discussion of when and how forward models can be identified and also the possible pitfalls in the search for forward models in control.

2001

Sensory-motor integration is one of the key issues in robotics. In this paper, we propose an approach to rhythmic arm movement control that is synchronized with an external signal based on exploiting a simple neural oscillator network. Trajectory generation by the neural oscillator is a biologically inspired method that can allow us to generate a smooth and continuous trajectory. The parameter tuning of the oscillators is used to generate a synchronized movement with wide intervals. We adopted the method for the drumming task as an example task. By using this method, the robot can realize synchronized drumming with wide drumming intervals in real time. The paper also shows the experimental results of drumming by a humanoid robot.

The 2/3 power law, the nonlinear relationship between tangential velocity and radius of curvature of the endeffector trajectory, has been suggested as a fundamental constraint of the central nervous system in the formation of rhythmic endpoint trajectories. However, studies on the 2/3 power law have largely been confined to planar drawing patterns of relatively small size. With the hypothesis that this strategy overlooks nonlinear effects that are constitutive in movement generation, the present experiments tested the validity of the power law in elliptical patterns which were not confined to a planar surface and which were performed by the unconstrained 7-DOF arm with significant variations in pattern size and workspace orientation. Data were recorded from five human subjects where the seven joint angles and the endpoint trajectories were analyzed. Additionally, an anthropomorphic 7-DOF robot arm served as a "control subject" whose endpoint trajectories were generated on the basis of the human joint angle data, modeled as simple harmonic oscillations. Analyses of the endpoint trajectories demonstrate that the power law is systematically violated with increasing pattern size, in both exponent and the goodness of fit. The origins of these violations can be explained analytically based on smooth rhythmic trajectory formation and the kinematic structure of the human arm. We conclude that in unconstrained rhythmic movements, the power law seems to be a by-product of a movement system that favors smooth trajectories, and that it is unlikely to serve as a primary movement generating principle. Our data rather suggests that subjects employed smooth oscillatory pattern generators in joint space to realize the required movement patterns.

Labeled Graph Matching (LGM) has been shown successful in numerous ob-ject vision tasks. This method is the basis for arguably the best face recognition system in the world. We present an algorithm for visual pattern recognition that is an extension of LGM ("LGM+"). We compare the performance of LGM and LGM+ algorithms with a state of the art statistical method based on Mutual Information Maximization (MIM). We present an adaptation of the MIM method for multi-dimensional Gabor wavelet features. The three pattern recognition methods were evaluated on an object detection task, using a set of stimuli on which none of the methods had been tested previously. The results indicate that while the performance of the MIM method operating upon Gabor wavelets is superior to the same method operating on pixels and to LGM, it is surpassed by LGM+. LGM+ offers a significant improvement in performance over LGM without losing LGMâ??s virtues of simplicity, biological plausibility, and a computational cost that is 2-3 orders of magnitude lower than that of the MIM algorithm.Â

Oculomotor control in a humanoid robot faces similar problems as biological oculomotor systems, i.e. the stabilization of gaze in face of unknown perturbations of the body, selective attention, stereo vision, and dealing with large information processing delays. Given the nonlinearities of the geometry of binocular vision as well as the possible nonlinearities of the oculomotor plant, it is desirable to accomplish accurate control of these behaviors through learning approaches. This paper develops a learning control system for the phylogenetically oldest behaviors of oculomotor control, the stabilization reflexes of gaze. In a step-wise procedure, we demonstrate how control theoretic reasonable choices of control components result in an oculomotor control system that resembles the known functional anatomy of the primate oculomotor system. The core of the learning system is derived from the biologically inspired principle of feedback-error learning combined with a state-of-the-art non-parametric statistical learning network. With this circuitry, we demonstrate that our humanoid robot is able to acquire high performance visual stabilization reflexes after about 40 s of learning despite significant nonlinearities and processing delays in the system.

Rhythmically bouncing a ball with a racket was investigated and modeled with a nonlinear map. Model analyses provided a variable defining a dynamically stable solution that obviates computationally expensive corrections. Three experiments evaluated whether dynamic stability is optimized and what perceptual support is necessary for stable behavior. Two hypotheses were tested: (a) Performance is stable if racket acceleration is negative at impact, and (b) variability is lowest at an impact acceleration between -4 and -1 m/s2. In Experiment 1 participants performed the task, eyes open or closed, bouncing a ball confined to a 1-dimensional trajectory. Experiment 2 eliminated constraints on racket and ball trajectory. Experiment 3 excluded visual or haptic information. Movements were performed with negative racket accelerations in the range of highest stability. Performance with eyes closed was more variable, leaving acceleration unaffected. With haptic information, performance was more stable than with visual information alone.

Oculomotor control in a humanoid robot faces similar problems as biological oculomotor systems, i.e., capturing targets accurately on a very narrow fovea, dealing with large delays in the control system, the stabilization of gaze in face of unknown perturbations of the body, selective attention, and the complexity of stereo vision. In this paper, we suggest control circuits to realize three of the most basic oculomotor behaviors and their integration - the vestibulo-ocular and optokinetic reflex (VOR-OKR) for gaze stabilization, smooth pursuit for tracking moving objects, and saccades for overt visual attention. Each of these behaviors and the mechanism for their integration was derived with inspiration from computational theories as well as behavioral and physiological data in neuroscience. Our implementations on a humanoid robot demonstrate good performance of the oculomotor behaviors, which proves to be a viable strategy to explore novel control mechanisms for humanoid robotics. Conversely, insights gained from our models have been able to directly influence views and provide new directions for computational neuroscience research.

2000

We report on our empirical studies of a new controller for a two-link brachiating robot. Motivated by the pendulum-like motion of an apeâ??s brachiation, we encode this task as the output of a â??target dynamical system.â? Numerical simulations indicate that the resulting controller solves a number of brachiation problems that we term the â??ladder,â? â??swing-up,â? and â??ropeâ? problems. Preliminary analysis provides some explanation for this success. The proposed controller is implemented on a physical system in our laboratory. The robot achieves behaviors including â??swing locomotionâ? and â??swing upâ? and is capable of continuous locomotion over several rungs of a ladder. We discuss a number of formal questions whose answers will be required to gain a full understanding of the strengths and weaknesses of this approach.

The study investigates a single-joint movement task that combines a translatory and cyclic component with the objective to investigate the interaction of discrete and rhythmic movement elements. Participants performed an elbow movement in the horizontal plane, oscillating at a prescribed frequency around one target and shifting to a second target upon a trigger signal, without stopping the oscillation. Analyses focused on extracting the mutual influences of the rhythmic and the discrete component of the task. Major findings are: (1) The onset of the discrete movement was confined to a limited phase window in the rhythmic cycle. (2) Its duration was influenced by the period of oscillation. (3) The rhythmic oscillation was "perturbed" by the discrete movement as indicated by phase resetting. On the basis of these results we propose a model for the coordination of discrete and rhythmic actions (K. Matsuoka, Sustained oscillations generated by mutually inhibiting neurons with adaptations, Biological Cybernetics 52 (1985) 367-376; Mechanisms of frequency and pattern control in the neural rhythm generators, Biological Cybernetics 56 (1987) 345-353). For rhythmic movements an oscillatory pattern generator is developed following models of half-center oscillations (D. Bullock, S. Grossberg, The VITE model: a neural command circuit for generating arm and articulated trajectories, in: J.A.S. Kelso, A.J. Mandel, M. F. Shlesinger (Eds.), Dynamic Patterns in Complex Systems. World Scientific. Singapore. 1988. pp. 305-326). For discrete movements a point attractor dynamics is developed close to the VITE model For each joint degree of freedom both pattern generators co-exist but exert mutual inhibition onto each other. The suggested modeling framework provides a unified account for both discrete and rhythmic movements on the basis of neuronal circuitry. Simulation results demonstrated that the effects observed in human performance can be replicated using the two pattern generators with a mutually inhibiting coupling.

On the basis of a modified bouncing-ball model, we investigated whether human movements utilize principles of dynamic stability in their performance of a similar movement task. Stability analyses of the model provided predictions about conditions indicative of a dynamically stable period-one regime. In a series of experiments, human subjects bounced a ball rhythmically on a racket and displayed these conditions supporting that they attuned to and exploited the dynamic stability properties of the task.

1999

This review will focus on two recent developments in artificial intelligence and neural computation: learning from imitation and the development of humanoid robots. It will be postulated that the study of imitation learning offers a promising route to gain new insights into mechanisms of perceptual motor control that could ultimately lead to the creation of autonomous humanoid robots. This hope is justified because imitation learning channels research efforts towards three important issues: efficient motor learning, the connection between action and perception, and modular motor control in form of movement primitives. In order to make these points, first, a brief review of imitation learning will be given from the view of psychology and neuroscience. In these fields, representations and functional connections between action and perception have been explored that contribute to the understanding of motor acts of other beings. The recent discovery that some areas in the primate brain are active during both movement perception and execution provided a first idea of the possible neural basis of imitation. Secondly, computational approaches to imitation learning will be described, initially from the perspective of traditional AI and robotics, and then with a focus on neural network models and statistical learning research. Parallels and differences between biological and computational approaches to imitation will be highlighted. The review will end with an overview of current projects that actually employ imitation learning for humanoid robots.

While it is generally assumed that complex movements consist of a sequence of simpler units, the quest to define these units of action, or movement primitives, still remains an open question. In this context, two hypotheses of movement segmentation of endpoint trajectories in 3D human drawing movements are re-examined: (1) the stroke-based segmentation hypothesis based on the results that the proportionality coefficient of the 2/3 power law changes discontinuously with each new â??strokeâ?, and (2) the segmentation hypothesis inferred from the observation of piecewise planar endpoint trajectories of 3D drawing movements. In two experiments human subjects performed a set of elliptical and figure-8 patterns of different sizes and orientations using their whole arm in 3D. The kinematic characteristics of the endpoint trajectories and the seven joint angles of the arm were analyzed. While the endpoint trajectories produced similar segmentation features as reported in the literature, analyses of the joint angles show no obvious segmentation but rather continuous oscillatory patterns. By approximating the joint angle data of human subjects with sinusoidal trajectories, and by implementing this model on a 7-degree-of-freedom anthropomorphic robot arm, it is shown that such a continuous movement strategy can produce exactly the same features as observed by the above segmentation hypotheses. The origin of this apparent segmentation of endpoint trajectories is traced back to the nonlinear transformations of the forward kinematics of human arms. The presented results demonstrate that principles of discrete movement generation may not be reconciled with those of rhythmic movement as easily as has been previously suggested, while the generalization of nonlinear pattern generators to arm movements can offer an interesting alternative to approach the question of units of action.

1998

We introduce a constructive, incremental learning system for regression problems that models data by means of spatially localized linear models. In contrast to other approaches, the size and shape of the receptive field of each locally linear model as well as the parameters of the locally linear model itself are learned independently, i.e., without the need for competition or any other kind of communication. Independent learning is accomplished by incrementally minimizing a weighted local cross validation error. As a result, we obtain a learning system that can allocate resources as needed while dealing with the bias-variance dilemma in a principled way. The spatial localization of the linear models increases robustness towards negative interference. Our learning system can be interpreted as a nonparametric adaptive bandwidth smoother, as a mixture of experts where the experts are trained in isolation, and as a learning system which profits from combining independent expert knowledge on the same problem. This paper illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields.Â

Incremental learning of sensorimotor transformations in high dimensional spaces is one of the basic prerequisites for the success of autonomous robot devices as well as biological movement systems. So far, due to sparsity of data in high dimensional spaces, learning in such settings requires a significant amount of prior knowledge about the learning task, usually provided by a human expert. In this paper we suggest a partial revision of the view. Based on empirical studies, we observed that, despite being globally high dimensional and sparse, data distributions from physical movement systems are locally low dimensional and dense. Under this assumption, we derive a learning algorithm, Locally Adaptive Subspace Regression, that exploits this property by combining a dynamically growing local dimensionality reduction techniqueÂ as a preprocessing step with a nonparametric learning technique, locally weighted regression, that also learns the region of validity of the regression. The usefulness of the algorithm and the validity of its assumptions are illustrated for a synthetic data set, and for data of the inverse dynamics of human arm movements and an actual 7 degree-of-freedom anthropomorphic robot arm.Â

1997

Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems