In several research projects, we investigate data-driven approaches for optimal and robust control, with applications e.g. in robotics. By combining optimal -- a principled way of decision-making and control, with reinforcement learning for control designs, we are tackling various challenges arising in robotic systems.

Voluntary behavior of humans appears to be composed of small, elementary building blocks or behavioral primitives. While this modular organization seems crucial for the learning of complex motor skills and the flexible adaption of behavior to new circumstances, the problem of learning meaningful, compositional abstractions from sensorimotor experiences remains an open challenge. Here, we introduce a computational learning architecture, termed surprise-based behavioral modularization into event-predictive structures (SUBMODES), that explores behavior and identifies the underlying behavioral units completely from scratch. The SUBMODES architecture bootstraps sensorimotor exploration using a self-organizing neural controller. While exploring the behavioral capabilities of its own body, the system learns modular structures that predict the sensorimotor dynamics and generate the associated behavior. In line with recent theories of event perception, the system uses unexpected prediction error signals, i.e., surprise, to detect transitions between successive behavioral primitives. We show that, when applied to two robotic systems with completely different body kinematics, the system manages to learn a variety of complex behavioral primitives. Moreover, after initial self-exploration the system can use its learned predictive models progressively more effectively for invoking model predictive planning and goal-directed control in different tasks and environments.

We present an approach to identify concise equations from data using a
shallow neural network approach. In contrast to ordinary black-box
regression, this approach allows understanding functional relations and
generalizing them from observed data to unseen parts of the parameter
space. We show how to extend the class of learnable equations for a
recently proposed equation learning network to include divisions, and we
improve the learning and model selection strategy to be useful for
challenging real-world data. For systems governed by analytical
expressions, our method can in many cases identify the true underlying
equation and extrapolate to unseen domains. We demonstrate its
effectiveness by experiments on a cart-pendulum system, where only 2
random rollouts are required to learn the forward dynamics and
successfully achieve the swing-up task.

One of the challenges of this century is to understand the neural mechanisms behind cognitive control and learning.
Recent investigations propose biologically plausible synaptic mechanisms for self-organizing controllers, in the spirit of Hebbian learning. In particular, differential extrinsic plasticity (DEP) [Der and Martius, PNAS 2015], has proven to enable embodied agents to self-organize their individual sensorimotor development, and generate highly coordinated behaviors during their interaction with the environment. These behaviors are attractors of a dynamical system. In this paper, we use the DEP rule to generate attractors and we combine it with a “repelling potential” which allows the system to actively explore all its attractor behaviors in a systematic way. With a view to a self-determined exploration of goal-free behaviors, our framework enables switching between different motion patterns in an autonomous and sequential fashion. Our algorithm is able to recover all the attractor behaviors in a toy system and it is also effective in two simulated environments. A spherical robot discovers all its major rolling modes and a hexapod robot learns to locomote in 50 different ways in 30min.

2015

Proceedings of the National Academy of Sciences, 112(45):E6224-E6232, 2015 (article)

Abstract

Grounding autonomous behavior in the nervous system is a fundamental challenge for neuroscience. In particular, self-organized behavioral development provides more questions than answers. Are there special functional units for curiosity, motivation, and creativity? This paper argues that these features can be grounded in synaptic plasticity itself, without requiring any higher-level constructs. We propose differential extrinsic plasticity (DEP) as a new synaptic rule for self-learning systems and apply it to a number of complex robotic systems as a test case. Without specifying any purpose or goal, seemingly purposeful and adaptive rhythmic behavior is developed, displaying a certain level of sensorimotor intelligence. These surprising results require no system-specific modifications of the DEP rule. They rather arise from the underlying mechanism of spontaneous symmetry breaking, which is due to the tight brain body environment coupling. The new synaptic rule is biologically plausible and would be an interesting target for neurobiological investigation. We also argue that this neuronal mechanism may have been a catalyst in natural evolution.

Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems