Applying reinforcement learning to humanoid robots is challenging because humanoids have a large number of degrees of freedom and state and action spaces are continuous. Thus, most reinforcement learning algorithms would become computationally infeasible and require a prohibitive amount of trials to explore such high-dimensional spaces. In this paper, we present a probabilistic reinforcement learning approach, which is derived from the framework of stochastic optimal control and path integrals. The algorithm, called Policy Improvement with Path Integrals (PI2), has a surprisingly simple form, has no open tuning parameters besides the exploration noise, is model-free, and performs numerically robustly in high dimensional learning problems. We demonstrate how PI2 is able to learn full-body motor skills on a 34-DOF humanoid robot. To demonstrate the generality of our approach, we also apply PI2 in the context of variable impedance control, where both planned trajectories and gain schedules for each joint are optimized simultaneously.

In the series of our earlier papers on the subject, we proposed a novel statistical hy-
pothesis testing method for detection of objects in noisy images. The method uses results from
percolation theory and random graph theory. We developed algorithms that allowed to detect
objects of unknown shapes in the presence of nonparametric noise of unknown level and of un-
known distribution. No boundary shape constraints were imposed on the objects, only a weak
bulk condition for the object's interior was required. Our algorithms have linear complexity and
exponential accuracy. In the present paper, we describe an implementation of our nonparametric
hypothesis testing method. We provide a program that can be used for statistical experiments in
image processing. This program is written in the statistical programming language R.

Table tennis is a sufficiently complex motor task
for studying complete skill learning systems. It consists of several
elementary motions and requires fast movements, accurate
control, and online adaptation. To represent the elementary
movements needed for robot table tennis, we rely on dynamic
systems motor primitives (DMP). While such DMPs have been
successfully used for learning a variety of simple motor tasks,
they only represent single elementary actions. In order to select
and generalize among different striking movements, we present
a new approach, called Mixture of Motor Primitives that uses
a gating network to activate appropriate motor primitives. The
resulting policy enables us to select among the appropriate
motor primitives as well as to generalize between them. In
order to obtain a fully learned robot table tennis setup, we
also address the problem of predicting the necessary context
information, i.e., the hitting point in time and space where
we want to hit the ball. We show that the resulting setup
was capable of playing rudimentary table tennis using an
anthropomorphic robot arm.

We propose a novel algorithm to solve the expectation propagation relaxation of Bayesian inference for continuous-variable graphical models. In contrast to most previous algorithms, our method is provably convergent. By marrying convergent EP ideas from (Opper&amp;Winther 05) with covariance decoupling techniques (Wipf&amp;Nagarajan 08, Nickisch&amp;Seeger 09), it runs at least an order of magnitude faster than the most commonly used EP solver.

Many successful applications of computer vision to image or video manipulation are interactive by nature. However, parameters of such systems are often trained neglecting the user. Traditionally, interactive systems have been treated in the same manner as their fully automatic counterparts. Their performance is evaluated by computing the accuracy of their solutions under some fixed set of user interactions. This paper proposes a new evaluation and learning method which brings the user in the loop. It is based on the use of an active robot user -- a simulated model of a human user. We show how this approach can be used to evaluate and learn parameters of state-of-the-art interactive segmentation systems. We also show how simulated user models can be integrated into the popular max-margin method for parameter learning and propose an algorithm to solve the resulting optimisation problem.

We present a method for fully automated selection of treatment beam ensembles for external radiation therapy. We reformulate the beam angle selection problem as a clustering problem of locally ideal beam orientations distributed on the unit sphere. For this purpose we construct an infinite mixture of von Mises-Fisher distributions, which is suited in general for density estimation from data on the D-dimensional sphere. Using a nonparametric Dirichlet process prior, our model infers probability distributions over both the number of clusters and their parameter values. We describe an efficient Markov chain Monte Carlo inference algorithm for posterior inference from experimental data in this model. The performance of the suggested beam angle selection framework is illustrated for one intra-cranial, pancreas, and prostate case each. The infinite von Mises-Fisher mixture model (iMFMM) creates between 18 and 32 clusters, depending on the patient anatomy. This suggests to use the iMFMM directly for beam ensemble selection in robotic radio surgery, or to generate low-dimensional input for both subsequent optimization of trajectories for arc therapy and beam ensemble selection for conventional radiation therapy.

Robot learning methods which allow autonomous robots to adapt to novel situations have been a long standing vision of robotics, artificial intelligence, and cognitive sciences. However, to date, learning techniques have yet to fulfill this promise as only few methods manage to scale into the high-dimensional domains of manipulator robotics, or even the new upcoming trend of humanoid robotics. If possible, scaling was usually only achieved in precisely pre-structured domains. In this paper, we investigate the ingredients for a general approach policy learning with the goal of an application to motor skill refinement in order to get one step closer towards human-like performance. For doing so, we study two major components for such an approach, i. e., firstly, we study policy learning algorithms which can be applied in the general setting of motor skill learning, and, secondly, we study a theoretically well-founded general approach to representing the required control structures for task representation and execution.

Building on recent results for submodular minimization with combinatorial constraints, and on online submodular minimization, we address online approximation
algorithms for submodular minimization with combinatorial constraints. We discuss two types of algorithms and outline approximation algorithms that integrate into those.

We consider the problem of local graph clustering
where the aim is to discover the local cluster corresponding
to a point of interest. The most popular algorithms to solve
this problem start a random walk at the point of interest and
let it run until some stopping criterion is met. The vertices
visited are then considered the local cluster. We suggest a more
powerful alternative, the multi-agent random walk. It consists
of several agents connected by a fixed rope of length l. All
agents move independently like a standard random walk on
the graph, but they are constrained to have distance at most l
from each other. The main insight is that for several agents it is
harder to simultaneously travel over the bottleneck of a graph
than for just one agent. Hence, the multi-agent random walk
has less tendency to mistakenly merge two different clusters
than the original random walk. In our paper we analyze
the multi-agent random walk theoretically and compare it
experimentally to the major local graph clustering algorithms
from the literature. We find that our multi-agent random walk
consistently outperforms these algorithms.

Resting state activity is brain activation that arises in the absence of any task, and is usually measured
in awake subjects during prolonged fMRI scanning sessions where the only instruction given is to
close the eyes and do nothing. It has been recognized in recent years that resting state activity is
implicated in a wide variety of brain function. While certain networks of brain areas have different
levels of activation at rest and during a task, there is nevertheless significant similarity between
activations in the two cases. This suggests that recordings of resting state activity can be used as
a source of unlabeled data to augment kernel canonical correlation analysis (KCCA) in a semisupervised
setting. We evaluate this setting empirically yielding three main results: (i) KCCA tends
to be improved by the use of Laplacian regularization even when no additional unlabeled data are
available, (ii) resting state data seem to have a similar marginal distribution to that recorded during
the execution of a visual processing task implying largely similar types of activation, and (iii) this
source of information can be broadly exploited to improve the robustness of empirical inference in
fMRI studies, an inherently data poor domain.

The GPML toolbox provides a wide range of functionality for Gaussian process (GP) inference and prediction. GPs are specified by mean and covariance functions; we offer a library of simple mean and covariance functions and mechanisms to compose more complex ones. Several likelihood functions are supported including Gaussian and heavy-tailed for regression as well as others suitable for classification. Finally, a range of inference methods is provided, including exact and variational inference, Expectation Propagation, and Laplace's method dealing with non-Gaussian likelihoods and FITC for dealing with large regression tasks.

Brain-Computer Interfaces based on electrocorticography (ECoG) or electroencephalography (EEG), in combination with robot-assisted active physical therapy, may support traditional rehabilitation procedures for patients with
severe motor impairment due to cerebrovascular brain damage caused by stroke. In this short report, we briefly review the state-of-the art in this exciting new field,
give an overview of the work carried out at the Max Planck Institute for Biological Cybernetics and the University of T{\"u}bingen, and discuss challenges that need to be addressed in order to move from basic research to clinical studies.

Proceedings of the National Academy of Sciences of the United States of America, 107(46):19748-19753, November 2010 (article)

Abstract

Protein biosynthesis, the translation of the genetic code into polypeptides, occurs on ribonucleoprotein particles called ribosomes. Although X-ray structures of bacterial ribosomes are available, high-resolution structures of eukaryotic 80S ribosomes are lacking. Using cryoelectron microscopy and single-particle reconstruction, we have determined the structure of a translating plant (Triticum aestivum) 80S ribosome at 5.5-Å resolution. This map, together with a 6.1-Å map of a Saccharomyces cerevisiae 80S ribosome, has enabled us to model ∼98% of the rRNA. Accurate assignment of the rRNA expansion segments (ES) and variable regions has revealed unique ES–ES and r-protein–ES interactions, providing insight into the structure and evolution of the eukaryotic ribosome.

Policy gradient methods are a type of reinforcement learning techniques that rely upon optimizing parametrized policies with respect to the expected return (long-term cumulative reward) by gradient descent. They do not suffer from many of the problems that have been marring traditional reinforcement learning approaches such as the lack of guarantees of a value function, the intractability problem resulting from uncertain state information and the complexity arising from continuous states & actions.

40(616.2), 40th Annual Meeting of the Society for Neuroscience (Neuroscience), November 2010 (poster)

Abstract

Functional correlates of Rhythms in the gamma band (30-100Hz) are observed in the mammalian brain with a large variety of functional correlates. Nevertheless, their functional role is still debated. One way to disentangle this issue is to go beyond usual correlation analysis and apply causality measures that quantify the directed interactions between the gamma rhythms and other aspects of neural activity. These measures can be further compared with other aspects of neurophysicological signals to find markers of neural interactions.
In a recent study, we analyzed extracellular recordings in the primary visual cortex of 4 anesthetized macaques during the presentation of movie stimuli using a causality measure named Transfer Entropy. We found causal interactions between high frequency gamma rhythms (60-100Hz) recorded in different electrodes, involving in particular their phase, and between the gamma phase and spiking activity quantified by the instantaneous envelope of the MUA band (1-3kHz).
Here, we further investigate in the same dataset the meaning of these phase-MUA and phase-phase causal interactions by studying the distribution of phases at multiple recording sites at lags around the occurrence of spiking events.
First, we found a sharpening of the gamma phase distribution in one electrode when spikes are occurring in other recording site. This phenomena appeared as a form of phase-spike synchronization and was quantified by an information theoretic measure. We found this measure correlates significantly with phase-MUA causal interactions. Additionally, we quantified in a similar way the interplay between spiking and the phase difference between two recording sites (reflecting the well-know concept of phase synchronization). We found that, depending on the couple of recording site, spiking can correlate either with a phase synchronization or with a desynchronization with respect to the baseline. This effect correlates very well with the phase-phase causality measure.
These results provide evidence for high frequency phase-spike synchronization to reflect communication between distant neural populations in V1. Conversely, both phase synchronization or desynchronization may favor neural communication between recording sites. This new result, which contrasts with current hypothesis on the role of phase synchronization, could be interpreted as the presence of inhibitory interactions that are suppressed by desynchronization. Finally, our findings give new insights into the role of gamma rhythms in regulating local computation in the visual cortex.

Proceedings of the National Academy of Sciences of the United States of America, 107(46):19754-19759, November 2010 (article)

Abstract

Protein synthesis in all living organisms occurs on ribonucleoprotein particles, called ribosomes. Despite the universality of this process, eukaryotic ribosomes are significantly larger in size than their bacterial counterparts due in part to the presence of 80 r proteins rather than 54 in bacteria. Using cryoelectron microscopy reconstructions of a translating plant (Triticum aestivum) 80S ribosome at 5.5-Å resolution, together with a 6.1-Å map of a translating Saccharomyces cerevisiae 80S ribosome, we have localized and modeled 74/80 (92.5%) of the ribosomal proteins, encompassing 12 archaeal/eukaryote-specific small subunit proteins as well as the complete complement of the ribosomal proteins of the eukaryotic large subunit. Near-complete atomic models of the 80S ribosome provide insights into the structure, function, and evolution of the eukaryotic translational apparatus.

Combined PET/MR provides at the same time molecular and functional imaging as well as excellent soft tissue contrast. It does not allow one to directly measure the attenuation properties of scanned tissues, despite the fact that accurate attenuation maps are necessary for quantitative PET imaging. Several methods have therefore been proposed for MR-based attenuation correction (MR-AC). So far, they have only been evaluated on data acquired from separate MR and PET scanners. We evaluated several MR-AC methods on data from 10 patients acquired on a combined BrainPET/MR scanner. This allowed the consideration of specific PET/MR issues, such as the RF coil that attenuates and scatters 511 keV gammas. We evaluated simple MR thresholding methods as well as atlas and machine learning-based MR-AC. CT-based AC served as gold standard reference. To comprehensively evaluate the MR-AC accuracy, we used RoIs from 2 anatomic brain atlases with different levels of detail.
Visual inspection of the PET images indicated that even the basic FLASH threshold MR-AC may be sufficient for several applications. Using a UTE sequence for bone prediction in MR-based thresholding occasionally led to false prediction of bone tissue inside the brain, causing a significant overestimation of PET activity. Although it yielded a lower mean underestimation of activity, it exhibited the highest variance of all methods. The atlas averaging approach had a smaller mean error, but showed high maximum overestimation on the RoIs of the more detailed atlas. The Nave Bayes and Atlas-Patch MR-AC yielded the smallest variance, and the Atlas-Patch also showed the smallest mean error.
In conclusion, Atlas-based AC using only MR information on the BrainPET/MR yields a high level of accuracy that is sufficient for clinical quantitative imaging requirements. The Atlas-Patch approach was superior to alternative atlas-based methods, yielding a quantification error below 10% for all RoIs except very small ones.

This letter presents a graph kernel for spatio-spectral remote sensing image classification with support vector machines (SVMs). The method considers higher order relations in the neighborhood (beyond pairwise spatial relations) to iteratively compute a kernel matrix for SVM learning. The proposed kernel is easy to compute and constitutes a powerful alternative to existing approaches. The capabilities of the method are illustrated in several multi- and hyperspectral remote sensing images acquired over both urban and agricultural areas.

Inferring the causal structure that links $n$ observables is usually based upon detecting statistical dependences and choosing simple graphs that make the joint measure Markovian. Here we argue why causal inference is also possible when the sample size is one. We develop a theory how to generate causal graphs explaining similarities between single objects. To this end, we replace the notion of conditional stochastic independence in the causal Markov condition with the vanishing of conditional algorithmic mutual information and describe the corresponding causal inference rules. We explain why a consistent reformulation of causal inference in terms of algorithmic complexity implies a new inference principle that takes into account also the complexity of conditional probability densities, making it possible to select among Markov equivalent causal graphs. This insight provides a theoretical foundation of a heuristic principle proposed in earlier work. We also sketch some ideas on how to replace Kolmogorov complexity with decidable complexity criteria. This can be seen as an algorithmic analog of replacing the empirically undecidable question of statistical independence with practical independence tests that are based on implicit or explicit assumptions on the underlying distribution.

Reinforcement learning for partially observable Markov decision problems (POMDPs) is a challenge as it requires policies with an internal state. Traditional approaches suffer significantly from this shortcoming and usually make strong assumptions on the problem domain such as perfect system models, state-estimators and a Markovian hidden system. Recurrent neural networks (RNNs) offer a natural framework for dealing with policy learning using hidden state and require only few limiting assumptions. As they can be trained well using gradient descent, they are suited for policy gradient approaches.
In this paper, we present a policy gradient method, the Recurrent Policy Gradient which constitutes a model-free reinforcement learning method. It is aimed at training limited-memory stochastic policies on problems which require long-term memories of past observations. The approach involves approximating a policy gradient for a recurrent neural network by backpropagating return-weighted characteristic eligibilities through time. Using a Long Short-Term Memory RNN architecture, we are able to outperform previous RL methods on three important benchmark tasks. Furthermore, we show that using history-dependent baselines helps reducing estimation variance significantly, thus enabling our approach to tackle more challenging, highly stochastic environments.

In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (SMC 2010), pages: 121-126, IEEE, Piscataway, NJ, USA, IEEE International Conference on Systems, Man and Cybernetics (SMC), October 2010 (inproceedings)

Abstract

Brain-Computer Interfaces (BCIs) in combination with robot-assisted physical therapy may become a valuable tool for neurorehabilitation of patients with
severe hemiparetic syndromes due to cerebrovascular brain damage (stroke) and other neurological conditions. A key aspect of this approach is reestablishing
the disrupted sensorimotor feedback loop, i.e., determining the intended movement using a BCI and helping a human with impaired motor function to move
the arm using a robot. It has not been studied yet, however, how artificially closing the sensorimotor feedback loop affects the BCI decoding performance.
In this article, we investigate this issue in six healthy subjects, and present evidence that haptic feedback facilitates the decoding of arm movement
intention. The results provide evidence of the feasibility of future rehabilitative efforts combining robot-assisted physical therapy with BCIs.

manner. First the backbone resonances are assigned. This is usually achieved from sequential information provided by three chemical shifts: CA, CB and C’. Once the sequence is solved, the second assignment step takes place. For this purpose, the CA-CB and HA chemical shifts are used as a start point for assignment of the side chain resonances, thus connecting the backbone resonances to their respective side chains. This strategy is unfortunately limited by
the size of the protein due to increasing signal overlap and missing signals. Therefore, amino acid recognition is in many cases not possible as the CA-CB chemical shift pattern is not sufficient to discriminate between the 20 amino acids. As a result, the first step of the strategy
described above remains tedious and time consuming. The combination of modern NMR techniques with new spectrometers now provide information that was not always accessible
in the past, due to sensitivity problems. These experiments can be applied efficiently to measure a protein size up to 45 kDa and furthermore provide a unique combination of
sequential carbon spin system information. The assignment process can thus benefit from a maximum knowledge input, containing âallâ backbone and side chain chemical shifts as
well as an immediate amino acid recognition from the side chain spin system. We propose to extend the software PASTA (Protein ASsignment by Threshold Accepting) to achieve
a general sequential assignment of backbone and side-chain resonances in a semi- to fullautomatic per-residue approach. PASTA will offer the possibility to achieve the sequential assignment using any kind of chemical shifts (carbons and/or protons) that can provide sequential information combined with an amino acid recognition feature based on carbon spin system analysis.

This paper addresses the problem of learning and efficiently representing discriminative probabilistic models of object-specific grasp affordances particularly when the number of labeled grasps is extremely limited. The proposed method does not require an explicit 3D model but rather learns an implicit manifold on which it defines a probability distribution over grasp affordances. We obtain hypothetical grasp configurations from visual descriptors that are associated with the contours of an object. While these hypothetical configurations are abundant, labeled configurations are very scarce as these are acquired via time-costly experiments carried out by the robot. Kernel logistic regression (KLR) via joint kernel maps is trained to map the hypothesis space of grasps into continuous class-conditional probability values indicating their achievability. We propose a soft-supervised extension of KLR and a framework to combine the merits of semi-supervised and active learning approaches to tackle the scarcity of labeled grasps. Experimental evaluation shows that combining active and semi-supervised learning is favorable in the existence of an oracle. Furthermore, semi-supervised learning outperforms supervised learning, particularly when the labeled data is very limited.

Programming-by-demonstration promises to significantly reduce the burden of coding robots to perform new tasks. However, service robots will be presented with a variety of different situations that were not specifically
demonstrated to it. In such cases, the robot must autonomously generalize its learned motions to these new situations. We propose a system that can generalize movements to new target locations and even new objects. The former is achieved by using a task-specific coordinate system together with dynamical systems motor primitives. Generalizing actions to new
objects is a more complex problem, which we solve by treating it as a
continuum-armed bandits problem. Using the bandits framework, we can
efficiently optimize the learned action for a specific object. The proposed method was implemented on a real robot and succesfully adapted the grasping action to three different objects. Although we focus on grasping as an example of a task, the proposed methods are much more widely applicable to robot manipulation tasks.

The goal of frequent subgraph mining is to detect subgraphs that frequently occur in a dataset of graphs. In classification settings, one is often interested in discovering discriminative frequent subgraphs, whose presence or absence is indicative of the class membership of a graph. In this article, we propose an approach to feature selection on frequent subgraphs, called CORK, that combines two central advantages. First, it optimizes a submodular quality criterion, which means that we can yield a near-optimal solution using greedy feature selection. Second, our submodular quality function criterion can be integrated into gSpan, the state-of-the-art tool for frequent subgraph mining, and help to prune the search space for discriminative frequent subgraphs even during frequent subgraph mining.

Although human beings see and move slower than table tennis or baseball robots, they manage to outperform such robot systems. One important aspect of this better performance is the human movement generation. In this paper, we study trajectory generation for table tennis from a biomimetic point of view. Our focus lies on generating efficient stroke movements capable of mastering variations in the environmental conditions, such as changing ball speed, spin and position. We study table tennis from a human motor control point of view. To make headway towards this goal, we construct a trajectory generator for a single stroke using the discrete movement stages hypothesis and the virtual hitting point hypothesis to create a model that produces a human-like stroke movement. We verify the functionality of the trajectory generator for a single forehand stroke both in a simulation and using a real Barrett WAM.

The combination of PET and MRI is an emerging field of current research. It is known that the positron range is shortened in high magnetic fields (MF), leading to an improved resolution in PET images. Interestingly, only the fraction of positron range (PR) orthogonal to the MF is reduced and the fraction along the MF is not affected and yields to a non-isotropic count distribution. We measured the PR effect with PET isotopes like F-18, Cu-64, C-11, N-13 and Ga-68. A piece of paper (1 cm2) was soaked with each isotope and placed in the cFOV of a clinical 3T BrainPET/MR scanner. A polyethylene board (PE) was placed as a positron (β+) stopper with an axial distance of 3 cm from the soaked paper. The area under the peaks of one pixel wide profiles along the z-axis in coronal images was compared. Based on these measurements we confirmed our data in organic tissue. A larynx/trachea and lung of a butchered swine were injected with a mixture of NiSO4 for T1 MRI signals and Ga-68, simulating tumor lesions in the respiratory tract. The trachea/larynx were aligned in 35° to the MF lines and a small mass lesion was inserted to imitate a primary tracheal tumor whereas the larynx was injected submucosally in the lower medial part of the epiglottis. Reconstructed PET data show that the annihilated ratio of β+ at the origin position and in the PE depends on the isotope energy and the direction of the MF. The annihilation ratios of the source and PE are 52.4/47.6 (F-18), 57.5/42.5 (Cu-64), 43.7/56.7 (C-11), 31.1/68.9 (N-13) and 14.9/85.1 (Ga-68). In the swine larynx measurement, an artefact with approximately 39% of the lesion activity formed along MF lines 3cm away from the original injected position (fig.1). The data of the trachea showed two shine artefacts with a symmetric alignment along the MF lines. About 58% of the positrons annihilated at the lesion and 21% formed each artefact. The PR effects areminor in tissue of higher or equal density to water (0.096 cm-1). However, the effect is severe in low density tissue or air and might lead to misinterpretation of clinical data.

Grasping an object is a task that inherently needs to be treated in a hybrid fashion. The system must decide both where and how to grasp the object. While selecting where to grasp requires learning about the object as a whole, the execution only needs to reactively adapt to the context close to the grasps location. We propose a hierarchical controller that reflects the structure of these two sub-problems, and attempts to learn solutions that work for both. A hybrid architecture is employed by the controller to make use of various machine learning methods that can cope with the large amount of uncertainty inherent to the task. The controllers upper level selects where to grasp the object using a reinforcement learner, while the lower level comprises an imitation learner and a vision-based reactive controller to determine appropriate grasping motions. The resulting system is able to quickly learn good grasps of a novel object in an unstructured environment, by executing smooth reaching motions and preshapin
g the hand depending on the objects geometry. The system was evaluated both in simulation and on a real robot.

We formulate weighted graph clustering as a prediction problem: given a subset of edge weights we analyze the ability of graph clustering to predict the remaining edge weights. This formulation enables practical and theoretical comparison of different approaches to graph clustering as well as comparison of graph clustering with other possible ways to model the graph. We adapt the PAC-Bayesian analysis of co-clustering (Seldin and Tishby, 2008; Seldin, 2009) to derive a PAC-Bayesian generalization bound for graph clustering. The bound shows that graph clustering should optimize a trade-off between empirical data fit and the mutual information that clusters preserve on the graph nodes. A similar trade-off derived from information-theoretic considerations was already shown to produce state-of-the-art results in practice (Slonim et al., 2005; Yom-Tov and Slonim, 2009). This paper supports the empirical evidence by providing a better theoretical foundation, suggesting formal generalization guarantees, and offering
a more accurate way to deal with finite sample issues. We derive a bound minimization algorithm and show that it provides good results in real-life problems and that the derived PAC-Bayesian bound is reasonably tight.

We study the problem of multimodal dimensionality reduction assuming that data samples can be missing at training time,
and not all data modalities may be present at application time. Maximum covariance analysis, as a generalization of PCA, has
many desirable properties, but its application to practical problems is limited by its need for perfectly paired data. We
overcome this limitation by a latent variable approach that allows working with weakly paired data and is still able to
efficiently process large datasets using standard numerical routines. The resulting weakly paired maximum covariance analysis
often finds better representations than alternative methods, as we show in two exemplary tasks: texture discrimination and transfer learning.

Most current algorithms for blind steganalysis of images are based on a two-stages approach: First, features are extracted in order to reduce dimensionality and to highlight potential manipulations; second, a classifier trained on pairs of clean and stego images finds a decision rule for these features to detect stego images. Thereby, vector components might vary significantly in their values, hence normalization of the feature vectors is crucial. Furthermore, most classifiers contain free parameters, and an automatic model selection step has to be carried out for adapting these parameters. However, the commonly used cross-validation destroys some information needed by the classifier because of the arbitrary splitting of image pairs (stego and clean version) in the training set. In this paper, we propose simple modifications of normalization and for standard cross-validation. In our experiments, we show that these methods lead to a significant improvement of the standard blind steganalyzer of Lyu and Farid.

We study nonparametric regression between Riemannian manifolds based on regularized empirical risk minimization. Regularization functionals for mappings between manifolds should respect the geometry of input and output manifold and be independent of the chosen parametrization of the manifolds. We define and analyze the three most simple regularization functionals with these properties and present a rather general scheme for solving the resulting optimization problem. As application examples we discuss interpolation on the sphere, fingerprint processing, and correspondence computations between three-dimensional surfaces. We conclude with characterizing interesting and sometimes counterintuitive implications and new open problems that are specific to learning between Riemannian manifolds and are not encountered in multivariate regression in Euclidean space.

Remote sensing image segmentation requires multi-category classification typically with limited number of labeled training samples. While semi-supervised learning (SSL) has emerged as a sub-field of machine learning to tackle the scarcity of labeled samples, most SSL algorithms to date have had trade-offs in terms of scalability and/or applicability to multi-categorical data. In this paper, we evaluate semi-supervised logistic regression (SLR), a recent information theoretic semi-supervised algorithm, for remote sensing image classification problems. SLR is a probabilistic discriminative classifier and a specific instance of the generalized maximum entropy framework with a convex loss function. Moreover, the method is inherently multi-class and easy to implement. These characteristics make SLR a strong alternative to the widely used semi-supervised variants of SVM for the segmentation of remote sensing images. We demonstrate the competitiveness of SLR in multispectral, hyperspectral and radar image classifica
tion.

Our winning approach to the 2010 MLSP Competition is based on a generative method for P300-based BCI decoding, successfully applied to visual spellers. Here, generative has a double meaning. On the one hand, we work with a probability density model of the data given the target/non target labeling, as opposed to discriminative (e.g. SVM-based) methods. On the other hand, the natural consequence of this approach is a decoding based on comparing the observation to templates generated from the data.

We introduce several new formulations for sparse nonnegative matrix approximation. Subsequently,
we solve these formulations by developing generic algorithms. Further, to help selecting a particular sparse formulation,
we briefly discuss the interpretation of each formulation. Finally, preliminary experiments are presented
to illustrate the behavior of our formulations and algorithms.

We formulate the multiframe blind deconvolution problem in an incremental
expectation maximization (EM) framework. Beyond deconvolution,
we show how to use the same framework to address: (i)
super-resolution despite noise and unknown blurring; (ii) saturationcorrection
of overexposed pixels that confound image restoration.
The abundance of data allows us to address both of these without
using explicit image or blur priors. The end result is a simple but effective
algorithm with no hyperparameters. We apply this algorithm
to real-world images from astronomy and to super resolution tasks:
for both, our algorithm yields increased resolution and deconvolved
images simultaneously.

Density modeling is notoriously difficult for high dimensional data. One approach to the problem is to search for a lower dimensional manifold which captures the main characteristics of the data. Recently, the Gaussian Process Latent Variable Model (GPLVM) has successfully been used to find low dimensional manifolds in a variety of complex data. The GPLVM consists of a set of points in a low dimensional latent space, and a stochastic map to the observed space. We show how it can be interpreted as a density model in the observed space. However, the GPLVM is not trained as a density model and therefore yields bad density estimates. We propose a new training strategy and obtain improved generalisation performance and better density estimates in comparative evaluations on several benchmark data sets.

We propose a novel statistical hypothesis testing method for detection of objects
in noisy images. The method uses results from percolation theory and random graph theory.
We present an algorithm that allows to detect objects of unknown shapes in the presence of
nonparametric noise of unknown level and of unknown distribution. No boundary shape constraints
are imposed on the object, only a weak bulk condition for the object's interior is required. The
algorithm has linear complexity and exponential accuracy and is appropriate for real-time systems.
In this paper, we develop further the mathematical formalism of our method and explore im-
portant connections to the mathematical theory of percolation and statistical physics. We prove
results on consistency and algorithmic complexity of our testing procedure. In addition, we address
not only an asymptotic behavior of the method, but also a nite sample performance of our test.

Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems