Large graphs abound in machine learning, data mining, and several related areas. A useful step towards analyzing such graphs is that of obtaining certain summary statistics — e.g., or the expected length of a shortest path between two nodes, or the expected weight of a minimum spanning tree of the graph, etc. These statistics provide insight into the structure of a graph, and they can help predict global
properties of a graph. Motivated thus, we propose to study statistical properties of structured subgraphs (of a given graph), in particular, to estimate the expected objective function value of a combinatorial optimization problem over these subgraphs. The general task is very difficult, if not unsolvable; so for concreteness we describe a more specific statistical estimation problem based on spanning trees.
We hope that our position paper encourages others to also study other types of graphical structures for which one can prove nontrivial statistical estimates.

In pages: 3210-3215, IEEE, Piscataway, NJ, USA, 50th IEEE Conference on Decision and Control and European Control Conference (CDC - ECC), December 2011 (inproceedings)

Abstract

We analyze the problem of data sets reduction for support vector classification. The work is also motivated by distributed problems, where sensors collect binary measurements at different locations moving inside an environment that needs to be divided into a collection of regions labeled in two different ways. The scope is to let each agent retain and exchange only those measurements that are mostly informative for the collective reconstruction of the decision boundary. For the case of separable classes, we provide the exact conditions and an efficient algorithm to determine if an element in the training set can become a support vector when new data arrive. The analysis is then extended to the non-separable case deriving a sufficient discardability condition and a general data selection
scheme for classification. Numerical experiments relative to the distributed problem show that the proposed procedure allows the agents to exchange a small amount of the collected data to obtain a highly predictive decision boundary.

Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. The case of two random variables is particularly challenging since no (conditional) independences can be exploited. Recent methods that are based on additive noise models suggest the following principle: Whenever the joint distribution {\bf P}^{(X,Y)} admits such a model in one direction, e.g., Y=f(X)+N, N \perp\kern-6pt \perp X, but does not admit the reversed model X=g(Y)+\tilde{N}, \tilde{N} \perp\kern-6pt \perp Y, one infers the former direction to be causal (i.e., X\rightarrow Y). Up to now, these approaches only dealt with continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work, we extend the notion of additive noise models to these cases. We prove that it almost never occurs that additive noise models can be fit in both directions. We further propose an efficient algorithm that is able to perform this way of causal inference on finite samples of discrete variables. We show that the algorithm works on both synthetic and real data sets.

Heritable epigenetic polymorphisms, such as differential cytosine methylation, can underlie phenotypic variation1, 2. Moreover, wild strains of the plant Arabidopsis thaliana differ in many epialleles3, 4, and these can influence the expression of nearby genes1, 2. However, to understand their role in evolution5, it is imperative to ascertain the emergence rate and stability of epialleles, including those that are not due to structural variation. We have compared genome-wide DNA methylation among 10 A. thaliana lines, derived 30 generations ago from a common ancestor6. Epimutations at individual positions were easily detected, and close to 30,000 cytosines in each strain were differentially methylated. In contrast, larger regions of contiguous methylation were much more stable, and the frequency of changes was in the same low range as that of DNA mutations7. Like individual positions, the same regions were often affected by differential methylation in independent lines, with evidence for recurrent cycles of forward and reverse mutations. Transposable elements and short interfering RNAs have been causally linked to DNA methylation8. In agreement, differentially methylated sites were farther from transposable elements and showed less association with short interfering RNA expression than invariant positions. The biased distribution and frequent reversion of epimutations have important implications for the potential contribution of sequence-independent epialleles to plant evolution.

There are (at least) three approaches to quantifying information. The first, algorithmic information or Kolmogorov complexity, takes events as strings and, given a universal Turing machine, quantifies the information content of a string as the length of the shortest program producing it [1]. The second, Shannon information, takes events as belonging to ensembles and quantifies the information resulting from observing the given event in terms of the number of alternate events that have been
ruled out [2]. The third, statistical learning theory, has introduced measures of capacity that control (in part) the expected risk of classifiers [3]. These capacities quantify the expectations regarding future data that learning algorithms embed into classifiers. Solomonoff and Hutter have applied algorithmic information to prove remarkable results on universal induction. Shannon information provides the mathematical foundation for communication
and coding theory. However, both approaches have shortcomings. Algorithmic information is not computable, severely limiting its practical usefulness. Shannon information refers to ensembles rather than actual events: it makes no sense to compute the Shannon information of a single string – or rather, there are many answers to this question depending on how a related ensemble is constructed.
Although there are asymptotic results linking algorithmic and Shannon information, it is unsatisfying that there is such a large gap – a difference in kind – between the two measures. This note describes a new method of quantifying information, effective information, that links algorithmic
information to Shannon information, and also links both to capacities arising in statistical learning theory [4, 5]. After introducing the measure, we show that it provides a non-universal analog of Kolmogorov complexity. We then apply it to derive basic capacities in statistical learning
theory: empirical VC-entropy and empirical Rademacher complexity. A nice byproduct of our approach is an interpretation of the explanatory power of a learning algorithm in terms of the number of hypotheses it falsifies [6], counted in two different ways for the two capacities. We also discuss how effective information relates to information gain, Shannon and mutual information.

State-space modeling provides a powerful tool for system identiﬁcation and prediction. In linear state-space models the data are usually assumed to be Gaussian and the models have certain structural constraints such that they are identiﬁable. In this paper we propose a non-Gaussian state-space model which does not have such constraints. We prove that this model is fully identiﬁable. We then propose an eﬃcient two-step method for parameter estimation: one ﬁrst extracts the subspace of the latent processes based on the temporal information of the data, and then performs multichannel blind deconvolution, making use of both the temporal information and non-Gaussianity. We conduct a series of simulations to illustrate the performance of the proposed method. Finally, we apply the proposed model and parameter estimation method on real data, including major world stock indices and magnetoencephalography (MEG) recordings. Experimental results are encouraging and show the practical usefulness of the proposed model and method.

Taking a sharp photo at several megapixel resolution traditionally relies on high grade lenses. In this paper, we present an approach to alleviate image degradations caused by imperfect optics. We rely on a calibration step to encode the optical aberrations in a space-variant point spread function and obtain a corrected image by non-stationary deconvolution. By including the Bayer array in our image formation model, we can perform demosaicing as part of the deconvolution.

Output kernel learning techniques allow to simultaneously learn a vector-valued function and a positive semidefinite matrix which describes the relationships between the outputs. In this paper, we introduce a new formulation that imposes a low-rank constraint on the output kernel and operates directly on a factor of the kernel matrix. First, we investigate the connection between output kernel learning and a regularization problem for an architecture
with two layers. Then, we show that a variety of methods such as nuclear norm regularized regression, reduced-rank regression, principal component analysis, and low rank matrix approximation can be seen as special cases of the output kernel learning framework. Finally, we introduce a block coordinate descent strategy for learning low-rank output kernels.

Motivation: Over the last decade, both static and dynamic fragment libraries for protein structure prediction have been introduced. The former are built from clusters in either sequence or structure space and aim to extract a universal structural alphabet. The latter are tailored for a particular query protein sequence and aim to provide local structural templates that need to be assembled in order to build the full-length structure.
Results: Here, we introduce HHfrag, a dynamic HMM-based fragment search method built on the profile–profile comparison tool HHpred. We show that HHfrag provides advantages over existing fragment assignment methods in that it: (i) improves the precision of the fragments at the expense of a minor loss in sequence coverage; (ii) detects fragments of variable length (6–21 amino acid residues); (iii) allows for gapped fragments and (iv) does not assign fragments to regions where there is no clear sequence conservation. We illustrate the usefulness of fragments detected by HHfrag on targets from most recent CASP.

Direct policy search is a promising reinforcement learning framework, in particular for controlling continuous, high-dimensional systems. Policy search often requires a large number of samples for obtaining a stable policy update estimator, and this is prohibitive when the sampling cost is expensive. In this letter, we extend an expectation-maximization-based policy search method so that previously collected samples can be efficiently reused. The usefulness of the proposed method, reward-weighted regression with sample reuse (R), is demonstrated through robot learning experiments.

This paper focuses on the stability condition of teleoperation system where there is a packet loss in communication channel. Communication channel between master and slave cause packet loss and it obviously leads to a performance degradation and instability of teleoperation system. We consider two-channel control architecture for teleoperation system, and control inputs to remote site are produced by position of master and slave. In this paper, teleoperation system is modeled in discrete domain to include packet loss process. Also, the stability condition for teleoperation system with packet loss is discussed with input-to-state stability. Finally, the stability condition is presented in LMI approach.

Models are among the most essential tools in robotics, such as kinematics and dynamics models of the robot's own body and controllable external objects. It is widely believed that intelligent mammals also rely on internal models in order to generate their actions. However, while classical robotics relies on manually generated models that are based on human insights into physics, future autonomous, cognitive
robots need to be able to automatically generate models that are based on information which is extracted from the data streams accessible to the robot. In this paper, we
survey the progress in model learning with a strong focus on robot control on a kinematic as well as dynamical level. Here, a model describes essential information about the behavior of the environment and the in uence of an agent on this environment. In the context of model based learning control, we view the model from three different perspectives. First, we need to study the dierent possible model learning architectures for robotics. Second, we discuss what kind of problems these architecture and the domain of robotics imply for the applicable learning methods. From this discussion, we deduce future directions of real-time learning algorithms. Third, we show where these
scenarios have been used successfully in several case studies.

Camera shake leads to non-uniform image blurs. State-of-the-art methods for removing camera shake model the blur as a linear combination of homographically transformed versions of the true image. While this is conceptually interesting, the resulting algorithms are computationally demanding. In this paper we develop a forward model based on the efficient filter flow framework, incorporating the particularities of camera shake, and show how an efficient algorithm for blur removal can be obtained. Comprehensive comparisons on a number of real-world blurry images show that our approach is not only substantially faster, but it also leads to better deblurring results.

We describe factored spectrally transformed linear mixed models (FaST-LMM), an algorithm for genome-wide association studies (GWAS) that scales linearly with cohort size in both run time and memory use. On Wellcome Trust data for 15,000 individuals, FaST-LMM ran an order of magnitude faster than current efficient algorithms. Our algorithm can analyze data for 120,000 individuals in just a few hours, whereas current algorithms fail on data for even 20,000 individuals (http://mscompbio.codeplex.com/).

The amount of information encoded by networks of neurons critically depends on the correlation structure of their activity. Neurons with similar stimulus preferences tend to have higher noise correlations than others. In homogeneous populations of neurons, this limited range correlation structure is highly detrimental to the accuracy of a population code. Therefore, reduced spike count correlations under attention, after adaptation, or after learning have been interpreted as evidence for a more efficient population code. Here, we analyze the role of limited range correlations in more realistic, heterogeneous population models. We use Fisher information and maximum-likelihood decoding to show that reduced correlations do not necessarily improve encoding accuracy. In fact, in populations with more than a few hundred neurons, increasing the level of limited range correlations can substantially improve encoding accuracy. We found that this improvement results from a decrease in noise entropy that is associated with increasing correlations if the marginal distributions are unchanged. Surprisingly, for constant noise entropy and in the limit of large populations, the encoding accuracy is independent of both structure and magnitude of noise correlations.

Our method for attenuation correction (AC) in MR-BrainPET with segmented T1-weighted MR images of the pa-tient's head was applied to data from different MR-BrainPET scanners (Jülich, Tübingen) and compared to CT-based results. The study objectives presented in this paper are twofold. The first objective is to examine if the segmentation method developed for and successfully applied to 3D MP-RAGE data can also be used to segment other T1-weighted MR data such as 3D FLASH data. The second aim is to show if the similarity of segmented MR-based (SBA) and CT-based AC (CBA) obtained at HR+ PET can also be confirmed for BrainPET for which the new AC method is intended for. In order to reach the first objective, 14 segmented MR data sets (three 3D MP-RAGE data sets from Jülich and eleven 3D FLASH data sets from Tubingen) were compared to the resp. CT data based on the Dice coefficient and scatter plots. For bone, a CT threshold HU>;500 was applied. Dice coefficients (mean±std) for the upper cranial part of the skull, the skull above cavities, and in the caudal part including the cerebellum are 0.73±0.1, 0.79±0.04, and 0.49±0.02 for the Jülich data and 0.7U0.1, 0.72±0.1, and 0.60±0.05 for the Tubingen data. To reach the second aim, SBA and CBA were compared for six subjects based on VOI (AAL atlas) analysis. Mean absolute relative difference (maRD) values are maRD(JUFVBWl-FDG): 0.99%±0.83%, maRD(JüFVBW2-FDG): 0.90%±0.89%, and maRD(JUEP-Fluma- zenil): 1.85%±1.25% for the Jülich data and maRD(TuTP02- FDG): 2.99%±1.65%, maRD(TuNP01-FDG): 5.37%±2.29%, and maRD(TuNP02-FDG): 6.52%±1.69% for the three best-segmented Tübingen data sets. The results show similar segmentation quality for both Tl- weighted MR sequence types. The application to AC in BrainPET - hows a high similarity to CT-based AC if the standardized ACF value for bone used in SBA is in good accordance to the bone density of the patient in question.

In this paper, we analyze the convergence of two general classes of optimization algorithms for regularized kernel methods with convex loss function and quadratic norm regularization. The first methodology is a new class of algorithms based on fixed-point iterations that are well-suited for a parallel implementation and can be used with any convex loss function. The second methodology is based on coordinate descent, and generalizes some techniques previously proposed for linear support vector machines. It exploits the structure of additively separable loss functions to compute solutions of line searches in closed form. The two methodologies are both very easy to implement. In this paper, we also show how to remove non-differentiability of the objective functional by exactly reformulating a convex regularization problem as an unconstrained differentiable stabilization problem.

Playing table tennis is a difficult motor task that requires fast movements, accurate control and adaptation
to task parameters. Although human beings see and move slower than most robot systems, they significantly
outperform all table tennis robots. One important reason for this higher performance is the human movement
generation. In this paper, we study human movements during table tennis and present a robot system that mimics
human striking behavior. Our focus lies on generating hitting motions capable of adapting to variations in environmental conditions, such as changes in ball speed and position. Therefore, we model the human movements
involved in hitting a table tennis ball using discrete movement stages and the virtual hitting point hypothesis.
The resulting model was evaluated both in a physically realistic simulation and on a real anthropomorphic seven
degrees of freedom Barrett WAM™ robot arm.

The plant Arabidopsis thaliana occurs naturally in many different habitats throughout Eurasia. As a foundation for identifying genetic variation contributing to adaptation to diverse environments, a 1001 Genomes Project to sequence geographically diverse A. thaliana strains has been initiated. Here we present the first phase of this project, based on population-scale sequencing of 80 strains drawn from eight regions throughout the species' native range. We describe the majority of common small-scale polymorphisms as well as many larger insertions and deletions in the A. thaliana pan-genome, their effects on gene function, and the patterns of local and global linkage among these variants. The action of processes other than spontaneous mutation is identified by comparing the spectrum of mutations that have accumulated since A. thaliana diverged from its closest relative 10 million years ago with the spectrum observed in the laboratory. Recent species-wide selective sweeps are rare, and potentially deleterious mutations are more common in marginal populations.

Playing table tennis is a difficult task for robots, especially due to their limitations of acceleration. A key bottleneck is the amount of time needed to reach the desired hitting position and velocity of the racket for returning the incoming ball. Here, it often does not suffice to simply extrapolate the ball's trajectory after the opponent returns it but more information is needed. Humans are able to predict the ball's trajectory based on the opponent's moves and, thus, have a considerable advantage. Hence, we propose to incorporate an anticipation system into robot table tennis players, which enables the robot to react earlier while the opponent is performing the striking movement. Based on visual observation of the opponent's racket movement, the robot can predict the aim of the opponent and adjust its movement generation accordingly. The policies for deciding how and when to react are obtained by reinforcement learning. We conduct experiments with an existing robot player to show that the learned reaction policy can significantly improve the performance of the overall system.

This paper relates a recently proposed measure of information integration to experiments investigating the evoked
high-density electroencephalography (EEG) response to transcranial magnetic stimulation (TMS) during wakefulness, early non-rapid eye movement (NREM) sleep and under anesthesia. We show that bistability, arising at the cellular and population level during NREM sleep and under anesthesia, dramatically reduces the brain’s ability to integrate information.

Many state-of-the-art denoising algorithms focus on recovering high-frequency details in noisy images. However, images corrupted by large amounts of noise are also degraded in the lower frequencies. Thus properly handling all frequency bands allows us to better denoise in such regimes. To improve existing denoising algorithms we propose a meta-procedure that applies existing denoising algorithms across different scales and combines the resulting images into a single denoised image. With a comprehensive evaluation we show that the performance of many state-of-the-art denoising algorithms can be improved.

Genetic differences between Arabidopsis thaliana accessions underlie the plant’s extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions.

In this article, we propose a family of efficient kernels for large graphs with discrete node labels. Key to our method is a rapid feature extraction scheme based on the Weisfeiler-Lehman test of isomorphism on graphs. It maps the original graph to a sequence of graphs, whose node attributes capture topological and label information. A family of kernels can be defined based on this Weisfeiler-Lehman sequence of graphs, including a highly efficient kernel comparing subtree-like patterns. Its runtime scales only linearly in the number of edges of the graphs and the length of the Weisfeiler-Lehman graph sequence. In our experimental evaluation, our kernels outperform state-of-the-art graph kernels on several graph classification benchmark data sets in terms of accuracy and runtime. Our kernels open the door to large-scale applications of graph kernels in various disciplines such as computational biology and social network analysis.

International Journal of Bioelectromagnetism, 13(3):115-116, September 2011 (article)

Abstract

While research on brain-computer interfacing (BCI) has seen tremendous progress in recent years, performance still varies substantially between as well as within subjects, with roughly 10 - 20% of subjects being incapable of successfully operating a BCI system. In this short report, I argue that this variation in performance constitutes one of the major obstacles that impedes a successful commercialization of BCI systems. I review the current state of research on the neuro-physiological causes of performance variation in BCI, discuss recent progress and open problems, and delineate potential research programs for addressing this issue.

Learning to grasp novel objects is an essential skill for robots operating in unstructured environments. We therefore propose a probabilistic approach for learning to grasp. In particular, we learn a function that predicts the success probability of grasps performed on surface points of a given object. Our approach is based on Markov Random Fields (MRF), and motivated by the fact that points that are geometrically close to each other tend to have similar grasp success probabilities. The MRF approach is successfully tested in simulation, and on a real robot using 3-D scans of various types of objects. The empirical results show a significant improvement over methods that do not utilize the smoothness assumption and classify each point separately from the others.

In recent work, we have provided evidence that fronto-parietal γ-range oscillations are a cause of within-subject performance variations in brain-computer interfaces (BCIs) based on motor-imagery. Here, we explore the feasibility of using neurofeedback of fronto-parietal γ-power to induce a mental state that is beneficial for BCI-performance. We provide empirical evidence based on two healthy subjects that intentional attenuation of fronto-parietal γ-power results in an enhanced resting-state sensorimotor-rhythm (SMR). As a large resting-state amplitude of the SMR has been shown to correlate with good BCI-performance, our approach may provide a means to reduce performance variations in BCIs.

Learning inverse kinematics of robots with redundant degrees of freedom (DoF) is a difficult problem in robot learning. The difficulty lies in the non-uniqueness of the inverse kinematics function. Existing methods tackle non-uniqueness by segmenting the configuration space and building a global solution from local experts. The usage of local experts implies the definition of an oracle, which governs the global consistency of the local models; the definition of this oracle is difficult. We propose an algorithm suitable to learn the inverse kinematics function in a single global model despite its multivalued nature. Inverse kinematics is approximated from examples using structured output learning methods. Unlike most of the existing methods, which estimate inverse kinematics on velocity level, we address the learning of the direct function on position level. This problem is a significantly harder. To support the proposed method, we conducted real world experiments on a tracking control task and tested our algorithms on these models.

A challenging problem in image restoration is to recover an image with a blurry foreground. Such images can easily occur with modern cameras, when the auto-focus aims mistakenly at the background (which will appear sharp) instead of the foreground, where usually the object of interest is. In this paper we propose an automatic procedure that (i) estimates the amount of out-of-focus blur, (ii) segments the image into foreground and background incorporating clues from the blurriness, (iii) recovers the sharp foreground, and finally (iv) blurs the background to refocus the scene. On several real photographs with blurry foreground and sharp background, we demonstrate the effectiveness and limitations of our method.

GRavitational lEnsing Accuracy Testing 2010 (GREAT10) is a public image analysis challenge aimed at the development of algorithms to analyze astronomical images. Specifically, the challenge is to measure varying image distortions in the presence of a variable convolution kernel, pixelization and noise. This is the second in a series of challenges set to the astronomy, computer science and statistics communities, providing a structured environment in which methods can be improved and tested in preparation for planned astronomical surveys. GREAT10 extends upon previous work by introducing variable fields into the challenge. The “Galaxy Challenge” involves the precise measurement of galaxy shape distortions, quantified locally by two parameters called shear, in the presence of a known convolution kernel. Crucially, the convolution kernel and the simulated gravitational lensing shape distortion both now vary as a function of position within the images, as is the case for real data. In addition, we introduce the “Star Challenge” that concerns the reconstruction of a variable convolution kernel, similar to that in a typical astronomical observation. This document details the GREAT10 Challenge for potential participants. Continually updated information is also available from www.greatchallenges.info.

Many complex robot motor skills can be represented using elementary movements, and there exist efficient
techniques for learning parametrized motor plans using demonstrations and self-improvement. However, in
many cases, the robot currently needs to learn a new elementary movement even if a parametrized motor
plan exists that covers a similar, related situation. Clearly, a method is needed that modulates the elementary
movement through the meta-parameters of its representation. In this paper, we show how to learn such
mappings from circumstances to meta-parameters using reinforcement learning.We introduce an appropriate
reinforcement learning algorithm based on a kernelized version of the reward-weighted regression. We
compare this algorithm to several previous methods on a toy example and show that it performs well in
comparison to standard algorithms. Subsequently, we show two robot applications of the presented setup;
i.e., the generalization of throwing movements in darts, and of hitting movements in table tennis. We show
that both tasks can be learned successfully using simulated and real robots.

Robust dry EEG electrodes are arguably the key to making EEG Brain-Computer Interfaces (BCIs) a practical technology. Existing studies on dry EEG electrodes can be characterized by the recording method (stand-alone dry electrodes or simultaneous recording with wet electrodes), the dry electrode technology (e.g. active or passive), the paradigm used for testing (e.g. event-related potentials), and the measure of performance (e.g. comparing dry and wet electrode frequency spectra). In this study, an active-dry electrode prototype is tested, during a motor-imagery task, with EEG-BCI in mind. It is used simultaneously with wet electrodes and assessed using classification accuracy. Our results indicate that the two types of electrodes are comparable in their performance but there are improvements to be made, particularly in finding ways to reduce motion-related artifacts.

Task-space tracking control is essential for robot manipulation. In practice, task-space control of redundant robot systems is known to be susceptive to modeling errors. Here, data driven learning methods may present an interesting alternative approach. However, learning models for task-space tracking control from sampled data is an ill-posed problem. In particular, the same input data point can yield many different output values which can form a non-convex solution space. Because the problem is ill-posed, models cannot be learned from such data using common regression methods. While learning of task-space control mappings is globally ill-posed, it has been shown in recent work that it is locally a well-defined problem. In this paper, we use this insight to formulate a local kernel-based learning approach for online model learning for taskspace tracking control. For evaluations, we show in simulation the ability of the method for online model learning for task-space tracking control of redundant robots.

In pages: 6, International Workshop on Microscopic Image Analysis with Application in Biology (MIAAB), September 2011 (inproceedings)

Abstract

An automatic particle picking algorithm for processing
electron micrographs of a large molecular complex, the
26S proteasome, is described. The algorithm makes use of a
coherence enhancing diffusion filter to denoise the data, and a random forest classifier for removing false positives. It does not make use of a 3D reference model, but uses a training set of manually picked particles instead. False positive and false negative rates of around 25% to 30% are achieved on a testing set. The algorithm was developed for a specific particle, but contains steps that should be useful for developing automatic picking algorithms for other particles.

This paper presents a comparative study in order to analyze active learning (AL) and semi-supervised learning (SSL) for the classification of remote sensing (RS) images. The two learning paradigms are analyzed both from the theoretical and experimental point of view. The aim of this work is to identify the advantages and disadvantages of AL and SSL methods, and to point out the boundary conditions on the applicability of these methods with respect to both the number of available labeled samples and the reliability of classification results. In our experimental analysis, AL and SSL techniques have been applied to the classification of both synthetic and real RS data, defining different classification problems starting from different initial training sets and considering different distributions of the classes. This analysis allowed us to derive important conclusion about the use of these classification approaches and to obtain insight about which one of the two approaches is more appropriate according to the specific classification problem, the available initial training set and the available budget for the acquisition of new labeled samples.

PET/MRI is an emerging dual-modality imaging technology that requires new approaches to PET attenuation correction (AC). We assessed 2 algorithms for whole-body MRI-based AC (MRAC): a basic MR image segmentation algorithm and a method based on atlas registration and pattern recognition (AT&PR).
METHODS:
Eleven patients each underwent a whole-body PET/CT study and a separate multibed whole-body MRI study. The MR image segmentation algorithm uses a combination of image thresholds, Dixon fat-water segmentation, and component analysis to detect the lungs. MR images are segmented into 5 tissue classes (not including bone), and each class is assigned a default linear attenuation value. The AT&PR algorithm uses a database of previously aligned pairs of MRI/CT image volumes. For each patient, these pairs are registered to the patient MRI volume, and machine-learning techniques are used to predict attenuation values on a continuous scale. MRAC methods are compared via the quantitative analysis of AC PET images using volumes of interest in normal organs and on lesions. We assume the PET/CT values after CT-based AC to be the reference standard.
RESULTS:
In regions of normal physiologic uptake, the average error of the mean standardized uptake value was 14.1% ± 10.2% and 7.7% ± 8.4% for the segmentation and the AT&PR methods, respectively. Lesion-based errors were 7.5% ± 7.9% for the segmentation method and 5.7% ± 4.7% for the AT&PR method.
CONCLUSION:
The MRAC method using AT&PR provided better overall PET quantification accuracy than the basic MR image segmentation approach. This better quantification was due to the significantly reduced volume of errors made regarding volumes of interest within or near bones and the slightly reduced volume of errors made regarding areas outside the lungs.

Many motor skills consist of many lower level elementary movements that need to be sequenced in order to achieve a task. In order to learn such a task, both the primitive movements as well as the higher-level strategy need to be acquired at the same time. In contrast, most learning approaches focus either on learning to combine a fixed set of options or to learn just single options. In this paper, we discuss a new approach that allows improving the performance of lower level actions while pursuing a higher level task. The presented approach is applicable to learning a wider range motor skills, but in this paper, we employ it for learning games where the player wants to improve his performance at the individual actions of the game while still performing well at the strategy level game. We propose to learn the lower level actions using Cost-regularized Kernel Regression and the higher level actions using a form of Policy Iteration. The two approaches are coupled by their transition probabilities. We evaluate the approach on a side-stall-style throwing game both in simulation and with a real BioRob.

In Proceedings of the 58th World Statistics Congress, pages: 4456-4461, ISI, August 2011 (inproceedings)

Abstract

We develop a novel method for detection of signals and reconstruction of images in the presence of random noise. The method uses results from percolation theory. We specifically address the problem of detection of multiple objects of unknown shapes in the case of nonparametric noise. The noise density is unknown and can be heavy-tailed. The objects of interest have unknown varying intensities. No boundary shape constraints are imposed on the objects, only a set of weak bulk conditions is required. We view the object detection problem as hypothesis testing for discrete statistical inverse problems. We present an algorithm that allows to detect greyscale objects of various shapes in noisy images. We prove results on consistency and algorithmic complexity of our procedures. Applications to cryo-electron microscopy are presented.

Kernel canonical correlation analysis (KCCA) is a general technique for subspace learning that incorporates principal components analysis (PCA) and Fisher linear discriminant analysis (LDA) as special cases. By finding directions that maximize correlation, KCCA learns representations that are more closely tied to the underlying process that generates the data and can ignore high-variance noise directions. However, for data where acquisition in one or more modalities is expensive or otherwise limited, KCCA may suffer from small sample effects. We propose to use semi-supervised Laplacian regularization to utilize data that are present in only one modality. This approach is able to find highly correlated directions that also lie along the data manifold, resulting in a more robust estimate of correlated subspaces.
Functional magnetic resonance imaging (fMRI) acquired data are naturally amenable to subspace techniques as data are well aligned. fMRI data of the human brain are a particularly interesting candidate. In this study we implemented various supervised and semi-supervised versions of KCCA on human fMRI data, with regression to single and multi-variate labels (corresponding to video content subjects viewed during the image acquisition). In each variate condition, the semi-supervised variants of KCCA performed better than the supervised variants, including a supervised variant with Laplacian regularization. We additionally analyze the weights learned by the regression in order to infer brain regions that are important to different types of visual processing.

Opponent modeling is a critical mechanism in repeated games. It allows a player to adapt its strategy in order to better respond to the presumed preferences of his opponents. We introduce a new modeling technique that adaptively balances exploitability and risk reduction. An opponent’s strategy is modeled with a set of possible strategies that contain the actual strategy with a high probability. The algorithm is safe as the expected payoff is above the minimax payoff with a high probability, and can exploit the opponents’ preferences when sufficient observations have been obtained. We apply them to normal-form games and stochastic games with a finite number of stages. The performance of the proposed approach is first demonstrated on repeated rock-paper-scissors games. Subsequently, the approach is evaluated in a human-robot table-tennis setting where the robot player learns to prepare to return a served ball. By modeling the human players, the robot chooses a forehand, backhand or middle preparation pose before they serve. The learned strategies can exploit the opponent’s preferences, leading to a higher rate of successful returns.

Many natural processes occur over characteristic spatial and
temporal scales. This paper presents tools for (i) flexibly and scalably coarse-graining cellular automata and (ii) identifying which coarse-grainings express an automaton’s dynamics well, and which express its dynamics badly. We apply the tools to investigate a range of examples in Conway’s Game of Life and Hopfield networks and demonstrate that they capture some basic intuitions about emergent processes. Finally, we formalize the notion that a process is emergent if it is better expressed at a coarser granularity.

We develop a novel method for detection of signals and reconstruction of images in the presence of random noise. The method uses results from percolation theory. We specifically address the problem of detection of multiple objects of unknown shapes in the case of nonparametric noise. The noise density is unknown. The objects of interest have unknown varying intensities. No boundary shape constraints are imposed on the objects, only a set of weak bulk conditions is required. We view the object detection problem as a multiple hypothesis testing for discrete statistical inverse problems. We present an algorithm that allows to detect greyscale objects of various shapes in noisy images. We prove results on consistency and algorithmic complexity of our procedures. Applications to cryo-electron microscopy are presented.

Genome-wide association studies (GWAS) have not been able to discover strong associations between many complex human diseases and single genetic loci. Mapping these phenotypes to pairs of genetic loci is hindered by the huge number of candidates leading to enormous computational and statistical problems. In GWAS on single nucleotide polymorphisms (SNPs), one has to consider in the order of 1010 to 1014 pairs, which is infeasible in practice. In this article, we give the first algorithm for 2-locus genome-wide association studies that is subquadratic in the number, n, of SNPs. The running time of our algorithm is data-dependent, but large experiments over real genomic data suggest that it scales empirically as n3/2. As a result, our algorithm can easily cope with n ~ 107, i.e., it can efficiently search all pairs of SNPs in the human genome.

Motor-imagery-based brain-computer interfaces (BCIs) commonly use the common spatial pattern filter (CSP) as preprocessing step before feature extraction and classification. The CSP method is a supervised algorithm
and therefore needs subject-specific training data for calibration, which is very time consuming to collect. In order to reduce the amount of calibration data that is needed for a new subject, one can apply multitask (from now on called multisubject) machine learning techniques to the preprocessing phase. Here, the goal of multisubject learning is to learn a spatial filter for a new subject
based on its own data and that of other subjects. This paper outlines the details of the multitask CSP algorithm and shows results on two data sets. In certain subjects a clear improvement can be seen, especially when the number of training trials is relatively low.

We present a novel technique for addressing domain adaptation problems in the classification of remote sensing images with active learning. Domain adaptation is the important problem of adapting a supervised classifier trained on a given image (source domain) to the classification of another similar (but not identical) image (target domain) acquired on a different area, or on the same area at a different time. The main idea of the proposed approach is to iteratively labeling and adding to the training set the minimum number of the most informative samples from target domain, while removing the source-domain samples that does not fit with the distributions of the classes in the target domain. In this way, the classification system exploits already available information, i.e., the labeled samples of source domain, in order to minimize the number of target domain samples to be labeled, thus reducing the cost associated to the definition of the training set for the classification of the target domain. Experimental results obtained in the classification of a hyperspectral image confirm the effectiveness of the proposed technique.

Many complex robot motor skills can be represented using elementary movements, and there exist efficient techniques for learning parametrized motor plans using demonstrations and self-improvement. However with current techniques, in many cases, the robot currently needs to learn a new elementary movement even if a parametrized motor plan exists that covers a related situation. A method is needed that modulates the elementary movement through the meta-parameters of its representation. In this paper, we describe how to learn such mappings from circumstances to meta-parameters using reinforcement learning. In particular we use a kernelized version of the reward-weighted regression. We show two robot applications of the presented setup in robotic domains; the generalization of throwing movements in darts, and of hitting movements in table tennis. We demonstrate that both tasks can be learned successfully using simulated and real robots.

Most results for online decision problems with structured concepts, such as trees or cuts, assume linear costs. In many settings, however, nonlinear costs are more realistic. Owing to their non-separability, these lead to much harder optimization problems. Going beyond linearity, we address online approximation algorithms for structured concepts that allow the cost to be submodular, i.e., nonseparable. In particular, we show regret bounds for three Hannan-consistent strategies that capture different settings. Our results also tighten a regret bound for unconstrained online submodular minimization.

In pages: 1-8, ICML Workshop on Online Trading of Exploration and Exploitation 2, July 2011 (inproceedings)

Abstract

We develop a coherent framework for integrative simultaneous analysis of the exploration-exploitation and model order selection trade-offs. We improve over our preceding results on the same subject (Seldin et al., 2011) by combining PAC-Bayesian analysis with Bernstein-type inequality for martingales. Such a combination is also of independent interest for studies of multiple simultaneously evolving martingales.

Motivation: Classifying biological data into different groups is a central task of bioinformatics: for instance, to predict the function of a gene or protein, the disease state of a patient or the phenotype of an individual based on its genotype. Support Vector Machines are a wide spread approach for classifying biological data, due to their high accuracy, their ability to deal with structured data such as strings, and the ease to integrate various types of data. However, it is unclear how to correct for confounding factors such as population structure, age or gender or experimental conditions in Support Vector Machine classification.
Results: In this article, we present a Support Vector Machine classifier that can correct the prediction for observed confounding factors. This is achieved by minimizing the statistical dependence between the classifier and the confounding factors. We prove that this formulation can be transformed into a standard Support Vector Machine with rescaled input data. In our experiments, our confounder correcting SVM (ccSVM) improves tumor diagnosis based on samples from different labs, tuberculosis diagnosis in patients of varying age, ethnicity and gender, and phenotype prediction in the presence of population structure and outperforms state-of-the-art methods in terms of prediction accuracy.

We describe a method that infers whether statistical dependences between two observed variables X and Y are due to a \direct" causal link or only due to a connecting causal
path that contains an unobserved variable of low complexity, e.g., a binary variable. This problem is motivated by statistical genetics. Given a genetic marker that is correlated with a phenotype of interest, we want to
detect whether this marker is causal or it only correlates with a causal one. Our method is based on the analysis of the location of the conditional distributions P(Y jx) in the simplex of all distributions of Y . We report encouraging results on semi-empirical data.

Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems