Large graphs abound in machine learning, data mining, and several related areas. A useful step towards analyzing such graphs is that of obtaining certain summary statistics — e.g., or the expected length of a shortest path between two nodes, or the expected weight of a minimum spanning tree of the graph, etc. These statistics provide insight into the structure of a graph, and they can help predict global
properties of a graph. Motivated thus, we propose to study statistical properties of structured subgraphs (of a given graph), in particular, to estimate the expected objective function value of a combinatorial optimization problem over these subgraphs. The general task is very difficult, if not unsolvable; so for concreteness we describe a more specific statistical estimation problem based on spanning trees.
We hope that our position paper encourages others to also study other types of graphical structures for which one can prove nontrivial statistical estimates.

In pages: 3210-3215, IEEE, Piscataway, NJ, USA, 50th IEEE Conference on Decision and Control and European Control Conference (CDC - ECC), December 2011 (inproceedings)

Abstract

We analyze the problem of data sets reduction for support vector classification. The work is also motivated by distributed problems, where sensors collect binary measurements at different locations moving inside an environment that needs to be divided into a collection of regions labeled in two different ways. The scope is to let each agent retain and exchange only those measurements that are mostly informative for the collective reconstruction of the decision boundary. For the case of separable classes, we provide the exact conditions and an efficient algorithm to determine if an element in the training set can become a support vector when new data arrive. The analysis is then extended to the non-separable case deriving a sufficient discardability condition and a general data selection
scheme for classification. Numerical experiments relative to the distributed problem show that the proposed procedure allows the agents to exchange a small amount of the collected data to obtain a highly predictive decision boundary.

There are (at least) three approaches to quantifying information. The first, algorithmic information or Kolmogorov complexity, takes events as strings and, given a universal Turing machine, quantifies the information content of a string as the length of the shortest program producing it [1]. The second, Shannon information, takes events as belonging to ensembles and quantifies the information resulting from observing the given event in terms of the number of alternate events that have been
ruled out [2]. The third, statistical learning theory, has introduced measures of capacity that control (in part) the expected risk of classifiers [3]. These capacities quantify the expectations regarding future data that learning algorithms embed into classifiers. Solomonoff and Hutter have applied algorithmic information to prove remarkable results on universal induction. Shannon information provides the mathematical foundation for communication
and coding theory. However, both approaches have shortcomings. Algorithmic information is not computable, severely limiting its practical usefulness. Shannon information refers to ensembles rather than actual events: it makes no sense to compute the Shannon information of a single string – or rather, there are many answers to this question depending on how a related ensemble is constructed.
Although there are asymptotic results linking algorithmic and Shannon information, it is unsatisfying that there is such a large gap – a difference in kind – between the two measures. This note describes a new method of quantifying information, effective information, that links algorithmic
information to Shannon information, and also links both to capacities arising in statistical learning theory [4, 5]. After introducing the measure, we show that it provides a non-universal analog of Kolmogorov complexity. We then apply it to derive basic capacities in statistical learning
theory: empirical VC-entropy and empirical Rademacher complexity. A nice byproduct of our approach is an interpretation of the explanatory power of a learning algorithm in terms of the number of hypotheses it falsifies [6], counted in two different ways for the two capacities. We also discuss how effective information relates to information gain, Shannon and mutual information.

State-space modeling provides a powerful tool for system identiﬁcation and prediction. In linear state-space models the data are usually assumed to be Gaussian and the models have certain structural constraints such that they are identiﬁable. In this paper we propose a non-Gaussian state-space model which does not have such constraints. We prove that this model is fully identiﬁable. We then propose an eﬃcient two-step method for parameter estimation: one ﬁrst extracts the subspace of the latent processes based on the temporal information of the data, and then performs multichannel blind deconvolution, making use of both the temporal information and non-Gaussianity. We conduct a series of simulations to illustrate the performance of the proposed method. Finally, we apply the proposed model and parameter estimation method on real data, including major world stock indices and magnetoencephalography (MEG) recordings. Experimental results are encouraging and show the practical usefulness of the proposed model and method.

Taking a sharp photo at several megapixel resolution traditionally relies on high grade lenses. In this paper, we present an approach to alleviate image degradations caused by imperfect optics. We rely on a calibration step to encode the optical aberrations in a space-variant point spread function and obtain a corrected image by non-stationary deconvolution. By including the Bayer array in our image formation model, we can perform demosaicing as part of the deconvolution.

Output kernel learning techniques allow to simultaneously learn a vector-valued function and a positive semidefinite matrix which describes the relationships between the outputs. In this paper, we introduce a new formulation that imposes a low-rank constraint on the output kernel and operates directly on a factor of the kernel matrix. First, we investigate the connection between output kernel learning and a regularization problem for an architecture
with two layers. Then, we show that a variety of methods such as nuclear norm regularized regression, reduced-rank regression, principal component analysis, and low rank matrix approximation can be seen as special cases of the output kernel learning framework. Finally, we introduce a block coordinate descent strategy for learning low-rank output kernels.

This paper focuses on the stability condition of teleoperation system where there is a packet loss in communication channel. Communication channel between master and slave cause packet loss and it obviously leads to a performance degradation and instability of teleoperation system. We consider two-channel control architecture for teleoperation system, and control inputs to remote site are produced by position of master and slave. In this paper, teleoperation system is modeled in discrete domain to include packet loss process. Also, the stability condition for teleoperation system with packet loss is discussed with input-to-state stability. Finally, the stability condition is presented in LMI approach.

Camera shake leads to non-uniform image blurs. State-of-the-art methods for removing camera shake model the blur as a linear combination of homographically transformed versions of the true image. While this is conceptually interesting, the resulting algorithms are computationally demanding. In this paper we develop a forward model based on the efficient filter flow framework, incorporating the particularities of camera shake, and show how an efficient algorithm for blur removal can be obtained. Comprehensive comparisons on a number of real-world blurry images show that our approach is not only substantially faster, but it also leads to better deblurring results.

The 3D shape of the human body is useful for applications in fitness, games and apparel. Accurate body scanners, however, are expensive, limiting the availability of 3D body models. We present a method for human shape reconstruction from noisy monocular image and range data using a single inexpensive commodity sensor. The approach combines low-resolution image silhouettes with coarse range data to estimate a parametric model of the body. Accurate 3D shape estimates are obtained by combining multiple monocular views of a person moving in front of the sensor. To cope with varying body pose, we use a SCAPE body model which factors 3D body shape and pose variations. This enables the estimation of a single consistent shape while allowing pose to vary. Additionally, we describe a novel method to minimize the distance between the projected 3D body contour and the image silhouette that uses analytic derivatives of the objective function. We propose a simple method to estimate standard body measurements from the recovered SCAPE model and show that the accuracy of our method is competitive with commercial body scanning systems costing orders of magnitude more.

Our method for attenuation correction (AC) in MR-BrainPET with segmented T1-weighted MR images of the pa-tient's head was applied to data from different MR-BrainPET scanners (Jülich, Tübingen) and compared to CT-based results. The study objectives presented in this paper are twofold. The first objective is to examine if the segmentation method developed for and successfully applied to 3D MP-RAGE data can also be used to segment other T1-weighted MR data such as 3D FLASH data. The second aim is to show if the similarity of segmented MR-based (SBA) and CT-based AC (CBA) obtained at HR+ PET can also be confirmed for BrainPET for which the new AC method is intended for. In order to reach the first objective, 14 segmented MR data sets (three 3D MP-RAGE data sets from Jülich and eleven 3D FLASH data sets from Tubingen) were compared to the resp. CT data based on the Dice coefficient and scatter plots. For bone, a CT threshold HU>;500 was applied. Dice coefficients (mean±std) for the upper cranial part of the skull, the skull above cavities, and in the caudal part including the cerebellum are 0.73±0.1, 0.79±0.04, and 0.49±0.02 for the Jülich data and 0.7U0.1, 0.72±0.1, and 0.60±0.05 for the Tubingen data. To reach the second aim, SBA and CBA were compared for six subjects based on VOI (AAL atlas) analysis. Mean absolute relative difference (maRD) values are maRD(JUFVBWl-FDG): 0.99%±0.83%, maRD(JüFVBW2-FDG): 0.90%±0.89%, and maRD(JUEP-Fluma- zenil): 1.85%±1.25% for the Jülich data and maRD(TuTP02- FDG): 2.99%±1.65%, maRD(TuNP01-FDG): 5.37%±2.29%, and maRD(TuNP02-FDG): 6.52%±1.69% for the three best-segmented Tübingen data sets. The results show similar segmentation quality for both Tl- weighted MR sequence types. The application to AC in BrainPET - hows a high similarity to CT-based AC if the standardized ACF value for bone used in SBA is in good accordance to the bone density of the patient in question.

The statistical analysis of large corpora of human body scans requires that these scans be in alignment, either for a small set of key landmarks or densely for all the vertices in the scan. Existing techniques tend to rely on hand-placed landmarks or algorithms that extract landmarks from scans. The former is time consuming and subjective while the latter is error prone. Here we show that a model-based approach can align meshes automatically, producing alignment accuracy similar to that of previous methods that rely on many landmarks. Specifically, we align a low-resolution, artist-created template body mesh to many high-resolution laser scans. Our alignment procedure employs a robust iterative closest point method with a regularization that promotes smooth and locally rigid deformation of the template mesh. We evaluate our approach on 50 female body models from the CAESAR dataset that vary significantly in body shape. To make the method fully automatic, we define simple feature detectors for the head and ankles, which provide initial landmark locations. We find that, if body poses are fairly similar, as in CAESAR, the fully automated method provides dense alignments that enable statistical analysis and anthropometric measurement.

Playing table tennis is a difficult task for robots, especially due to their limitations of acceleration. A key bottleneck is the amount of time needed to reach the desired hitting position and velocity of the racket for returning the incoming ball. Here, it often does not suffice to simply extrapolate the ball's trajectory after the opponent returns it but more information is needed. Humans are able to predict the ball's trajectory based on the opponent's moves and, thus, have a considerable advantage. Hence, we propose to incorporate an anticipation system into robot table tennis players, which enables the robot to react earlier while the opponent is performing the striking movement. Based on visual observation of the opponent's racket movement, the robot can predict the aim of the opponent and adjust its movement generation accordingly. The policies for deciding how and when to react are obtained by reinforcement learning. We conduct experiments with an existing robot player to show that the learned reaction policy can significantly improve the performance of the overall system.

This paper relates a recently proposed measure of information integration to experiments investigating the evoked
high-density electroencephalography (EEG) response to transcranial magnetic stimulation (TMS) during wakefulness, early non-rapid eye movement (NREM) sleep and under anesthesia. We show that bistability, arising at the cellular and population level during NREM sleep and under anesthesia, dramatically reduces the brain’s ability to integrate information.

Many state-of-the-art denoising algorithms focus on recovering high-frequency details in noisy images. However, images corrupted by large amounts of noise are also degraded in the lower frequencies. Thus properly handling all frequency bands allows us to better denoise in such regimes. To improve existing denoising algorithms we propose a meta-procedure that applies existing denoising algorithms across different scales and combines the resulting images into a single denoised image. With a comprehensive evaluation we show that the performance of many state-of-the-art denoising algorithms can be improved.

Learning to grasp novel objects is an essential skill for robots operating in unstructured environments. We therefore propose a probabilistic approach for learning to grasp. In particular, we learn a function that predicts the success probability of grasps performed on surface points of a given object. Our approach is based on Markov Random Fields (MRF), and motivated by the fact that points that are geometrically close to each other tend to have similar grasp success probabilities. The MRF approach is successfully tested in simulation, and on a real robot using 3-D scans of various types of objects. The empirical results show a significant improvement over methods that do not utilize the smoothness assumption and classify each point separately from the others.

In recent work, we have provided evidence that fronto-parietal γ-range oscillations are a cause of within-subject performance variations in brain-computer interfaces (BCIs) based on motor-imagery. Here, we explore the feasibility of using neurofeedback of fronto-parietal γ-power to induce a mental state that is beneficial for BCI-performance. We provide empirical evidence based on two healthy subjects that intentional attenuation of fronto-parietal γ-power results in an enhanced resting-state sensorimotor-rhythm (SMR). As a large resting-state amplitude of the SMR has been shown to correlate with good BCI-performance, our approach may provide a means to reduce performance variations in BCIs.

Learning inverse kinematics of robots with redundant degrees of freedom (DoF) is a difficult problem in robot learning. The difficulty lies in the non-uniqueness of the inverse kinematics function. Existing methods tackle non-uniqueness by segmenting the configuration space and building a global solution from local experts. The usage of local experts implies the definition of an oracle, which governs the global consistency of the local models; the definition of this oracle is difficult. We propose an algorithm suitable to learn the inverse kinematics function in a single global model despite its multivalued nature. Inverse kinematics is approximated from examples using structured output learning methods. Unlike most of the existing methods, which estimate inverse kinematics on velocity level, we address the learning of the direct function on position level. This problem is a significantly harder. To support the proposed method, we conducted real world experiments on a tracking control task and tested our algorithms on these models.

A challenging problem in image restoration is to recover an image with a blurry foreground. Such images can easily occur with modern cameras, when the auto-focus aims mistakenly at the background (which will appear sharp) instead of the foreground, where usually the object of interest is. In this paper we propose an automatic procedure that (i) estimates the amount of out-of-focus blur, (ii) segments the image into foreground and background incorporating clues from the blurriness, (iii) recovers the sharp foreground, and finally (iv) blurs the background to refocus the scene. On several real photographs with blurry foreground and sharp background, we demonstrate the effectiveness and limitations of our method.

Many complex robot motor skills can be represented using elementary movements, and there exist efficient
techniques for learning parametrized motor plans using demonstrations and self-improvement. However, in
many cases, the robot currently needs to learn a new elementary movement even if a parametrized motor
plan exists that covers a similar, related situation. Clearly, a method is needed that modulates the elementary
movement through the meta-parameters of its representation. In this paper, we show how to learn such
mappings from circumstances to meta-parameters using reinforcement learning.We introduce an appropriate
reinforcement learning algorithm based on a kernelized version of the reward-weighted regression. We
compare this algorithm to several previous methods on a toy example and show that it performs well in
comparison to standard algorithms. Subsequently, we show two robot applications of the presented setup;
i.e., the generalization of throwing movements in darts, and of hitting movements in table tennis. We show
that both tasks can be learned successfully using simulated and real robots.

Robust dry EEG electrodes are arguably the key to making EEG Brain-Computer Interfaces (BCIs) a practical technology. Existing studies on dry EEG electrodes can be characterized by the recording method (stand-alone dry electrodes or simultaneous recording with wet electrodes), the dry electrode technology (e.g. active or passive), the paradigm used for testing (e.g. event-related potentials), and the measure of performance (e.g. comparing dry and wet electrode frequency spectra). In this study, an active-dry electrode prototype is tested, during a motor-imagery task, with EEG-BCI in mind. It is used simultaneously with wet electrodes and assessed using classification accuracy. Our results indicate that the two types of electrodes are comparable in their performance but there are improvements to be made, particularly in finding ways to reduce motion-related artifacts.

Task-space tracking control is essential for robot manipulation. In practice, task-space control of redundant robot systems is known to be susceptive to modeling errors. Here, data driven learning methods may present an interesting alternative approach. However, learning models for task-space tracking control from sampled data is an ill-posed problem. In particular, the same input data point can yield many different output values which can form a non-convex solution space. Because the problem is ill-posed, models cannot be learned from such data using common regression methods. While learning of task-space control mappings is globally ill-posed, it has been shown in recent work that it is locally a well-defined problem. In this paper, we use this insight to formulate a local kernel-based learning approach for online model learning for taskspace tracking control. For evaluations, we show in simulation the ability of the method for online model learning for task-space tracking control of redundant robots.

In pages: 6, International Workshop on Microscopic Image Analysis with Application in Biology (MIAAB), September 2011 (inproceedings)

Abstract

An automatic particle picking algorithm for processing
electron micrographs of a large molecular complex, the
26S proteasome, is described. The algorithm makes use of a
coherence enhancing diffusion filter to denoise the data, and a random forest classifier for removing false positives. It does not make use of a 3D reference model, but uses a training set of manually picked particles instead. False positive and false negative rates of around 25% to 30% are achieved on a testing set. The algorithm was developed for a specific particle, but contains steps that should be useful for developing automatic picking algorithms for other particles.

This paper presents a comparative study in order to analyze active learning (AL) and semi-supervised learning (SSL) for the classification of remote sensing (RS) images. The two learning paradigms are analyzed both from the theoretical and experimental point of view. The aim of this work is to identify the advantages and disadvantages of AL and SSL methods, and to point out the boundary conditions on the applicability of these methods with respect to both the number of available labeled samples and the reliability of classification results. In our experimental analysis, AL and SSL techniques have been applied to the classification of both synthetic and real RS data, defining different classification problems starting from different initial training sets and considering different distributions of the classes. This analysis allowed us to derive important conclusion about the use of these classification approaches and to obtain insight about which one of the two approaches is more appropriate according to the specific classification problem, the available initial training set and the available budget for the acquisition of new labeled samples.

Many motor skills consist of many lower level elementary movements that need to be sequenced in order to achieve a task. In order to learn such a task, both the primitive movements as well as the higher-level strategy need to be acquired at the same time. In contrast, most learning approaches focus either on learning to combine a fixed set of options or to learn just single options. In this paper, we discuss a new approach that allows improving the performance of lower level actions while pursuing a higher level task. The presented approach is applicable to learning a wider range motor skills, but in this paper, we employ it for learning games where the player wants to improve his performance at the individual actions of the game while still performing well at the strategy level game. We propose to learn the lower level actions using Cost-regularized Kernel Regression and the higher level actions using a form of Policy Iteration. The two approaches are coupled by their transition probabilities. We evaluate the approach on a side-stall-style throwing game both in simulation and with a real BioRob.

In Proceedings of the 58th World Statistics Congress, pages: 4456-4461, ISI, August 2011 (inproceedings)

Abstract

We develop a novel method for detection of signals and reconstruction of images in the presence of random noise. The method uses results from percolation theory. We specifically address the problem of detection of multiple objects of unknown shapes in the case of nonparametric noise. The noise density is unknown and can be heavy-tailed. The objects of interest have unknown varying intensities. No boundary shape constraints are imposed on the objects, only a set of weak bulk conditions is required. We view the object detection problem as hypothesis testing for discrete statistical inverse problems. We present an algorithm that allows to detect greyscale objects of various shapes in noisy images. We prove results on consistency and algorithmic complexity of our procedures. Applications to cryo-electron microscopy are presented.

Opponent modeling is a critical mechanism in repeated games. It allows a player to adapt its strategy in order to better respond to the presumed preferences of his opponents. We introduce a new modeling technique that adaptively balances exploitability and risk reduction. An opponent’s strategy is modeled with a set of possible strategies that contain the actual strategy with a high probability. The algorithm is safe as the expected payoff is above the minimax payoff with a high probability, and can exploit the opponents’ preferences when sufficient observations have been obtained. We apply them to normal-form games and stochastic games with a finite number of stages. The performance of the proposed approach is first demonstrated on repeated rock-paper-scissors games. Subsequently, the approach is evaluated in a human-robot table-tennis setting where the robot player learns to prepare to return a served ball. By modeling the human players, the robot chooses a forehand, backhand or middle preparation pose before they serve. The learned strategies can exploit the opponent’s preferences, leading to a higher rate of successful returns.

Many natural processes occur over characteristic spatial and
temporal scales. This paper presents tools for (i) flexibly and scalably coarse-graining cellular automata and (ii) identifying which coarse-grainings express an automaton’s dynamics well, and which express its dynamics badly. We apply the tools to investigate a range of examples in Conway’s Game of Life and Hopfield networks and demonstrate that they capture some basic intuitions about emergent processes. Finally, we formalize the notion that a process is emergent if it is better expressed at a coarser granularity.

We develop a novel method for detection of signals and reconstruction of images in the presence of random noise. The method uses results from percolation theory. We specifically address the problem of detection of multiple objects of unknown shapes in the case of nonparametric noise. The noise density is unknown. The objects of interest have unknown varying intensities. No boundary shape constraints are imposed on the objects, only a set of weak bulk conditions is required. We view the object detection problem as a multiple hypothesis testing for discrete statistical inverse problems. We present an algorithm that allows to detect greyscale objects of various shapes in noisy images. We prove results on consistency and algorithmic complexity of our procedures. Applications to cryo-electron microscopy are presented.

Genome-wide association studies (GWAS) have not been able to discover strong associations between many complex human diseases and single genetic loci. Mapping these phenotypes to pairs of genetic loci is hindered by the huge number of candidates leading to enormous computational and statistical problems. In GWAS on single nucleotide polymorphisms (SNPs), one has to consider in the order of 1010 to 1014 pairs, which is infeasible in practice. In this article, we give the first algorithm for 2-locus genome-wide association studies that is subquadratic in the number, n, of SNPs. The running time of our algorithm is data-dependent, but large experiments over real genomic data suggest that it scales empirically as n3/2. As a result, our algorithm can easily cope with n ~ 107, i.e., it can efficiently search all pairs of SNPs in the human genome.

We present a novel technique for addressing domain adaptation problems in the classification of remote sensing images with active learning. Domain adaptation is the important problem of adapting a supervised classifier trained on a given image (source domain) to the classification of another similar (but not identical) image (target domain) acquired on a different area, or on the same area at a different time. The main idea of the proposed approach is to iteratively labeling and adding to the training set the minimum number of the most informative samples from target domain, while removing the source-domain samples that does not fit with the distributions of the classes in the target domain. In this way, the classification system exploits already available information, i.e., the labeled samples of source domain, in order to minimize the number of target domain samples to be labeled, thus reducing the cost associated to the definition of the training set for the classification of the target domain. Experimental results obtained in the classification of a hyperspectral image confirm the effectiveness of the proposed technique.

Many complex robot motor skills can be represented using elementary movements, and there exist efficient techniques for learning parametrized motor plans using demonstrations and self-improvement. However with current techniques, in many cases, the robot currently needs to learn a new elementary movement even if a parametrized motor plan exists that covers a related situation. A method is needed that modulates the elementary movement through the meta-parameters of its representation. In this paper, we describe how to learn such mappings from circumstances to meta-parameters using reinforcement learning. In particular we use a kernelized version of the reward-weighted regression. We show two robot applications of the presented setup in robotic domains; the generalization of throwing movements in darts, and of hitting movements in table tennis. We demonstrate that both tasks can be learned successfully using simulated and real robots.

Most results for online decision problems with structured concepts, such as trees or cuts, assume linear costs. In many settings, however, nonlinear costs are more realistic. Owing to their non-separability, these lead to much harder optimization problems. Going beyond linearity, we address online approximation algorithms for structured concepts that allow the cost to be submodular, i.e., nonseparable. In particular, we show regret bounds for three Hannan-consistent strategies that capture different settings. Our results also tighten a regret bound for unconstrained online submodular minimization.

In pages: 1-8, ICML Workshop on Online Trading of Exploration and Exploitation 2, July 2011 (inproceedings)

Abstract

We develop a coherent framework for integrative simultaneous analysis of the exploration-exploitation and model order selection trade-offs. We improve over our preceding results on the same subject (Seldin et al., 2011) by combining PAC-Bayesian analysis with Bernstein-type inequality for martingales. Such a combination is also of independent interest for studies of multiple simultaneously evolving martingales.

We describe a method that infers whether statistical dependences between two observed variables X and Y are due to a \direct" causal link or only due to a connecting causal
path that contains an unobserved variable of low complexity, e.g., a binary variable. This problem is motivated by statistical genetics. Given a genetic marker that is correlated with a phenotype of interest, we want to
detect whether this marker is causal or it only correlates with a causal one. Our method is based on the analysis of the location of the conditional distributions P(Y jx) in the simplex of all distributions of Y . We report encouraging results on semi-empirical data.

We show how the SVM can be viewed as a maximum likelihood estimate of a class of probabilistic models. This model class can be viewed as a reparametrization of the SVM in a similar vein to the v-SVM reparametrizing the classical (C-)SVM. It is not discriminative, but has a non-uniform marginal. We
illustrate the benefits of this new view by rederiving and re-investigating two established SVM-related algorithms.

This work addresses the following question: Under what assumptions on the data generating process can one infer the causal graph from the joint distribution? The approach
taken by conditional independencebased causal discovery methods is based on two assumptions: the Markov condition and faithfulness. It has been shown that under these assumptions the causal graph can be identified up to Markov equivalence (some arrows remain undirected) using methods like the PC algorithm. In this work we propose an alternative by Identifiable Functional Model Classes (IFMOCs). As our main theorem we prove that if the data generating process belongs to an IFMOC, one can identify the complete causal graph. To the best of our knowledge this is the first identifiability result of this kind that is not limited to linear functional relationships. We discuss
how the IFMOC assumption and the Markov and faithfulness assumptions relate to each other and explain why we believe that the IFMOC assumption can be tested more easily on given data. We further provide a practical algorithm that recovers the causal graph from finitely many data; experiments on simulated data support the theoretical fndings.

Nearest neighbor ($k$-NN) graphs are widely used in machine learning and data mining applications, and our aim is to better understand what they reveal about the cluster structure of the unknown underlying distribution of points. Moreover, is it possible to identify spurious structures that might arise due to sampling variability? Our first contribution is a statistical analysis that reveals how certain subgraphs of a $k$-NN graph form a consistent estimator of the cluster tree of the underlying distribution of points. Our second and perhaps most important contribution is the following finite sample guarantee. We carefully work out the tradeoff between aggressive and conservative pruning and are able to guarantee the removal of all spurious cluster structures while at the same time guaranteeing the recovery of salient clusters. This is the first such finite sample result in the context of clustering.

We propose a method that infers whether linear relations between two high-dimensional variables X and Y are due to a causal influence from X to Y or from Y to X. The earlier proposed so-called Trace Method is extended to the regime where the dimension of the observed variables exceeds the sample size. Based on previous work, we postulate
conditions that characterize a causal relation between X and Y . Moreover, we describe a statistical test and argue that both causal directions are typically rejected if there is a common cause. A full theoretical analysis is
presented for the deterministic case but our approach seems to be valid for the noisy case, too, for which we additionally present an approach based on a sparsity constraint. The discussed method yields promising results for both simulated and real world data.

The internal structure of a measuring device, which depends on what its components are and how they are organized, determines how it categorizes its inputs. This paper presents a geometric approach to studying the internal structure of measurements performed by distributed systems such as probabilistic cellular automata. It constructs the quale, a family of sections of a suitably defined
presheaf, whose elements correspond to the measurements performed by all subsystems of a distributed system. Using the quale we quantify (i) the information generated by a measurement; (ii) the extent to which a measurement is context-dependent; and (iii) whether a measurement is decomposable into independent submeasurements, which turns out to be equivalent to context-dependence. Finally, we show that only indecomposable measurements are more informative than the sum of their submeasurements.

A neurorehabilitation approach that combines robot-assisted active physical therapy and Brain-Computer Interfaces (BCIs) may provide an additional mileage with respect to traditional rehabilitation methods for patients with severe motor impairment due to cerebrovascular brain damage (e.g., stroke) and other neurological conditions. In this paper, we describe the design and modes of operation of a robot-based rehabilitation framework that enables artificial support of the sensorimotor feedback loop. The aim is to increase cortical plasticity by means of Hebbian-type learning rules. A BCI-based shared-control strategy is used to drive a Barret WAM 7-degree-of-freedom arm that guides a subject's arm. Experimental validation of our setup is carried out both with healthy subjects and stroke patients. We review the empirical results which we have obtained to date, and argue that they support the feasibility of future rehabilitative treatments employing this novel approach.

Time plays an essential role in the diffusion of information, influence and disease over networks. In many cases we only observe when a node copies information, makes a decision or becomes infected -- but the connectivity, transmission rates between nodes and transmission sources are unknown. Inferring the underlying dynamics is of outstanding interest since it enables forecasting, influencing and retarding infections, broadly construed. To this end, we model diffusion processes as discrete networks of continuous temporal processes occurring at different rates. Given cascade data -- observed infection times of nodes -- we infer the edges of the global diffusion network and estimate the transmission rates of each edge that best explain the observed data. The optimization problem is convex. The model naturally (without heuristics) imposes sparse solutions and requires no parameter tuning. The problem decouples into a collection of independent smaller problems, thus scaling easily to networks on the order of hundreds of thousands of nodes. Experiments on real and synthetic data show that our algorithm both recovers the edges of diffusion networks and accurately estimates their transmission rates from cascade data.

We derive a generalized notion of f-divergences, called (f,l)-divergences. We show that this generalization enjoys many of the nice properties of f-divergences, although it is a richer family. It also provides alternative definitions of standard divergences in terms of surrogate risks. As a first practical application of this theory, we derive a new estimator for the Kulback-Leibler divergence that we use for clustering sets of vectors.

Conditional independence testing is an important problem, especially in Bayesian network learning and causal discovery. Due to the curse of dimensionality, testing for conditional independence of continuous variables is particularly challenging. We propose a Kernel-based Conditional Independence test (KCI-test), by constructing an appropriate test statistic and deriving its asymptotic distribution under the null hypothesis of conditional
independence. The proposed method is computationally efficient and easy to implement. Experimental results show that it outperforms other methods, especially when the conditioning set is large or the sample size is not very large, in which case other methods encounter difficulties.

We analyze a family of probability distributions that are characterized by an embedded combinatorial structure. This family includes models having arbitrary treewidth and arbitrary sized factors. Unlike general models with such freedom, where the “most probable explanation” (MPE) problem is inapproximable, the combinatorial structure within our model, in particular the indirect use of submodularity, leads to several MPE algorithms that all have approximation guarantees.

In pages: 1-4, CVPR Workshop on Inference in Graphical Models with Structured Potentials, June 2011 (inproceedings)

Abstract

Recently, a family of global, non-submodular energy functions
has been proposed that is expressed as coupling edges
in a graph cut. This formulation provides a rich modelling
framework and also leads to efficient approximate inference
algorithms. So far, the results addressed binary random variables. Here, we extend these results to the multi-label case, and combine edge coupling with move-making algorithms.

We propose a new family of non-submodular global energy functions that still use submodularity internally to couple edges in a graph cut. We show it is possible to develop an efﬁcient approximation algorithm that, thanks to the internal submodularity, can use standard graph cuts as a subroutine. We demonstrate the advantages of edge coupling in a natural setting, namely image segmentation. In particular, for ﬁnestructured objects and objects with shading variation, our structured edge coupling leads to signiﬁcant improvements over standard approaches.

Cross-spectral density (CSD), is widely used to find linear dependency between two real or complex valued time series. We define a non-linear extension of this measure by mapping the time series into two Reproducing Kernel Hilbert Spaces. The dependency is quantified by the Hilbert Schmidt norm of a cross-spectral density operator between these two spaces. We prove that, by choosing a characteristic kernel for the mapping, this quantity detects any pairwise dependency between the time series. Then we provide a fast estimator for the Hilbert-Schmidt norm based on the Fast Fourier Trans form. We demonstrate the interest of this approach to quantify non-linear dependencies between frequency bands of simulated signals and intra-cortical neural recordings.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 2011), pages: 3719-3726 , IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2011 (inproceedings)

Abstract

Many real-world tasks require fast planning of highly dynamic movements for their execution in real-time. The success often hinges on quickly finding one of the few plans that can achieve the task at all. A further challenge is to quickly find a plan which optimizes a desired cost. In this paper, we will discuss this problem in the context of catching small flying targets efficiently. This can be formulated as a non-linear optimization problem where the desired trajectory is encoded by an adequate parametric representation. The optimizer generates an energy-optimal trajectory by efficiently using the robot kinematic redundancy while taking into account maximal joint motion, collision avoidance and local minima. To enable the resulting method to work in real-time, examples of the global planner are generalized using nearest neighbour approaches, Support Vector Machines and Gaussian process regression, which are compared in this context. Evaluations indicate that the presented method is highly efficient in complex tasks such as ball-catching.

In recent work, we have provided evidence that fronto-parietal γ-oscillations of the electromagnetic field of the brain modulate the sensorimotor-rhythm. It is unclear, however, what impact this effect may have on explaining and addressing within-subject performance variations of brain-computer interfaces (BCIs). In this paper, we provide evidence that on a group-average classification accuracies in a two-class motor-imagery paradigm differ by up to 22.2% depending on the state of fronto-parietal γ-power. As such, this effect may have a large impact on the design of future BCI-systems. We further investigate whether adapting classification procedures to the current state of γ-power improves classification accuracy, and discuss other approaches to exploiting this effect.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 2011), pages: 1856-1861 , IEEE, Piscataway, NJ, USA, IEEE International Conference on Robotics and Automation (ICRA), May 2011 (inproceedings)

Abstract

Future service robots will need to perform a wide range of tasks using various objects. In order to perform complex tasks, robots require a suitable internal representation of the task. We propose a hybrid framework for representing manipulation tasks, which combines continuous motion planning and discrete task-level planning. In addition, we use a mid-level planner to optimize individual actions according to the plan. The proposed framework incorporates biologically-inspired concepts, such as affordances and motor primitives, in order to efficiently plan for manipulation tasks. The final framework is modular, can generalize well to different situations, and is straightforward to expand. Our demonstrations also show how the use of affordances and mid-level planning can lead to improved performance.

Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems