The problem of inferring causal relations from observed correlations is relevant to a wide variety of scientific disciplines. Yet given the correlations between just two classical variables, it is impossible to determine whether they arose from a causal influence of one on the other or a common cause influencing both. Only a randomized trial can settle the issue. Here we consider the problem of causal inference for quantum variables. We show that the analogue of a randomized trial, causal tomography, yields a complete solution. We also show that, in contrast to the classical case, one can sometimes infer the causal structure from observations alone. We implement a quantum-optical experiment wherein we control the causal relation between two optical modes, and two measurement schemes—with and without randomization—that extract this relation from the observed correlations. Our results show that entanglement and quantum coherence provide an advantage for causal inference.

The Kalman filter is a well-established approach to get information on the time-dependent state of a system from noisy observations. It was developed in the context of the Apollo project to see the deviation of the true trajectory of a rocket from the desired trajectory. Afterwards it was applied to many different systems with small numbers of components of the respective state vector (typically about 10). In all cases the equation of motion for the state vector was known exactly. The fast dissipative magnetization dynamics is often investigated by x-ray magnetic circular dichroism movies (XMCD movies), which are often very noisy. In this situation the number of components of the state vector is extremely large (about 105), and the equation of motion for the dissipative magnetization dynamics (especially the values of the material parameters of this equation) is not well known. In the present paper it is shown by theoretical considerations that – nevertheless – there is no principle problem for the use of the Kalman filter to denoise XMCD movies of fast dissipative magnetization dynamics.

Photometry of stars from the K2 extension of NASA’s Kepler mission is afflicted by systematic effects caused by small (few-pixel) drifts in the telescope pointing and other spacecraft issues. We present a method for searching K2 light curves for evidence of exoplanets by simultaneously fitting for these systematics and the transit signals of interest. This method is more computationally expensive than standard search algorithms but we demonstrate that it can be efficiently implemented and used to discover transit signals. We apply this method to the full Campaign 1 data set and report a list of 36 planet candidates transiting 31 stars, along with an analysis of the pipeline performance and detection efficiency based on artificial signal injections and recoveries. For all planet candidates, we present posterior distributions on the properties of each system based strictly on the transit observables.

Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, 471(2179), 2015 (article)

Abstract

We deliver a call to arms for probabilistic numerical methods: algorithms for numerical tasks, including linear algebra, integration, optimization and solving differential equations, that return uncertainties in their calculations. Such uncertainties, arising from the loss of precision induced by numerical calculation with limited time or hardware, are important for much contemporary science and industry. Within applications such as climate science and astrophysics, the need to make decisions on the basis of computations with large and complex data have led to a renewed focus on the management of numerical uncertainty. We describe how several seminal classic numerical methods can be interpreted naturally as probabilistic inference. We then show that the probabilistic view suggests new algorithms that can flexibly be adapted to suit application specifics, while delivering improved empirical performance. We provide concrete illustrations of the benefits of probabilistic numeric algorithms on real scientific problems from astrometry and astronomical imaging, while highlighting open problems with these new algorithms. Finally, we describe how probabilistic numerical methods provide a coherent framework for identifying the uncertainty in calculations performed with a combination of numerical algorithms (e.g. both numerical optimizers and differential equation solvers), potentially allowing the diagnosis (and control) of error sources in computations.

In a previous study we have shown that human motion trajectories can be characterized by translating continuous trajectories into symbol sequences with well-defined complexity measures. Here we test the hypothesis that the motion complexity individuals generate in their movements might be correlated to the degree of creativity assigned by a human observer to the visualized motion trajectories. We asked participants to generate 55 novel hand movement patterns in virtual reality, where each pattern had to be repeated 10 times in a row to ensure reproducibility. This allowed us to estimate a probability distribution over trajectories for each pattern. We assessed motion complexity not only by the previously proposed complexity measures on symbolic sequences, but we also propose two novel complexity measures that can be directly applied to the distributions over trajectories based on the frameworks of Gaussian Processes and Probabilistic Movement Primitives. In contrast to previous studies, these new methods allow computing complexities of individual motion patterns from very few sample trajectories. We compared the different complexity measures to how a group of independent jurors rank ordered the recorded motion trajectories according to their personal creativity judgment. We found three entropic complexity measures that correlate significantly with human creativity judgment and discuss differences between the measures. We also test whether these complexity measures correlate with individual creativity in divergent thinking tasks, but do not find any consistent correlation. Our results suggest that entropic complexity measures of hand motion may reveal domain-specific individual differences in kinesthetic creativity.

Abstraction and hierarchical information-processing are hallmarks of human and animal intelligence underlying the unrivaled flexibility of behavior in biological systems. Achieving such a flexibility in artificial systems is challenging, even with more and more computational power. Here we investigate the hypothesis that abstraction and hierarchical information-processing might in fact be the consequence of limitations in information-processing power. In particular, we study an information-theoretic framework of bounded rational decision-making that trades off utility maximization against information-processing costs. We apply the basic principle of this framework to perception-action systems with multiple information-processing nodes and derive bounded optimal solutions. We show how the formation of abstractions and decision-making hierarchies depends on information-processing costs. We illustrate the theoretical ideas with example simulations and conclude by formalizing a mathematically unifying optimization principle that could potentially be extended to more complex systems.

Although complex forms of communication like human language are often assumed to have evolved out of more simple forms of sensorimotor signaling, less attention has been devoted to investigate the latter. Here, we study communicative sensorimotor behavior of humans in a two-person joint motor task where each player controls one dimension of a planar motion. We designed this joint task as a game where one player (the sender) possesses private information about a hidden target the other player (the receiver) wants to know about, and where the sender's actions are costly signals that influence the receiver's control strategy. We developed a game-theoretic model within the framework of signaling games to investigate whether subjects' behavior could be adequately described by the corresponding equilibrium solutions. The model predicts both separating and pooling equilibria, in which signaling does and does not occur respectively. We observed both kinds of equilibria in subjects and found that, in line with model predictions, the propensity of signaling decreased with increasing signaling costs and decreasing uncertainty on the part of the receiver. Our study demonstrates that signaling games, which have previously been applied to economic decision-making and animal communication, provide a framework for human signaling behavior arising during sensorimotor interactions in continuous and dynamic environments.

Previous studies have shown that sensorimotor processing can often be described by Bayesian learning, in particular the integration of prior and feedback information depending on its degree of reliability. Here we test the hypothesis that the integration process itself can be tuned to the statistical structure of the environment. We exposed human participants to a reaching task in a three-dimensional virtual reality environment where we could displace the visual feedback of their hand position in a two dimensional plane. When introducing statistical structure between the two dimensions of the displacement, we found that over the course of several days participants adapted their feedback integration process in order to exploit this structure for performance improvement. In control experiments we found that this adaptation process critically depended on performance feedback and could not be induced by verbal instructions. Our results suggest that structural learning is an important meta-learning component of Bayesian sensorimotor integration.

Rate distortion theory describes how to communicate relevant information most efficiently over a channel with limited capacity. One of the many applications of rate distortion theory is bounded rational decision making, where decision makers are modeled as information channels that transform sensory input into motor output under the constraint that their channel capacity is limited. Such a bounded rational decision maker can be thought to optimize an objective function that trades off the decision maker's utility or cumulative reward against the information processing cost measured by the mutual information between sensory input and motor output. In this study, we interpret a spiking neuron as a bounded rational decision maker that aims to maximize its expected reward under the computational constraint that the mutual information between the neuron's input and output is upper bounded. This abstract computational constraint translates into a penalization of the deviation between the neuron's instantaneous and average firing behavior. We derive a synaptic weight update rule for such a rate distortion optimizing neuron and show in simulations that the neuron efficiently extracts reward-relevant information from the input by trading off its synaptic strengths against the collected reward.

Free energy models of learning and acting do not only care about utility or extrinsic value, but also about intrinsic value, that is, the information value stemming from probability distributions that represent beliefs or strategies. While these intrinsic values can be interpreted as epistemic values or exploration bonuses under certain conditions, the framework of bounded rationality offers a complementary interpretation in terms of information-processing costs that we discuss here.

Inferring the causal structure of a set of random variables from a finite sample of the joint distribution is an important problem in science. The case of two random variables is particularly challenging since no (conditional) independences can be exploited. Recent methods that are based on additive noise models suggest the following principle: Whenever the joint distribution {\bf P}^{(X,Y)} admits such a model in one direction, e.g., Y=f(X)+N, N \perp\kern-6pt \perp X, but does not admit the reversed model X=g(Y)+\tilde{N}, \tilde{N} \perp\kern-6pt \perp Y, one infers the former direction to be causal (i.e., X\rightarrow Y). Up to now, these approaches only dealt with continuous variables. In many situations, however, the variables of interest are discrete or even have only finitely many states. In this work, we extend the notion of additive noise models to these cases. We prove that it almost never occurs that additive noise models can be fit in both directions. We further propose an efficient algorithm that is able to perform this way of causal inference on finite samples of discrete variables. We show that the algorithm works on both synthetic and real data sets.

Heritable epigenetic polymorphisms, such as differential cytosine methylation, can underlie phenotypic variation1, 2. Moreover, wild strains of the plant Arabidopsis thaliana differ in many epialleles3, 4, and these can influence the expression of nearby genes1, 2. However, to understand their role in evolution5, it is imperative to ascertain the emergence rate and stability of epialleles, including those that are not due to structural variation. We have compared genome-wide DNA methylation among 10 A. thaliana lines, derived 30 generations ago from a common ancestor6. Epimutations at individual positions were easily detected, and close to 30,000 cytosines in each strain were differentially methylated. In contrast, larger regions of contiguous methylation were much more stable, and the frequency of changes was in the same low range as that of DNA mutations7. Like individual positions, the same regions were often affected by differential methylation in independent lines, with evidence for recurrent cycles of forward and reverse mutations. Transposable elements and short interfering RNAs have been causally linked to DNA methylation8. In agreement, differentially methylated sites were farther from transposable elements and showed less association with short interfering RNA expression than invariant positions. The biased distribution and frequent reversion of epimutations have important implications for the potential contribution of sequence-independent epialleles to plant evolution.

The interplay between optimization and machine learning is one of the most important developments in modern computational science. Optimization formulations and methods are proving to be vital in designing algorithms to extract essential knowledge from huge volumes of data. Machine learning, however, is not simply a consumer of optimization technology but a rapidly evolving field that is itself generating new optimization ideas. This book captures the state of the art of the interaction between optimization and machine learning in a way that is accessible to researchers in both fields.
Optimization approaches have enjoyed prominence in machine learning because of their wide applicability and attractive theoretical properties. The increasing complexity, size, and variety of today's machine learning models call for the reassessment of existing assumptions. This book starts the process of reassessment. It describes the resurgence in novel contexts of established frameworks such as first-order methods, stochastic approximations, convex relaxations, interior-point methods, and proximal methods. It also devotes attention to newer themes such as regularized optimization, robust optimization, gradient and subgradient methods, splitting techniques, and second-order methods. Many of these techniques draw inspiration from other fields, including operations research, theoretical computer science, and subfields of optimization. The book will enrich the ongoing cross-fertilization between the machine learning community and these other fields, and within the broader optimization community.

Motivation: Over the last decade, both static and dynamic fragment libraries for protein structure prediction have been introduced. The former are built from clusters in either sequence or structure space and aim to extract a universal structural alphabet. The latter are tailored for a particular query protein sequence and aim to provide local structural templates that need to be assembled in order to build the full-length structure.
Results: Here, we introduce HHfrag, a dynamic HMM-based fragment search method built on the profile–profile comparison tool HHpred. We show that HHfrag provides advantages over existing fragment assignment methods in that it: (i) improves the precision of the fragments at the expense of a minor loss in sequence coverage; (ii) detects fragments of variable length (6–21 amino acid residues); (iii) allows for gapped fragments and (iv) does not assign fragments to regions where there is no clear sequence conservation. We illustrate the usefulness of fragments detected by HHfrag on targets from most recent CASP.

Direct policy search is a promising reinforcement learning framework, in particular for controlling continuous, high-dimensional systems. Policy search often requires a large number of samples for obtaining a stable policy update estimator, and this is prohibitive when the sampling cost is expensive. In this letter, we extend an expectation-maximization-based policy search method so that previously collected samples can be efficiently reused. The usefulness of the proposed method, reward-weighted regression with sample reuse (R), is demonstrated through robot learning experiments.

Models are among the most essential tools in robotics, such as kinematics and dynamics models of the robot's own body and controllable external objects. It is widely believed that intelligent mammals also rely on internal models in order to generate their actions. However, while classical robotics relies on manually generated models that are based on human insights into physics, future autonomous, cognitive
robots need to be able to automatically generate models that are based on information which is extracted from the data streams accessible to the robot. In this paper, we
survey the progress in model learning with a strong focus on robot control on a kinematic as well as dynamical level. Here, a model describes essential information about the behavior of the environment and the in uence of an agent on this environment. In the context of model based learning control, we view the model from three different perspectives. First, we need to study the dierent possible model learning architectures for robotics. Second, we discuss what kind of problems these architecture and the domain of robotics imply for the applicable learning methods. From this discussion, we deduce future directions of real-time learning algorithms. Third, we show where these
scenarios have been used successfully in several case studies.

We describe factored spectrally transformed linear mixed models (FaST-LMM), an algorithm for genome-wide association studies (GWAS) that scales linearly with cohort size in both run time and memory use. On Wellcome Trust data for 15,000 individuals, FaST-LMM ran an order of magnitude faster than current efficient algorithms. Our algorithm can analyze data for 120,000 individuals in just a few hours, whereas current algorithms fail on data for even 20,000 individuals (http://mscompbio.codeplex.com/).

The amount of information encoded by networks of neurons critically depends on the correlation structure of their activity. Neurons with similar stimulus preferences tend to have higher noise correlations than others. In homogeneous populations of neurons, this limited range correlation structure is highly detrimental to the accuracy of a population code. Therefore, reduced spike count correlations under attention, after adaptation, or after learning have been interpreted as evidence for a more efficient population code. Here, we analyze the role of limited range correlations in more realistic, heterogeneous population models. We use Fisher information and maximum-likelihood decoding to show that reduced correlations do not necessarily improve encoding accuracy. In fact, in populations with more than a few hundred neurons, increasing the level of limited range correlations can substantially improve encoding accuracy. We found that this improvement results from a decrease in noise entropy that is associated with increasing correlations if the marginal distributions are unchanged. Surprisingly, for constant noise entropy and in the limit of large populations, the encoding accuracy is independent of both structure and magnitude of noise correlations.

Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems