Linear causal analysis is central to a wide range of important application spanning finance, the physical sciences, and engineering. Much of the existing literature in linear causal analysis operates in the time domain. Unfortunately, the direct application of time domain linear causal analysis to many real-world time series presents three critical challenges: irregular temporal sampling, long range dependencies, and scale. Moreover, real-world data is often collected at irregular time intervals across vast arrays of decentralized sensors and with long range dependencies which make naive time domain correlation estimators spurious. In this paper we present a frequency domain based estimation framework which naturally handles irregularly sampled data and long range dependencies while enabled memory and communication efficient distributed processing of time series data. By operating in the frequency domain we eliminate the need to interpolate and help mitigate the effects of long range dependencies. We implement and evaluate our new work-flow in the distributed setting using Apache Spark and demonstrate on both Monte Carlo simulations and high-frequency financial trading that we can accurately recover causal structure at scale.

Recent research applying information theory to causal analysis has shown that the causal structure of some systems can actually come into focus and be more informative at a macroscale. That is, a macroscale description of a system (a map) can be more informative than a fully detailed microscale description of the system (the territory). This has been called “causal emergence.”

One of the basic assumptions implicit in the way physics is usually done is that all causation flows in a bottom up fashion, from micro to macro scales. However this is wrong in many cases in biology, and in particular in the way the brain functions. Here I make the case that it is also wrong in the case of digital computers - the paradigm of mechanistic algorithmic causation - and in many cases in physics, ranging from the origin of the arrow of time to the process of state vector preparation. I consider some examples from classical physics, as well as the case of digital computers, and then explain why this is possible without contradicting the causal powers of the underlying microphysics. Understanding the emergence of genuine complexity out of the underlying physics depends on recognising this kind of causation.

Unlike previous approaches,
CGNN leverages both conditional independences and distributional asymmetries
to seamlessly discover bivariate and multivariate causal structures, with or without
hidden variables. CGNN does not only estimate the causal structure, but a full and
differentiable generative model of the data.

We believe that our approach opens new avenues of research, both from the point of view of leveraging
the power of deep learning in causal discovery and from the point of view of building deep networks
with better structure interpretability. Once the model is learned, the CGNNs present the advantage
to be fully parametrized and may be used to simulate interventions on one or more variables of the
model and evaluate their impact on a set of target variables. This usage is relevant in a wide variety
of domains, typically among medical and sociological domains.

This calculus entails finding and applying controlled interventions to an evolving object to estimate how its algorithmic information content is affected in terms of positive or negative shifts towards and away from randomness in connection to causation. The approach is an alternative to statistical approaches for inferring causal relationships and formulating theoretical expectations from perturbation analysis

Current machine learning systems operate, almost exclusively, in a purely statistical mode,
which puts severe theoretical limits on their performance. We consider the feasibility of leveraging
counterfactual reasoning in machine learning tasks, and to identify areas where such
reasoning could lead to major breakthroughs in machine learning applications.

We propose the deconfounder, an algorithm that combines unsupervised machine learning and predictive model checking to perform causal inference in multiple-cause settings. The deconfounder infers a latent variable as a substitute for unobserved confounders and then uses that substitute to perform causal inference. We develop theory for when the deconfounder leads to unbiased causal estimates, and show that it requires weaker assumptions than classical causal inference. We analyze its performance in three types of studies: semi-simulated data around smoking and lung cancer, semi-simulated data around genome wide association studies, and a real dataset about actors and movie revenue. The deconfounder provides a checkable approach to estimating close-to-truth causal effects.