Iterative Algorithms for Complex Causal Network Discovery

The study of complex networks is a growing area of research with applications in multiple fields ranging from neuroscience through genetics, ecology, social anthropology, informatics, economy and energetics to climate research. A key principle in complex network research is viewing the system at hand as a network of interacting subsystems, with one of the central questions being that of estimating the pattern of mutual or causal interactions of these. While in some cases (computer or social networks) the existence of connections can be naturally defined, in other systems (neuroscience, climatology) it is often problematic to determine this structure by direct observation. For this reason, methods for inference of causal structure using only knowledge of time series have been developed.
The main concept of causality was designed by Sir Clive Granger in 1960's. This generally states that a variable is to be considered causal with respect to some target variable, if its inclusion in a model improves the prediction of the target. This process has been generalized to nonlinear processes using the framework of information theory. However, the practical applicability of this methodology has been hindered by the need to properly account for all other potential intervening variables in the system, bringing in both computational and accuracy issues growing with network size (so called curse of dimensionality).
Recently, several methods based on iterative procedures for assessment of conditional dependencies have been developed to mitigate this problem. Based on theoretical analysis and numerical simulations, we bring a comparison of accuracy and computational demands of two general principles. Numerical simulations were performed on a vector autoregressive process of order 1 with different structural matrices, including random Erdos-Rényi matrix and matrices that model realistic (neuroscientific and climatic) complex systems.
Our simulations suggest that accuracy of studied algorithms is (under suitable parameter choice) very similar, for fixed value of the false negative ratio the algorithms have approximately the same value of the false positive ratio. However, computational demands of the algorithms differ in a manner depending on the density and size of estimated network. One of the methods is more effective in estimation of very sparse networks, the other one in the estimation of networks with higher density. Based on analysis of the current algorithms, we have also proposed a new hybrid Fast Approximate Causal Discovery Algorithm (FACDA), designed for optimized performance and accuracy. Unlike the other studied algorithms, FACDA appears on simulated data to be computationally effective also for estimation of causality in complex systems with hundreds elements.