Interrogation of genome-wide networks in biology: comparison of knowledge-based and statistical methods

Abstract

Networks are used extensively in the study of biological systems to address a wide range of questions such as understanding the complex behaviour of a given system or identifying key alterations leading to a disease phenotype. Numerous network-based methods have been developed for inferring molecular interactions using transcriptomic and proteomic data. Different network methods come with their own advantages and limitations, and often give different results for the same data. A systematic study is essential to understand how the methods fare in terms of correctly predicting known biological processes and yielding testable biological hypotheses. To address this, we have carried out a comparison of four different methods to derive context-specific perturbations for two different case studies and evaluated their performance. The methods can be broadly classified into statistical inference and knowledge-based methods. Two of the four methods, WGCNA and ARACNE, belong to the broad class of data-driven approaches which do not rely on prior network information. On the other hand, ResponseNet and jActiveModules utilise knowledge-based protein–protein interaction networks and integrate condition-specific transcriptome or proteome data. We evaluated the interactions inferred through all the approaches and assessed their biological relevance based on three criteria: (1) enrichment of the gold standard gene sets, (2) comparison to gold standard pathways and (3) recovery of hub genes from the context-specific perturbed network, known to be related to the given condition. Comparing the performance of these four methods in two different cases, tuberculosis and melanoma, showed superior performance by ResponseNet, based on all three criteria.