Abstract

The accumulation of electronically accessible data and knowledge are posing theoretical and practical challenges for study design and statistical data analysis. It consists of the use of the results of earlier high-throughput measurements of genetic variations, microRNA, and gene expression levels, and the use of the biological knowledge bases. We investigate fusion in the phases of study design, data analysis, and interpretation; specifically, we present methodologies and bioinformatic tools in the Bayesian framework to deepen, lengthen, and broaden this fusion. First, we overview a Bayesian decision support for design of partial genetic association studies (GASs) incorporating domain literature, knowledge bases, and results of analysis of earlier studies. Second, we present a Bayesian multilevel analysis (BMLA) for GAS, which performs an integrated analysis at the univariate and multivariate levels, and at the level of interactions. Third, we present a Bayesian logic to support interpretation, which integrates the results of data analysis and factual domain knowledge. Finally, we discuss the advantages of the Bayesian framework to cope with small sample size, fusion of data and knowledge, challenges of multiple testing, meta-analysis, and positive results bias (i.e., the communication of scientific uncertainty). The genomics of asthma will serve as an application domain.

Notes

Acknowledgments

We thank Yves Moreau for his insightful suggestion to apply the SNP study design system for prior generation in our Bayesian data analysis. Supported by grants from the OTKA National Scientific Research Fund (PD-76348); NKTH TECH_08-A1/2-2008-0120 (Genagrid), and the János Bolyai Research Scholarship of the Hungarian Academy of Sciences (P. Antal).