Gene-gene interaction is believed to play an important role in understanding complex traits. Multifactor dimensionality reduction (MDR) was proposed by Ritchie et al. [2001. Am J Hum Genet 69:138-147] to identify multiple loci that simultaneously affect disease susceptibility. Although the MDR method has been widely used to detect gene-gene interactions, few studies have been reported on MDR analysis when there are missing data. Currently, there are four approaches available in MDR analysis to handle missing data. The first approach uses only complete observations that have no missing data, which can cause a severe loss of data. The second approach is to treat missing values as an additional genotype category, but interpretation of the results may then be not clear and the conclusions may be misleading. Furthermore, it performs poorly when the missing rates are unbalanced between the case and control groups. The third approach is a simple imputation method that imputes missing genotypes as the most frequent genotype, which may also produce biased results. The fourth approach, Available, uses all data available for the given loci to increase power. In any real data analysis, it is not clear which MDR approach one should use when there are missing data. In this article, we consider a new EM Impute approach to handle missing data more appropriately. Through simulation studies, we compared the performance of the proposed EM Impute approach with the current approaches. Our results showed that Available and EM Impute approaches perform better than the three other current approaches in terms of power and precision.

1 Comments:

Saw this a few weeks ago. If I remember from reading the article, their results show that their fancy EM method doesn't perform any better than simply using the available data, as presented in Bush et al 2006 in Bioinformatics.

About Me

Edward Rose Professor of Informatics,
Director of the Institute for Biomedical Informatics, Director of the Division of Informatics in the Department of Biostatistics and Epidemiology,
Senior Associate Dean for Informatics,
The Perelman School of Medicine,
University of Pennsylvania