In silico discovery and experimental validation of new protein-protein interactions

We introduce a framework for predicting novel protein-protein interactions (PPIs), based on Fisher's method for combining probabilities of predictions that are based on different data sources, such as the biomedical literature, protein domain and mRNA expression information. Our method compares favorably to our previous method based on text-mining alone and other methods such as STRING. We evaluated our algorithms through the prediction of experimentally found protein interactions underlying Muscular Dystrophy, Huntington's Disease and Polycystic Kidney Disease, which had not yet been recorded in protein-protein interaction databases. We found a 1.74-fold increase in the mean average prediction precision for dysferlin and a 3.09-fold for huntingtin when compared to STRING. The top 10 of predicted interaction partners of huntingtin were analysed in depth. Five were identified previously, and the other five were new potential interaction partners. The full matrix of human protein pairs and their prediction scores are available for download. Our framework can be extended to predict other types of relationships such as proteins in a complex, pathway or related disease mechanisms.