Abstract

Background: The prediction of protein-protein interactions is
an important step toward the elucidation of protein functions and the
understanding of the molecular mechanisms inside the cell. While
experimental methods for identifying these interactions remain costly
and often noisy, the increasing quantity of solved 3D protein
structures suggests that in silico methods to predict interactions
between two protein structures will play an increasingly important
role in screening candidate interacting pairs. Approaches using the
knowledge of the structure are presumably more accurate than those
based on sequence only. Approaches based on docking protein structures
solve a variant of this problem, but these methods remain very
computationally intensive and will not scale in the near future to the
detection of interactions at the level of an interactome, involving
millions of candidate pairs of proteins.

Results: Here, we describe a computational method to predict
efficiently in silico whether two protein structures interact. This
yes/no question is presumably easier to answer than the standard
protein docking question, "How do these two protein structures
interact?" Our approach is to discriminate between interacting and
non-interacting protein pairs using a statistical pattern recognition
method known as a support vector machine (SVM). We demonstrate that
our structure-based method performs well on this task and scales well
to the size of an interactome.

Conclusions: The use of structure information for the
prediction of protein interaction yields significantly better
performance than other sequence-based methods. Among structure-based
classifiers, the SVM algorithm, combined with the metric learning
pairwise kernel and the MAMMOTH kernel, performs best in our
experiments.