Abstract

Protein-protein docking algorithms are powerful computational tools, capable of analyzing the protein-protein interactions at the atomic-level. In this chapter, we will review the theoretical concepts behind different protein-protein docking algorithms, highlighting their strengths as well as their limitations and pointing to important case studies for each method. The methods we intend to cover in this chapter include various search strategies and scoring techniques. This includes exhaustive global search, fast Fourier transform search, spherical Fourier transform-based search, direct search in Cartesian space, local shape feature matching, geometric hashing, genetic algorithm, randomized search, and Monte Carlo search. We will also discuss the different ways that have been used to incorporate protein flexibility within the docking procedure and some other future directions in this field, suggesting possible ways to improve the different methods.

Introduction

Protein-protein interactions play key roles in several biological processes. These processes involve many essential mechanisms ranging from signal transduction and cellular transport to gene expression and immune responses. All these processes are mediated by selective and potent protein-protein interactions (Waksman & Sansom, 2005). Furthermore, many diseases have been associated with either an over-activated or an under-regulated protein-protein interaction and the cure for these diseases has been focused on either inhibiting or stimulating these interactions, respectively. For example, the p53-MDM2 interaction is associated with a severe down regulation of the p53 pathway. An inhibitor for this interaction (e.g. nutlin3) can reactivate the p53 pathway, forcing cancer cells to undergo apoptosis (Barakat, Gajewski, & Tuszynski, 2012; Barakat, Issack, Stepanova, & Tuszynski, 2011; Barakat, Mane, Friesen, & Tuszynski, 2010; Chène, 2003; Kojima et al., 2006). The more we know about such crucial interactions, the more we can build vital protein networks and apply this knowledge to identify treatments for many diseases. Moreover, characterizing these interactions at the atomic level can help in rationally designing new therapeutic agents that can either enhance or inhibit these interactions. Constructing a three dimensional structure of such protein complexes is an essential step toward identifying their binding interface and recognizing any hot spots that can be targeted for their regulation (Elcock et al. 2001; Kann, 2007; Kortemme & Baker, 2004).

For the last few decades X-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy and electron microscopy have been the main source to predict such structures. Despite their accuracy, efficiency and the huge amount of details they can provide, they are expensive and very labour and skill demanding. A simple comparison between protein structure and gene sequence databases would simply reveal the great discrepancy between the two. That is, although hundreds of thousands of gene expressions have been characterized, only less than thirty thousand protein structures have been determined so far and most of these structures are either redundant or describe only apo (unbound) proteins (Villoutreix et al., 2014). Moreover, protein complexes are more difficult to crystallize than the individual proteins, consequently, they are less represented in Protein Data Bank (PDB) (Berman et al., 2000) and constitute only a small fraction of the experimentally determined structures. This huge discrepancy and lack of structural details motivated many computational groups to fill this gap and suggest a new, rapid and cheap way to predict these interactions (Barakat & Tuszynski, 2011; Barakat, Houghton, Tyrrell, & Tuszynski, 2014; Barakat, Mane, & Tuszynski, 2011; Gógl et al., 2015; Nillegoda et al., 2015; Pedotti, Simonelli, Livoti, & Varani, 2011; Taylor et al., 2015). One solution they provided which is also the focus of this chapter is protein-protein docking.