Amy E. Keating

We are studying the specificity of protein-protein interactions in a research program that combines bioinformatic analysis, structural modeling, computational design and experimental characterization. Our aim is to understand, at a high level of detail, how the interaction properties of proteins are encoded in their sequences and structures. Most of our work is focused on two protein families that are important for human health: the α-helical coiled coil and the Bcl-2 family of apoptosis-regulating proteins.

Protein-protein interactions establish the architecture of the cell, regulate biological signaling, underlie the assembly of macromolecular machines and mediate chemical transformations. Although we now have fairly complete lists of the proteins found in various organisms, our knowledge of which proteins interact with one another, as well as how and why they interact as they do, is limited. In the Keating lab, we are particularly interested in the question of interaction specificity, i.e. how a protein selects a particular interaction partner out of a large number of closely related alternatives. Both computational and experimental methods are needed to accelerate discovery and understanding in this area. Our lab integrates both approaches, tackling the complex problems of characterizing, analyzing, designing and predicting protein-protein interaction specificity by studying domains with relatively simple structures. Three goals are: (1) to develop and apply techniques to assess the interaction specificity of biologically interesting protein families in vitro, (2) to achieve an understanding of how specificity is encoded biophysically, through the analysis of protein sequence and structure, and (3) to develop and test computational methods for predicting and designing specific protein-protein interactions.

Coiled coils

The α-helical coiled coil is the simplest of all protein-protein interaction motifs. Coiled coils consist of two or more α-helices that wrap around each other with a superhelical twist. They are characterized by a repeating sequence of seven amino acids, (abcdefg)n, in which the a- and d-position residues are predominantly hydrophobic and the e- and g-position residues are usually polar or charged. The regular sequence makes it possible to predict the occurrence of coiled coils in genomic sequence data. We estimate that >5% of all proteins in S. cerevisiae, C. elegans,A. thaliana and D. melanogaster contain a coiled-coil region. It is likely that many of these coiled coils mediate protein-protein interactions or oligomerization. An important, unanswered question about coiled coils is how their interaction specificity is encoded in their sequences. We call this the “partnering problem” for coiled coils and are studying it using both computational and experimental approaches.

Our experimental approach to the partnering problem started with an analysis of human bZIP transcription factor interactions. In these proteins, the coiled-coil region determines the homo- or heterodimerization specificity of the transcription factor, which in turn influences its DNA-binding properties and biological function. To determine how sequence encodes interaction preferences in the bZIPs, we used protein microarray technology to measure all of the pair-wise interactions between 48 human and 10 yeast bZIP peptides. We found that the interactions are very specific, and that interaction profiles are largely, but not universally, conserved within bZIP subfamilies. This work established the protein microarray as a powerful method for generating large amounts of high quality interaction specificity data. We continue to develop techniques that can increase the throughput and improve the reliability of protein-protein interaction measurements.

In addition to providing a wealth of data about important transcription factor interactions, the bZIP microarray data provided an opportunity to test and improve computational models. We have used this information to develop and/or test several different methods for predicting coiled-coil interactions. A machine-learning algorithm trained on the literature shows excellent performance in detecting correct bZIP pairings. We have also used structure-based methods for prediction. Because the coiled coil has a very simple structure, it is particularly amenable to molecular modeling. We have shown that structural modeling can be used in conjunction with learning models to provide good predictions of bZIP coiled-coil interactions. We are now applying structure-based methods more broadly to the problem of predicting interaction specificity, with recent work focused on predicting parallel vs antiparallel helix orientation preferences.

Another way to understand factors that mediate protein association is through the process of design. The field of protein design has seen exciting advances in the past ten years with the application of fast search algorithms to the problem of side-chain selection and positioning. This has allowed the design of proteins with new folds and functions. We are applying methods developed for the computational design of stable protein folds to the study of protein interaction specificity. In one study we designed and characterized a mini-protein heterotetramer in collaboration with Barbara Imperiali’s group at MIT. More recently, we have designed coiled-coil peptides that bind specifically to native bZIP transcription factor targets, and validated these using the protein microarray assay. Designed coiled coils could not only serve as research tools for probing the cell and disrupting native interactions, but also hold significant promise for applications in the emerging area of synthetic biology.

Bcl-2 family proteins

The Bcl-2 family comprises ~25 proteins important for controlling apoptosis. Critical junctures that govern cellular life-vs-death decisions are regulated by specific interactions among pro- and anti-apoptotic members of this family. The delicate balance between these is often disrupted in cancers. Five mammalian anti-apoptotic family members have a conserved structure with a surface binding cleft, and all known family members share a weakly conserved short BH3 (Bcl-2 homology 3) sequence. Peptides corresponding to the BH3 region have been shown in several instances to bind as alpha helices into the hydrophobic groove on the surface of the anti-apoptotic proteins (see figure). We are interested in how the interaction specificity of Bcl-2 family proteins is determined by sequence and structure and are exploring this using x-ray crystallography, mutational analysis, selection experiments and computational protein design. Using new computational methods for varying the backbone structure of α-helices, we designed several novel ligands to bind the anti-apoptotic protein Bcl-xL. Solution binding studies confirm that many of these designed peptides bind with low- to mid- nanomolar affinity and have specificity profiles that differ from those of known native BH3s. More recently we have solved crystal structures that provide additional insights into the structural plasticity of Bcl-2 complexes, and we have selected BH3 peptides with novel sequences and specific binding behaviors out of libraries.

Computational methodology

We apply a wide range of computational tools to the analysis of protein interactions, including structure-based modeling, sequence analysis and machine learning. Some projects in the lab are focused on developing or improving computational methods, and a recent exciting advance was our collaboration with the Ceder group (MIT Materials Science) to adapt the technique of cluster expansion for use in protein modeling. Cluster expansion allows the energy of a protein folded into a certain structure to be expressed directly as a function of sequence. The approach brings dramatic speed-ups to modeling calculations while retaining many of the benefits of physical, structure-based approaches. We have also explored many methodological aspects of computational protein design (including the use of cluster expansion in design).