The natural symmetry of protein folds provides a framework for the computational design of new molecules.

Recent rapid progress in the study of protein structure has led investigators to embark on the ambitious goal of protein design. Modifications of known proteins, or the construction of proteins de novo could be used to produce molecules with useful functions for biomedical, industrial, or environmental applications.

One of the greatest challenges in the rational approach to protein design is to understand and predict the structural determinants of protein folding. A protein’s correct three-dimensional shape can be difficult to infer, as illustrated by the growing number of proteins that have similar folds but divergent sequences. However, the daunting task of protein structure prediction is simplified to some degree by the observation that the tertiary structures of many proteins can be classified into one of ten fundamental protein folds. Furthermore, six of these basic folds are characterized by symmetry at the fold level as exemplified by the repeat appearance of a set unit of secondary structure in a defined order and symmetric spatial arrangement (Figure 1). This high frequency of symmetrical folds in proteins can be explained, at least in part, by the enhanced stability of symmetric homodimeric proteins in nature. Duplication and fusion of the gene encoding a homodimer would lead to a single gene encoding a monomeric protein containing a symmetric repeat structure. Iterating this process multiple times leads to the presence of multiple repeat units within a single protein fold.

Figure 1. Six of ten fundamental protein superfolds have a symmetric topology. These are the αβ-plait, TIM-barrel, β-trefoil, “jelly roll”, IG-like, and “up-down”, each shown above. Figure provided by Jens Meiler. Images taken from CATH (www.cathdb.info).

The frequent occurrence of symmetrical structures in natural proteins inspired Vanderbilt Institute of Chemical Biology (VICB) investigator Jens Meiler to exploit symmetry in his approach to computational protein design. The requirement that symmetry be maintained limits the number of possible structures and reduces the complexity (and thereby the resources) required for protein modeling calculations. The Meiler lab has now tested this approach using the enzyme imidazole glycerol phosphate synthase (HisF) as a model. The tertiary structure of HisF is an eight-strand β-barrel [(βα)8-barrel] comprised of two highly symmetrical (βα)4-half-barrels. The two half-barrels share only 16% sequence identity, but are highly superimposable in three-dimensional space. The Meiler group used HisF as a starting point to computationally engineer a (βα)8-barrel protein comprising two identical half-barrels [C. Fortenberry et al. (2011) J. Amer. Chem. Soc., published online October 6, DOI: 10.1021/ja2051217].

The Meiler group started by aligning two copies of the HisF structure at 180o rotation and identifying 62 sites that superimposed at less than 2.1 Å in the protein backbone. They used these sites as points of crossover between the two structures in order to create a series of new proteins comprising two identical half barrels. For example, as illustrated in Figure 2, one protein construct consisted of amino acids 94 through 215 from one copy of the protein connected to amino acids 94′ to 215′ from the second copy. Using distinct combinations of the 62 crossover sites resulted in 62 protein constructs. Stringent energy minimization using the publicly available ROSETTA protein design software showed that all of the most stable constructs were derived from duplicating β-strands 4 through 7. The single most stable structure was formed by crossover at amino acids 94 and 215. The resulting construct was named FLR for the amino acids present at the crossover site.

Figure 2. Approach for designing symmetric HisF constructs. (Top left) Two HisF structures (shown in blue and orange) were superimposed at 180o to each other to identify 62 sites that superimpose closely (shown in dark blue and red at the top right). (Bottom, left and right) Example of a construct made by linking amino acids 94 from protein copy 1 to amino acid 215′ of copy 2 and amino acid 94′ of copy 2 to amino acid 215 of copy 1. Reproduced with permission from Fortenberry et al. [(2011) J. Amer. Chem. Soc., published online Oct. 6, DOI: 10.1021/ja2051217]. Copyright 2011, American Chemical Society.

Having identified the most energetically stable structure computationally, the Meiler lab expressed the FLR protein as well as halfFLR, which included only a single half-barrel structure. A combination of size exclusion chromatography, circular dichroism spectroscopy, guanidine-induced denaturation, differential scanning calorimetry, and two-dimensional nuclear magnetic resonance spectroscopy indicated that both proteins consisted of a well-folded and stable (βα)8-barrel structure. All of the halfFLR was detectable as a dimer, supporting the hypothesis that the stability of the symmetrical homodimer contributed to the evolution of more complex monomeric proteins.

X-ray crystallography of both FLR and halfFLR revealed structures that agreed well with those predicted from ROSETTA calculations (Figure 3). Since the active site residues of HisF are all in the C-terminus of the protein, the FLR constructs recapitulated all of these critical amino acids. However, neither FLR nor the halfFLR dimer was catalytically active, possibly due to greater flexibility in the region of the active site than is seen in HisF. Ongoing research in the Meiler lab is targeted at restoring catalytic activity to the FLR structure.

Despite the lack of catalytic activity, FLR represents the first example of an engineered (βα)8-barrel protein composed of two identical halves. The results confirm that the Meiler group’s approach can be used to construct symmetrical, stably folded proteins, providing a scaffold for further protein refinement. In addition, the results support the hypothesis that gene duplication and fusion formed the foundation for the evolution of complex symmetric protein folds.