Protein molecules can be classified into ~1000 distinct structural classes called ''folds''. A ''fold'' consists of secondary structural elements arranged and connected in a specific way. Although we know a great deal about protein evolution at the sequence level, how new folds arise is still a mystery.

We plan to use a combination of bioinformatics approaches, protein design, molecular simulations and experiments to gain insight into the ancient evolutionary events that gave rise to the current diversity of protein structures. A viable hypotheses is that longer chains are the results of fusion events merging shorter peptide segments. But a number of questions remain open. In the ancient peptide world, were these short peptides able to form a stable structure themselves? Were they able to associate to form oligomers that were similar to present-day domains? What was their function as monomers?

Our research plan involves designing and finding short peptides that can associate to a more complex fold. We computationally redesign sequences of known folds and employ Monte Carlo and discrete molecular dynamics simulations to predict the folding and association behavior of the designed peptides. Finally, the most promising sequences will be experimentally tested.

We expect that the results will lead to a plausible evolutionary scenario, supported by simulations and experiments, for the origin of protein folds. They will also provide new strategies for efficient protein design.

The main result of the project is the identification of a new, previously undescribed group of proteins that has important evolutionary and functional properties. We call them segment-swapped proteins. They are multidomain proteins with structurally similar domains, with equivalent portions swapped between them. This is similar to the well-known "3D domain swapping" phenomenon but here the swap occurs within a chain rather than between chains. We identified 18 segment-swapped protein families among proteins with known structures. We posited two evolutionary models to explain the formation of segment-swapped proteins: "domain swapping and fusion" and "circular permutation". Using a variety of tests, we proved that the majority of segment-swapped proteins likely arose by the "domain swapping and fusion" mechanism. This is also an important new mechanism for generating new multidomain architectures. Studying the functional properties of segment-swapped proteins, we found that many of them exhibit hinge-bending domain motions, and we propose that this is facilitated by the two linker regions resulting from the segment swapping. Thus, segment swapping has functional implications. Using simplified protein models, we found that increasing protein size promotes structural ordering, which in part explains why protein size tends to increase during evolution.