Abstract

Analysis of protein structures reveals that they are made up from independent globular substructures known as protein domains.
The fold that these domains assume is typically the same in evolutionarily related proteins. However, exceptions to this rule
allow us to begin to determine the process by which novel folds can develop from ancestral folds and possibly even how the
first folds came into existence. Various lines of research have shown that thermodynamic stability, designability, functional
flexibility and structural drift all play important roles in shaping the distribution and variation of structural families
in nature.

The CATH hierarchy. The four major hierarchical levels in the CATH structural classification – (C)lass, (A)rchitecture, (T)opology or fold level and (H)omologous superfamily. Three of the
most highly populated architectures in the classification are illustrated.

Figure 2.

Structure is not always more conserved than sequence. In this case, domains 1du2a00 (CATH code: 1.10.110.10; blue) and 1se7a00 (CATH code:1.20.58.250; blue) superpose badly and have a low structural similarity (SSAP score = 55.48). However, the sequence
alignment produced using the sequence alignment software MUSCLE shows clear sequence similarity between the domains (sequence
identity=60%).

Figure 3.

Large insertions and conserved cores. Highlighted in red are two examples of domains from the adenosine triphosphate (ATP)‐Grasp family. These examples vary significantly in size, with the largest example (left) containing many inserts and embellishments.
Despite this variation, the cores are recognizably similar and the location of the active site (the yellow and green residues)
appear to be conserved.

Figure 4.

Identifiable metabolic paths of LUCA using homology data derived from structure rather than sequence. It has been possible to derive a more complex view of the
abilities of the LUCA. Sequence‐based approaches had only identified systems involved in information transfer (i.e. translation and transcription).
Figure was adapted by Stathis Sedaris from Ranea J, Sillero A, Thornton J and Orengo C () Protein superfamily evolution and the Last Common Universal Ancestor. Journal of Molecular Biology63(4): 513–525.