Protein Superfamily Evolution and the Last Universal Common Ancestor (LUCA)

Abstract

By exploiting three-dimensional structure comparison, which is more sensitive than conventional sequence-based methods for detecting remote homology, we have identified a set of 140 ancestral protein domains using very restrictive criteria to minimize the potential error introduced by horizontal gene transfer. These domains are highly likely to have been present in the Last Universal Common Ancestor (LUCA) based on their universality in almost all of 114 completed prokaryotic (Bacteria and Archaea) and eukaryotic genomes. Functional analysis of these ancestral domains reveals a genetically complex LUCA with practically all the essential functional systems present in extant organisms, supporting the theory that life achieved its modern cellular status much before the main kingdom separation (Doolittle 2000). In addition, we have calculated different estimations of the genetic and functional versatility of all the superfamilies and functional groups in the prokaryote subsample. These estimations reveal that some ancestral superfamilies have been more versatile than others during evolution allowing more genetic and functional variation. Furthermore, the differences in genetic versatility between protein families are more attributable to their functional nature rather than the time that they have been evolving. These differences in tolerance to mutation suggest that some protein families have eroded their phylogenetic signal faster than others, hiding in many cases, their ancestral origin and suggesting that the calculation of 140 ancestral domains is probably an underestimate.

Notes

Acknowledgments

We would like to thank Beatriz Simas Magalhaes for her useful advice and comments, Stathis Sideris for help with the figures, and Corin Yeats for text review. This work was supported by grants from the MRC (Christine A. Orengo) and European Union (Juan A. G. Ranea). A.S. was a visiting professor at UCL (from UAM) aided by the Spanish Ministry of Education and Science and supported by grants from Direccion General de Investigacion Cientifica y Tecnica (08/0021.1/2001) and Instituto de Salud Carlos III, RMN (C03/08) Madrid, Spain.