Zusammenfassung

The classical textbook assumption for the structure of an eukaryotic gene is that it codes for a single polypeptide of more than 100 amino acids in length. This is also the implicit assumption in most gene annotation pipelines. A gene family has now been discovered in insects that shows that an eukaryotic mRNA can code for peptides as short as eleven amino acids and that a single mRNA can code for several such peptides. This raises the question whether short open reading frames might also have a functional potential in other mRNAs, in particular those that occur in the 5-UTR of many mRNAs. A number of these have been shown to act in cis to regulate the translation of the main open reading frame of the mRNA. But there may be others that could act in trans on other biological processes. The question of how many peptide-coding genes may exist is therefore worth revisiting. This poses new bioinformatic challenges that can only be resolved through multiple genome comparisons within a range of evolutionary distances.