Reactome introduces a controlled vocabulary for peptides

Reactome often needs to represent a protein in several different forms, perhaps the initial translated form, then as fragments of this following processing, or following many different kinds of post-translational modification. In response to this, we have developed a systematic nomenclature for the names of peptides.

As part of the systematic renaming of Reactome pathway components, we have used HGNC gene symbols as the label for gene products. These are identified from UniProt via the Reactome reference molecule.

Reactome often represents several peptides that are derived from the same translated protein, all sharing a common UniProt external reference. To generate unique names for these, we have added the start and end coordinates of the peptide as a suffix to the gene symbol. The coordinates of the Reactome peptide are compared with UniProt’s ‘Chain’ feature; in UniProt this feature is part of an annotation group called Molecular Features. This feature is used because it represents the ‘default’ peptide; this usage is consistent with our use of Uniprot IDs as our primary external peptide reference. If the start and end coordinates of the Reactome peptide agree with the Uniprot Chain feature, the coordinates are not added to the gene symbol. If either coordinate is not the same as the Chain feature, both Reactome coordinates are added as a gene symbol suffix. When the true peptide start or end coordinates are unknown, the ‘?’ symbol is used. This combination of gene symbol plus coordinates is usually sufficient to generate a unique name.

Post-translational modifications (PTMs) are shown as a prefix to the gene symbol. Some Reactome peptides are exempt from systematic renaming and are named in a style that is similar to the systematic style, so far as possible.

A full explanation of the renaming process is available on the Reactome Wiki.