Abstract

Multicellular organisms developed the concept of specialized cells that perform specific functions. Examples are neurons and fibroblast to name just two out of more than 200. These cellular differences are established based on the same sequence information stored in the cell nucleus of all cells of an organism. The sequence information needs consequently different interpretations by the different cell types. During cellular development this interpretation of the genetic code has to be tightly regulated in space and time. Interpretation of the sequence information involves the controlled activation and silencing of specific genes so that certain proteins are made in one cell type but not in others. This involves an additional regulatory information layer beyond the pure base sequence. One aspect of this regulatory information layer relies on functional groups that are attached to the C(5) position of the canonical base dC. Currently four regulatory, non-canonical bases with a methyl (CH3)-, a hydroxymethyl (CH2OH)-, a formyl (CHO)- and a carboxyl (COOH)- group are known. While 5-methyl-cytidine is long recognised to be a regulatory base in the genome, the other three bases and the enzymes responsible for generating them, were just recently discovered.