Simon says turn this gene off. Now, Simon says, “turn this gene on”!

One of the most fundamental tenets of cell biology is turning genes on and off. Gene expression regulation is primary to all cell processes; in order to perform their “jobs” in the body, cells must selectively turn on genes — for example, in order to facilitate brain functions, neurons must express neuronal genes, and must concomitantly turn off (silence) non-neuronal genes. Interfering with gene expression regulation leads to extremely deleterious effects, and it is known that one of the principal causes of cancer is dysregulation of gene expression.

Thus, in order to facilitate proper function, cells must specifically express certain genes and must silence other genes. This combination of upregulated and downregulated genes is necessary to maintain cellular function and phenotype.

While this process of turning genes on and off sounds a lot like flipping a light switch on and off, as we can all anticipate, it definitely isn’t so easy. There are countless levels of control over gene expression — at the chromatin levels, at the transcriptional levels, and at the post-transcriptional levels.

At the level of genomic DNA, there are two broad ways to regulate gene expression: histone modification and DNA modification. Modifications to histones (which wrap DNA and form chromatin), such as methylation and acetylation, are a way to control gene expression. For example, histone acetylation at a gene almost always leads to the activation of that gene (with only one exception). Acetylating histones is believed to loosen the wrapping of DNA around histones, thereby making DNA more accessible to the RNA polymerase machinery, the complex of proteins that are supposed to produce mRNA from that gene.

Another way to regulate gene expression is through DNA modification. While histones can be methylated at specific amino acid residues, DNA itself can actually be methylated — at critical “CpG islands”. When a cytosine is adjacent to a guanine (cytosine – phosphodiester bond – guanine; therefore leading to the term “CpG island”, the cytosine can be modified and methylated to produce 5-methylcytosine. When this cytosine is methylated into 5-methylcytosine, the CpG island is said to be methylated.

Methylation of CpG islands in a gene leads to the silencing of that gene — it is no longer expressed and transcribed into mRNA. Therefore, the methylation of CpG islands is a cellular mechanism to silence genes. The cell methylates CpG islands on viral genes and transposons, therefore silencing them; a very clever form of intracellular immunity to viral expression. Disruption of proper CpG methylation patterns leads to cancer — when proliferation genes are unsilenced and turned on in normal cells, this makes them proliferate and undergo oncogenic transformation, turning them into cancer cells. CpG methylation is also critical during development; during development, enzymes known as methyltransferases (DNMT3A and DNMT3B) establish a characteristic methylation pattern in the genomes of newly-made cells. The pattern of genes that are turned on and the genes that are turned off establish the cell’s identity, making it know what bodily “job” it’s supposed to perform. When these cells divide to form daughter cells, another methyltransferase (DNMT1) copies the mother cell’s methylation pattern into the daughter cells, so the daughter cells are just like the mother cell.

This poses a very interesting situation. Focusing just on CpG islands, one can envisage that a gene can be activated if it is demethylated, and that a gene is silenced if it is methylated; this silencing process is mediated by methyltransferases. But then, are silenced genes ever turned back on? Are methylated genes ever demethylated and expressed? Certain evidence shows that in order for us to store new memories, we must activate a certain class of genes to make new brain cells (neurons; the process is known as neurogenesis, making more neurons), and these newly-born neurons may play a role in increasing our memory storage capacity. But obviously, these neurogenesis genes aren’t turned on all the time when we don’t need them (when our memories aren’t being stimulated). So is there a magic switch that can turn on silenced neurogenesis genes and express them when we need to capture new memories?

It has been a long scientific campaign to find “demethylases” that will demethylate and re-express silenced genes. The scientific community’s efforts to find an enzyme that can clip off the methyl group of a 5-methylcytosine and convert it into cytosine hasn’t yielded fruit yet. However, we might have found the elusive genes that may mediate a multistep pathway to convert 5-methylcytosine into cytosine.

A paper in the Journal of Biological Chemistry in 2004 by a British group showed that a DNA deaminase, AID, could deaminate 5-methylcytosine (the methylated component of methyl-CpG islands) into thymine. While this initially seems unrelated to demethylating 5-methylcytosine, consider this. An unmethylated CpG island has the cytosine base-paired to a complimentary guanine (C:G). Methylating this CpG island makes it look like: 5-methyl-C:G. Now, if you add AID, the 5-methylcytosine becomes deaminated: it looks like T:G. This can be recognized as a base-pair mismatch and a mutation, as thymine doesn’t interact with guanine through Watson-Crick interactions. Therefore, if you add an enzyme that preferentially removes the thymine, and then you use polymerase to add back the cytosine, you convert the T:G into C:G … without the methyl group on cytosine! This process therefore would convert 5-methylcytosine into cytosine through a thymine intermediate and utilizes DNA repair processes to “repair” a T:G mismatch.

An excellent Cell paper in 2008 by a group from the University of Utah showed that the overexpression of both AID and another gene, MBD4, is sufficient to initiate genome-wide demethylation in zebrafish embryoes — expressing AID and MBD4 leads to the demethylation of CpG islands throughout the entire genome! Furthermore, the authors found that this demethylation process involves a T:G intermediate in the genome. According to our proposed model, AID would deaminate 5-methylcytosine into thymine. MBD4 is a DNA repair enzyme that recognizes T:G mismatches, and preferentially removes the thymine. This leaves an open lesion in the DNA, and DNA repair processes pick it up from there, adding a complimentary C to the G, reforming a C:G basepair — an unmethylated CpG island! This paper also found that overexpression of GADD45, a DNA repair controller, enhanced this process, strongly suggesting that DNA repair mechanisms are necessary to re-fill the cytosine and reform the C:G basepair.

Together, these two papers provide a mechanistic basis for how AID would orchestrate the first biochemical step in a three-step process to demethylate CpG islands. Recently, strong evidence has emerged to support that AID is a “CpG demethylase”. For example, stem cells are known to highly express AID (this was originally shown in the first J. Biol. Chem. paper). It would make sense that stem cells would have a need to be able to re-activate certain genes. Embryonic stem cells/pluripotent stem cells are responsible for generating all of the hundreds of cells types of the human body — and obviously, they can’t turn on the genes of all these hundreds of cell types at once! Instead, they must express none of these genes initially — and when they differentiate to produce one out of these hundreds of cell types, they selectively turn on that cell type’s genes and keep the rest of the genes silenced. Thus, obviously, pluripotent stem cells must have a mechanism to activate the (originally silenced) genes of the differentiated cell type they wish to make.

Concordantly, pluripotent stem cells are known to express AID and related family members (i.e. APOBEC1). The fact that pluripotent stem cells, which need to reactivate silenced genes, express AID, provides further evidence to suggest AID is indeed a CpG demethylase.

Recently, several other genes, including TET1 and Elp3, have also recently been shown to have tentative CpG demethylase activity.

The identification of putative DNA demethylases has strong clinical implications. Good cells “go bad” and become cancer cells when they start expressing genes they’re not supposed to — like proliferation genes (allowing them to multiply uncontrollably) and migration genes (allowing them to metastasize). DNA methylation patterns are known to be aberrant in cancer cells, suggesting that CpG demethylation may allow for normal cells to begin expressing cancer genes. Therefore, are AID and other DNA demethylases responsible for allowing for cells to become cancerous? If so, inhibition of these demethylases may represent a novel therapeutic strategy to prevent the initiation or progression of cancer.