The TATA box was first identified in 1978[1] as a component of eukaryotic promoters. Transcription is initiated at the TATA box in TATA-containing genes. The TATA box is the binding site of the TATA-binding protein(TBP) and transcription factors in some eukaryotic genes. Gene transcription by RNA polymerase II depends on the regulation of the core promoter by long-range regulatory elements such as enhancers and silencers.[5] Without proper regulation of transcription, eukaryotic organisms would not be able to properly respond to their environment.

Most research on the TATA box has been conducted on yeast, human, and Drosophila genomes, however, similar elements have been found in archaea and ancient eukaryotes.[2] In archaea species, the promoter contains an 8 bp AT-rich sequence located ~24 bp upstream of the transcription start site. This sequence was originally called Box A, which is now known to be the sequence that interacts with the homologue of the archaeal TATA-binding protein (TBP). Also, even though some studies have uncovered several similarities, there are others that have detected notable differences between archaeal and eukaryotic TBP. The archaea protein exhibits a greater symmetry in its primary sequence and in the distribution of electrostatic charge, which is important because the higher symmetry lowers the protein's ability to bind the TATA box in a polar manner.[2]

Even though the TATA box is present in many eukaryoticpromoters, is important to note that is not contained in the majority of promoters. One study found less than 30% of 1031 potential promoter regions contain a putative TATA box motif in humans.[9] In Drosophila, less than 40% of 205 core promoters contain a TATA box.[8] When there is an absence of the TATA box and TBP is not present, the downstream promoter element (DPE) in cooperation with the initiator element (Inr) bind to the transcription factor II D (TFIID), initiating transcription in TATA-less promoters. The DPE has been identified in three Drosophila TATA-less promoters and in the TATA-less human IRF-1 promoter.[10]

In prokaryotes, promoter regions may contain a Pribnow box, which serves an analogous purpose to the eukaryotic TATA box. The Pribnow box has a 6 bp region centered around the -10 position and a 8-12 bp sequence around the -35 region that are both conserved.[10]

A CAAT box (also CAT box) is a region of nucleotides with the following consensus sequence: 5’ GGCCAATCT 3’. The CAAT box is located about 75-80 bases upstream of the transcription initiation site and about 150 bases upstream of the TATA box. It binds transcription factors (CAAT TF or CTFs) and thereby stabilizes the nearby preinitiation complex for easier binding of RNA polymerases. CAAT boxes are rarely found in genes that express proteins ubiquitous in all cell types.[10]

The TATA box is a component of the eukaryotic core promoter and generally contains the consensus sequence 5'-TATA(A/T)A(A/T)-3'.[3] In yeast, for example, one study found that various Saccharomyces genomes had the consensus sequence 5'-TATA(A/T)A(A/T)(A/G)-3', yet only about 20% of yeast genes even contained the TATA sequence.[12] Similarly, in humans only 24% of genes have promoter regions containing the TATA box.[13] Genes containing the TATA-box tend to be involved in stress-responses and certain types of metabolism and are more highly regulated when compared to TATA-less genes.[12][14] Generally, TATA-containing genes are not involved in essential cellular functions such as cell growth, DNA replication, transcription, and translation because of their highly regulated nature.[14]

Additionally, binding of TBP is facilitated by stabilizing interactions with DNA flanking the TATA box, which consists of G-C rich sequences.[19] These secondary interactions induce bending of the DNA and helical unwinding.[20] The degree of DNA bending is species and sequence dependent. For example, one study used the adenovirus TATA promoter sequence (5'-CGCTATAAAAGGGC-3') as a model binding sequence and found that human TBP binding to the TATA box induced a 97° bend toward the major groove while the yeast TBP protein only induced an 82° bend.[21]X-ray crystallography studies of TBP/TATA-box complexes generally agree that the DNA goes through an ~80° bend during the process of TBP-binding.[16][17][18]

Figure 3. Effects on TBP binding to the TATA box from mutations. Wildtype shows transcription done normally. An insertion or deletion shifts the TATA box recognition site which results in a shifted transcription site.[26] Point mutations risk the TBP being unable to bind for initiation.[27]

One of the first studies of TATA box mutations looked at a sequence of DNA from Agrobacterium tumefaciens for the octopine type cytokinin gene.[26] This specific gene has three TATA boxes. A phenotype change was only observed when all three TATA boxes were deleted. An insertion of extra base pairs between the last TATA box and the transcription start site resulted in a shift in the start site; thus, resulting in a phenotypic change. From this original mutation study, a change in transcription can be seen when there is no TATA box to promote transcription, but transcription of a gene will occur when there is an insertion to the sequence. The nature of the resulting phenotype may be affected due to the insertion.

Savinkova et al. has written a simulation to predict the KD value for a selected TATA box sequence and TBP.[37] This can be used to directly predict the phenotypic traits resulting from a selected mutation based on how tightly TBP is binding to the TATA box.

Gastric cancer is correlated with TATA box polymorphism.[38] The TATA box has a binding site for the transcription factor of the PG2 gene. This gene produces PG2 serum, which is used as a biomarker for tumours in gastric cancer. Longer TATA box sequences correlates with higher levels of PG2 serum indicating gastric cancer conditions. Carriers with shorter TATA box sequences may produce lower levels of PG2 serum.

MicroRNAs also play a role in replicating viruses such as HIV-1.[43] Novel HIV-1-encoded microRNA have been found to enhance the production of the virus as well as activating HIV-1 latency by targeting the TATA box region.

Many of the studies so far have been performed in vitro, providing only a prediction of what may happen not a real-time representation of what is happening in the cells. Recent studies in 2016 have been done to demonstrate TATA-binding activity in vivo. Core promoter-specific mechanisms for transcription initiation by the canonical TBP/TFIID-dependent basal transcription machinery has recently been documented in vivo showing the activation by SRF-dependent upstream activating sequence (UAS) of the human ACTB gene involved in TATA-binding.[5]

Evolutionary changes have pushed plants to adapt to the changing environmental conditions. In the history of Earth, the development of Earth's aerobic atmosphere resulted in an iron deficiency in plants.[45] Modifications to the iron-regulated transporter 1 (IRT1) promoter region of apple trees, genus Malus, have been done recently to increase iron uptake. By inserting a TATA box in the promoter upstream of the iron-regulated transporter 1 (IRT1) promoter, the promoter activity levels are enhanced, which causes TFIID activity to increase, which helps to up-regulate transcription initiation.[45]