RNA polymerase II stalling: loading at the start prepares genes for a sprint

Abstract

Stalling of RNA polymerase II near the promoter has recently been found to be much more common than previously thought. Genome-wide surveys of the phenomenon suggest that it is likely to be a rate-limiting control on gene activation that poises developmental and stimulus-responsive genes for prompt expression when inducing signals are received.

The recruitment of RNA polymerase II (Pol II) to the promoter has been generally believed to be the rate-limiting step in gene activation [1]. However, a series of discoveries made since the mid-1980s, combined with recent genome-wide studies, suggest that many developmental and inducible Drosophila and mammalian genes, prior to their expression, contain Pol II bound predominantly in their promoter proximal regions in a 'stalled' state [1–8]. Activation of the stalled polymerase is thought to be responsible for the expression of these genes [1].

Distinct sets of accessory factors are associated with Pol II stalling and its escape from stalling, acting either by direct interaction with Pol II, or by manipulating the chromatin environment - for example, by affecting histone modifications by histone methyltransferases (HMTs) or histone acetyltransferases (HATs) [1]. Proteins associated with Pol II stalling include the DRB sensitivity-inducing factor (DSIF) and the negative elongation factor (NELF) [9, 10], whereas proteins such as the positive transcription-elongation factor-b (P-TEFb) complex, and the general transcription factors TFIIS and TFIIF contribute to escape from stalling [11, 12]. These latter proteins enable Pol II to begin transcription elongation on induction by heat shock, for example. Thus, all the factors mentioned above may serve as a stalling checkpoint to poise genes for prompt expression.

Although initial studies revealed that Pol II stalling is present on several genes, it has recently been found that Pol II stalling is more common than previously thought [13–16]. Technologies such as ChIP-chip (chromatin immunoprecipitation in combination with genomic DNA microarrays) allow transcriptional regulation to be examined genome wide [17, 18]. Through the ENCODE (Encyclopedia of DNA Elements) project [19], in which 1% of the human genome has been extensively analyzed, and through genomic studies in Drosophila, human embryonic stem cells (hESCs) and other human cell lines, Pol II stalling is now seen as a genome-wide phenomenon. Moreover, it is enriched at highly regulated genes that are essential for responses to stimuli and for embryonic development [13, 14]. In addition, stalled Pol II signals are associated with active histone modification marks, including trimethylation of lysine 4 on histone H3 (H3K4me3) and acetylation of H3 lysine 9 and 14 (H3K9ac and H3K14ac) [16]. Thus, Pol II promoter-proximal stalling could help to provide an active chromatin environment and prepare developmental and stimulus-responsive genes for timely expression [1, 20]. However, for a gene to proceed to transcriptional elongation, additional histone modifications are necessary [16]. In this article, we review the mechanisms of Pol II stalling, with a focus on recent genomic and epigenomic findings, and discuss the biological implications of the widespread stalling phenomenon.

Pol II stalling and transcription elongation

Pol II promoter-proximal stalling was first described in Drosophila heat-shock-inducible genes (for example, Hsp70) using ultraviolet-crosslinking and chromatin immunoprecipitation (UV ChIP), which captures the specific proteins and their bound DNA in vivo [4]. Pol II was found to be recruited to the promoter of the uninduced Hsp70 gene, where it initiates RNA synthesis but stalls after synthesis of 20-50 nucleotides of RNA [1, 21]. Heat-shock stimulation enabled Pol II to escape from the Hsp70 promoter-proximal region and transcribe the full-length RNA. Thus, the regulation of Pol II stalling rather than of transcription initiation is rate-limiting for expression of this gene. After this initial discovery, Pol II stalling was observed in more than a dozen Drosophila and viral (HIV) genes, as well as in mammalian genes (Myc, Junb, and Fos), in studies using UV ChIP and nuclear run-on methods [1–8].

Pol II stalling was found to result from repression of transcript elongation by at least two protein complexes: DSIF and NELF [9, 10]. ChIP assays showed that both DSIF and NELF are co-localized with the stalled polymerase in the promoter-proximal regions of uninduced Drosophila heat-shock genes [9, 10]. The mechanisms by which DSIF and NELF regulate Pol II stalling are still under investigation. As NELF has been found to bind to RNA, it is possible that NELF exerts its repressive function through interaction with the nascent RNA [22].

The negative effects of DSIF and NELF on transcriptional elongation are relieved by the action of factors that include the P-TEFb complex, TFIIF and TFIIS [11, 12] (Figure 1). The P-TEFb complex includes the cyclin-dependent kinase-9 (CDK9), and cyclin T. P-TEFb facilitates the release of Pol II from stalling by phosphorylating DSIF, NELF and the carboxy-terminal domain (CTD) of the largest Pol II subunit (Rpb1) at its Ser2 residue [8, 11]. In addition, TFIIS also helps Pol II to escape from promoter-proximal regions to carry out full-length gene transcription. Stalled polymerases are prone to backtracking along the DNA template, such that the 3'-OH of the nascent RNA becomes misaligned with the Pol II active site. By associating with the stalled Pol II, TFIIS enables transcription elongation to continue by stimulating the intrinsic RNA cleavage activity of Pol II to generate a new 3'-OH end that is aligned with the active site [12]. Moreover, TFIIF, eleven-nineteen lysine-rich in leukemia (ELL), and elongin also help to overcome stalling and stimulate Pol II transcription [1]. It is thought that TFIIF functions mainly during the promoter-escape stage, whereas ELL and elongin exert their effects once the nascent RNA transcript is eight to nine nucleotides long [1]. After Pol II escapes from the promoter-proximal regions, NELF dissociates from the elongation complex while DSIF, TFIIS and P-TEFb remain associated and continue to stimulate full-length gene transcription (Figure 1) [1, 9, 12]. Identification and characterization of additional elongation factors will certainly further our current understanding of Pol II stalling and its escape.

Figure 1

RNA polymerase II promoter-proximal stalling and subsequent escape to transcriptional elongation. At many genes, RNA polymerase II (Pol II) stalls after the initiation of transcription, producing a short transcript typically less than 50 nucleotides long (left). Escape from stalling (right) is induced by developmental or environmental signals. In the stalled complex, only Ser5 of the carboxy-terminal domain (CTD) of Pol II is phosphorylated [9]. The P-TEFb complex (composed of CDK9 and cyclin T) facilitates release of Pol II from stalling by phosphorylating DSIF, NELF and the carboxy-terminal domain of Pol II at Ser2 residues [8,11]. See text for details of other proteins shown in the diagram.

Genome-wide Pol II stalling phenomena

Until very recently, Pol II stalling was only known at the promoters of a limited number of genes. Large-scale transcriptional regulation studies have enabled Pol II locations to be examined on a whole-genome scale. ChIP-chip experiments in both human and Drosophila using tiling oligo-nucleotide microarrays have demonstrated that Pol II stalling is a genome-wide phenomenon and more common than previously thought [17, 18]. Kim et al. [15] showed that, in addition to genes, such as Fos, that were known to have stalled Pol II, large numbers of promoters of human genes are bound by the Pol II preinitiation complex (PIC), although no expression was detected for these genes. These authors used specific antibodies against the PIC components of Pol II and the general transcription factor TFIID in ChIP-chip assays to map global PIC-binding sites in human primary fibroblast IMR90 cells. Promoter occupancy by the PIC was correlated with the genome-wide expression profiles of IMR90 cells. Kim et al. found that more than 600 genes were bound by the PIC but were not expressed. Obviously, regulatory mechanisms other than Pol II recruitment to the promoters are necessary to express these transcripts.

More recently, Muse et al. [13] reported a genome-wide search using ChIP-chip for Pol II promoter-proximal stalling in Drosophila. Among the genes bound by Pol II, some carry uniform Pol II binding throughout the gene, including the coding region, whereas in others Pol II binding signals were prominent in the promoter regions, and either absent or present at a low level within the full-length genes, consistent with the unique features of stalled Pol II. The genes with stalled Pol II showed low or no expression, whereas the uniformly bound genes showed a good correlation between Pol II occupancy and expression level. Other evidence supported the notion that the promoter-enriched binding was indeed due to Pol II stalling. Permanganate footprinting assays were carried out to test promoter melting by monitoring the reactivity of thymine residues. The hyper-reactivity of single-stranded thymine residues confirmed the presence of the stalled polymerase in the transcription bubble [23]. In addition, NELF occupancy was also detected via ChIP at the genes with stalled Pol II. Depleting NELF by RNA interference significantly decreased Pol II signals in the promoter region only and not throughout the gene. Interestingly, Gene Ontology analysis revealed that the genes bound by stalled polymerase are enriched in developmental and stimulus-responsive genes involved in cell differentiation, cell-cell signaling and immune response pathways. Finally, it was shown that the stalled Pol II could be rapidly released upon gene induction, such as with UV irradiation [13].

Using similar methods, Zeitlinger et al. [14] reported a comprehensive Pol II ChIP-chip study in Drosophila embryos. This study took advantage of a mutant embryo that consists of mesodermal precursor cells alone. Neuronal- and ectodermal-specific genes are repressed in this mutant embryo. Pol II stalling was observed at more than 1,000 genes. In these cases, there is a much higher Pol II signal associated with the 5' region of the gene than the 3' region, and neuronal and ectodermal developmental genes were over-represented among the genes with stalled Pol II. On the other hand, ubiquitously expressed genes, such as genes for ribosomal components, exhibited uniform Pol II binding signal throughout the entire gene and a high level of expression. Zeitlinger et al. [14] also found genes that lacked Pol II binding altogether. These genes were enriched in genes for adult functions that do not need to be expressed in the embryo.

Connecting chromatin structure with Pol II stalling

In addition to polymerase binding, an 'open' chromatin structure is essential for active transcription. Various covalent histone modifications around the transcription start site are thought to be important for nucleosome depletion and chromatin decondensation, which enable Pol II to move forward and transcribe a DNA template. The most common chromatin modifications include histone acetylation by HAT and methylation by HMT, which can alter the properties of chromatin and affect nucleosome repositioning. Genome-wide studies in Saccharomyces cerevisiae and Drosophila have shown that trimethylation of H3K4, and acetylation of H3K9 and H3K14 are usually associated with active transcription [1, 16, 24]. In humans, the correlation between Pol II occupancy and histone marks is likely to be more complex. As mentioned above, large numbers of human genes have Pol II bound at their promoter-proximal regions [15, 16, 25]. In addition, Guenther et al. [16] found that most annotated genes are associated with H3K4me3, H3K9ac and H3K14ac modifications in the promoter regions, although more than half of these genes are inactive. This is true not only in hESCs but also in differentiated cells, including primary hepatocytes and B cells [16]. Most genes that are bound by Pol II initiate transcription, but only genes with histone H3 trimethylation of lysine 36 (H3K36me3) and dimethylation of lysine 79 (H3K79me2) proceed to elongation and produce a mature transcript (Figure 2). Furthermore, quantitative reverse transcription-PCR (RT-PCR) showed that the genes that lack these additional histone marks do bind Pol II but at a level lower than the actively transcribed genes, and produce mainly short 5' transcripts of fewer than 70 nucleotides [16, 18]. This observation is consistent with the findings of Barski and colleagues on genome-wide DNA methylation [25]. In addition, studies in the ENCODE Project [18] and by Weber et al. [26] revealed that H3K4me2 and H3K4me3 are enriched at unmethylated CpG island promoters, regardless of gene-expression status, suggesting that these histone marks may protect DNA from methylation and irreversible transcriptional silencing [18, 26]. It is presently unclear whether chromatin modifications are the result of transcriptional events, or facilitate them, or both. For example, H3 trimethylation of lysine 36 by the enzyme Set2 associated with the elongating transcription complex might alter chromatin structure and thereby facilitate subsequent rounds of transcription [1, 27].

Figure 2

Histone-modification patterns associated with Pol II stalling and escape. Histone modifications typical of (a) genes with stalled Pol II and (b) after Pol II escape to transcription elongation. Stalled Pol II signals are associated with active histone-modification marks, including histone H3 trimethylation on lysine 4 (H3K4me3) and acetylation of lysine 9 and 14 (H3K9ac, H3K14ac) [16]. For transcript elongation to proceed, not only the histone modification marks mentioned above are necessary; additional histone modifications, including H3 trimethylation on K36 (H3K36me3) and dimethylation on K79 (H3K79me2) are also needed [16]. The colored bars indicate the location of the histone modifications on the transcripts.

Pol II stalling and new transcription

Recent studies of the transcriptome using microarrays or sequencing have disclosed large amounts of transcriptional activity throughout the human genome, with many of the transcripts being present as polyA+ RNA [18, 28–30]. The biological role of this 'unannotated' transcription, much of which is not expected to encode proteins, remains elusive. In particular, it was reported that short transcribed sequences of less than 200 nucleotides are clustered at the 5' and 3' ends of genes, producing the so-called promoter-associated sRNAs (PASRs) and termini-associated sRNAs (TASRs) [31]. Intriguingly, the approximate lengths of one major class of PASRs are 26, 38 and 50 nucleotides, which is consistent with the lengths of the short transcripts reported to be produced as a result of Pol II stalling. However, there are usually multiple PASRs in the promoter region, not necessarily starting from the same site. Whether these represent multiple transcriptional start sites and the relationship between these PASRs and Pol II stalling are not yet clear. Finally, Pol II binding signals are also observed at the 3' ends of transcripts. It is possible that Pol II also pauses upon termination of transcription (Z Lian, A Karpikov, J Lian, MC Mahajan, S Hartman, M Gerstein, MS and SM Weissman, unpublished data). Further study is necessary to reveal the role of Pol II stalling in global transcription. The emerging technology of massively parallel sequencing will facilitate this effort [32–34].

The implication of genome-wide Pol II stalling and future prospects

What is the broad implication of Pol II stalling beyond the heat-shock response [4]? The genomic studies discussed in this review suggest that Pol II stalling is much more widespread than previously thought, and could serve as a rate-limiting step in transcriptional regulation to prepare organisms to respond to dynamic environmental and developmental changes. The prompt generation of gene products is crucial for the development and survival of the organism. Zeitlinger et al. [14] found that genes with stalled Pol II were highly enriched among developmental genes that are destined to be transcribed very soon - within 12 hours. Furthermore, Pol II promoter-proximal stalling could help to establish an active chromatin structure and allow prompt regulation of gene expression upon environmental stimuli and developmental signals. However, it is not clear how active histone marks are established around the 'poised' genes and whether the deposition of the histone marks depends on Pol II occupancy. Only about 70% of the genes marked by H3K4me3 or H3K9ac and K14ac are associated with Pol II promoter-proximal binding [16]. How Pol II stalling is regulated also remains to be investigated. As noted above, data from Hsp70 indicate that stalling is likely to be mediated through a repressive mechanism. One possible developmental regulator of Pol II stalling may be the protein Snail, which represses mesodermal gene expression in Drosophila embryos [14, 35].

Further research is needed to identify additional factors that play important roles in Pol II stalling and the escape from stalling to transcriptional elongation. For example, the loss of key factors such as NELF and DSIF does not completely eliminate Pol II stalling. Also, there are indications from co-immunoprecipitation assays that additional elongation activators interact with the P-TEFb kinase, but the evidence is not conclusive [11]. Lastly, the issue of how stimuli and developmental signals trigger the elongation machinery to release the stalled Pol II will be of prime interest for future investigations.

Declarations

Acknowledgements

We thank Maya Kasowski, Wei Zheng, Karl Waern, and Christopher Hef-felfinger for critical reading of the manuscript and discussion. We acknowledge the members of the Snyder lab for help and support. JQW is supported by an NIH Ruth L Kirschstein National Research Service Award and an NIH training grant. MS and research in the Snyder laboratory is supported by grants from the NIH.

Authors' original submitted files for images

Below are the links to the authors’ original submitted files for images.