Abstract

Abstract Functioning as an essential step of pre‐mRNA processing, polyadenylation has been realized in recent years to play an important regulatory role during eukaryotic gene expression. Such regulation occurs mostly through the use of alternative polyadenylation (APA) sites and generates different transcripts with altered coding capacity for proteins and/or RNA. However, the molecular mechanisms that underlie APAs are poorly understood. Besides APA cases demonstrated in animal embryo development, cancers, and other diseases, there are a number of APA examples reported in plants. The best‐known ones are related to flowering time control pathways and stress responses. Genome‐wide studies have revealed that plants use APA extensively to generate diversity in their transcriptomes. Although each transcript produced by RNA polymerase II has a poly(A) tail, over 50% of plant genes studied possess multiple APA sites in their transcripts. The signals defining poly(A) sites in plants were mostly studied through classical genetic means. Our understanding of these poly(A) signals is enhanced by the tallies of whole plant transcriptomes. The profiles of these signals have been used to build computer models that can predict poly(A) sites in newly sequenced genomes, potential APA sites in genes of interest, and/or to identify, and then mutate, unwanted poly(A) sites in target transgenes to facilitate crop improvements. In this review, we provide readers an update on recent research advances that shed light on the understanding of polyadenylation, APA, and its role in gene expression regulation in plants. WIREs RNA 2011 2 445–458 DOI: 10.1002/wrna.59 This article is categorized under: RNA Processing > Splicing Regulation/Alternative Splicing RNA Processing > 3' End Processing

Images

Polyadenylation signals of nuclear mRNA in plants and representative algae. The algal signal information was mostly from three species with sequenced genomes: the green alga Chlamydomonas reinhardtii and two diatoms, Thalassiosira pseudonana and Phaeodactylum tricornutum. The specific signals of C. reinhardtii are denoted with *. The signal strength information is for both plants and algae based on classical genetics (plants) and bioinformatics (algae) analyses. The percentage data were estimated from the results of bioinformatics analysis. The question mark after FUE implies that some of the species might not have this signal. CDS, protein coding sequence; UTR, untranslated region; FUE, far upstream element; NUE, near upstream element; CE, cleavage element; PAS, poly(A) site; YA, predominant dinucleotide located at the poly(A) or cleavage site where Y = U or C. The ‘A’ is the last nucleotide before poly(A) tail; > or ≫ indicates that one nucleotide appears more than another. Not drawn to scale.

Schematic representation of alternative polyadenylation of sense transcripts of FCA, FPA, CPSF30, and antisense transcripts of FLC. (a) Gene structure; (b) The transcript derived from proximal poly(A) site (P); (c) The transcript derived from the distal poly(A) site (D). The open boxes denote exons, the solid lines denote introns, and the dotted lines indicate joining exons. The filled black boxes represent the gene segment being exon (and/or part of UTR) when the proximal poly(A) site is used, but being intron when the distal poly(A) site is used. The double backward slash lines denote exons that are not shown.