Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.

Abstract

We have previously demonstrated that the KH-domain protein αCP binds to a 3' untranslated region (3'UTR) C-rich motif of the nascent human alpha-globin (hα-globin) transcript and enhances the efficiency of 3' processing. Here we assess the genome-wide impact of αCP RNA-protein (RNP) complexes on 3' processing with a specific focus on its role in alternative polyadenylation (APA) site utilization. The major isoforms of αCP were acutely depleted from a human hematopoietic cell line, and the impact on mRNA representation and poly(A) site utilization was determined by direct RNA sequencing (DRS). Bioinformatic analysis revealed 357 significant alterations in poly(A) site utilization that could be specifically linked to the αCP depletion. These APA events correlated strongly with the presence of C-rich sequences in close proximity to the impacted poly(A) addition sites. The most significant linkage was the presence of a C-rich motif within a window 30 to 40 bases 5' to poly(A) signals (AAUAAA) that were repressed upon αCP depletion. This linkage is consistent with a general role for αCPs as enhancers of 3' processing. These findings predict a role for αCPs in posttranscriptional control pathways that can alter the coding potential and/or levels of expression of subsets of mRNAs in the mammalian transcriptome.

siRNA-mediated codepletion of αCP1 and αCP2 from K562 cells. (A) Experimental procedure. K562 cells were separately transfected with two distinct siRNAs, each cotargeting αCP1 and αCP2 mRNAs [αCP(1/2)-1 and αCP(1/2)-2]. Parallel transfections were carried out with 2 distinct control siRNAs targeting an unrelated protein (GLD-2 mRNA; CTRL-1 and CTRL-2). At 24 h posttransfection, cells were retransfected with same siRNAs, cultured an additional 2 days (total, 3 days of culture), and assessed for effective siRNA-mediated depletion by protein and RNA analyses. RNA isolated from each culture was subjected to DRS analysis for mapping and quantification of 3′ processing sites. (B) Assessment of αCP depletion by real-time RT-PCR. Levels of mRNAs encoding the two αCP isoforms, αCP1 and αCP2, are displayed. The values on the y axis represent the αCP mRNA levels normalized to levels of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) mRNA in the respective samples. The ratio of αCP to GAPDH for CTRL-1 is arbitrarily defined as 1.0. The standard deviation for each sample is shown (n = 2). (C) Assessment of αCP depletion by Western blotting. Affinity-purified antibodies specific for either αCP1 or αCP2 () were used for detection in the top and middle panels. Detection of the large ribosomal subunit, L7a (), controlled for sample loading, is shown in the bottom panel.

Impact of αCP depletion on the K562 transcriptome. (A) Heat map. Direct RNA sequencing (DRS) analyses were carried out on poly(A) RNAs isolated from cell cultures treated with the αCP siRNAs or the control siRNAs (as described for ). The heat map represents all 117 mRNA species that showed a >2-fold change in expression (increased or decreased) subsequent to the αCP depletion. The color gradient (log scale) for the heat map represents the change in the overall representation of each mRNA normalized to the corresponding level in the RNA isolated from the cells treated in parallel with the control siRNAs. The positions of the direct siRNA targets, αCP1 and αCP2 mRNAs, are indicated by the arrows to the left of the heat map. (B) GO analysis of mRNAs altered in overall expression (DGE levels) by αCP depletion. GO analyses (DAVID algorithm) of mRNAs that undergo an alteration in steady-state levels (1.5-fold or greater change) subsequent to αCP depletion were included in the analysis. The data were assessed with Fisher's exact test with the FDR adjustment. The asterisks indicate the level of significance of the effect. *, 0.01 < FDR < 0.05; **, 0.001 < FDR < 0.01; ***, FDR < 0.001.

Confirmation of DRS results by targeted real-time RT-PCR analysis. Results of real-time analyses of three mRNAs that increased and three mRNAs that decreased in overall abundance relative to controls subsequent to αCP depletion (see Table S2 in the supplemental material) are shown. These studies were carried out on the same RNA preparations as were used in the original DRS analysis. To further validate these results, analyses of mRNA levels in K562 cells were additionally carried out with cells treated with a third distinct control siRNA corresponding to an unrelated mRNA (CTRL-3; cyclophilin siRNA). All values shown were normalized to the corresponding levels of GAPDH mRNA. The data are represented as ratios, with the ratio for the CTRL-3 siRNA sample defined as 1.0. The standard deviation for each sample is shown (n = 3).

Motif analysis within the 3′UTRs of mRNAs impacted by αCP depletion. (A) MEME analyses of the sequences 5′ to the dominant poly(A) sites of all mRNAs that underwent a 1.5-fold or greater change (up or down) in their representation subsequent to αCP depletion (differentially expressed gene [DEG] mRNAs). The RNA map encompassed the 200-nt segments immediately 5′ to the sites of poly(A) addition. The top 3 motifs as detected by MEME are shown. For each motif, the P value and number of mRNAs containing corresponding motifs among the total number of mRNAs being studied are shown. The distance distributions [the poly(A) cleavage site is defined as base 0] are shown below each motif (x axis). The y axis indicates the percentage of nucleotides at each indicated site. An asterisk symbolizes a significant peak detected by the Wilcoxon rank sum test (FDR < 0.05). The P value measures the significance of a motif, and the ratio measures the fraction of mRNAs harboring corresponding motifs in the whole set of mRNAs. DEG mRNAs, 1.5-fold or greater change in expression as determined in comparisons of the αCP-depleted cells with control siRNA-treated cells. Unchanged mRNAs represent mRNAs whose expression was not changed subsequent to αCP depletion in comparison with mRNAs from control siRNA-treated samples. (B) Summary of MEME analyses of all mRNAs downregulated by greater than 1.5-fold subsequent to αCP depletion. Data are displayed as described for panel A. (C) Summary of MEME analyses of all mRNAs upregulated by greater than 1.5-fold subsequent to αCP depletion. Data are displayed as described for panel A.

Motif analysis of transcripts undergoing APA in response to αCP depletion. (A) Analyses of motifs 5′ to poly(A) sites that are involved in APA [DE-APA and SE-APA categories combined; 198 poly(A) sites] and are repressed in their representation subsequent to αCP depletion. The distance distribution plot is shown below each corresponding motif. For each motif, the P value and number of mRNAs containing corresponding motifs among the total number of mRNAs being studied are shown. The y axis indicates the percentage of each nucleotide at each indicated site at the indicated distance to the poly(A) cleavage site location (defined as base 0). The analyses from the cells treated with the control or the αCP siRNAs are directly compared in each setting. Asterisks illustrate positions with FDR (Benjamini-Hochberg algorithm) of less than 0.05 as determined by the Wilcoxon rank sum test. (B) Analyses of motifs 5′ to poly(A) sites that are involved in APA [DE-APA and SE-APA categories combined; 159 poly(A) sites] and are enhanced in their representation by αCP depletion (figure organized as described for panel A). (C) Analysis of motifs 5′ to poly(A) sites that are involved specifically in APA between sites in the same terminal exon (SE)-APA category. A discriminative motif discovery analysis by MEME was executed to specifically identify motifs overrepresented in the downregulated SE-APA [58 poly(A)s] and underrepresented in the upregulated SE-APA [44 poly(A)s]. The repressed SE-APA was defined in this comparison as the positive set, and the enhanced SE-APA was defined as the negative/background set. The position-specific prior probabilities were first estimated for the background set. Next, a normal motif search was done in downregulated SE-APA based on the position-specific prior probabilities. Note that the nucleotide at position 7 can be any of the four nucleotides. (D) Motifs 5′ to poly(A) sites that are involved specifically in APA between sites in different exons [DE-APA poly(A) sites] [122 poly(A)s]. The analysis was carried out as described for panel A.

αCP depletion alters 3′ processing of the αCP2 transcript. (A) Genome browser view of the DRS reads at the αCP2 locus. Comparison of 3′ processing site utilization in cells treated with αCP siRNAs (pooled, upper panels) and control siRNAs (pooled, lower panels). The y axis represents the number of read counts corresponding to each of the poly(A) sites. The two novel poly(A) sites observed in the αCP-depleted cells are encompassed in the dotted oval. (B) Activation of the two novel poly(A) sites in αCP mRNA subsequent to αCP depletion reflects linked alterations in splicing and 3′ processing. Exons 13 and 14 (terminal exon) of the αCP gene are shown in the diagram. The splicing patterns and the positions of the corresponding 3′ poly(A) termini detected in the control cells are indicated by the solid lines and the solid vertical arrows, respectively. (Upper panel) The alternative splicing and 3′ processing events that occur subsequent to αCP depletion are indicated by the corresponding set of dotted lines and dotted vertical arrows, respectively. (Middle panel) The RT-PCR analysis, shown in the gel image, represents amplification between primers 1 and 2. The amplified fragment (boxed) was excised and sequenced. The sequence confirmed the use of the novel splice acceptor site in exon 13a, with the two poly(A) signals highlighted (lower panel: the partial exon 13 sequence is shown in italics; the arrow indicates the starting point of new novel exon 13). (C) Real-time PCR analysis confirms the switch in terminal intron splicing and 3′ poly(A) site selection within the αCP2 transcript subsequent to αCP depletion. The amplimer generated between primers F and R assesses the canonical poly(A) usage, the amplimer generated between primers c and d assesses use of the distal novel poly(A) site within exon 13a, and the amplimer generated by primers a and b assesses overall levels of alternative splicing and 3′ processing with exon 13a. The assays were carried out on RNAs isolated from cells treated with each of the 3 distinct control siRNAs and with each of the two distinct αCP2-targeting siRNAs. Each real-time assay was normalized to the GAPDH amplicon. The ratio in the CTRL-3 sample is defined as 1.0. The standard deviation for each sample is shown (n = 3). (D) In vitro RNA-protein interaction assay. A 24-nt RNA oligonucleotide (shown below the diagram), encompassing the region immediately 5′ to exon 13a (dashed rectangle), was synthesized and 32P labeled. The DNA sequence 5′ to the splice site is also shown. (Left panel) The labeled RNA oligonucleotide was incubated with HeLa cell nuclear extract, UV-cross-linked, subjected to immunoprecipitation (IP) with anti-αCP2/KL, and resolved on an SDS-PAGE gel (). The position of the αCP complex is defined by IP using anti-αCP2/KL antibody. (Right panel) The same 32P-labeled probe was subjected to RNA EMSA with poly(C) competition and anti-αCP supershift analysis using K562 cell S100 extract as described previously ().

RT-qPCR validations of APA events. A subset of APA events identified in the DRS analysis was independently assessed by targeted RT-qPCR. Each set of DRS data is shown in the context of the genome browser diagram of the respective locus. The positions of the proximal 3′UTR and the distal 3′UTR are marked below the browser view. The arrow indicates the position of the site of reduced polyadenylation site usage triggered by αCP depletion. All the real-time RT-PCR quantifications were normalized to GAPDH mRNA and are presented as ratios versus CTRL-3 (cyclophilin siRNA) defined as 1. The standard deviation for each sample is shown (n = 3). (A and B) Examples of APA involving competing PA sites within the same terminal exon (SE-APA). KD, knockdown. (A) Amino-terminal enhancer of split (AES) gene. (B) Protein phosphatase 2, subunit B, isoform delta (PPP2r2d) gene. The sequences encompassing and 5′ to each APA sites are shown, and the C-rich sequences (putative αCP binding sites) and the alternative poly(A) signals (AAUAAA) are highlighted in red. These two examples of mRNAs with the SE-APA pattern were confirmed by real-time RT-PCR. Panel A shows the analysis of a transcript with a C-rich motif preceding the proximal alternative poly(A) site; the decreased usage of the proximal poly(A) site subsequent to αCP depletion was confirmed and demonstrated in the individual histogram and is presented as the elevated ratio of the distal 3′UTR isoform [i.e., use of the distal poly(A) site] relative to the total poly(A) site usage (as described in reference ). Panel B shows a transcript with a C-rich motif preceding the distal site of the alternative poly(A) sites; the decreased usage of the distal poly(A) site was confirmed and is shown in the individual histogram, simply presented as the ratio of the distal 3′UTR isoform [i.e., use of the distal poly(A) site] to control GAPDH mRNA. All the real-time RT-PCR quantifications were normalized to GAPDH mRNA and are presented as a ratio versus CTRL-3 (cyclophilin siRNA) defined as 1.0. The standard deviation for each sample is shown (n = 3). RNA EMSAs were performed on 32P-labeled RNA probes corresponding to each C-rich segment in the tested genes to confirm αCP complex assembly. RNA oligonucleotide sequences are shown above the EMSA data. Brackets indicate the αCP complex, and the arrows indicate the supershift that occurred when anti-αCP antibody was included in the reaction.

In vitro polyadenylation assay. 32P-labeled RNA substrates representing the poly(A) addition sites for the PRG2 and AES mRNAs were synthesized by in vitro transcription. In each case, the wild-type (WT) sequence or a sequence of a corresponding substrate containing a mutation of the C-rich region (Mut) was synthesized. (A) EMSA was performed to determine the efficiency of αCP assembly on the WT and Mut templates (as described for ). (B) In vitro polyadenylation assay. The 32P-labeled templates were added to an in vitro polyadenylation reaction mixture (HeLa nuclear extract), and the products were resolved on a denaturing polyacrylamide gel. The poly(A) tails were added to the template (bracketed) in the presence (+) but not the absence (−) of the nuclear extract. The polyadenylation efficiency was calculated as the ratio of labeled RNA in the poly(A) tail region divided by total RNA activity (polyadenylated plus remaining substrate). These values (% PA) are shown at the bottom of the respective lanes. Standard deviations, P values, and numbers of repeats (n) are shown below the respective gels. (C) αCP proteins increase the efficiency of polyadenylation in vitro. The WT (left) and mutant (right) PRG2 and AES RNA substrates were incubated with HeLa cell nuclear extract (as described for panel B). Increasing equivalent amounts (0 μg, 0.3 μg, 0.6 μg, and 1.2 μg) of BSA or recombinant αCP2 were added to the indicated reaction mixtures. The reactions were terminated after a 90-min incubation, and the RNAs were analyzed on 6% denaturing PAGE for the addition of poly(A) tails. The PA efficiency, the ratio of labeled RNA in the poly(A) tail region divided by total RNA activity (polyadenylated plus remaining substrate), was quantified as indicated below the respective lanes (the activity at the lowest level of BSA for each set of reactions was defined as 1.0).

Impact of αCPs on expression and alternative polyadenylation of Pol II transcripts. Based on our analysis of 3′ processing of the hα-globin transcript () and on the present genome-wide analysis, we propose that αCPs can act as general regulators of 3′ processing. The αCP RNP complex recruits core components of the 3′-end processing machinery to a defined subset of human transcripts containing cognate C-rich binding motifs situated in proximity to poly(A) signals. This RNP complex serves as an upstream sequence element (USE) enhancer of 3′ cleavage and polyadenylation (). This enhancement of 3′ processing can increase the net levels of steady-state mRNA and/or alter the pattern of poly(A) site selection. Physiological () and/or pathological () shifts in the levels or biological activities of αCP can result in major alterations in the transcriptome by altering steady-state levels of subsets of mRNAs and/or by shifts in the relative utilization of alternative poly(A) sites.