Abstract

DEAH helicases participate in pre‐messenger RNA splicing and ribosome biogenesis. The structure of yeast Prp43p‐ADP reveals the homology of DEAH helicases to DNA helicases and the presence of an oligonucleotide‐binding motif. A β‐hairpin from the second RecA domain is wedged between two carboxy‐terminal domains and blocks access to the occluded RNA binding site formed by the RecA domains and a C‐terminal domain. ATP binding and hydrolysis are likely to induce conformational changes in the hairpin that are important for RNA unwinding or ribonucleoprotein remodelling. The structure of Prp43p provides the framework for functional and genetic analysis of all DEAH helicases.

Introduction

RNA helicases use nucleoside triphosphates (NTPs) to unwind double‐stranded RNA or remodel protein–RNA complexes (Liu et al, 2008; Hilbert et al, 2009). Prp43p is a DEAH‐box helicase involved in the release of the intron lariat during pre‐messenger RNA (pre‐mRNA) splicing and pre‐ribosomal RNA (pre‐rRNA) processing in Saccharomyces cerevisiae (Arenas & Abelson, 1997; Combs et al, 2006; Tanaka et al, 2007). Many RNA duplexes that are part of the splicing machinery are short and therefore expected to require proteins for stabilization; such ribonucleoprotein (RNP) complexes might be regulated by remodelling factors (Staley & Guthrie, 1998). Whereas the Prp43p binding site in the spliceosome is still unknown, Prp43p contacts with pre‐rRNA have recently been mapped to several different stretches in 18S and 25S rRNA and the helicase was suggested to regulate the association of small nucleolar RNA (snoRNA) with pre‐rRNA (Bohnsack et al, 2009). The pre‐rRNA binding sites for Prp43p span 14–20 nucleotides, indicating the length of RNA required for Prp43p binding and potential sites of action. Prp43p is an 88 kDa protein that is conserved universally in eukaryotes containing two canonical RecA helicase domains that include eight conserved motifs (I, Ia, Ib, II, III, IV, V and VI; Cordin et al, 2006), followed by a carboxy‐terminal region. These features are shared with the other five known DEAH‐box proteins in S. cerevisiae: splicing factors Prp2p, Prp16p and Prp22p (Burgess et al, 1990; Chen & Lin, 1990; Company et al, 1991; Arenas & Abelson, 1997), and Dhr1p and Dhr2p, which are involved in ribosome biogenesis (Colley et al, 2000). Prp43p has been reported to interact with Ntr1p, Pfa1p/Sqs1p and Pxr1p/Gno1p (Lebaron et al, 2005; Tsai et al, 2005), all of which contain the conserved G‐patch motif (Aravind & Koonin, 1999). The interactions with Ntr1p and Pfa1p/Sqs1p are essential for the biological functions of Prp43p in spliceosomal disassembly and pre‐rRNA processing, respectively (Tanaka et al, 2007; Pertschy et al, 2009).

Results And Discussion

The architecture of a DEAH‐box helicase

We have determined the crystal structure of yeast Prp43p in complex with ADP at 2.2 Å resolution (Table 1; supplementary Fig S1 online) and traced the entire molecule except for residues 756–767. The molecule is roughly box‐shaped with dimensions of 97 × 69 × 65 Å3 and comprises six structural entities (Fig 1; supplementary Fig S2 online). DEAH helicases contain two RecA domains, which in Prp43p‐ADP (residues 59–270 and 271–454, respectively) are packed tightly around the nucleotide (Fig 2). As the sequence of Prp43p and other DEAH helicases outside the RecA domains does not show homology to proteins of known structure, we searched for proteins with structural homology by using the DALI server (Holm et al, 2008). In one search with residues 455–634, the best matches were two DNA helicases, Hjm (Research Collaboratory for Structural Bioinformatics (RCSB) entry 2ZJ8, Z‐score 8.1, identity 15%) and Hel308 (RCSB entry 2P6R, Z‐score 7.3, identity 13%), which are highly related structurally (Buttner et al, 2007; Oyama et al, 2009). This region contains a degenerated winged‐helix domain (WHD; residues 455–520) with a weakly defined β‐sheet. It is followed by a seven‐helical bundle, named the ratchet domain (residues 521–634) owing to its putative function as a ratchet for DNA and RNA (see below; Fig 1A; supplementary Fig S2 online). The second search with residues 635–750 identified structural homology to eIF1A (RCSB entry 2OKQ, Z‐score 3.8). We term this region the C‐terminal domain (CTD), which includes a five‐stranded β‐barrel arranged in an oligonucleotide/oligosaccharide‐binding (OB) motif (residues 660–712). The CTD is terminated with a long α‐helix protruding from the bulk of Prp43p (Fig 1A; supplementary Fig S2 online). The WHD, ratchet domain and CTD make intimate contacts with residues 10–36 of the extended amino‐terminal domain (residues 1–58) and the WHD also interacts extensively with RecA‐1 (Fig 1A). The interfaces between RecA‐2 and the other domains are relatively small and predominantly polar, suggesting that they are dynamic (supplementary Table S1 online).

The structure of Prp43p. (A) Representation of Prp43p bound to ADP in two orientations differing by a 180° rotation. The extended NTD region is coloured blue, RecA‐1 green, RecA‐2 magenta, the 5′HP yellow, the WHD orange, the ratchet domain cyan and the CTD red. A long helix in the ratchet domain, containing three conserved arginines potentially involved in RNA interaction, is labelled '3R'. (B) Prp43p sequence conservation, based on supplementary Fig S2 online, mapped on the molecular surface in the same orientations as in (A). The 5′HP surface is outlined in yellow in the right panel. (C) Close‐up of the interaction between the 5′HP and the WHD. Hydrogen bonds and salt bridges are shown as dashed lines. The orientation is the same as in (A), but the focus is behind the ratchet domain. (D) Predicted movement of the 5′HP between the ADP state (yellow) and a modelled ATP state (grey). The orientation is rotated −120° about the vertical axis with respect to the right panel of (A). C, conserved; CTD, carboxy‐terminal domain; NC, non‐conserved; NSD, not sufficient data; NTD, amino‐terminal domain; SC, strictly conserved; WHD, winged‐helix domain.

The ADP binding site. (A) The DEAH‐specific binding site for the adenine base between Arg 159 in RecA‐1 and Phe357 in RecA‐2. Water molecules are shown as red spheres. (B) Comparison of the nucleotide binding sites of a DEAH‐box and DEAD‐box (grey) helicase (Andersen et al, 2006). (C) Schematic representation of direct interactions between the ADP molecule and Prp43p. Dashed lines represent backbone interactions (eletrostatic or hydrogen bonds) and full lines represent side‐chain interactions. The different motifs are shown in bold. (D) The interactions between ADPNP and eIF4AIII (DEAD‐box helicase) are shown as in (C). The DEAD‐box specific Q motif is responsible for direct recognition of adenine in this class of helicases.

Helicases of the two superfamilies I and II bound to ATP and DNA or RNA adopt a conserved 'closed' conformation of the two RecA domains (Le Hir & Andersen, 2008), in which single‐stranded RNA (ssRNA) or DNA is recognized by motifs Ia and Ib in RecA‐1 and motifs IV and V in RecA‐2. On ATP hydrolysis and subsequent Pi release (Nielsen et al, 2009), an 'open' conformation of the helicase is adopted, which can be highly variable (Caruthers et al, 2000; Andersen et al, 2006). To investigate the potential effect of ATP and RNA binding to Prp43p, we first superimposed the RecA‐1 domains of eIF4A3‐RNA‐ADPNP and Prp43p, and then matched the RecA‐2 domain of Prp43p onto its counterpart in eIF4A3‐RNA‐ADPNP. In the resulting model, RecA‐2 was rotated approximately 18° relative to Prp43p‐ADP (supplementary Fig S3 online). This also allowed us to tentatively model ssRNA bound to the RecA domains (see below).

ADP inhibits binding of RNA to Prp43p

A long, twisted β‐hairpin (residues 396–415) located between motifs V and VI of RecA‐2 is inserted into a cleft between the WHD and CTD (Figs 1 and 3). We will refer to this hairpin as the 5′HP, as it blocks access to the 5′ end of the putative binding pocket for ssRNA expected to be formed in the ATP state. Pulldown experiments with biotinylated RNA verified in vitro that the presence of ADP inhibits the interaction of Prp43p with ssRNA compared with both Prp43p‐ADPNP and apo‐Prp43p (Fig 3C). A hairpin of the same length and location is present in the flavivirus NS3 RNA helicase (Luo et al, 2008). Similarly to Prp43p, the NS3 hairpin also closes the 5′ end of the ssRNA binding pocket and interacts with a C‐terminal helical region. Other elements bounding this pocket are the conserved helicase motifs Ia, Ib, IV and V in the RecA domains, two loops and the last helix in the Prp43p ratchet domain (Fig 3A,B). This helix (residues 610–630) corresponds to a helix in Hel308, which was suggested to provide a ratchet for the directional movement of ssDNA across the RecA domains (Buttner et al, 2007). An equivalent helix is also present in the RNA helicase Brr2, which is required for unwinding of the extremely stable U4/U6 duplex during splicing (Pena et al, 2009; Zhang et al, 2009). In Prp43p, three conserved arginines in this helix face the 3′‐half of this putative RNA binding pocket (Fig 1A; supplementary Fig S2 online), suggesting their involvement in RNA contacts in Prp43p. The 3′ end of this binding pocket is sealed by interactions involving RecA‐1, the ratchet domain and the last residues of the WHD (Fig 3B).

RNA binding to Prp43p. (A) Surface representation of Prp43p with the 5′HP present (left) and removed (right) to show its occlusion of a pocket between RecA‐1, RecA‐2 and the ratchet domains expected to bind ssRNA in the ATP state. An RNA placed by comparison with the exon junction complex (Research Collaboratory for Structural Bioinformatics entry 2HYI) is shown in light‐blue sticks. (B) Inside view of the RNA binding area of the two RecA domains with the 5′ end of the binding site obstructed by the 5′HP and the 3′ end bounded by the RecA‐1 and ratchet domains (Helix 1 and Helix 2 of the ratchet domain are indicated). (C) SDS–PAGE analysis of streptavidin agarose pulldown using a U30 biotinylated RNA. All reactions contained Prp43p and bovine serum albumin (BSA) to reduce background binding and the indicated nucleotide or U30‐biotin. 'M' denotes marker. CTD, carboxy‐terminal domain; NTD, amino‐terminal domain; SDS–PAGE, SDS–polyacrylamide gel electrophoresis; ssRNA, single‐stranded RNA; WHD, winged‐helix domain.

WHDs have been observed to interact with both DNA (Gajiwala & Burley, 2000) and RNA (Alfano et al, 2004), as do OB motifs (Theobald et al, 2003), suggesting that in Prp43p they could have a similar function. The conserved loop connecting β‐strands 4 and 5 of the OB motif (residues 701–704) neighbours the 5′HP (supplementary Fig S4A,B online) and is therefore located close to where the 5′ end of ssRNA bound to the Rec‐A helicase domains is expected to be, and the equivalent loop in AspRS interacts with the anticodon of transfer RNA (RCSB entry 1ASY). Nearby, in the ratchet domain, helix four and the preceding loop (residues 557–569) contain a highly conserved, positively charged surface patch (supplementary Fig S4B online) and comparison with DNA‐bound Hel308 (Buttner et al, 2007) suggests that this region could interact with RNA.

Insertion of the 5′HP between the CTD and WHD is likely to lock the orientation of RecA‐2 in the ADP state relative to the rest of the molecule. Both the 5′HP and the residues it contacts are in general highly conserved in Prp43p and other DEAH helicases (Fig 1B,C; supplementary Figs S2 and S5 online). We predict that binding of ATP to Prp43p will induce a conformational change releasing the 5′HP from the RNA binding pocket, with the tip of the β‐hairpin moving more than 7 Å (Fig 1D; supplementary Fig S3 online), and will allow binding of ssRNA to the RecA domains. In addition, conformational changes at the interface between RecA‐1 and the WHD and ratchet domain are required for opening of the 3′ end of the occluded RNA binding pocket observed in Prp43p‐ADP. Similarly, binding of ssRNA to the RecA domains probably causes them to adopt the closed conformation, thereby creating an optimal ATP binding site in agreement with the strong stimulation of Prp43p ATPase activity by ssRNA (Tanaka & Schwer, 2006).

The functional importance of the 5′HP is supported by several observations. First, there is a strong asymmetry in the distribution of conserved surface residues in Prp43p, and the 5′HP is located at the most conserved surface (Fig 1B). Second, the 5′HP interacts functionally with the N‐terminal region of Ntr1 (Tanaka et al, 2007), which significantly stimulates the helicase activity of Prp43p (Tanaka & Schwer, 2006; Pertschy et al, 2009). A similar hairpin in the same location in Hel308 is probably responsible for DNA unwinding (Buttner et al, 2007), suggesting a potential role of the 5′HP in RNA unwinding and/or RNP remodelling catalysed by DEAH helicases. Third, a mutation in the 5′HP of Prp43p (Ser 414) contributes to accumulation of 20S pre‐rRNA during ribosome biogenesis (Pertschy et al, 2009; supplementary Table S2 and Fig S6 online). Hence, the 5′‐HP might have at least two functions; in the ADP state it probably has an autoinhibitory role, preventing binding of ssRNA to Prp43p, and in the ATP state it possibly participates in strand separation or remodelling of protein–RNA interactions.

A novel binding site for the ADP molecule

An ADP molecule is sandwiched between the two RecA domains (Fig 2). Compared with DEAD box and other helicases, the adenine base is stacked at a novel binding site between Phe 359 and Arg 159, which are conserved in all DEAH helicases (Fig 2; supplementary Figs S2 and S5 online). Asp 386 from motif V and Arg 430 from motif VI interact with the ribose, and the conserved Ser 155 and Gln 354 coordinate water molecules to form hydrogen bonds with the adenine, but the base is not specifically recognized as seen in the ATP‐specific DEAD‐box proteins (Fig 2C,D; supplementary Fig S7 online). In agreement with a non‐specific nucleotide binding site, DEAH helicases Prp43p (Tanaka & Schwer, 2006), Prp2p (Kim et al, 1992), Prp16p (Schwer & Guthrie, 1992) and Prp22p (Tanaka & Schwer, 2005) have all been shown to hydrolyse various NTPs and dNTPs. The ADP phosphates are coordinated by interactions with Gly 119, Lys 122 and Thr 123 from motif I and the adjacent Thr 124 involving both backbone and side‐chain interactions (supplementary Fig S7 online). The Mg2+ ion is coordinated by four water molecules, a β‐phosphate oxygen and Thr 123. Three of the water molecules coordinating the Mg2+ ion are kept in place by Thr 381 and Ser 382 from motif V (supplementary Fig S7 online). Motifs I, II, V and VI interact through contacts between Thr 123–Asp 215, Glu 216–Thr 381 and Arg 430–Gly 119, while Glu 216 and His 218 in motif II form a salt bridge (supplementary Fig S7 online). In the ATP state of DEAH helicases, we expect no significant changes in the adenine binding pocket, whereas the triphosphate moiety probably binds similarly to other superfamily I and II helicases surrounded by motifs I, II and VI (Le Hir & Andersen, 2008).

In summary, our structure defines the structural building blocks present in Prp43p, and based on sequence comparisons, these are also present in the other DEAH helicases. These findings might also be extended to DExH RNA helicases, such as DHX9, that are involved in transcription (Fuller‐Pace, 2006). It also reveals an unexpected structural homology of these DEAH helicases to the Hel308 and Hjm DNA helicases and establishes the presence of motifs involved in RNA binding in other proteins, thereby providing straightforward suggestions for regions that are potentially involved in intermolecular interactions. Finally, our structure provides the rationale for the well‐established functional importance of the 5′HP in Prp43p shown by genetic and functional studies of both pre‐mRNA splicing and ribosome biogenesis (Tanaka et al, 2007; Pertschy et al, 2009). Together with the recent identification of regions in pre‐rRNA and snoRNA that can be crosslinked to Prp43p (Bohnsack et al, 2009), our structure paves the way for a rational approach to obtain a detailed molecular understanding of the function of Prp43p. However, several important questions about the mode of action of Prp43p remain, including (i) whether Prp43p is a processive or non‐processive helicase; (ii) whether it unwinds double‐stranded RNA and/or remodels messenger RNP (mRNP); (iii) and how binding partners recruit and localize Prp43p to a specific place for unwinding and mRNP remodelling (Guenther & Jankowsky, 2009).

Methods

RNA pulldown. High Capacity Streptavidin Agarose Resin (Thermo Scientific, Waltham, MA, USA) and 3′ biotinylated pU30 (Dharmacon, Lafayette, CO, USA) were used for RNA pulldown of Prp43p. For each reaction, 20 μl of streptavidin agarose resin was washed three times with 500 μl of binding buffer (20 mM Tris–HCl, pH 7.6, 200 mM NaCl, 5 mM MgCl2, 0.5 mM dithiothreitol and 10% glycerol), then mixed together with 0.1 nmol of biotinylated polyU 10 μg Prp43p and, when necessary, 1 mM of ADP or ADPNP in the binding buffer to a final volume of 100 μl. The reaction mixtures were incubated at 4°C overnight. After thorough washing of the resin, Prp43p was eluted by resuspending the resin in 20 μl 2.5 × SDS–polyacrylamide gel electrophoresis loading buffer and heated for 5 min at 100°C. The bound Prp43p was visualized by Coomassie staining.

Crystallization and structure determination. Needle‐shaped crystals with maximum dimensions of 500 × 20 × 20 μm3 of the Prp43p–ADP complex were grown in hanging drops by vapour diffusion at 4°C against reservoirs containing 100 mM Mes‐NaOH, pH 6.75, 14% PEG8000, 300 mM NaOAc and 100 mM BaCl2. The crystals were transferred to a cryoprotection buffer (100 mM Mes‐NaOH, pH 6.75, 200 mM NaCl, 15% PEG8000, 20% glycerol) and flash frozen in liquid nitrogen. Native data were recorded at 100 K at the PXI beamline of the Swiss Light Source (Villigen, Switzerland), and selenomethionine derivative data were collected at the I911‐3 beamline of MAX‐lab (Lund, Sweden). Diffraction data were processed and scaled by using the XDS package (Kabsch, 2001). SHELXD (Sheldrick, 2008) was used to identify 34 of 36 possible selenium sites from the anomalous data with a correlation coefficient of 0.54 for the resolution range 50–4 Å. Phasing and initial model building was done using the AutoSol and Resolve procedures as implemented in PHENIX (Adams et al, 2002). Iteratively improved solvent flattened and averaged maps starting from combined phases (experimental phases and current model phases) were obtained with CNS (Brunger et al, 1998) until the model was refined to an Rfree of 35%. Subsequently, all rebuilding was done in averaged 2mFo−DFc maps. Manual rebuilding was done in O ( Jones et al, 1991) and COOT (Emsley & Cowtan, 2004), after which the model was refined with PHENIX using tight non‐crystallographic symmetry restraints on the two copies of Prp43p‐ADP present in the asymmetric unit. Refinement without non‐crystallographic symmetry restraints resulted in an increase in Rfree and was therefore abandoned. In the final model, 96.28% of the residues fell into the favoured regions of the Ramachandran plot and there were no outliers. Structure analysis was done with MolProbity (Davis et al, 2004), DynDom (Hayward & Berendsen, 1998) and PISA (Krissinel & Henrick, 2007), and figures with PyMOL (DeLano, 2002) or ALINE (Bond & Schuttelkopf, 2009).

Conflict of Interest

Supplementary Information

Acknowledgements

We are grateful to the staff at MAX‐lab, European Synchrotron Radiation Facility and Swiss Light Source for their help with data collection, Lars Sottrup‐Jensen for amino acid analysis, and Mickaél Blaise, Laure Yatime and Christian B.F. Andersen for their help with data processing. K.H.N. was supported by the Alfred Benzon Foundation. G.R.A. was supported by the Danish Science Research Council (FNU), the Danish National Research Foundation and a Hallas‐Møller stipend from the Novo‐Nordisk Foundation.