Abstract

Herpesviruses possess a genome-pressurized capsid. The 235-kilobase genome of human cytomegalovirus (HCMV) is by far the largest of any herpesvirus, yet it has been unclear how its capsid, which is similar in size to those of other herpesviruses, is stabilized. Here we report a HCMV atomic structure consisting of the herpesvirus-conserved capsid proteins MCP, Tri1, Tri2, and SCP and the HCMV-specific tegument protein pp150—totaling ~4000 molecules and 62 different conformers. MCPs manifest as a complex of insertions around a bacteriophage HK97 gp5–like domain, which gives rise to three classes of capsid floor–defining interactions; triplexes, composed of two “embracing” Tri2 conformers and a “third-wheeling” Tri1, fasten the capsid floor. HCMV-specific strategies include using hexon channels to accommodate the genome and pp150 helix bundles to secure the capsid via cysteine tetrad–to-SCP interactions. Our structure should inform rational design of countermeasures against HCMV, other herpesviruses, and even HIV/AIDS.

Human cytomegalovirus (HCMV), a member of the Herpesviridae family and β-herpesvirinae subfamily, is a leading cause of congenital defects (1, 2) and a major contributor to life-threatening complications in immunocompromised individuals such as AIDS (3, 4) and organ-transplant patients (5, 6). Yet HCMV’s ability to establish relatively nontoxic lifelong latency in hosts (7), its high seroprevalence (>90% in some populations) (8), and its large genetic capacity (9) are characteristics shared among herpesviruses that together give them an advantage over other viral vectors as tools for the development of gene delivery vehicles (10), oncolytic vectors (11), and vaccines against not just herpes-viruses, but even HIV/AIDS (12, 13).

A double-stranded DNA (dsDNA) virus, HCMV’s 235-kb genome is twice the size of that of chickenpox-causing varicella zoster virus and >50% larger than that of the cold sore–causing herpes simplex virus 1 (HSV-1) (14), two common human α-herpesviruses. Despite enclosing a much larger genome, the size of the HCMV capsid is similar to that of HSV-1 (as well as those of other herpes-viruses) (15); both have an icosahedrally ordered nucleocapsid with triangulation number (T) = 16, composed of 12 pentons, 150 hexons, and 320 triplexes. Thus, the high capsid pressure in HSV-1 (up to 20 atm) (16) resulting from tightly packed, electrostatically repulsive genomic material must be a condition further exacerbated in HCMV. Evidence from biochemical and structural studies suggests that the β-herpesvirus–specific tegument protein pp150 contributes to a netlike tegument density layer enclosing—and therefore perhaps stabilizing—the C-capsid to facilitate the formation of infectious virions (17–19).

Nonetheless, the exact mechanism through which capsid stability is achieved has remained unclear in the absence of an atomic description of HCMV particles. At more than 2000 Å in diameter, the sheer size of herpesviruses and the potential fragility associated with such large molecular assemblies present tremendous obstacles for high-resolution reconstructions (20, 21). Despite recent successes in high-resolution studies of macromolecular complexes (22–24) and smaller viruses (25–27), the highest-resolution reconstructions of α-, β-, and γ-herpesvirus particles have so far been 6.8 (28), 6 (19), and 6 Å (29), respectively—none of which are adequate for reliable de novo atomic modeling of the viral capsid. In this study, by using an improved sample preparation strategy and electron-counting cryo–electron microscopy (cryoEM), we obtained a three-dimensional (3D) reconstruction of HCMV at 3.9-Å resolution and derived atomic models for the capsid and tegument protein pp150.

Two types of particles can be identified in the resulting images: DNA-containing and DNA-devoid, corresponding to virions and noninfectious enveloped particles (NIEPs), respectively (fig. S1). A comparison of reconstructions of these two particle types at 15-Å resolution reveals concentrically packaged dsDNA manifesting as multiple concentric density shells contoured against the inner surface of the capsid (Fig. 1A). Consistent with its markedly larger genome compared with all other human herpesviruses, HCMV’s DNA layers are more densely packed, having a ~23-Å interlayer distance versus the 25- to 26-Å spacing in HSV-1 (30) and Kaposi’s sarcoma–associated herpesvirus (KSHV, a γ-herpesvirus) (29). Unexpectedly, DNA density was observed to contour even the narrow interior confines of the capsid’s hexon channels (Fig. 1A, inset)—a feature not observed in HSV-1 or KSHV, and perhaps indicative of the space premium and massive pressures within the HCMV capsid.

Next, we sought to achieve a high-resolution structure of the tegument-coated HCMV capsid. We first obtained a 4.5-Å 3D reconstruction (fig. S2, A to C) by combining 60,000 particle images collected from more than 3800 photographic films. However, we deemed the degree of confidence and coverage with which we could accurately build an atomic structure to be insufficient for our goals. We subsequently used direct electron-counting techniques (then newly introduced) to record 12,000 movies and obtained a new reconstruction at 3.9-Å resolution by combining 39,600 particle images from this data set (Fig. 1B, fig. S2C, and Movie 1). The resulting density map features well-resolved side chains consistent with this resolution (Fig. 1C and figs. S3 to S7).

We then derived atomic models for all four capsid proteins that together make up the HCMV capsid shell and for the N-terminal one-third of pp150 (pp150nt), a capsid-associated tegument protein. The HCMV capsid structure is an assembly of 60 asymmetric units. An asymmetric unit contains 16 copies of major capsid protein (MCP), which account for the bulk of the protein-aceous capsid and exist in penton and hexon capsomers of five and six subunits, respectively. Hexons further exist in three subtypes, designated C (center), P (peripentonal), and E (edge) in reference to their positions relative to the capsid’s icosahedral symmetry. Each asymmetric unit contains a C hexon, P hexon, one-half of an E hexon, and one-fifth of a penton. Additionally, an asymmetric unit contains 16 copies of smallest capsid protein (SCP) that sit atop each MCP; five and one-third heterotrimeric triplexes (Ta, Tb, Tc, Td, Te, and one-third of Tf, located at the threefold axis), each composed of a triplex monomer protein Tri1 (also known as minor capsid binding protein) coupled with two dimer-forming conformers of triplex dimer protein Tri2 known as Tri2A and Tri2B (also known as minor capsid proteins); and 16 copies of pp150 molecules that cluster in groups of three above each triplex (Fig. 1B). Our atomic models accounted for all 16 conformers of MCP in an asymmetric unit, 16 conformers of SCP, five conformers of Tri1, 10 conformers of Tri2 divided between Tri2A and Tri2B, and 15 conformers of pp150nt (Fig. 1D). We were unable to model triplex Tf and its three associated molecules of pp150, because icosahedral averaging destroys the region's density as a result of symmetric mismatching at threefold axes.

MCP structure and tower features

At 1370 amino acids in length, MCPs are enormous proteins. Each MCP is folded into seven distinct domains: upper (residues 481 to 1031), channel (398 to 480 and 1322 to 1370), buttress (1107 to 1321), helix-hairpin (190 to 233), dimerization (291 to 362), N-lasso (1 to 59), and Johnson fold (60 to 189, 234 to 290, 363 to 397, and 1032 to 1106) (Fig. 2A and Movie 2). The lattermost domain is named after a characteristic fold first identified in bacteriophage HK97 gp5 (31), later found in the major capsid proteins of other DNA bacteriophages (32, 33) and herpesviruses (34), and now modeled atomically in a herpesvirus in our HCMV MCP model (Fig. 2B).

A prominent feature of viruses that use Johnson folds is a five-stranded β-core within the fold that serves as an organizational hub of the major capsid protein (31, 33) (Fig. 2B, inset, and fig. S8, A to F). With the exception of the N-lasso domain (itself an extension of the Johnson fold domain’s N element), all other domains of the MCP can be understood as simply modular insertions into the peripheral loops of the Johnson fold β-core (Fig. 2C). The MCP domains can further be organized into two regions: a tower region composed of the upper, channel, buttress, and helix-hairpin domains, which makes up the penton and hexon capsomer protrusions, and a floor region composed of the N-lasso, dimerization, and Johnson fold domains.

Extensive interactions exist between the tower regions of adjacent MCPs in a capsomer (Fig. 2D). Within the upper domain, the bulk of interactions occur at the interface of a long, exposed helix and a loop region from its neighboring MCP’s upper domain (Fig. 2D, magenta box), corroborating observations from the crystal structure of the HSV-1 MCP (VP5) upper domain (referred to as VP5ud), the only other herpesvirus capsid atomic structure in existence (35). A structural alignment of the VP5ud model to our HCMV MCP model’s upper domain reveals the two to be highly similar in secondary and tertiary structure, with an abundance of helices in conserved orientations. Differences are more pronounced in loop regions, in particular at the top of the upper domains where SCP binding occurs (Fig. 2D, insets).

Descending the tower, the channel domain presents an interesting arrangement in hexons, where six β-strands from the channel domain of one MCP are augmented by a β-strand (β22) from an adjacent MCP’s channel domain to form a seven-stranded β-sheet angled toward the interior of the hexon channel (Fig. 3A). Together, six such β-sheets come together to form a β-sheet ring, connected by six constricting loops that descend from each β-sheet, constituting the narrowest region of the hexon channel (Fig. 3B). This “daisy chain” arrangement is possible because each constricting loop is flanked by β21 and the augmenting β22, which participates in an adjacent β-sheet. We presume that this constricted region, with an internal diameter measured at ~12 Å, is responsible for preventing DNA forced into the lower hexon channels from escaping the capsid altogether.

An electrostatic surface rendering of residues lining the hexon channel further reveals a substantial difference in the properties of residues above and below the channel’s constricted region (Fig. 3C). Residues above the constriction are demonstrably more negative in charge (Fig. 3D) than residues below the constriction, which overall tend to be positively charged (Fig. 3E). This makes sense because the lower hexon channel is the DNA-accommodating region, and positive residues should accommodate the packing of negatively charged DNA better than negative residues, which would only serve to hinder packing through repulsion. These results support our interpretation of DNA-packed hexon channels as a means for HCMV to simultaneously package its large genome and alleviate extreme capsid pressure, thereby improving capsid stability.

Capsid floor–defining MCP-MCP interactions

Within the floor region, three notable types of MCP-MCP interactions contribute to the structural integrity of the HCMV capsid shell (Fig. 4A). Type I interactions are intracapsomer interactions that occur between adjacent MCPs within a capsomer and are dominated by two sets of β-sheet augmentations. An archetypal type I interaction is illustrated between C4 and C5 MCPs in Fig. 4B. β-strands from C5’s dimerization domain (β16) and the E-loop of its Johnson fold domain (β6 and β8) are joined by two β-strands from C4’s N-lasso domain (β3 and β4) to form a five-stranded β-sheet. Similarly, a β-strand from C5’s E-loop (β7) joins three β-strands from C4’s E-loop (β9) and Johnson fold domain’s P-subdomain (β38 and β39) to form a four-stranded β-sheet. Conversely, type II interactions are intercapsomer dimerization interactions that feature quasi-equivalent interactions between two pairs of helices in the dimerization domains of MCPs across local twofold axes. Type II interactions are illustrated by E2 and C5 MCPs (Fig. 4C).

Unlike type I and II interactions, type III interactions occur among three MCPs and are characterized by the lassoing action of the N-lasso domain, which extends out and lashes around an E-loop and an N-lasso neck from two MCPs located diagonally across a local twofold axis. Three sets of N-lasso interactions form an enclosed triangle around local threefold axes, creating a “lasso triangle.” E1, C4, and C5 MCPs illustrate a type III N-lasso interaction, and E1, C5, and P3 MCPs illustrate the lasso triangle (Fig. 4D). E1’s N-lasso encircles C5’s E-loop and the neck of C4’s N-lasso, both of which also participate in type I interactions. In addition to lashing these two elements, E1’s N-lasso also contributes two β-strands (β1 and β2) that augment the existing five-stranded β-sheet from C4 and C5’s type I interaction to form a seven-stranded β-sheet complex (Fig. 4D, lower inset). Thus, type III interactions build on and likely lend increased stability to type I intracapsomer interactions. Lastly, a small helix bundle formed from E1 N-lasso’s helix H2, two helices from C5’s helix-hairpin domain (H6 and H7), and a helix from C5’s buttress domain (H49) further secure the E1 N-lasso in place.

MCP adaptations at the fivefold axis

MCPs adjust to variations in capsid geometry at fivefold vertices by adopting conformational changes. A comparison of hexon and penton MCPs reveals a contracted and rotated penton MCP floor region relative to that of hexon MCP, whereas their tower regions are highly similar (Fig. 5A). Pentons thus have a more “closed umbrella” shape in comparison with hexons and accordingly have a smaller diameter but greater height (Fig. 5B).

In addition to differences in domain orientation, penton MCPs also exhibit several notable local conformation changes. Specifically, the penton MCP N-lasso adopts an “open” configuration, effectively eliminating its lassoing ability (Fig. 5C, magenta insets), whereas the dimerization domain exists in an extended configuration that differs from the compact fold of its canonical hexon counterpart (Fig. 5C, green insets). As such, the penton MCP neither lashes an adjacent P1 (hexon) MCP, nor does it establish quasi-equivalent interactions with a (different) P1 MCP’s dimerization domain (which itself also adopts a distinctive configuration). The P6 (hexon) MCP, whose N-lasso would normally lash a penton MCP in a traditional lasso triangle, adopts an open N-lasso as well. Thus, pentons neither lash adjacent hexons, nor are they lashed by adjacent hexons, nor do they participate in dimerization interactions with surrounding hexons (Fig. 5D). This lack of type II and III interactions at pentons likely accounts for the documented vulnerability of pentonal vertices in herpesvirus capsids (36, 37).

Lastly, in the absence of a canonical P6 N-lasso and type III interactions to stabilize the penton MCP floor’s type I β-sheet complex, an element that we term the “buttress support” from MCP’s buttress domain provides reinforcement. In hexon MCPs, the buttress support consists of two helices (H48 and H49) and a flexible loop region, and it acts as a flexed “elbow” resting on a neighboring MCP’s N-lasso (Fig. 5E, left panel), so that it serves as both a clamp for the N-lasso and a structural support for the MCP tower. In penton MCPs, the elbow is extended as the two helices combine into a long helix (H49) that reaches down to the floor of the penton MCP (Fig. 5E, right panel), where it contributes β37 to type I β-sheet interactions. There, β37’s augmentation with the penton MCP floor enhances type I β-sheet interactions in the absence of a lashing canonical N-lasso from P6 and simultaneously cements the lower end of the buttress support to the MCP floor, thereby facilitating additional structural propping of the penton tower.

Triplex structure and roles in capsid architecture

Triplexes are heterotrimers consisting of two dimer-forming Tri2 protein conformers, Tri2A and Tri2B, coupled with a Tri1 protein (Movie 3). Triplexes sit atop the MCP N-lasso triangle (Fig. 6A) and play an important role in the capsid’s architecture by plugging the large voids in the capsid floor between capsomers.

Both conformers of Tri2 exhibit three domains: clamp (residues 1 to 88), trunk (89 to 187 and 286 to 306), and embracing arm (188 to 285), which are named to reflect their structural roles in the triplex (Fig. 1D). Clamp domains are the primary elements through which the Tri2 dimer establishes contact with the floor regions of adjacent capsomers (Fig. 6B). In doing so, the Tri2 dimer reinforces MCP-MCP interactions in the vicinity of the lasso triangle. Trunk domains are structural elements whose key role is to support the helix-laden embracing arm domains, through which Tri2A and Tri2B “embrace” each other to form a large helix bundle that clasps the Tri2 dimer together (Fig. 6C). Superimposing Tri2A and Tri2B reveals that their clamp and trunk domains are nearly identical, whereas their embracing arms exist in markedly different spatial configurations (Fig. 6D).

Tri1 similarly has three domains, including an N-anchor (residues 1 to 44), trunk (45 to 168), and “third-wheel” (169 to 290) domain (Figs.1D and ​and6E).6E). But whereas Tri2A and Tri2B have clamp domains that interact extensively with the floor regions of surrounding MCPs, the bulk of Tri1 exhibits considerably less contact with the capsid floor. Instead, Tri1 functions largely as a latch-and-anchor protein that secures the Tri2 dimer in place. In keeping with our imagery, Tri1’s third-wheel domain wedges its helices into the helix bundle formed from Tri2 dimer’s embracing arms, thereby latching Tri1 to the dimer (Fig. 6E, green perspective). Meanwhile, Tri1’s N-anchor penetrates the capsid floor and extends along the inner surface of the capsid shell, anchoring Tri1 and, by extension, the entire triplex to the capsid floor (Fig. 6E, red and blue perspectives). Lastly, the extensive triplex-MCP interactions at the capsid floor are complemented topside by three helix-containing buttress arms that extend from the buttress domains of three adjacent MCPs (Fig. 6A, insets). These rest on the triplex, further clamping the triplex in place while creating an additional point of support for their respective MCP towers.

In the greater context of the capsid’s global structure, the triplex forms the centerpiece of the lock unit, which is a conceptual means of encompassing the complete set of intricate molecular interactions found within the HCMV capsid. Each lock unit is composed of six MCPs from three different capsomers and features three pairs of type I interactions, three pairs of type II interactions, and one lasso triangle—all organized around a triplex that not only reinforces MCP-MCP interactions about the lasso triangle, but also plugs what would otherwise be a large perforation in the capsid floor at local threefold axes (fig. S9A and Movie 4). Six lock units interact in overlapping fashion to constitute a group of six (GOS), at the center of which sits a hexon (fig. S9B). Because each MCP is included in two lock units and directly interacts with four lock units including its own, each lock unit interacts with all lock units in its GOS except the far opposite unit. GOSs overlap so that each lock unit takes part in the makeup of three GOSs, allowing an individual lock unit to interact directly with nine distinct lock units (fig. S9C). From a global perspective, lock units are a capsid organizational schema that helps illustrate the highly interwoven nature of the structural proteins that come together to constitute the HCMV capsid.

SCP structure and interactions with MCP

The 75–amino acid UL48.5 gene product known as the SCP is the smallest of the HCMV capsid proteins and also the smallest of all its functional homologs in human herpesviruses. Unlike in HSV-1, where SCPs bind only hexon MCPs (38), SCPs in HCMV bind both penton and hexon MCPs so that each MCP is bound by exactly one SCP. Both penton and hexon SCPs exhibit nearly identical globular structures, and both sit atop MCP upper domains at the outer circumference of their respective capsomers, exhibiting similar interactions with their underlying MCPs (Fig. 7A).

Our modeling efforts successfully resolved 63 of the 75 amino acids of SCP (residues 13 to 75), revealing a structure characterized by three helices connected by short loops (Fig. 7B). In all SCP copies in our density map, the density for residues 1 to 12 degrades from highly disordered to completely invisible toward the N terminus, suggesting that this fragment is inherently flexible—and thus not resolved in cryoEM structures obtained by averaging tens of thousands of individual viral particles. Indeed, a previous mutagenesis study of HCMV SCP found that the deletion of residues 1 to 11 had no effect on SCP binding to MCP (39). The same study also found residues 56 to 75 to be required for SCP-to-MCP binding. This is consistent with our results, which show that SCP’s H3 (residues 57 to 72) and C-terminal loop (residues 73 to 75) serve as the main interacting residues between SCP and MCP, inserting into a shallow cleft in the MCP upper domain (Fig. 7B, green box).

Tegumental pp150 structure and capsid binding

Three pp150 molecules—conformers a, b, and c—cluster on each triplex and extend toward the top of three nearby MCPs, contributing to the netlike layer of tegument densities that enmesh HCMV capsids (Fig. 7C). The atomic model constructed for pp150nt (the N-terminal one-third of pp150, residues 1 to 285) confirms that it is predominantly helical (19), with a series of roughly parallel helices arranged in upper and lower bundles joined by a central long helix (Fig. 7D). The remainder of pp150 was invisible in our reconstructed density map, again suggesting that the region exhibits flexibility and/or lacks a fixed orientation, which is consistent with data showing that pp150’s N-terminal residues 1 to 275 alone are sufficient for pp150-to-capsid binding (17). Several conserved regions in the N-terminal 275 residues have also been identified, including a 27–amino acid cysteine tetrad conserved across all primate CMVs and two regions known as conserved regions 1 and 2 (CR1 and CR2) that are conserved among β-herpesviruses (17). Our model reveals the cysteine tetrad and CR1 to be in pp150nt’s upper helix bundle, whereas CR2 is in the lower helix bundle (Fig. 7D and fig. S10, A and B).

Interactions between pp150nt and the capsid occur at pp150nt’s upper and lower ends and reveal how pp150 works to secure the HCMV capsid (Fig. 7E and Movie 5). At the upper end, all three conformers of pp150nt rest primarily on the SCPs of nearby MCPs in identical fashion through a maintained cysteine tetrad–to-SCP interaction (Fig. 7E, right insets). In contrast, interactions that cement the lower end of pp150nt to the capsid are far less specific and differ among conformers, as necessitated by the fact that pp150’s underlying triplex does not exhibit perfect threefold symmetry. At most triplexes, the lower end of pp150nt conformer a (pp150nt-a) interacts with Tri2A, Tri2B, and the side of an adjacent MCP’s upper domain (Fig. 7E, left insets), whereas the lower ends of pp150nt-b and pp150nt-c interact exclusively with Tri1 and Tri2B (Fig. 7F) and Tri2A and Tri2B (Fig. 7G), respectively. Consistent with the higher degree of interaction specificity at the upper end, pp150nt’s upper helix bundle contains more conserved elements in CR1 and the cysteine tetrad and exhibits greater structural uniformity across the three pp150nt conformers than the lower helix bundle (fig. S10C). Interestingly, although CR2 is in the lower helix bundle, its location is on the central long helix that forms part of the upper helix bundle and on which 18 residues of the cysteine tetrad are also found.

Although the lock unit is a concept readily applicable throughout the herpesviruses, pp150 in HCMV seems to be an adaptation that specifically allows HCMV to cope with the pressure of its exceptionally large genome. Whereas other human herpesviruses exhibit auxiliary tegument proteins that bind exclusively to pentons and peripentonal triplexes [UL17 and UL25 in HSV-1 (30, 40, 41) and ORF32 and ORF19 in KSHV (42)], pp150 is globally bound to all capsomers and triplexes in HCMV. We posit this as consequence of the fact that HSV-1 and KSHV, possessing high capsid pressures but smaller genomes, can manage with structural reinforcements limited to their pentonal vertices, which we demonstrated lack both type II dimerization and type III N-lasso lashing interactions. Indeed, atomic force microscopy studies of HSV-1 have shown that UL25 binding at pentons considerably increases the mechanical stiffness of the capsid (43). But the vastly greater pressures in HCMV that result from a similar-sized capsid containing a substantially larger genome occupying every last cubic angstrom of space—as evidenced by smaller DNA interlayer distances and DNA-filled hexon channels—require more robust methods to stabilize the capsid than simply penton reinforcement, necessitating the recruitment of pp150 at hexons as well (Fig. 8, A to D). Our structures reveal the specific nature of SCP’s role in pp150’s recruitment, illustrating how pp150nt binds capsomer protrusions through a well-defined cysteine tetrad–to-SCP interaction, which incidentally accounts for why pp150 loss-of-function HCMV mutants can be rescued by pp150 from primate CMV (in which cysteine tetrad is conserved), but not from nonprimate CMV (44). Thus, the strengthening of the DNA-containing capsid by pp150 relies on its interaction with a mediator protein that has no apparent structural role itself and is the least conserved capsid protein across Herpesviridae subfamilies—the 8-kDa HCMV SCP.

Discussion

Like herpesviruses, dsDNA bacteriophages contain tightly packaged genomes and are highly pressurized, often with capsid internal pressures in the tens of atmospheres (45, 46). Both classes of virus inject their genomes into host cells by using a pressure-driven DNA ejection strategy, despite likely billions of years of evolution separating eukaryotic viruses and bacteriophages (16). Likewise, maintaining capsid structural integrity under such pressured conditions is a common challenge faced by herpesviruses and dsDNA bacteriophages, and the solution is necessarily an architectural one. Since the Johnson fold was first discovered in bacteriophage HK97 (31), many dsDNA viruses have been found to use the fold as a core structural motif, though often with elaborations—presumably to enhance capsid stability—that seem to correlate in complexity with the physical size and organizational complexity of the virus (47). In some viruses, these elaborations manifest as domains that insert into the Johnson fold, as exemplified by phage P22 (48). Others, such as lambda phage, recruit an auxiliary protein to stabilize the capsid (49). Among viruses whose atomic structures are known, HK97 represents perhaps the most optimized solution, achieving a highly stable capsid through covalent bonds cross-linking a simple MCP that is merely 282 amino acids in length (31). The HCMV structure presented here contributes an example at the other extreme: a 1300-Å-diameter capsid with T = 16 icosahedral symmetry that uses (i) an enormous 1370–amino acid MCP consisting of six domain insertions elaborating the archetypal Johnson fold to establish the capsid’s basic chassis, (ii) auxiliary triplex heterotrimers to stabilize its MCP floor, and (iii) a network of helix bundles from the auxiliary tegument protein pp150 to further secure its outmost regions.

Beyond providing a framework to understand mechanisms of capsid stabilization in the large family of herpesviruses in general and HCMV in particular, the HCMV atomic structures presented here also hold promise for the rational design of therapeutic strategies. Since the 1970s, tremendous efforts have been invested in the development of live-attenuated HCMV vaccines, with little success. Our structure should allow a more precise, structure-based mutagenesis approach in developing effective live-attenuated mutants suitable for vaccination against HCMV and potentially other herpesviruses. Additionally, recent studies have shown rhesus cytomegalovirus to be effective as a persistent vector in not just controlling but clearing simian immunodeficiency virus in rhesus macaques (12, 13). These ground-breaking advances have inspired researchers to develop vaccines against HIV by using an analogous strategy. Because wild-type HCMV cannot be used as a vector for HIV vaccine development given its virulence, the central issue of this approach is the construction of appropriate live-attenuated HCMV vectors. The HCMV atomic model should prove invaluable in this effort. These findings thus will have profound impacts on the development of new strategies for therapeutic intervention against both HCMV and HIV infections.

Materials and methods

Sample preparation

Human fibroblast MRC-5 cells were grown in Dulbecco’s Modified Eagle Medium (DMEM) supplemented with 10% fetal bovine serum (FBS). Cells were infected with HCMV strain AD169 at a multiplicity of infection (MOI) of 0.1 to 0.5 when cells reached ~80 to 100% confluence. At 7 days postinfection with roughly 80% of the cells lysed, the culture media was collected and centrifuged at 10,000g for 12 min to remove cell debris. The supernatant was collected and then centrifuged at 80,000g for 1 hour to pellet HCMV particles. The pellet was resuspended in phosphate buffered saline (10mM PBS, pH 7.4) and further purified by centrifugation through a 15 to 50% (w/v) sucrose gradient at 100,000g for 1 hour. The light-scattering band of virus particles was collected, diluted with PBS, and then pelleted at 80,000g for another hour.

HCMV particles (virions and NIEPs) have a pleomorphic viral envelope and a particle diameter ranging from 2000 to 3000 Å, which presents a problem in obtaining thin enough samples when preparing cryoEM grids (see Ewald sphere curvature “depth of focus” problem below). Since the pleomorphic envelope and the majority of nonordered tegument proteins it envelops manifests as noise and increased thickness in our single-particle cryoEM reconstruction of the nucleocapsid, it is desirable to remove them before preparing the cryoEM grids. Thus, to reduce both noise and viral particle size while still maintaining nucleocapsid integrity, we added NP-40 detergent at a 1% final concentration to purified intact HCMV particles to partially solubilize the viral envelope (fig. S1). Immediately after, aliquots of 2 μl of this treated sample were transferred to Quantifoil grids (2/1), which had previously been baked overnight by exposure to a strong electron beam. The grids were then blotted for 20 s in an FEI vitrobot with 100% humidity and plunged into liquid ethane. Sample-containing grids were subsequently kept in liquid nitrogen storage.

CryoEM imaging

CryoEM imaging was performed with an FEI Titan Krios electron microscope operated at 300 kV and liquid nitrogen temperature, using the image acquisition software Leginon (50, 51). Before cryoEM data collection, the electron microscope was carefully aligned to minimize beam tilt with coma-free alignment. Note that our project was begun in 2011 when direct electron-counting technology was not yet commercially available. For this reason, cryoEM data recorded on both photographic films and a direct electron detector was used in our effort to obtain a reconstruction of sufficient resolution for atomic model building.

The initial set of cryoEM images were recorded on Kodak SO163 films with a dosage of ~25 e−/Å2 at 47,000× nominal magnification and defocus values between 2.0 and 2.5 μm. A total of 3800 films were recorded and digitized using Nikon Super CoolScan 9000 ED scanners at 6.35 μm per pixel (corresponding to 1.351 Å per pixel at the sample level). Magnification was calibrated using a catalase crystal sample, giving a specimen pixel size of 1.39 Å per pixel. Despite tremendous effort and time invested, we were only able to push the resolution to 4.5 Å from this initial set of film data.

When the direct electron-counting camera eventually became available (52), we decided to take advantage of this cutting-edge technology to record a second set of cryoEM data. Movies from this set were recorded using a Gatan K2 Summit direct electron detection camera operated in counting mode at a nominal magnification of 18,000×. Using a catalase crystal sample, the magnification was calibrated to 31,120×, giving a pixel size of 1.61 Å per pixel on the specimen. The dose rate of the electron beam was set to ~7 electrons per physical pixel per second on camera, giving a corresponding dosage of ~2.7 e−/Å2/s on specimen. Image stacks were recorded at 4 frames per second (giving a per frame dose rate of 1.75 e− per physical pixel) for 14 s, and a total of 12,000 movies were ultimately captured from two grids from a whole month of imaging.

Image processing and 3D reconstruction

From the over 3800 micrograph films of cryoEM images we obtained and manually scanned—a laborious process that took hundreds of man hours—defocus values and astigmatism parameters for each micrograph were determined using CTFFIND (53). Individual particle images (1280 × 1280 pixels) were boxed out automatically using the autoBox program in the IMIRS package (54), then checked manually using the boxer program in EMAN (55) to keep only well-separated, artifact-free particles. With the resulting 60,000 particle images, each individually screened, we proceeded to obtain a reconstruction of the HCMV particle at 4.5-Å resolution (fig. S2, A to C) with IMIRS. Building reliable atomic models at this resolution for capsid proteins as complex as those found in HCMV proved to be difficult. Thus, despite our 4.5-Å reconstruction being the highest resolution yet of a herpesvirus, and despite the sheer effort it took to obtain this reconstruction, we desired to push an even higher resolution with the end goal of producing higher fidelity atomic models.

For the movies obtained via direct electron counting from our second imaging session, drift correction was carried out between frames in each image stack using the UCSF software suite (52). Two types of final images were produced. The first type, with a total dose of ~40 e−/Å2, was generated by merging all frames (56 frames) of each image stack together. These were used for the determination of defocus values and astigmatism parameters, again with the program CTFFIND. The second type, with a total dose of ~24 e−/Å2 and used to carry out further data processing, was generated by merging the first 36 frames of each image stack. From the 10,264 movies ultimately processed, a total of 50,500 particle images (1024 × 1024 pixels) were selected, once again using autoBox from IMIRS initially and then checked manually in EMAN. From these particle images, initial particle orientation and center parameters were determined, and using IMIRS, the initial 3D reconstruction was obtained, computed using a graphics processing unit (GPU) set-up running eLite3D (56). Projection models obtained from the initial reconstruction were then used to refine initial particle orientation and center parameters to produce improved 3D reconstructions. After multiple iterative refinements in which astigmatism in the CTF correction was incorporated at each iteration, we achieved a final reconstruction with a resolution of 3.9 Å, obtained from 39,600 particles (Fig. 1, B and C). The effective resolution of our map was assessed with the 0.5 criterion of the reference-based Fourier shell correlation (FSC) coefficient (Cref = 0.5 or FSC = 0.143), as defined by Rosenthal and Henderson (57) (fig. S2C). Lastly, the map was deconvolved by a temperature factor of 100 Å2 to enhance higher resolution features, and the final reconstruction was filtered to 3.9-Å resolution by low pass filtering with a cosine-shaped cutoff of 11 Fourier pixels (full width at half max). To further evaluate our map quality, a local resolution map was produced for the density surrounding an asymmetric unit using ResMap (58) (fig. S3).

On a related note, the large number of particle images required to produce our 3.9-Å reconstruction is a result of what is known as the Ewald sphere curvature (or “depth of focus”) problem. Essentially, the Fourier transform of a cryoEM image corresponds to the sum of the Fourier values on two spheres in reciprocal space (59, 60), but most reconstruction methods, including IMIRS used here, are based on the Central Projection Theorem, which assumes that the two spheres are flat, as a single degenerated central section of the 3D Fourier transform of the original object. Such treatment imposes a resolution limit in resulting 3D reconstructions if the particles do not have rotational symmetry (such as ribosomes). As this limiting effect is a function of both electron voltage and particle size (sample thickness) (60, 61), our use of a 300-kV accelerating voltage helped alleviate the problem somewhat, but the colossal ~2000-Å particle size of HCMV remained a sizeable hurdle. Fortunately, in the presence of rotational symmetry, as is the case in our study with an icosahedral virus, the limiting effect can be gradually removed by imposing symmetry during 3D reconstruction. This utilizes structural information near the central region of the particle (i.e., “good” information) to average out the compromised structural information contributed by the top and bottom of the particle (i.e., “bad” information) beyond the depth of focus of the microscope, where the effects of Ewald sphere curvature are the greatest. For this reason, the net result of ignoring the effects of Ewald sphere curvature in large icosahedral virus reconstructions is similar to introducing an additional envelope damping function (i.e., the so-called “B-factor”) to high-resolution structural information. Indeed, as shown in our study, the total number of particle images needed for near-atomic resolution structural determination is considerably increased, despite recording particle images with a direct electron-counting camera, which yields a relatively high signal-to-noise ratio.

Atomic-model building, refinement, and 3D visualization

Our initial attempts to atomically model HCMV using the first 4.5-Å reconstruction were met with enormous obstacles and resulted in mixed success. The main challenge lay in density quality, which progressively deteriorated at larger radii from the capsid center. Whereas large side chains were fairly well-resolved in MCP floor regions as distinct density protrusions from the main chain (fig. S2B), smaller side chains were sometimes indistinguishable from the main chain’s Cα bumps, and much less from each other. This condition was only exacerbated moving up toward the MCP tower regions. Furthermore, main chain density was often compromised, frequently appearing broken or “branched” in instances, giving an illusion of multiple possible main chain traces. Nevertheless, when it became apparent that 4.5 Å was to be the maximum attainable resolution of our initial film-based reconstruction, we began what would become a year-long effort of modeling with the 4.5-Å map.

In determining the main chain traces of the capsid proteins, we used EMAN to produce several versions of density maps filtered with different B-factors showing local regions around our target proteins. Maps were filtered to either better show low density threshold features (i.e., side chain densities) at the expense of main chain connectivity, or to better emphasize main chain connectivity at the expense of finer features. As automated Cα-building programs were out of the question for a map of this resolution, we utilized the Marker utility found in the UCSF Chimera tool suite (62) to first trace possible main chain paths through the density. Ambiguous breaks and “branches” were left unconnected such that these could be revisited after all confident main chain segments were traced and built. Chimera marker files and the filtered cryoEM density maps were then imported into the crystallographic program COOT (63) for further analysis.

We then began constructing Cα models following our marker trace files using the manual Baton_build utility in COOT. Observable main chain residue bumps in the density were used as reference points to approximate Cα positions. After Cα backbones of all confident main chain segment traces were built, previously unconnected main chain ends in ambiguous regions were analyzed to determine the final correct trace, relying on several levels of constraints. These included: the linearity of the protein; secondary structure predictions of the protein obtained from prediction servers JPRED (64) and Phyre2 (65), which we used to cross-reference helices and β-sheets visible in our density and traced in our main chain segments; and the protein amino acid sequence, which we cross-referenced with our Cα segments and density features to locate large side chain features that served as “landmarks” to evaluate the accuracy of our trace. Upon determining a complete main chain trace, a refined Cα model was rebuilt for model-able regions of the protein, and amino acid registration was accomplished by virtue of landmark side chain features. We thus worked out coarse models for most regions of the MCP, Tri1, Tri2A, and Tri2B proteins in our initial modeling attempt. Of note, the MCP upper domain trace was determined referencing HSV-1’s VP5ud model (35), which we fitted into our HCMV density map. Finally, at 4.5-Å resolution, attempts to perform large-scale refinement of our atomic models yielded highly inconsistent results. Consequently, we refined these first models completely by hand using the Regularization utility in COOT.

Soon after concluding our major modeling efforts on the 4.5-Å map, we gained access to the aforementioned direct electron-counting camera, at which point we decided to endeavor a second imaging session to attempt a higher resolution density map from which we could perhaps improve our atomic models. The resulting 3.9-Å density map obtained from our second round of imaging and reconstruction was drastically improved. On a qualitative level, densities at the capsid floor versus outer regions were much more consistent in side chain resolution and main chain connectivity (figs. S4 to S7). With the new 3.9-Å map, we validated our initial models and importantly, the accuracy of our main chain traces in regions that were previously resolved through constraint-based deduction. The new map also affirmed that our residue registration using the 4.5-Å map was mostly correct across all modeled proteins, but contained occasional errors involving localized registration shifts of a few amino acids. These errors were corrected, and several regions previously left unmodeled in the 4.5-Å attempt due to poor density quality were modeled in this attempt. As before, we opted to leave regions with sub-optimal density quality as gaps in the atomic model, as opposed to approximating poly-Ala traces. Notable regions/proteins that we were able to successfully model using the improved 3.9-Å map—and which we attempted, but were unsuccessful in using the 4.5-Å map—included: the N-anchor of Tri1; upper loop regions in the embracing arms of Tri2A and Tri2B; and the C-terminal 63 residues of the 75 residue SCP as well as the N-terminal one-third of pp150 tegument protein, both situated at the outer extremes of the capsid.

The 3.9-Å map permitted us to refine our improved full-atom models using real space refinement in Phenix (66). After multiple rounds of refinement, the final coordinate file was submitted along with the EM density map to the Worldwide Protein Data Bank. Lastly, atomic models and densities in figures were visualized and rendered in Chimera, and movies were recorded using the Animations utility in Chimera.

Figure S3. Local resolution assessment. Local resolution heat maps of density slices through an asymmetric unit, rendered using ResMap (58). The red through blue color scheme corresponds to regions of relative low through high resolution.

Figure S4. Density map and atomic model of MCP. Insets correspond to zoomed-in views of boxed regions and illustrate residue features in the density map (mesh).

Figure S5. Density map and atomic model of Tri1. Insets correspond to zoomed-in views of boxed regions and illustrate residue features in the density map (mesh).

Figure S9. Lock unit interactions and global capsid organization. (A) A lock unit is comprised of six MCPs organized around a central triplex. Each lock unit includes a complete set of all interactions found within the HCMV capsid, including three pairs of type I interactions (blue), three pairs of type II interactions (red), and three pairs of type III interactions forming one lasso triangle (yellow), upon which the triplex sits. (B) Lock units are arranged in six overlapping units such that each hexon is at the center of a lock unit group-of-six (GOS). Each MCP is included in two lock units and directly interacts with four lock units. Thus, a lock unit interacts with all lock units within its GOS except the far opposite unit. Intersecting lock unit boundaries denote direct interactions. (C) A global view of GOSs reveals that each lock unit (filled hexagons) takes part in three GOSs (blue, teal, and green GOSs not shown for simplicity). Individual lock units are thus able to interact with nine unique lock units—four from each of three GOSs, overlap between GOSs not counted. Concatenated rings representing HK97’s covalent chainmail are superposed on overlapping GOSs for comparative reference.

Figure S10. Conserved regions and conformers of pp150nt. (A) Schematic showing the amino acid sequence and secondary structure of pp150nt. Lines represent loops, and pink cylinders represent helices. β-herpesvirus-conserved regions CR1 and CR2 are boxed in green, while the primate CMV-conserved cysteine tetrad is boxed in yellow. (B) Rainbow ribbon model (blue at the N terminus through green and yellow to red at the C terminus) of pp150nt with labeled helices. (C) Structural alignment based on Cα positions of the three pp150nt conformers associated with the Tb triplex reveals a greater degree of structural similarity at the upper helix bundle compared to the lower helix bundle.

Acknowledgments

Our research has been supported in part by grants from the NIH (GM071940, DE025567, and AI094386) and NSF (DMR-1548924). We acknowledge the use of instruments at the Electron Imaging Center for Nanomachines supported by the University of California–Los Angeles and by instrumentation grants from NIH (1S10RR23057 and 1U24GM116792) and NSF (DBI-1338135). We thank H. Zhu for providing AD169 HCMV virus stock, X. Zhang and S. Shivakati for assistance with tissue culture, W. H. Hui for assistance with cryoEM data collection, B. K. Zhou for scanning some of the films, and X. Zhang for help with the refinement of atomic models. The cryoEM density map and atomic coordinates of models reported here are deposited in the Electron Microscopy Data Bank and the Protein Data Bank with accession codes EMD-8703 and 5VKU, respectively. Z.H.Z. conceived the project; X.Y. and Z.H.Z. designed the experiments; X.Y. prepared the samples; X.Y., J.Jia., and Z.H.Z. recorded and processed the EM data; J.Jih, X.Y., and Z.H.Z. built the atomic models; J.Jih, X.Y., and Z.H.Z. analyzed and interpreted the models; J.Jih and X.Y. prepared the illustrations and media; Z.H.Z., X.Y., and J.Jih wrote the manuscript.