Abstract

The 1918 influenza pandemic resulted in about 20 million deaths. This enormous impact, coupled with renewed interest in emerging infections, makes characterization of the virus involved a priority. Receptor binding, the initial event in virus infection, is a major determinant of virus transmissibilitythat, for influenza viruses, is mediated by the hemagglutinin (HA) membrane glycoprotein. We have determined the crystal structures of the HA from the 1918 virus and two closelyrelated HAs in complex with receptor analogs. Theyexplain how the 1918 HA, while retaining receptor binding site amino acids characteristic of an avian precursor HA, is able to bind human receptors and how, as a consequence, the virus was able to spread in the human population.

The HAs of influenza viruses mediate receptor binding and membrane fusion, the first stages of virus infection (1). The receptors that they recognize are sialic acids of cell-surface glycoproteins and glycolipids, and the nature of the interactions involved in determining binding specificity has been described in biochemical, genetic, and structural studies (2–8). Sialic acids are usually found in either α2,3 or α2,6 linkages to galactose, the predominant penultimate sugar of N-linked carbohydrate side chains. The binding preference of a given HA for one or other of these linkage types correlates with the species specificity for infection. Thus, the HAs of all 15 antigenic subtypes found in avian influenza viruses bind preferentially to sialic acid in α2,3 linkage (9), and it is this form of the sialosaccharide that predominates in avian enteric tracts where these viruses replicate (10). Swine influenza viruses are reported to bind sialic acid in α2,6, and sometimes also α2,3, linkages (11), and sialic acid in both linkages is detected in porcine tracheae (10). Human viruses of the H1, H2, and H3 subtypes that are known to have caused pandemics in 1918, 1957, and 1968, respectively, recognize α2,6-linked sialic acid (5), the major form found on cells of the human respiratory tract (12, 13).

Because an avian origin is proposed for the HAs of swine and human viruses (14), changes in binding specificity are required for cross-species transfer. The mechanism that human viruses have used to achieve these changes appears to be different for different subtypes. For the HAs of the H2 and H3 human viruses, a minimum of two changes in receptor binding site amino acids, Gln226 to Leu226 and Gly228 to Ser228, correlates with the shift from avian to human receptor binding (15, 16). By contrast, HAs of human H1 viruses acquire the ability to bind to human receptors while retaining Gln226 and Gly228 (11). To understand how they do this, we determined the structures of HAs from the 1918 pandemic virus (1918-human) with the use of HA expressed from DNA of the sequence recovered from tissues infected with virus in 1918 (17) and from the prototype human (1934-human) and swine (1930-swine) H1 influenza viruses, A/Puerto Rico/8/34 and A/swine/Iowa/30, respectively (Fig. 1). A/Puerto Rico/8/34 was one of the first human influenza viruses isolated in the Americas (18) and has been widely used in laboratory investigations of influenza. A/swine/Iowa/30 was the first influenza virus isolated from mammals in 1930 (19), 3 years before the first isolate was recovered from humans (20).

Sequence alignment of 1918-human, 1930-swine, and 1934-human HAs. The subdomain structure of HA in the two polypeptides, HA1 and HA2, that form each monomer is indicated by the colored bars above the sequences. Carbohydrate attachment sites are shaded gray. Residue numbering is on the basis of H3 HA sequence (1). Nonconserved residues are in red.

Overall structure. The structures were solved by molecular replacement, and crystallographic statistics are given in Table 1 and table S1. The overall trimeric structures of the three H1 HAs are similar (Fig. 2), but they show notable differences to HAs of other subtypes with respect to the arrangements of the receptor binding, vestigial esterase, and membrane fusion subdomains, both within the HA trimer and also within individual monomers (21) (table S2). As predicted from their placement in the same phylogenetic and structure-based clade, the H1 HAs are most similar to those of the H5 subtype (21). We have examined the structures in detail in relation to their receptor binding and membrane fusion activities and to the antigenic variation that occurred in both periods of human H1 virus prevalence, 1918 to 1957 and 1977 to date (Fig. 2), and we will present a detailed description of this analysis elsewhere (22). We have, however, concluded that the receptor binding properties are the most distinctive features of the 1918 virus HA, and we focus on these here.

Structures of 1918-human, 1934-human, and 1930-swine HAs. (A) Ribbons diagram of the trimer of 1918-human HA. Monomers 2 and 3 are in silver and gold, respectively, and the featured monomer is colored according to its individual subdomains as in Fig. 1: receptor binding (RB) in blue, vestigial esterase (E) in yellow, and fusion subdomains (F′ and F) in magenta and red, respectively (21). Changes in the relative orientations of individual subdomains of H1 HA relative to H3 HA can usefully be described (21) after superposing the two structures with the use of just the F domain. The RB, E, and F′ subdomains of H1 are related to the equivalent subdomains in H3 by a clockwise rotation of 23° about the trimer threefold axis. Glycosylation sites are indicated by spheres; in green are those observed in the 1918-human HA (21, 33, 94, and 289 in the F′ and E subdomains and 154 in the F subdomain) and in red are the sites (63, 81,129 or 131, 158, 163, and 271 in the F′, E, and RB subdomains) that have accumulated on the membrane distal surface between 1918 and 2002 (17, 25, 29) that probably influence antigenicity. (B) An expanded view of the superposed polypeptide backbone of the receptor binding site of all three H1 HAs. The position of the three secondary structure units making up the site, the 190 helix, and the 130 and 220 loops are indicated. Also shown are the side chains of some residues important for receptor binding. Certain Cα positions are indicated by black spheres for residues discussed in the text. Overall, the three H1 structures are very similar, with rmsd on all Cα of 0.82 Å and 0.56 Å for 1918-human versus 1934-human and 1930-swine, respectively. (C) An expanded view of a region of the F subdomain indicating differences between H1 and H5 subtype HAs in the position of the loop connecting helix A to helix B. Because H1 and H5 are in the same phylogenetic clade, these differences imply that the position of the interhelical loop relative to helix B is not a clade-specific feature, as suggested previously (21). Interactions between the C-terminal region of the loop and the RB and E subdomains (110 helix) influence the dispositions of the subdomains relative to the central coiled coil formed in the trimer by the B helices.

Table 1.

Crystallographic statistics. More extensive crystallographic data are given in table S1.

H1 HA type

Resolution (Å)

Rwork (%)

Rfree (%)

Rmsd bonds

Human 1918

2.9

24.8

28.9

0.007

Swine 1930 uncomplexed

2.7

24.8

26.5

0.006

+ avian receptor

2.5

21.4

25.4

0.006

+ human receptor

2.5

21.7

25.4

0.007

Human 1934 uncomplexed

2.3

23.5

26.6

0.006

+ avian receptor

2.2

22.7

26.0

0.006

+ human receptor

2.25

22.5

25.7

0.006

The receptor binding subdomain. The receptor binding sites are located at the membrane-distal tip of each subunit of the HA trimer (Fig. 2). Three secondary structure elements—the 190 helix (residues 190 to 198), the 130 loop (residues 135 to 138), and the 220 loop (residues 221 to 228)—form the sides of each site, with the base made up of the conserved residues Tyr98, Trp153, His183, and Tyr195 (1) (Fig. 2B). The conformations adopted by the 130 and 220 loops of the three H1 HAs are similar, but they are significantly different from those of the equivalent loops in the HAs of other influenza subtypes (2, 3) (Figs. 2B, 3, and 4 and table S2). To understand the structural basis of the receptor specificity of H1 HAs, we determined the structures of the 1934-human and the 1930-swine HAs in complex, with α2,3- and α2,6-linked sialopentasaccharides as analogs of avian and human receptors, respectively (2) (Fig. 3 and figs. S1 and S2). As observed with other HAs (1, 3, 8), the terminal sialic acids of the human and avian receptors interact with binding site residues through a series of conserved hydrogen bonds (Fig. 3 and fig. S2).

Interactions of 1934-human HA (top) and 1930-swine HA (bottom) with human receptor and with avian receptor analogs. The view of the receptor binding site is about the same as in Fig. 2B. The three secondary structure components of the binding site are labeled in this backbone representation together with some of the most relevant side chains. The broken lines indicate potential hydrogen-bond interactions between the protein and the receptors; residues making interactions via main-chain carbonyl groups are shown as red spheres, whereas those interacting via main-chain nitrogens are shown as blue spheres. In all four panels, the sialosaccharides are colored yellow for carbon atoms, blue for nitrogen, and red for oxygen. Water molecules are indicated by green spheres. (A) 1934-human HA in complex with human receptor and (B) 1934-human HA in complex with avian receptor; in both cases, the HA is colored in green for backbone and carbon atoms. (C) 1930-swine HA in complex with human receptor and (D) 1930-swine HA in complex with avian receptor; in these cases, the backbone and carbon atoms of the HA are colored in blue. The small black arrows in (A), (B), and (C) indicate that for the two human receptor complexes the Sia-1-Gal-2 linkage adopts a cis conformation about the glycosidic bond, whereas for the avian complex it adopts a trans conformation. The large black arrow in (C) indicates the direction of an axis parallel to the trimer threefold axis.

Differences in the orientations of bound receptors in the receptor binding sites of three different HAs. The receptor binding sites of 1934-human HA (green), 1930-swine HA (blue), and human H3 HA (red) are overlaid. The view matrix is about the same as in Fig. 3. The sialopentasaccharides are colored according to the HAs to which they are bound. The side chains of Gln226 (H1 HAs) and Leu226 (H3 HA) are shown. (A) Human receptor complexes. (B) Avian receptor complexes.

The 1934-human HA/human receptor complex. The electron density maps reveal well-ordered features for the Sia-1, Gal-2, and GlcNAc-3 of the sialopentasaccharide in this complex (Fig. 3A and figs. S1 and S2). Gal-2 forms five hydrogen bonds that have not been previously observed in other HA/receptor complexes. Four bonds are possible between the 2- and 3-hydroxyls of Gal-2 and the side chains of Lys222 and Asp225, and a fifth between the 4-hydroxyl of Gal-2 and the main-chain amide of 227 that is mediated by a water molecule.

The 1934-human HA/avian receptor complex. Again only the Sia-1, Gal-2, and GlcNAc-3 moieties of the sialopentasaccharide are ordered in this complex (Fig. 3B and figs. S1 and S2). This observation is consistent with the results of hemagglutination assays showing dual binding specificity for this HA (5, 11). The side-chain carbonyl of Gln226 forms a hydrogen bond with the 4-hydroxyl of Gal-2, as observed in other HA/avian receptor complexes (3). In addition, there is a previously unknown water-mediated interaction between the 4-hydroxyl of Gal-2, the main-chain carbonyl of residue 225, and the side chain of Lys222 (Fig. 3B).

The 1930-swine HA/human receptor complex. The sialic acid of the receptor is located similarly in the 1930-swine and 1934-human HA receptor binding sites, but in the 1930 swine complex all five saccharides of the receptor analog are detected (Fig. 3C and figs. S1 and S2). Lys222 again forms hydrogen bonds with the 2- and 3-hydroxyls of Gal-2, although in this case Gal-2 sits higher in the binding site. Asp190 hydrogen bonds to the amino nitrogen of GlcNAc-3, Ser193 hydrogen bonds to the 2-hydroxyl of Gal-4, and there is a water-mediated interaction between Thr189 and GlcNAc-5. The last three interactions have not been observed before in HA receptor complexes. In addition, the sialopentasaccharide exits the binding site in an orientation not previously seen, crossing the 190 helix near its N terminus, about parallel to the threefold symmetry axis of the HA trimer (Fig. 3C).

The 1930-swine HA/avian receptor complex. The electron density for the avian receptor analog bound to the 1930-swine HA is weak and mainly represents the sialic acid moiety (Fig. 3D and figs. S1 and S2). A similar situation was observed for an H5 avian HA in complex with a human receptor analog, where only a subset of the atoms for the sialic acid could be located (3). These observations probably reflect the low affinity of the HAs for their respective ligands, consistent with the preference of the 1930 swine virus for human receptor in hemagglutination assays as detailed in (11).

To ascertain the essential differences in the binding sites that allow the “avian” binding site structure of H1 HAs to recognize both human and avian receptors, we compared their structures with those of H3 avian (A/duck/Ukraine/63) and human (X-31) HAs determined previously to prefer either avian (23) or human (2) receptors.

Human receptor complexes. Complexes of human receptor analog bound to 1934-human HA (green) and 1930-swine HA (blue) and to human H3 HA (red) are superposed in Fig. 4A. Perhaps the most important feature of this comparison is the difference in structure adopted by the 130 and 220 loops of the receptor binding site between the H1 and H3 HAs. One consequence of the change in the 130 loop structure is that the sialic acid of the receptor is tilted about 10° into the receptor binding sites of the H1 HAs. This effect, together with different orientations about the glycosidic bond, contributes to Gal-2 being located almost 2 Å lower in the H1 HAs than in the human H3 HAs. Gal-2 is able to adopt this position because structural differences in the 220 loop locate Gln226 lower in the binding site than the equivalent Leu226 of human H3 HA. Consequently, in the H1 HAs Gal-2 is located closer to the 220 loop and is able to form hydrogen bonds with Lys222. In the case of the 1934-human HA, Gal-2 also interacts with Asp225. Thus, a combination of factors relating to the structure of the 130 and 220 loops enable the H1 HAs to make favorable hydrogen-bond interactions with Gal-2 of the human receptor. Gln226 plays an essentially passive role in this process, in marked contrast to the role played by Leu226 in the binding of human H3 HA to human receptor. In that case, Gal-2 makes hydrophobic contacts with Leu226, and the higher position and the nature of this side chain are important for human receptor binding (2, 23).

Avian receptor complexes. Complexes of avian receptor analogs with 1934-human HA (green), 1930-swine HA (blue), and an avian H3 HA (red) (22) are overlaid in Fig. 4B. Again the differences in the structure of the 130 loop between the H1 and H3 HAs result in the sialic acid of the avian receptor being located lower in the receptor binding site of the H1 HAs. Comparison of the 1934-human and avian H3 complexes also reveals that Gal-2 of the avian receptor is located about 1 Å lower in the binding site of the H1 complex, as is Gln226. In both complexes, the 4-hydroxyl of Gal-2 hydrogen-bonds with the side-chain carbonyl of Gln226 (Fig. 3B), and the coordinated differences in position of the bound receptor and Gln226 enable this interaction to be conserved. It seems therefore that 1934-human HA is able to bind avian receptor in a manner reminiscent of avian HAs (3), with Gln226 playing a key role.

Given that the overall structures of 1930-swine and 1934-human HA receptor binding sites are very similar (Fig. 2B) and that both contain a glutamine residue at position 226, why does the 1930-swine HA bind less effectively to avian receptors than 1934-human HA? The reason seems to be that the position adopted by Gln226 in 1934-human HA is about 1 Å higher in its complex with the avian receptor than it is either in the human receptor complex or uncomplexed (Fig. 5). By contrast, the position of Gln226 in 1930-swine HA is about the same uncomplexed and in the human and avian receptor complexes. The apparent inability of Gln226 to adopt a higher position in the receptor binding site seems to explain the failure of 1930-swine HA to interact as effectively with the avian receptor. This explanation is supported by the structural observation that, in the 1934-human HA complex with avian receptor, Glu190 interacts through two water molecules with Gln226 (Fig. 5). This network of hydrogen bonds may be necessary to position Gln226 in the binding site for its interaction with Gal-2. Glu190 is conserved in avian H1 HAs, all of which specifically bind α2,3-linked receptors. By contrast, residue 190 of 1930-swine HA is an aspartic acid, which does not interact with either the 9-hydroxyl of Sia-1 or Gln226 and is thus unable to facilitate binding to avian receptor.

Superposition of the binding site of 1934-human HA in its uncomplexed state and complexed with avian receptor analog. The HA is shown in green in both cases, and the avian receptor is colored as in Fig. 3. Two water molecules, shown as green spheres, link Glu190 to Gln226 in the avian receptor complex. This hydrogen-bonded network is not formed in the uncomplexed structure or in the human receptor complex not shown.

Many of the viruses from the first H1 influenza pandemic period, 1918 to 1957, recognized to some extent both avian and human receptors, whereas those from the second period, 1977 to date, appear to be more human receptor–specific (11). Loss of the ability to bind avian receptors, which may be an advantage in the face of infectivity-blocking α2,3-linked soluble sialosides in human lungs (13), was proposed to correlate with the amino acid substitution of Ala138 to Ser138 (11). Our structural data also support this interpretation, because an interaction between Ser138 and Gln226 is feasible and would favor the lower positioning in the receptor binding site of Gln226 and the consequent negative effect on avian receptor binding.

The receptor binding specificity of the 1918-human HA. Although we have been unable to obtain receptor analog complexes with the 1918 HA, we can deduce its likely specificity on the basis of our observations of receptor binding to 1930-swine and 1934-human HAs. Table 2 compares potentially important contact residues in the receptor binding sites of 1934-human, 1930-swine, 1918-human, and a representative avian H1 HA (9). From these data, and the close similarity of the structures of the human and swine HA binding sites (Fig. 2), two of the five HAs reported from 1918 viruses (17, 24) have receptor binding sites, and presumably binding specificities, indistinguishable from those of 1930-swine HA. The other three 1918 HAs, which differ only at residue 225 (Table 2), most likely also prefer binding to human rather than avian receptors. If interactions with Lys222 and Asp225 are formed with human receptors, as in the 1934-human HA complex, then the overall orientation of the oligosaccharide in the 1918 HA binding sites may also be similar. In this case, Asp190 would not contact the amino group of GlcNAc-3, and the entire complex would closely resemble that formed by the 1934-human HA rather than by the 1930-swine HA. However, the fact that Asp190 is often conserved in human H1 HAs (25), as is Ser193, suggests that, in the 1918 HA and in H1 HAs generally, these residues interact with GlcNAc-3 and Gal-4 in similar ways to those seen in the 1930-swine HA-human receptor complex. This conclusion is also consistent with the ability of human H1 HAs to discriminate between the receptor analogs 6′-sialyllactosamine and 6′-sialyllactose that is reported to be dependent on Asp190 (26, 27).

Table 2.

Residue types at four positions in the receptor binding sites of 1918-human, 1930-swine, 1934-human, and a representative avian H1 HA. Numbers in parentheses are the number of sequences available.

Residue of HA

1918-human

1930-swine

1934-human

Avian

190

Asp

Asp

Glu

Glu

193

Ser

Ser

Asn

Ser

222

Lys

Lys

Lys

Lys

225

Asp (3)/Gly (2)

Gly

Asp

Gly

Irrespective of the single amino acid difference of Asp or Gly at residue 225 (Table 2) between the sequences of 1918-human HAs, by recognizing human receptors all would contribute the first requirement of an epidemic virus: the ability to spread in the human population. The importance of this requirement was emphasized in the 1997 outbreak of H5 “chicken” influenza in Hong Kong, when the virus was extremely virulent but did not acquire the ability to bind α2,6-linked sialosides (28) and was therefore unable to spread. With the ability to ensure the efficiency of the initial stages of virus infection, coupled with novel antigenicity, the human-1918 HA may have been the prime determinant of extensive mortality in the 1918 pandemic.

We thank S. Smerdon and P. Walker for assistance and discussion. Diffraction data were collected at Daresbury Synchrotron Radiation Source (SRS) and European Synchrotron Radiation Facility (ESRF) Grenoble. We thank J. Nicholson (SRS) and S. McSweeney (ESRF) for rapid access to and assistance with beamtime as well as other beamline staff. This research was supported by the MRC (UK), NIH grant AI-13654, by a supplement to this grant for Expanded International Research on Emerging and Re-Emerging Diseases, by an International Partnership Research Award in Veterinary Epidemiology of the Wellcome Trust, and by the Howard Hughes Medical Institute. The coordinates for the 1918-human, 1934-human, 1934-human/human, 1934-human/avian, 1930-swine, 1930-swine/human, and 1930-swine/avian HAs have been deposited in the Protein Data Bank (accession codes 1RUZ, 1RU7, 1RVZ, 1RVX, 1RUY, 1RVT, and 1RV0).